Security Analysis in Psalm
Psalm can attempt to find connections between user-controlled input (like
$_GET['name']) and places that we don’t want unescaped user-controlled input to end up (like
echo "<h1>$name</h1>" by looking at the ways that data flows through your application (via assignments, function/method calls and array/property access).
You can enable this mode with the
--taint-analysis command line flag. When taint analysis is enabled, no other analysis is performed. To ensure comprehensive results, Psalm should be run normally prior to taint analysis, and any errors should be fixed.
Tainted input is anything that can be controlled, wholly or in part, by a user of your application. In taint analysis, tainted input is called a taint source.
Taint analysis tracks how data flows from taint sources into taint sinks. Taint sinks are places you really don’t want untrusted data to end up.
<div id="section_<?= $id ?>">
$pdo->exec("select * from users where name='" . $name . "'")
Psalm recognises a number of taint types by default, defined in the Psalm\Type\TaintKind class:
sql- used for strings that could contain SQL
ldap- used for strings that could contain a ldap DN or filter
html- used for strings that could contain angle brackets or unquoted strings
has_quotes- used for strings that could contain unquoted strings
shell- used for strings that could contain shell commands
callable- used for callable strings that could be user-controlled
unserialize- used for strings that could contain a serialized string
include- used for strings that could contain a path being included
eval- used for strings that could contain code
ssrf- used for strings that could contain text passed to Curl or similar
file- used for strings that could contain a path
cookie- used for strings that could contain a http cookie
header- used for strings that could contain a http header
user_secret- used for strings that could contain user-supplied secrets
system_secret- used for strings that could contain system secrets
You're also free to define your own taint types when defining custom taint sources – they're just strings.
Psalm currently defines three default taint sources: the
$_COOKIE server variables.
You can also define your own taint sources.
Psalm currently defines a number of different sinks for builtin functions and methods, including
You can also define your own taint sinks.
Nobody likes to wade through a ton of false-positives – here’s a guide to avoiding them.
Taint Analysis relies on not making any mistakes when escaping values, e.g.
To avoid these issues, use Parameterised Queries for SQL and Commands (e.g.
exec); and a context-aware templating engine for HTML. Then use the literal-string type to ensure sensitive strings are defined in your application (i.e. have been written by a developer).
Using Baseline With Taint Analysis
Since taint analysis is performed separately from other static code analysis, it makes sense to use a separate baseline for it.
You can use --use-baseline=PATH option to set a different baseline for taint analysis.
Viewing Results in a User Interface
Psalm supports the SARIF standard for exchanging static analysis results. This enables you to view the results in any SARIF compatible software, including the taint flow.
GitHub Code Scanning
Alternatively, the generated SARIF file can be manually uploaded as described in the GitHub documentation.
The results will then be available in the "Security" tab of your repository.
Other SARIF compatible software
To generate a SARIF report run Psalm with the
--report flag and a
.sarif extension. For example:
Debugging the taint graph
Psalm can output the taint graph using the DOT language. This is useful when expected taints are not detected. To generate a DOT graph run Psalm with the
--dump-taint-graph flag. For example:
psalm --taint-analysis --dump-taint-graph=taints.dot dot -Tsvg -o taints.svg taints.dot