Advanced Static Analysis with Psalm
Static analysis is an essential practice in modern software development for ensuring code quality, identifying potential bugs, and enforcing coding standards without executing the code. For PHP developers, Psalm has emerged as a powerful and flexible static analysis tool that goes beyond basic type checking. This post dives into some of Psalm's advanced features, demonstrating how you can leverage them to catch subtle bugs, improve security, and maintain a high level of code quality in your projects.
What is Psalm and Why Use It?
Psalm is a free and open-source static analysis tool for PHP, developed and maintained by Vimeo. It helps developers find errors in their code, from simple typos to more complex logical issues. By integrating Psalm into your development workflow, you can:
- Catch errors before they reach production: Psalm can identify a wide range of issues, including null pointer exceptions, type mismatches, and undefined variables.
- Improve code quality and maintainability: By enforcing consistent coding standards and highlighting potential issues, Psalm helps you write cleaner, more robust code.
- Enhance security: Psalm's taint analysis feature can help you identify and mitigate security vulnerabilities like SQL injection and cross-site scripting (XSS).
Getting Started with Psalm
If you're new to Psalm, setting it up is straightforward. You can add it to your project using Composer:
composer require --dev vimeo/psalm
After installing, you need to create a configuration file. You can do this automatically by running:
./vendor/bin/psalm --init
This will generate a psalm.xml
file in your project's root directory. You can then run Psalm with the following command:
./vendor/bin/psalm
Advanced Psalm Features
Now, let's explore some of Psalm's more advanced capabilities.
Taint Analysis for Security
Taint analysis is a powerful feature that helps you track the flow of user-provided data through your application to prevent security vulnerabilities. Psalm can identify when "tainted" data (e.g., from $_GET
or $_POST
) is used in a "sensitive" sink (e.g., a SQL query or echo
).
To enable taint analysis, you need to add the following to your psalm.xml
:
<psalm
errorLevel="1"
resolveFromConfigFile="true"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="https://getpsalm.org/schema/config"
xsi:schemaLocation="https://getpsalm.org/schema/config vendor/vimeo/psalm/config.xsd"
findUnusedCode="true"
>
<projectFiles>
<directory name="src" />
</projectFiles>
<taintAnalysis>
<trackSources>
<source id="get-requests" type="array" />
</trackSources>
<trackSinks>
<sink id="echo" type="string" />
</trackSinks>
</taintAnalysis>
</psalm>
Here’s an example of how Psalm can catch a potential XSS vulnerability:
<?php
/** @var string $_GET['name'] */
$name = $_GET['name'];
echo "Hello, " . $name;
When you run Psalm with this configuration, it will report a TaintedHtml
error, indicating that you're echoing tainted data without proper sanitization.
Custom Plugins and Hooks
Psalm's functionality can be extended with custom plugins. This is useful when you have project-specific conventions or want to enforce rules that are not covered by Psalm's core features.
Here is a simple example of a custom plugin that checks for a specific naming convention:
<?php
namespace MyPsalmPlugins;
use Psalm\Plugin\PluginEntryPointInterface;
use Psalm\Plugin\RegistrationInterface;
class MyPlugin implements PluginEntryPointInterface
{
public function __invoke(RegistrationInterface $registration, ?\SimpleXMLElement $config = null): void
{
// Register a hook to analyze method calls
$registration->registerHooksFromClass(MyMethodAnalyzer::class);
}
}
class MyMethodAnalyzer extends \Psalm\Plugin\EventHandler\AfterMethodCallAnalysisInterface
{
public static function afterMethodCallAnalysis(
\Psalm\Plugin\EventHandler\Event\AfterMethodCallAnalysisEvent $event
): void {
$expr = $event->getExpr();
if ($expr->name->name === 'mySpecialFunction') {
// Your custom logic here
}
}
}
Generics and Advanced Types
While PHP has made significant strides in its type system, it still lacks generics. Psalm, however, allows you to use generics in your docblocks, which can greatly improve the accuracy of static analysis, especially for collections and data structures.
Here’s an example of using generics with an array of objects:
<?php
/**
* @param array<int, User> $users
* @return array<int, string>
*/
function getUserNames(array $users): array
{
return array_map(fn(User $user) => $user->name, $users);
}
In this example, Psalm understands that the $users
array contains User
objects and that the function returns an array of strings.
Baseline File for Legacy Codebases
Introducing static analysis to a large, legacy codebase can be overwhelming due to the sheer number of reported issues. Psalm provides a solution for this with its baseline file feature. A baseline file allows you to ignore existing errors and focus on new or modified code.
To generate a baseline file, run Psalm with the --set-baseline
option:
./vendor/bin/psalm --set-baseline=psalm-baseline.xml
This will create a psalm-baseline.xml
file containing all the current errors. You can then commit this file to your repository. From now on, Psalm will only report new errors that are not in the baseline.
Psalm in CI/CD
To get the most out of Psalm, you should integrate it into your Continuous Integration/Continuous Deployment (CI/CD) pipeline. This ensures that every code change is automatically checked for potential issues.
Here’s an example of how to run Psalm in a GitHub Actions workflow:
name: Psalm CI
on: [push]
jobs:
psalm:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Setup PHP
uses: shivammathur/setup-php@v2
with:
php-version: '8.1'
- name: Install dependencies
run: composer install
- name: Run Psalm
run: ./vendor/bin/psalm
Conclusion
Psalm is an incredibly powerful tool that can significantly improve the quality and security of your PHP projects. By moving beyond the basics and leveraging its advanced features like taint analysis, custom plugins, and generics, you can catch a wider range of errors and build more robust applications. Whether you're working on a new project or a legacy codebase, integrating Psalm into your workflow is a worthwhile investment.
Resources
- Official Psalm Documentation
- Psalm GitHub Repository
- Awesome Psalm: A curated list of Psalm resources.