Advanced Static Analysis with Psalm

Static analysis is an essential practice in modern software development for ensuring code quality, identifying potential bugs, and enforcing coding standards without executing the code. For PHP developers, Psalm has emerged as a powerful and flexible static analysis tool that goes beyond basic type checking. This post dives into some of Psalm's advanced features, demonstrating how you can leverage them to catch subtle bugs, improve security, and maintain a high level of code quality in your projects.

What is Psalm and Why Use It?

Psalm is a free and open-source static analysis tool for PHP, developed and maintained by Vimeo. It helps developers find errors in their code, from simple typos to more complex logical issues. By integrating Psalm into your development workflow, you can:

  • Catch errors before they reach production: Psalm can identify a wide range of issues, including null pointer exceptions, type mismatches, and undefined variables.
  • Improve code quality and maintainability: By enforcing consistent coding standards and highlighting potential issues, Psalm helps you write cleaner, more robust code.
  • Enhance security: Psalm's taint analysis feature can help you identify and mitigate security vulnerabilities like SQL injection and cross-site scripting (XSS).

Getting Started with Psalm

If you're new to Psalm, setting it up is straightforward. You can add it to your project using Composer:

composer require --dev vimeo/psalm

After installing, you need to create a configuration file. You can do this automatically by running:

./vendor/bin/psalm --init

This will generate a psalm.xml file in your project's root directory. You can then run Psalm with the following command:

./vendor/bin/psalm

Advanced Psalm Features

Now, let's explore some of Psalm's more advanced capabilities.

Taint Analysis for Security

Taint analysis is a powerful feature that helps you track the flow of user-provided data through your application to prevent security vulnerabilities. Psalm can identify when "tainted" data (e.g., from $_GET or $_POST) is used in a "sensitive" sink (e.g., a SQL query or echo).

To enable taint analysis, you need to add the following to your psalm.xml:

<psalm
    errorLevel="1"
    resolveFromConfigFile="true"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xmlns="https://getpsalm.org/schema/config"
    xsi:schemaLocation="https://getpsalm.org/schema/config vendor/vimeo/psalm/config.xsd"
    findUnusedCode="true"
>
    <projectFiles>
        <directory name="src" />
    </projectFiles>

    <taintAnalysis>
        <trackSources>
            <source id="get-requests" type="array" />
        </trackSources>
        <trackSinks>
            <sink id="echo" type="string" />
        </trackSinks>
    </taintAnalysis>
</psalm>

Here’s an example of how Psalm can catch a potential XSS vulnerability:

<?php

/** @var string $_GET['name'] */
$name = $_GET['name'];

echo "Hello, " . $name;

When you run Psalm with this configuration, it will report a TaintedHtml error, indicating that you're echoing tainted data without proper sanitization.

Custom Plugins and Hooks

Psalm's functionality can be extended with custom plugins. This is useful when you have project-specific conventions or want to enforce rules that are not covered by Psalm's core features.

Here is a simple example of a custom plugin that checks for a specific naming convention:

<?php

namespace MyPsalmPlugins;

use Psalm\Plugin\PluginEntryPointInterface;
use Psalm\Plugin\RegistrationInterface;

class MyPlugin implements PluginEntryPointInterface
{
    public function __invoke(RegistrationInterface $registration, ?\SimpleXMLElement $config = null): void
    {
        // Register a hook to analyze method calls
        $registration->registerHooksFromClass(MyMethodAnalyzer::class);
    }
}

class MyMethodAnalyzer extends \Psalm\Plugin\EventHandler\AfterMethodCallAnalysisInterface
{
    public static function afterMethodCallAnalysis(
        \Psalm\Plugin\EventHandler\Event\AfterMethodCallAnalysisEvent $event
    ): void {
        $expr = $event->getExpr();
        if ($expr->name->name === 'mySpecialFunction') {
            // Your custom logic here
        }
    }
}

Generics and Advanced Types

While PHP has made significant strides in its type system, it still lacks generics. Psalm, however, allows you to use generics in your docblocks, which can greatly improve the accuracy of static analysis, especially for collections and data structures.

Here’s an example of using generics with an array of objects:

<?php

/**
 * @param array<int, User> $users
 * @return array<int, string>
 */
function getUserNames(array $users): array
{
    return array_map(fn(User $user) => $user->name, $users);
}

In this example, Psalm understands that the $users array contains User objects and that the function returns an array of strings.

Baseline File for Legacy Codebases

Introducing static analysis to a large, legacy codebase can be overwhelming due to the sheer number of reported issues. Psalm provides a solution for this with its baseline file feature. A baseline file allows you to ignore existing errors and focus on new or modified code.

To generate a baseline file, run Psalm with the --set-baseline option:

./vendor/bin/psalm --set-baseline=psalm-baseline.xml

This will create a psalm-baseline.xml file containing all the current errors. You can then commit this file to your repository. From now on, Psalm will only report new errors that are not in the baseline.

Psalm in CI/CD

To get the most out of Psalm, you should integrate it into your Continuous Integration/Continuous Deployment (CI/CD) pipeline. This ensures that every code change is automatically checked for potential issues.

Here’s an example of how to run Psalm in a GitHub Actions workflow:

name: Psalm CI

on: [push]

jobs:
  psalm:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v2

      - name: Setup PHP
        uses: shivammathur/setup-php@v2
        with:
          php-version: '8.1'

      - name: Install dependencies
        run: composer install

      - name: Run Psalm
        run: ./vendor/bin/psalm

Conclusion

Psalm is an incredibly powerful tool that can significantly improve the quality and security of your PHP projects. By moving beyond the basics and leveraging its advanced features like taint analysis, custom plugins, and generics, you can catch a wider range of errors and build more robust applications. Whether you're working on a new project or a legacy codebase, integrating Psalm into your workflow is a worthwhile investment.

Resources

Author

Efe Omoregie

Efe Omoregie

Software engineer with a passion for computer science, programming and cloud computing