sdk/api/python
Py

Python API Reference

Write taint rules for Python applications (Flask, Django, FastAPI, scripts, anything). Python rules use pattern-based matchers (calls, variable, attribute) rather than pre-defined QueryType classes.

Heads up: Python has no pre-defined QueryType classes

Unlike Go, the SDK does not ship with pre-defined QueryType classes for Python frameworks. There is no FlaskRequest or Sqlite3Module in codepathfinder. Python rules use pattern-based matchers: calls(), variable(), attribute(). You can still subclass QueryType yourself if you want type-inferred matching. The base class is language-agnostic.

Canonical imports

Standard imports for a Python rule file:

python
from codepathfinder.python_decorators import python_rule
from codepathfinder import calls, variable, attribute, flows, And, Or, Not
from codepathfinder.presets import PropagationPresets

Taken from real files in python-sdk/rules/python/. Import only what you need per rule.

@python_rule decorator

Declares a Python-scoped rule. Parameters are identical to @go_rule but the executor scopes analysis to .py files.

python
@python_rule(
    id="PYTHON-DESER-001",            # required
    name="",
    severity="MEDIUM",                # CRITICAL | HIGH | MEDIUM | LOW | INFO
    category="security",
    cwe="",
    cve="",
    tags="",
    message="",
    owasp="",
)
def detect_pickle_loads():
    return flows(...)
Parameters
idrequired
str
Unique rule identifier.
name
str = ""
Human-readable name. Auto-generated from the function name when empty.
severity
str = "MEDIUM"
One of "CRITICAL", "HIGH", "MEDIUM", "LOW", "INFO".
category
str = "security"
Free-form grouping (e.g., flask, django, deserialization).
cwe
str = ""
CWE identifier, e.g., "CWE-502".
cve
str = ""
Associated CVE ID if applicable.
tags
str = ""
Comma-separated taxonomy tags.
message
str = ""
Finding message. Defaults to a generic message with the rule ID.
owasp
str = ""
OWASP Top 10 mapping, e.g., "A08:2021".

calls()

Pattern-based function or method call matcher. Function-name patterns support the * wildcard ("request.*", "*.execute"). Argument-value constraints in match_position / match_name additionally support ?. Chainable with .tracks().

python
from codepathfinder import calls

# Exact match
calls("pickle.loads")
calls("eval")

# Multiple patterns (OR semantics)
calls("pickle.loads", "pickle.load", "marshal.loads")

# Wildcards
calls("*.execute")                                 # any method named execute
calls("request.*")                                 # any attribute on request

# Positional argument constraint
calls("subprocess.Popen", match_position={0: "sh"})

# Keyword argument constraint
calls("yaml.load", match_name={"Loader": ["Loader", "UnsafeLoader"]})

# Tuple indexing for nested positional args
calls("socket.connect", match_position={"0[0]": "192.168.*"})

# Taint precision
calls("os.system").tracks(0)
Signature
*patternsrequired
str
One or more function / method name patterns to match. Wildcards * and ? allowed.
match_position
dict[int | str, value | list[value]] | None
Filter by positional argument value. Keys are 0-based indices or tuple-index strings like '0[0]'. Values can be literals or lists of acceptable values.
match_name
dict[str, value | list[value]] | None
Filter by keyword argument. Keys are argument names; values are literals or lists.

variable()

Matches variable references by name pattern. Useful for marking naming-convention-based sources or sinks. This matcher is terminal, with no .tracks() or .where().

python
from codepathfinder import variable

variable("user_input")        # exact name
variable("*_password")        # any variable ending in _password
variable("request_*")         # any request_* variable

attribute()

Matches attribute / property access (as opposed to method calls). Common for properties like request.url that aren't function calls. Supports multiple patterns. Also terminal, with no chained modifiers.

python
from codepathfinder import attribute

attribute("request.url")
attribute("request.GET", "request.POST")
attribute("file.filename")
Don't confuse attribute() with QueryType.attr(). attribute() is a standalone pattern matcher. .attr() is a method on QueryType subclasses that requires type inference.

.tracks() on calls()

Attach .tracks() to calls() to pin taint to specific parameters.

python
calls("cursor.execute").tracks(0)                  # first positional arg
calls("cursor.execute").tracks(0, "query")         # positional or keyword
calls("requests.get").tracks("return")             # the return value (for source)

On sinks, taint must reach a tracked parameter to produce a finding. On sources, only the tracked return / argument is marked tainted.

Complete example: Pickle deserialization

Real rule from python-sdk/rules/python/deserialization/pickle_loads.py. Detects user-controlled data flowing into pickle.loads (RCE).

python
from codepathfinder.python_decorators import python_rule
from codepathfinder import calls, flows
from codepathfinder.presets import PropagationPresets


@python_rule(
    id="PYTHON-DESER-001",
    name="Unsafe Pickle Deserialization",
    severity="CRITICAL",
    category="deserialization",
    cwe="CWE-502",
    cve="CVE-2021-3177",
    tags="python,deserialization,pickle,rce,untrusted-data,owasp-a08,cwe-502",
    message=(
        "Unsafe pickle deserialization: Untrusted data flows to pickle.loads(). "
        "Use JSON for untrusted input, or sign+verify pickled payloads."
    ),
    owasp="A08:2021",
)
def detect_pickle_deserialization():
    return flows(
        from_sources=[
            calls("request.data"),
            calls("request.GET"),
            calls("input"),
        ],
        to_sinks=[
            calls("pickle.loads"),
            calls("pickle.load"),
        ],
        sanitized_by=[calls("*.verify")],
        propagates_through=PropagationPresets.standard(),
        scope="local",
    )