FlaskRequest or Sqlite3Module in codepathfinder. Python rules use pattern-based matchers: calls(), variable(), attribute(). You can still subclass QueryType yourself if you want type-inferred matching. The base class is language-agnostic.Standard imports for a Python rule file:
from codepathfinder.python_decorators import python_rule
from codepathfinder import calls, variable, attribute, flows, And, Or, Not
from codepathfinder.presets import PropagationPresetsTaken from real files in python-sdk/rules/python/. Import only what you need per rule.
Declares a Python-scoped rule. Parameters are identical to @go_rule but the executor scopes analysis to .py files.
@python_rule(
id="PYTHON-DESER-001", # required
name="",
severity="MEDIUM", # CRITICAL | HIGH | MEDIUM | LOW | INFO
category="security",
cwe="",
cve="",
tags="",
message="",
owasp="",
)
def detect_pickle_loads():
return flows(...)idrequirednameseveritycategorycwecvetagsmessageowaspPattern-based function or method call matcher. Function-name patterns support the * wildcard ("request.*", "*.execute"). Argument-value constraints in match_position / match_name additionally support ?. Chainable with .tracks().
from codepathfinder import calls
# Exact match
calls("pickle.loads")
calls("eval")
# Multiple patterns (OR semantics)
calls("pickle.loads", "pickle.load", "marshal.loads")
# Wildcards
calls("*.execute") # any method named execute
calls("request.*") # any attribute on request
# Positional argument constraint
calls("subprocess.Popen", match_position={0: "sh"})
# Keyword argument constraint
calls("yaml.load", match_name={"Loader": ["Loader", "UnsafeLoader"]})
# Tuple indexing for nested positional args
calls("socket.connect", match_position={"0[0]": "192.168.*"})
# Taint precision
calls("os.system").tracks(0)*patternsrequiredmatch_positionmatch_nameMatches variable references by name pattern. Useful for marking naming-convention-based sources or sinks. This matcher is terminal, with no .tracks() or .where().
from codepathfinder import variable
variable("user_input") # exact name
variable("*_password") # any variable ending in _password
variable("request_*") # any request_* variableMatches attribute / property access (as opposed to method calls). Common for properties like request.url that aren't function calls. Supports multiple patterns. Also terminal, with no chained modifiers.
from codepathfinder import attribute
attribute("request.url")
attribute("request.GET", "request.POST")
attribute("file.filename")attribute() with QueryType.attr(). attribute() is a standalone pattern matcher. .attr() is a method on QueryType subclasses that requires type inference.Attach .tracks() to calls() to pin taint to specific parameters.
calls("cursor.execute").tracks(0) # first positional arg
calls("cursor.execute").tracks(0, "query") # positional or keyword
calls("requests.get").tracks("return") # the return value (for source)On sinks, taint must reach a tracked parameter to produce a finding. On sources, only the tracked return / argument is marked tainted.
Real rule from python-sdk/rules/python/deserialization/pickle_loads.py. Detects user-controlled data flowing into pickle.loads (RCE).
from codepathfinder.python_decorators import python_rule
from codepathfinder import calls, flows
from codepathfinder.presets import PropagationPresets
@python_rule(
id="PYTHON-DESER-001",
name="Unsafe Pickle Deserialization",
severity="CRITICAL",
category="deserialization",
cwe="CWE-502",
cve="CVE-2021-3177",
tags="python,deserialization,pickle,rce,untrusted-data,owasp-a08,cwe-502",
message=(
"Unsafe pickle deserialization: Untrusted data flows to pickle.loads(). "
"Use JSON for untrusted input, or sign+verify pickled payloads."
),
owasp="A08:2021",
)
def detect_pickle_deserialization():
return flows(
from_sources=[
calls("request.data"),
calls("request.GET"),
calls("input"),
],
to_sinks=[
calls("pickle.loads"),
calls("pickle.load"),
],
sanitized_by=[calls("*.verify")],
propagates_through=PropagationPresets.standard(),
scope="local",
)