QueryType

Tell the engine about types, not string patterns. QueryType is how you go from "match anything called execute" to "match only sqlite3.Cursor.execute".

What is QueryType

A QueryType describes a type you want the engine to recognize. Instead of writing regex patterns that try to guess what a variable is, you give the engine the fully qualified name (FQN) of the type. The engine resolves imports, follows assignments, and figures out the rest.

Think of it this way. When you write a rule, you are telling the engine: "I care about this specific type from this specific library." The engine takes that information and uses it to understand your codebase. It knows that req on line 14 is a flask.wrappers.Requestbecause it traced the type through function parameters, imports, and assignments.

That is what FQN-based resolution means. You specify the canonical name of a type, and the engine does the work of figuring out where that type appears in real code.

from codepathfinder import QueryType

# Define a type you care about
FlaskRequest = QueryType(
    fqns=["flask.wrappers.Request", "flask.Request"]
)

Why It Matters

Most SAST tools match function names with regex. You write a pattern like execute and hope for the best. The problem is obvious once you think about it: execute() matches everything.

  • cursor.execute(query) - a SQL call you want to find
  • my_dict.execute() - a custom dict method, totally harmless
  • task_runner.execute(job) - your internal task system, not relevant
  • strategy.execute(plan) - a design pattern, nothing to do with SQL

String matching catches all four. QueryType catches only the first one. It knows the difference because it resolves the type of cursor to sqlite3.Cursor before deciding whether to match. That is not a small improvement. It is the difference between a tool that floods you with false positives and one you actually trust.

The core idea:

String matching asks "does this function name look right?" QueryType asks "is this variable actually that type?" One is guessing. The other is knowing.

How to Find FQNs

The FQN is the full dotted path to a class or module as it exists in the library's source code. Finding it is straightforward, but there are a few things to watch out for.

Open the Library Source or Type Stubs

The most reliable way to find an FQN is to look at the source. If you use Flask, go to the Flask package and look at where the class is defined. For example, Flask's Request class lives in flask/wrappers.py. That makes its FQN flask.wrappers.Request.

Type stubs (the .pyi files) work the same way. If a library ships type stubs or has them on typeshed, the paths in those stubs give you the FQN.

Check PyPI Package Structure

Look at the package on PyPI or GitHub. The directory structure tells you the module path. A file at requests/sessions.py containing a class Session means the FQN is requests.sessions.Session.

Watch Out for Re-exports

Many libraries re-export classes from their top-level __init__.py. Flask's Request is defined in flask.wrappers, but users import it as flask.Request. Both are valid FQNs for the same type. List both to catch all usage patterns.

# Flask re-exports Request from flask.wrappers
# Users might write either:
from flask import Request           # resolves to flask.Request
from flask.wrappers import Request  # resolves to flask.wrappers.Request

# So list both FQNs
FlaskRequest = QueryType(
    fqns=["flask.wrappers.Request", "flask.Request"]
)

Standard Library Types

For stdlib modules, the FQN is just the module name. No package prefix, no special treatment. os is "os". subprocess is "subprocess". sqlite3 is "sqlite3".

For classes inside stdlib modules, include the class name: sqlite3.Cursor, http.client.HTTPConnection, pathlib.Path.

# stdlib modules
OSModule = QueryType(fqns=["os"])
SubprocessModule = QueryType(fqns=["subprocess"])

# stdlib classes
SQLiteCursor = QueryType(fqns=["sqlite3.Cursor"])
HTTPConn = QueryType(fqns=["http.client.HTTPConnection"])

Fields Reference

QueryType has three fields. Only fqns is required.

fqns (required)

A list of fully qualified names for the type. The engine matches if a variable resolves to any of these names. Always a list, even if you only have one FQN.

# Single FQN
CursorType = QueryType(fqns=["sqlite3.Cursor"])

# Multiple FQNs for the same logical type
FlaskRequest = QueryType(
    fqns=["flask.wrappers.Request", "flask.Request"]
)

patterns

Wildcard patterns that match variable names. This is a fallback for cases where type resolution cannot determine the type, but the variable name is a strong enough signal. Use * as the wildcard character.

# Match variables named like "cursor", "db_cursor", "my_cursor"
DBCursor = QueryType(
    fqns=["sqlite3.Cursor"],
    patterns=["*cursor*", "*Cursor*"]
)

Use patterns sparingly. They exist for when type resolution hits a dead end. If the engine can resolve the type, it will use the FQN. Patterns only kick in when type information is unavailable. Overly broad patterns bring back the false-positive problem that QueryType was designed to solve.

match_subclasses

Set to True to match not just the exact type, but any class that inherits from it. This is essential for frameworks where users subclass library types.

# Match sqlite3.Cursor and any subclass of it
DBCursor = QueryType(
    fqns=["sqlite3.Cursor"],
    match_subclasses=True
)

# This catches:
# - sqlite3.Cursor (exact match)
# - MyCustomCursor(sqlite3.Cursor) (subclass)
# - psycopg2.extensions.cursor (if it inherits from sqlite3.Cursor)

Real Examples

Here are QueryType definitions you can use directly in your rules. Each one shows a different pattern you will encounter in practice.

Flask Request Object

The Flask request object is what you reach for when detecting user input flowing into dangerous sinks. Users import it from two paths, so list both.

FlaskRequest = QueryType(
    fqns=["flask.wrappers.Request", "flask.Request"]
)

# Now use it in a rule to find tainted input
@rule(id="flask-sqli", severity="critical", cwe="CWE-89")
def detect_flask_sql_injection():
    """Detect SQL injection from Flask request data"""
    source = FlaskRequest.method("get_json").tracks(0)
    sink = DBCursor.method("execute").tracks(0)
    return source >> sink

Database Cursor with Subclasses

Database cursors are a good case for combining all three fields. You want the FQN for precise matching, patterns as a safety net, and subclass matching because every database driver subclasses the base cursor.

DBCursor = QueryType(
    fqns=["sqlite3.Cursor", "psycopg2.extensions.cursor",
          "mysql.connector.cursor.MySQLCursor"],
    patterns=["*Cursor*", "*cursor*"],
    match_subclasses=True
)

OS Module

For detecting dangerous OS operations like command injection. The FQN for a stdlib module is just its name.

OSModule = QueryType(fqns=["os"])

@rule(id="os-command-injection", severity="critical", cwe="CWE-78")
def detect_os_injection():
    """Detect OS command injection"""
    return OSModule.method("system", "popen").tracks(0)

Subprocess Module

Similar to the OS module, but subprocess has its own set of dangerous functions.

SubprocessModule = QueryType(fqns=["subprocess"])

@rule(id="subprocess-injection", severity="critical", cwe="CWE-78")
def detect_subprocess_injection():
    """Detect command injection via subprocess"""
    return SubprocessModule.method("run", "call", "Popen").tracks(0)

.method() - Selecting Methods

Once you have a QueryType, you need to say which methods on that type you care about. That is what .method() does. Pass one or more method names, and the engine will match calls to those methods on variables of that type.

# Match cursor.execute() and cursor.executemany()
DBCursor.method("execute", "executemany")

# Match request.get_json() only
FlaskRequest.method("get_json")

# Match os.system() and os.popen()
OSModule.method("system", "popen")

Without .method(), the QueryType just identifies the type. It does not match any calls. You need .method() to go from "I know about this type" to "I want to find calls to these methods on this type."

Think of it in two steps:

  • QueryType = "I care about this type"
  • .method() = "Specifically, these methods on it"

You can chain .method() with .tracks() and .where() to build the full query. The order matters: type, then method, then what you want to do with the arguments.

# Full chain: type -> method -> argument tracking
SubprocessModule.method("run").tracks(0)

# Full chain: type -> method -> argument constraint
SubprocessModule.method("run").where("shell", True)

.tracks(N) - Argument Position

In dataflow analysis, you need to tell the engine which argument carries the tainted data. That is what .tracks(N) does. The number is the zero-indexed position of the argument you want to track.

# Track the first argument (index 0) to cursor.execute()
# In: cursor.execute(query, params)
#                     ^^^^^
#                     This is argument 0
DBCursor.method("execute").tracks(0)

# Track the second argument (index 1)
# In: some_function(safe_value, user_input)
#                                ^^^^^^^^^^
#                                This is argument 1
SomeType.method("dangerous_call").tracks(1)

When you use .tracks(0) as a source, the engine marks the return value of that method call as tainted. When you use it as a sink, the engine checks whether tainted data flows into that argument position.

Source and Sink Example

FlaskRequest = QueryType(
    fqns=["flask.wrappers.Request", "flask.Request"]
)
DBCursor = QueryType(
    fqns=["sqlite3.Cursor"],
    match_subclasses=True
)

@rule(id="sqli-via-flask", severity="critical", cwe="CWE-89")
def detect_sql_injection():
    """Detect user input flowing into SQL queries"""
    # Source: track return value of request.get_json()
    source = FlaskRequest.method("get_json").tracks(0)

    # Sink: track first argument to cursor.execute()
    sink = DBCursor.method("execute").tracks(0)

    # Tainted data flows from source to sink
    return source >> sink

What this catches:

@app.route("/search")
def search():
    data = request.get_json()       # source: tainted
    query = data["search_term"]     # still tainted (tracked)
    cursor.execute(
        "SELECT * FROM items WHERE name = '" + query + "'"
    )                                # sink: tainted data reaches here

.where(key, value) - Argument Constraints

Sometimes you do not want to track an argument. You want to check that a specific keyword argument has a specific value. That is what .where() does. It adds a constraint: only match this method call if the given keyword argument equals the given value.

# Only match subprocess.run() when shell=True
SubprocessModule.method("run").where("shell", True)

# Only match when verify=False (insecure SSL)
RequestsSession = QueryType(fqns=["requests.sessions.Session", "requests.Session"])
RequestsSession.method("get", "post").where("verify", False)

This is how you cut through noise. Without .where(), you would flag every call to subprocess.run(). With it, you only flag the ones that actually use shell=True, which is where the real danger is.

Combining .tracks() and .where()

You can use both on the same method call. Track an argument for dataflow and constrain another argument to a specific value.

# Track the command (arg 0) but only when shell=True
SubprocessModule.method("run").tracks(0).where("shell", True)

# This matches:
#   subprocess.run(user_input, shell=True)   -> flagged
#   subprocess.run(user_input, shell=False)  -> not flagged
#   subprocess.run(user_input)               -> not flagged

Full Rule with .where()

FlaskRequest = QueryType(
    fqns=["flask.wrappers.Request", "flask.Request"]
)
SubprocessModule = QueryType(fqns=["subprocess"])

@rule(id="shell-injection", severity="critical", cwe="CWE-78")
def detect_shell_injection():
    """Detect user input in subprocess calls with shell=True"""
    source = FlaskRequest.method("get_json", "args").tracks(0)
    sink = SubprocessModule.method("run", "call", "Popen") \
        .tracks(0) \
        .where("shell", True)
    return source >> sink

Putting it all together:

  • QueryType tells the engine what type to look for
  • .method() narrows it to specific methods
  • .tracks(N) marks which argument to follow through the dataflow
  • .where(key, value) adds constraints on other arguments

Each step makes the match more precise. Start with the type. Pick the methods. Choose what to track. Add constraints. The result is a rule that finds exactly what you are looking for and nothing else.

Next Steps

Now that you understand QueryType, you have the foundation for writing precise, type-aware rules. Continue with these guides: