lxml is the C-backed XML / HTML parser. etree.parse / fromstring with a custom XMLParser(resolve_entities=True) is an XXE sink. Default behavior in recent lxml is safer but the API still allows unsafe configurations.
.parse().fromstring().parse()Sinklxml.etree.parse(source, parser=None, base_url=None) -> ElementTree
Parses XML. XXE sink when parser has resolve_entities=True.
0.fromstring()Sinklxml.etree.fromstring(text, parser=None, base_url=None) -> Element
Parses XML from string. XXE sink under unsafe parser config.
0.XMLParser()Neutrallxml.etree.XMLParser(resolve_entities=False, no_network=True, ...) -> XMLParser
Creates an XML parser. Finding when resolve_entities=True or no_network=False.
.HTMLParser()Neutrallxml.etree.HTMLParser(recover=True, ...) -> HTMLParser
HTML parser variant. Less XXE risk than XML but still processes entities.
| FQN | Field | |
|---|---|---|
| lxml | fqns[0] | |
| lxml.etree | fqns[1] |
Wrong FQN → 0 findings. Verify with: change fqns to garbage → must produce 0 results.
from codepathfinder.go_rule import PyLxml