Description
Hi,
I'm using pydruid.db.connector
to run a query that pulls a row where the content that is returned ends in "...\\"
, and this appears to break pydruid, meaning it either drops rows from the data or fails with a JSONDecodeError
.
e.g. "SELECT x FROM y"
-> [{"x": "some row"},{"x": "...\\"},{"x": "another row"},{"x": "more rows"}]
2020-11-27 10:44:23: [CRITICAL] JSONDecodeError('Unterminated string starting at: line 1 column 85919 (char 85918)')
2020-11-27 10:44:23: [CRITICAL] Traceback (most recent call last):
File "xxxxx", line 291, in main
data_paths = pull_data(tracker.last_data_dt, tracker.next_data_dt)
File "xxxxx", line 162, in pull_data
data_path = collector.execute_and_save()
File "xxxxx", line 226, in execute_and_save
for i, row in enumerate(cursor):
File "xxxxx", line 181, in _get_cursor
raise err
File "xxxxx", line 164, in _get_cursor
raise err
File "xxxxx", line 161, in _get_cursor
r = next(cursor)
File "/xxxx/venv/lib64/python3.8/site-packages/pydruid/db/api.py", line 62, in g
return f(self, *args, **kwargs)
File "/xxxx/venv/lib64/python3.8/site-packages/pydruid/db/api.py", line 320, in next
return next(self._results)
File "/xxxx/venv/lib64/python3.8/site-packages/pydruid/db/api.py", line 370, in _stream_query
for row in rows_from_chunks(chunks):
File "/xxxx/venv/lib64/python3.8/site-packages/pydruid/db/api.py", line 420, in rows_from_chunks
for row in json.loads(
File "/usr/lib64/python3.8/json/init.py", line 370, in loads
return cls(**kw).decode(s)
File "/usr/lib64/python3.8/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python3.8/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Unterminated string starting at: line 1 column 85919 (char 85918)
Any rows proceeding the {"x": "...\\"}
either do not return data, or return a JSONDecodeError
. I'm guessing this is because pydruid.db.api.rows_from_chunks
tries to parse the JSON itself, and looks for "\\"
as end of strings?
I have attached a script and a dummy JSON file (scratch.zip) that shows the rows being dropped by the function but this does not trigger the JSONDecodeError
- this appears to only trigger when I try to read this row and the surrounding rows from the database.
Many thanks in advance