-
|
I am parsing PL/SQL with a derivation of the ANTLR4 PL/SQL grammar. My parser is written in Java and reads sql files from a zip input file and parses them serially. Say I have two sql files parse1.sql and parse2.sql. If I package these separately as zip files, they parse without error. However, if I package them together, the second file loops in closure. I instantiate a new lexer and parser for every parse. After each parse I call getInterpreter().clearDFA() on both the lexer and the parser. The symptoms seems to imply, that the first parse is having some effect on the second parse. |
Beta Was this translation helpful? Give feedback.
Replies: 5 comments 4 replies
-
|
I discovered that if I add a method to the PredictionContextCache to clear the cache Map, the loop goes away and the both files parse fine. Would this be considered a bug? |
Beta Was this translation helpful? Give feedback.
-
|
I'll give it a try. If I can create a smaller reproducible test case. I'll close this issue. |
Beta Was this translation helpful? Give feedback.
-
|
I have a stripped down example. In my original post I said that I was parsing serially. This example is currently parsing each file in a different thread. When I reproduced the error parsing serially, I had to mess around with the zip file to get files to parse in a different order. With the attached example you can just run it multiple times and sometimes it will hang and sometimes parse successfully. |
Beta Was this translation helpful? Give feedback.
-
|
The problem is your grammar. It's extraordinarily slow due to ambiguity and max-k's. You can't work around this by parallelizing the parses. I can't see the .dot files because there is a bug in the Antlr4 tool, so I can't see why table_ref_aux has a max-k of 1323. The problem is that you use EOF on the RHS of a lexer rule. You really should not do that. |
Beta Was this translation helpful? Give feedback.
-
|
The problem is in merge(). When two threads are working on the same context, one thread can create a partially completed result that another thread reads. When this happens, the parent chains can become circular. Yes, it's a bug. We could fix it by single-threading So, here is an update to your code that seems to work.
Deleted plsqlParser.getInterpreter().clearDFA() and lexer.getInterpreter().clearDFA() after each parse. Why? The DFA is a shared static state across all parser instances. In a parallel stream, one thread could call clearDFA() while other threads were actively using that DFA to parse, corrupting their state mid-parse.
Changed ParseInput to carry the zip File path instead of the shared ZipFile handle. Each parseEntry() call now opens its own ZipFile. Why? A single ZipFile instance shared across parallel threads means concurrent calls to getInputStream() share underlying native inflater state, which is not thread-safe. Each thread has its own ZipFile, eliminating the need for sharing entirely.
After constructing the PlSqlLexer and PlSqlParser, we immediately replace their ATNSimulator with a fresh one backed by a newly allocated DFA[] and a new PredictionContextCache. Why? ANTLR's generated classes store their DFA[] in a static field shared across all instances. Concurrent threads racing to build DFA state can corrupt the shared PredictionContext graph — creating cyclic parent-pointer chains (A→B→A) that cause equals() to recurse infinitely. With a private DFA[] per parse, all mutable prediction states are thread-local, and there is nothing to corrupt.
Before calling sql_script(), set PredictionMode.SLL. If the parse reports syntax errors, reset the token stream and re-parse with PredictionMode.LL. Why? This was the root cause of the hang. ANTLR's full LL mode tracks complete calling contexts using PredictionContext objects linked together in a graph via parent pointers. For deeply recursive grammar rules (common in PL/SQL), PredictionContext.merge() can create cycles in that graph even within a single-threaded parse. Any subsequent HashMap.get() that calls equals() on those cyclic contexts then recurses forever. SLL mode uses a context-free prediction algorithm that never builds these deep context graphs, so the cycle can never form. The LL fallback is only triggered for genuinely ambiguous grammar points, and in practice, those parsed correctly and terminated within the observed ~15-17 seconds.
A daemon thread that wakes after 15 seconds and dumps all thread stack traces every 5 seconds. This works at least on my slow-ish system. Why? Code used to identify the root cause of the hang. The thread dump showed ForkJoinPool.commonPool-worker-2 burning CPU in an infinite SingletonPredictionContext.equals() recursion deep inside PredictionContext.mergeSingletons(), which pointed directly to the cyclic context graph problem. The output is commented out, but the thread remains, so it can be re-enabled if a future hang needs diagnosing. |
Beta Was this translation helpful? Give feedback.
The problem is in merge(). When two threads are working on the same context, one thread can create a partially completed result that another thread reads. When this happens, the parent chains can become circular.
Yes, it's a bug. We could fix it by single-threading
merge(),but that would kill performance. So, it's recommended to just replace the DFA[] and PredictionContextCache caches. Admittedly a hack.So, here is an update to your code that seems to work.
antlr-report-fixed.zip It changes four things.
Deleted plsqlParser.getInterpreter().clearDFA() and lexer.getInterpreter().clearDFA() after each parse.
Why? The DFA is a shared static state across all parser…