-
Notifications
You must be signed in to change notification settings - Fork 5
Description
So just for my own amusement, I tried the stupidest memoization I could think of, to see if it helped. I figured hey, I already did the tracing stuff, which means that a single call is wrapping every rule check, perfect place to put memoization, right? The diffs are at https://gist.github.com/rlpowell/43ff4ab1550004b8a802406b49f7b4d2 , but all it really is is I added:
memo: Box<HashMap<String, &'static str>>,
to ParserState (yes, it's a &str, I was going for quick and dirty) and lines like this to the tracing module wherever a production failed:
(*new_state.memo).insert(memo_string.clone(), "Err");
memo_string there is the rule plus the position.
This took my 5 character pathological example from ~390 seconds to 0.4 seconds. (Doing the equivalent thing with successful productions just doesn't work, and given how well failed-only worked I didn't bother to figure out why.)
So, yeah. I'd prefer to stay with Pest, partly because sunk costs and partly because it's the most popular Rust PEG parser, so if there's a version of this that y'all would be willing to mainline, please let me know; I have zero interest in maintaining my own fork.
If you do want me to polish this up, I'd love a pointer to a real-time place to discuss it, as that pretty clearly isn't the Discord.
FWIW, it turns out ( https://docs.rs/peg/latest/peg/#caching-and-left-recursion ) that rust-peg is also not a packrat parser by default, but you can mark individual productions for caching. It seems very Rust-y that the top two PEG parsers are both like "make your grammar performant that's not our job". :) rust-peg also has left-recursion support via caching, which my project also needs, so if Pest doesn't work out I'll go there. But I dunno, maybe y'all want to do a similar thing where productions can be marked for caching
Originally posted by @rlpowell in pest-parser/pest#1081 (comment)