Description
The Rebar regex benchmarks have been updated to 8.0. The good news is that we are now in aggregate faster than PCRE on this benchmark, apparently, and faster than Rust regex on a few individual benchmarks. However there are a few outliers where .NET are significantly behind Rust regex, which is one of the fastest available.
See
BurntSushi/rebar#10 (comment)
BurntSushi/rebar#10 (comment)
these include the command line to reproduce the numbers -- and the repo itself is very well documented and has powerful scripts.
Task here is to investigate any of the biggest outliers of which these are the largest
benchmark dotnet/compiled rust/regex
--------- --------------- ----------
curated/03-date/ascii 1162.4 KB/s (139.40x) 158.2 MB/s (1.00x)
curated/03-date/unicode 1167.8 KB/s (137.05x) 156.3 MB/s (1.00x)
curated/12-dictionary/single 1436.7 KB/s (507.60x) 712.2 MB/s (1.00x)
and determine whether there is an optimization we are missing that would help close the gap.
It is OK to look at the code in https://docs.rs/crate/regex/latest/source/LICENSE-MIT.
NOTE -- this benchmark does not support source generated regex: these are from compiled. However, almost all optimizations in one apply to the other.