Skip to content

Tiered compilation #6

@no-defun-allowed

Description

@no-defun-allowed

The compiler is a bit slow. It's faster than compilers for many programming languages, but it's weird for a funny-looking regular expression to take 150ms or so to compile. I've done my part in optimizing DFA generation, so most compile-time is taken by the Common Lisp compiler. On the other hand, the compiler also generates really good code - I wouldn't be surprised if most projects could go without full compilation. While we could sink DFA compilation into compile time when the regular expression is known at compile time, we do provide a function which conceptually accepts a string at runtime, and so it would be nice to do something useful if we need to compile at runtime.

So here is an idea: we do something more like JIT compilation and have a tiered compiler. The current compiler is used for "hot" regexes which are used to scan lots of vectors, and we use a chain of closures for cold code. Some heuristic would be used to make the switch to hot code, such as if we've matched more than some length with a cached chain of closures, and the optimizing compiler could run concurrently in another thread, too.

The chain of closures would take some of the benefits of DFA generation: we know how many registers are needed (though, without a compiler with copy propagation we overestimate substantially; oh well) and we still have linear complexity while scanning. We also could still use type splitting for chains of closures. All in all I wouldn't be too surprised if the cold compiler was still faster than cl-ppcre and other bytecode/closure-based implementations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestquestionFurther information is requested

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions