Merged
Conversation
xavivars
reviewed
Jul 22, 2022
Contributor
Author
|
I tested this on
I also observed a slowdown in runtime, which, if it's due to the different fst structure would roughly cancel out the benefits if your workflow involves running a large corpus through the pipeline after each recompilation. It would probably also be worth checking whether a language with less divergence between variants would have as much of a slowdown from merging them. And I should add tests. |
Contributor
Author
|
I tested this on
So it seems that the usefulness of this will need to be determined on a language-by-language basis. I also made |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The goal of this PR to make it so that in place of
we can instead write
Why, you might ask, would we want to replace 2 commands with 4 (or 3, if I make
lt-restrictinvert the fst when the direction isrl)? Well, ifLT_RELEASEis unset or is set tono,lt-restrictwill not minimize the transducer (which, even after recent optimizations, is still by far the biggest piece of the process), significantly cutting down on overall compile time, especially for languages like-ociwhere the dictionary is getting compiled 6 times.This PR is a draft because in order for this to be fully usable, I need to also write a tool to apply an ACX file to an already-compiled transducer.
Oh, and I wrote a wrapper around
getoptbecause I was tired of typing the same boilerplate over and over again.