Skip to content

Conversation

tiramisuflavor
Copy link

@tiramisuflavor tiramisuflavor commented Mar 8, 2025

Disabled regex to find better matches as agreed on F1Carreras Discord. Searching for "Formula1 2024 97" returns nothing, but "Formula1 S2024E97" works just fine.

- name: re_replace # S2024 to 2024 and S2024E97 to 2024 97
args: ["\\b(?:S(\\d{2,4}))(?:E(\\d{2,4}))?\\b", "$1 $2"]
# disabled to find better matches as agreed on F1Carreras Discord
# searching for "Formula1 2024 97" returns nothing, but "Formula1 S2024E97" works just fine
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an workaround for both Formula1 S2024E97 and Formula1 2024x97 formats. Unless 2024x97 is not being used at all, your change breaks matching both formats.

Sadly when I pushed the change it worked fine for both AFAIK.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for you review and insights. Let me collect some data and see if there’s any better alternative.

Copy link
Author

@tiramisuflavor tiramisuflavor Mar 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been playing around with the meilisearch engine they have and finally found a way to get both formats working. It's not very intuitive, but it's the price of a full-text search mechanism.

Example Inputs & Expected Outputs:

Input Step 1 (SxxExx → SxxXxx) Step 2 (Sxx → xx, if no x) Final Output Expected Results
S2024E103 S2024x103 (unchanged) S2024x103 S2024E103 and 2024x103
S2024 (unchanged) 2024 2024 2024 Season Release

Copy link
Contributor

@ilike2burnthing ilike2burnthing Aug 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formula1 "S2024E97" returns 1 result - Formula1 S2024E97
Formula1 S2024x97 returns 1 result - Formula1 S2024E97 (same as ^)
Formula1 "2024x97" returns 1 result - Formula1 2024x97

Quotation marks seem to be required in some cases, as the tracker will return irrelevant results otherwise.

Doesn't look like there's a simple one size fits all fix for this. Best case scenario is going to be finding which returns the most results most of the time.

From a very quick look, I found that NTT "S2025e37" and NTT S2025x37 will return the same one response, but there's nothing for NTT "2025x37". You'd need to look at more examples.

@tiramisuflavor tiramisuflavor marked this pull request as draft March 9, 2025 14:22
@tiramisuflavor tiramisuflavor marked this pull request as ready for review March 10, 2025 19:00
@bakerboy448 bakerboy448 added the Status: Ready for Review Ready for Review label Apr 25, 2025
@bakerboy448 bakerboy448 changed the title Update keyword filter for improved search accuracy F1Carreras : Update keyword filter for improved search accuracy Apr 27, 2025
@bakerboy448 bakerboy448 added the Status: Testing Needed Indexer needs testing and validation label Apr 27, 2025
Co-authored-by: ilike2burnthing <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Status: Ready for Review Ready for Review Status: Testing Needed Indexer needs testing and validation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants