Skip to content

Re-tune work search ranking parameters #11777

@cdrini

Description

@cdrini

We've made a number of changes to our search, and it's time to re-evaluate / re-tune some of our search parameters via the evaluation spreadsheet.

Potential tweaks:

  1. we should decrease the edition_count boosting: bf='min(100,edition_count) min(100,def(readinglog_count,0))' . I think this served as well while we were still getting more readinglog_count usage, but now that we do have a lot of readinlog_count, I think it's causing non-relevant classics to get boosted. We should cap it closer to maybe 20, if that.
  2. We might want to increase the phrase boosting, pf. I'm wary of adding the pf2/etc parameters, since I'm afraid they might have a perf impact, but worth experimenting with.
  3. We explicitly specify AND ; but having a mm parameter, which allows long queries to only partially match ~75% might be interesting! But I don't think that's what the data is showing is the problem.
  4. Stopwords. We had to disable stopwords many years ago due to a bug solr had ; it would be worth revisiting that to see if we can re-enable them. Note this would require a full reindex.
  5. We have a really annoying subtle/low-level issue with the way search works related to term frequency and merging ; that would require a non-trivial solr config change/investigation. See Investigate peculiar search ranking #10969 .

Note this will require exposing some low-level search parameters for us to play with on testing. Eg #8482 .

Stakeholders

@lokesh

Metadata

Metadata

Assignees

No one assigned

    Labels

    Lead: @cdriniIssues overseen by Drini (Staff: Team Lead & Solr, Library Explorer, i18n) [managed]Module: SolrIssues related to the configuration or use of the Solr subsystem. [managed]Needs: ResponseIssues which require feedback from leadPriority: 2Important, as time permits. [managed]Theme: SearchIssues related to search UI and backend. [managed]

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions