-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Disorder browser search lacks relevance ranking
Problem
Searching for a disorder by name in the browser app (app/index.html) returns results in alphabetical order rather than by relevance. This means exact or near-exact name matches can be buried below unrelated disorders that happen to mention the search term somewhere in their descriptions, phenotypes, or other fields.
Example: Searching "Parkinson" returns Parkinson's Disease as the 4th result, behind other disorders that merely reference "parkinson" or "parkinsonian" in their pathophysiology or phenotype descriptions.
Root cause
The current search implementation is boolean-only:
- Tokenizes all searchable fields into a flat
Setper record - Checks whether all query tokens appear as substrings of any record token
- Matching records are then sorted alphabetically by name (or by date if selected)
There is no scoring or ranking — a match in the name field is treated identically to a match buried in a phenotype list or description paragraph. There is no prefix match advantage, no term frequency weighting, and no field boosting.
Expected behavior
- Searching "Parkinson" should return Parkinson's Disease as the Bump actions/checkout from 4 to 6 #1 result
- Name matches should rank above matches in descriptions or other fields
- Prefix matches (e.g., "Parkin") should rank the matching disorder first
- Multi-word queries like "Sickle Cell" should rank Sickle Cell Disease first
- When a search query is active, results should default to relevance sort rather than alphabetical