Skip to content

Use locale-aware names for search result reranking#4088

Merged
lonvia merged 6 commits into
osm-search:masterfrom
Itz-Agasta:language-reranking
May 20, 2026
Merged

Use locale-aware names for search result reranking#4088
lonvia merged 6 commits into
osm-search:masterfrom
Itz-Agasta:language-reranking

Conversation

@Itz-Agasta
Copy link
Copy Markdown
Contributor

@Itz-Agasta Itz-Agasta commented May 3, 2026

Summary

Searching for "Athens" with accept-language: en returned Athens, Georgia, US before Athens, Greece because the reranking logic used display_name for word matching, which returned the local Greek name "Αθήνα". This caused a distance penalty for Athens, Greece since "Athens" had no match in the word set.

Introduce _get_result_rerank_text() which builds the reranking text using locale-aware name resolution via locales.display_name(). This ensures that translated name variants (e.g., name:en) are included in the word matching pool when the caller specifies a language preference.

closes #3871 #4062

AI usage

None

Contributor guidelines (mandatory)

  • I have adhered to the coding style
  • I have tested the proposed changes
  • I have disclosed above any use of AI to generate code, documentation, or the pull request description

@Itz-Agasta
Copy link
Copy Markdown
Contributor Author

Results

toto-result

Copy link
Copy Markdown
Member

@lonvia lonvia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in general, just some minor suggestions for code improvements.

Comment thread src/nominatim_api/search/geocoder.py Outdated
Comment thread src/nominatim_api/search/geocoder.py Outdated
Comment thread test/bdd/features/db/query/search_simple.feature
@lonvia
Copy link
Copy Markdown
Member

lonvia commented May 7, 2026

Please rebase this on master. #4090 contained a few new fixes for mypy.

Itz-Agasta added 3 commits May 9, 2026 21:07
When reranking search results, the word matching pool was built only
from display_name, which returns the local name (e.g., "Αθήνα" for
Athens, Greece). This caused results with translated names to receive
unfair distance penalties when the caller used accept-language.
Introduce _get_result_rerank_text() which builds the reranking text
using locale-aware name resolution from locales.display_name(). This
ensures that name variants like name:en are included in the word
matching pool when the caller requests a specific language.
Add a test scenario where a place with a non-English local name
and an English name:en alias competes against a place with only
an English name. Verifies that accept-language=en correctly
- Refines how result labels are aggregated for reranking by using sets of localized names instead of concatenated strings.

- Updates tests to reflect improved handling of locale-specific queries.
@Itz-Agasta Itz-Agasta force-pushed the language-reranking branch from 3fd7bd2 to 01c2c26 Compare May 9, 2026 15:42
Comment thread src/nominatim_api/search/geocoder.py Outdated
@Itz-Agasta
Copy link
Copy Markdown
Contributor Author

Itz-Agasta commented May 13, 2026

Ig its good to go... pls have a look now

Comment thread src/nominatim_api/search/geocoder.py Outdated
Enhances address ranking accuracy by comparing normalized query text
against all localized name variants and country code, rather than
a single locale name. Reduces false positives and improves ranking
for multilingual search scenarios.
@lonvia lonvia merged commit 242c73d into osm-search:master May 20, 2026
8 checks passed
@lonvia
Copy link
Copy Markdown
Member

lonvia commented May 20, 2026

All good. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Athens: unexpected search result order

2 participants