Docs updates to reranker section from audit workflow (#228)

prrao87 · web-flow · commit b3e4a5a446c1 · 2026-05-04T12:25:34.000-04:00
* Fix bug with preparing artifact outputs

* Add reranking manifest

* Update workflow automation doc

* Update reranker docs based on audit
diff --git a/docs/reranking/custom-reranker.mdx b/docs/reranking/custom-reranker.mdx
@@ -5,9 +5,16 @@ description: Learn how to create custom rerankers in LanceDB by extending the ba
 icon: "code"
 ---
 
-You can build your own custom reranker in LanceDB by subclassing the `Reranker` class and implementing the
-`rerank_hybrid()` method. Optionally, you can also implement the `rerank_vector()` and `rerank_fts()`
-methods if you want to support reranking for vector and FTS search separately.
+You can build your own custom reranker in LanceDB by subclassing the base `Reranker` class. At a
+minimum, you need to implement `rerank_hybrid()`, which is the logic that combines vector and
+full-text search results. Beyond that, you can optionally implement `rerank_vector()` and
+`rerank_fts()` if you want your reranker to also handle pure vector or pure full-text searches.
+
+Decide up front which surfaces — hybrid, pure vector, or pure full-text — your reranker should
+cover, and only override the ones you need. The base class leaves `rerank_vector()` and
+`rerank_fts()` unimplemented, so calling `.rerank(...)` on a single-modality search you haven't
+overridden raises `NotImplementedError` rather than silently returning unsorted results. That's a
+useful guard, but worth knowing about before you wire up a query path you didn't plan for.
 
 ## Interface
 
@@ -18,6 +25,11 @@ first copy of the row encountered. This works well in cases that don't require t
 and full-text search to combine the results. If you want to use the scores or want to support
 `return_score="all"`, you'll need to implement your own merging algorithm.
 
+Whichever methods you override, your reranker has one job on the way out: attach a
+`_relevance_score` column with the most relevant rows at the top. LanceDB will reject the result
+if that column is missing, and downstream `.limit(...)` calls trust the order you return, so
+sort descending before handing the table back.
+
 Below, we show the pseudocode of a custom reranker that combines the results of semantic and full-text
 search using a linear combination of the scores:
 
diff --git a/docs/reranking/eval.mdx b/docs/reranking/eval.mdx
@@ -15,9 +15,26 @@ Combining results from multiple searches thus requires a reranking step.
 
 There are two common approaches for reranking search results from multiple sources.
 
-- **Score-based**: Calculate final relevance scores based on a weighted linear combination of individual search algorithm scores. Example: Weighted linear combination of semantic search & keyword-based search results.
+- **Score-based**: Calculate final relevance scores from the individual search algorithm scores. Examples: Reciprocal Rank Fusion (the default in LanceDB), and weighted linear combination of semantic & keyword-based search scores.
 
-- **Relevance-based**: Discards the existing scores and calculates the relevance of each search result-query pair. Example: Cross Encoder models
+- **Relevance-based**: Discards the existing scores and calculates the relevance of each search result-query pair. Example: Cross Encoder models
+
+<Info>
+If you call `.rerank()` on a hybrid query without passing a reranker, LanceDB defaults to
+`RRFReranker()` — a score-based reranker that uses Reciprocal Rank Fusion. This is the
+score-based path most readers encounter first; `LinearCombinationReranker` is an alternative
+score-based strategy you opt into explicitly.
+</Info>
+
+The hybrid `rerank(...)` method also accepts a `normalize` argument that controls how the raw
+vector and FTS scores are made comparable before reranking:
+
+- `normalize="score"` (the default) — normalizes the raw vector and FTS scores directly.
+- `normalize="rank"` — converts each result list to ranks first, then normalizes.
+
+This choice materially affects score-based rerankers (such as `LinearCombinationReranker`), so
+when you evaluate score-based strategies, treat `normalize` as a tunable hyperparameter
+alongside the reranker itself.
 
 Even though there may many more strategies for reranking, there are no "universally best"
 ones that work well for all cases, because they be dataset or application specific.
diff --git a/docs/reranking/index.mdx b/docs/reranking/index.mdx
@@ -42,6 +42,16 @@ LanceDB supports several rerankers out of the box. Here are a few examples:
 
 You can find more details about these and other rerankers in the [integrations](/integrations/reranking) section.
 
+<Note>
+**SDK coverage differs across languages**
+
+The provider-specific rerankers in the table above
+(`CohereReranker`, `CrossEncoderReranker`, `ColbertReranker`, and others under `lancedb.rerankers`)
+are currently **Python-only**. The TypeScript and Rust SDKs currently expose the generic `Reranker`
+interface (`rerankHybrid` / `rerank_hybrid`) and the built-in `RRFReranker`. To use a
+model-based reranker from TypeScript or Rust, you must implement the `Reranker` interface yourself.
+</Note>
+
 
 ### Multi-vector reranking
 Most rerankers support reranking based on multiple vectors. To rerank based on multiple vectors, you can pass a list of vectors to the `rerank` method. Here's an example of how to rerank based on multiple vector columns using the `CrossEncoderReranker`:
@@ -54,14 +64,22 @@ reranker = CrossEncoderReranker()
 
 query = "hello"
 
-res1 = table.search(query, vector_column_name="vector").limit(3)
-res2 = table.search(query, vector_column_name="text_vector").limit(3)
-res3 = table.search(query, vector_column_name="meta_vector").limit(3)
+# `deduplicate=True` requires `_rowid` on every input result set,
+# so call `.with_row_id(True)` on each search before passing it in.
+res1 = table.search(query, vector_column_name="vector").limit(3).with_row_id(True)
+res2 = table.search(query, vector_column_name="text_vector").limit(3).with_row_id(True)
+res3 = table.search(query, vector_column_name="meta_vector").limit(3).with_row_id(True)
 
-reranked = reranker.rerank_multivector([res1, res2, res3],  deduplicate=True)
+reranked = reranker.rerank_multivector([res1, res2, res3], deduplicate=True)
 ```
 </CodeGroup>
 
+- Passing `deduplicate=True` to `rerank_multivector(...)` raises a `ValueError` if any of the
+input result sets is missing the `_rowid` column. Therefore, it's recommended to add `.with_row_id(True)` to every
+`table.search(...)` call before reranking, or omit `deduplicate=True` if you don't need it.
+- `RRFReranker.rerank_multivector(...)` always requires `_rowid` on its inputs, regardless of
+the `deduplicate` flag.
+
 ## Creating Custom Rerankers
 
 LanceDB also allows you to create custom rerankers by extending the base `Reranker` class. The custom reranker
diff --git a/workflows/docs-audit/README.md b/workflows/docs-audit/README.md
@@ -115,14 +115,17 @@ So `--area indexing` maps to `manifests/indexing.toml`. If you add `manifests/se
 uv run python scripts/run_audit.py prepare --area search --refresh
 ```
 
-This creates a new run directory under `artifacts/runs/<run_id>/` and prints a JSON summary to stdout.
+This creates a pending run directory under `artifacts/pending/<run_id>/` and prints a JSON summary to stdout.
 
-After the LLM phase writes the expected outputs into that run directory, complete the run with:
+After the LLM phase writes the expected outputs into that pending run directory, complete the run with:
 
 ```bash
 uv run python scripts/run_audit.py complete --run-id <run_id>
 ```
 
+Completion publishes the directory to `artifacts/runs/<run_id>/`. Directories under `artifacts/runs/`
+are completed audit artifacts and should contain `report.md`.
+
 To clean up old generated run artifacts, use:
 
 ```bash
@@ -146,7 +149,7 @@ uv run python scripts/run_audit.py prepare \
 
 ## Inspecting Artifacts
 
-Each run directory contains:
+Each completed run directory under `artifacts/runs/<run_id>/` contains:
 
 - `metadata.json`: run-level metadata, repo refresh results, selection decisions
 - `page_bundles/*.json`: deterministic evidence bundles per page
@@ -156,6 +159,10 @@ Each run directory contains:
 
 `artifacts/latest_run.json` points to the most recently completed run.
 
+Pending run directories under `artifacts/pending/<run_id>/` are working directories from `prepare`.
+They are used for manifest validation and LLM drafting, and are not considered completed artifacts
+until `complete` publishes them.
+
 ## Using and Updating Area Manifests
 
 The manifest is the only thing you usually need to change when you want to audit another docs domain. Treat it as a mapping file:
@@ -272,7 +279,7 @@ A practical workflow:
 5. Replace each source block with a small set of relevant files.
 6. Make sure every `applies_to` entry refers to a real page `id`.
 7. Add the area to `enabled_areas` in `config.toml` if your automation depends on that list.
-8. Run `prepare --area <new-area>` and inspect the generated `page_bundles/*.json`.
+8. Run `prepare --area <new-area>` and inspect the generated `page_bundles/*.json` in the printed pending `run_dir`.
 
 ### 6. Sanity-check the manifest before using it weekly
 
@@ -284,9 +291,10 @@ uv run python scripts/run_audit.py prepare --area <new-area>
 
 Then inspect:
 
-- `artifacts/runs/<run_id>/metadata.json`
-- `artifacts/runs/<run_id>/selected_pages.json`
-- `artifacts/runs/<run_id>/page_bundles/*.json`
+- the `run_dir` printed by `prepare` (normally `artifacts/pending/<run_id>`)
+- `<run_dir>/metadata.json`
+- `<run_dir>/selected_pages.json`
+- `<run_dir>/page_bundles/*.json`
 
 If the bundles look noisy, the fix is usually one of:
 
diff --git a/workflows/docs-audit/manifests/reranking.toml b/workflows/docs-audit/manifests/reranking.toml
@@ -0,0 +1,108 @@
+name = "reranking"
+description = "Audit the user-guide reranking docs against public SDK reranker APIs, tested behavior, and user-facing examples in lancedb and docs snippets/tests."
+docs_repo = "docs"
+rotation_unit = "page"
+keywords = [
+  "rerank_hybrid",
+  "rerank_vector",
+  "rerank_fts",
+  "rerank_multivector",
+  "return_score",
+  "RRFReranker",
+  "MRRReranker",
+  "LinearCombinationReranker",
+]
+
+[[pages]]
+id = "overview"
+title = "Reranking Search Results"
+path = "docs/reranking/index.mdx"
+keywords = ["reranking", "CohereReranker", "CrossEncoderReranker", "ColbertReranker", "rerank_multivector", "deduplicate"]
+
+[[pages]]
+id = "custom-reranker"
+title = "Building Custom Rerankers"
+path = "docs/reranking/custom-reranker.mdx"
+keywords = ["custom reranker", "Reranker", "rerank_hybrid", "rerank_vector", "rerank_fts", "merge_results", "return_score"]
+
+[[pages]]
+id = "evaluation"
+title = "Evaluating Hybrid Search Performance"
+path = "docs/reranking/eval.mdx"
+keywords = ["hybrid search", "reranking strategies", "score-based", "relevance-based", "Linear Combination", "Cross Encoder", "Cohere", "ColBERT"]
+
+[[sources]]
+id = "lancedb-python-reranker-core"
+repo = "lancedb"
+kind = "public_python_api"
+applies_to = ["overview", "custom-reranker", "evaluation"]
+paths = [
+  "python/python/lancedb/rerankers/__init__.py",
+  "python/python/lancedb/rerankers/base.py",
+  "python/python/lancedb/rerankers/linear_combination.py",
+  "python/python/lancedb/rerankers/mrr.py",
+  "python/python/lancedb/rerankers/rrf.py",
+  "python/python/lancedb/query.py",
+]
+extract_keywords = ["Reranker", "rerank_hybrid", "rerank_vector", "rerank_fts", "merge_results", "rerank_multivector", "return_score", "_relevance_score", "RRFReranker", "MRRReranker", "LinearCombinationReranker"]
+
+[[sources]]
+id = "lancedb-python-reranker-tests"
+repo = "lancedb"
+kind = "public_python_tests"
+applies_to = ["overview", "custom-reranker", "evaluation"]
+paths = [
+  "python/python/tests/test_rerankers.py",
+  "python/python/tests/test_hybrid_query.py",
+]
+extract_keywords = ["return_score", "_relevance_score", "rerank_multivector", "deduplicate", "RRFReranker", "MRRReranker", "LinearCombinationReranker", "CohereReranker", "CrossEncoderReranker", "ColbertReranker"]
+
+[[sources]]
+id = "lancedb-python-provider-rerankers-overview"
+repo = "lancedb"
+kind = "public_python_api"
+applies_to = ["overview"]
+paths = [
+  "python/python/lancedb/rerankers/cohere.py",
+  "python/python/lancedb/rerankers/colbert.py",
+  "python/python/lancedb/rerankers/cross_encoder.py",
+]
+extract_keywords = ["CohereReranker", "ColbertReranker", "CrossEncoderReranker", "model_name", "return_score", "_relevance_score"]
+
+[[sources]]
+id = "lancedb-typescript-rust-rerankers"
+repo = "lancedb"
+kind = "typescript_rust_api"
+applies_to = ["overview", "custom-reranker"]
+paths = [
+  "nodejs/lancedb/rerankers/index.ts",
+  "nodejs/lancedb/rerankers/rrf.ts",
+  "nodejs/__test__/rerankers.test.ts",
+  "rust/lancedb/src/rerankers.rs",
+  "rust/lancedb/src/rerankers/rrf.rs",
+  "rust/lancedb/src/query.rs",
+]
+extract_keywords = ["Reranker", "RRFReranker", "rerankHybrid", "rerank_hybrid", "_relevance_score", "custom reranker"]
+
+[[sources]]
+id = "lancedb-generated-js-reranker-docs"
+repo = "lancedb"
+kind = "generated_api_docs"
+applies_to = ["overview"]
+paths = [
+  "docs/src/js/namespaces/rerankers/README.md",
+  "docs/src/js/namespaces/rerankers/interfaces/Reranker.md",
+  "docs/src/js/namespaces/rerankers/classes/RRFReranker.md",
+  "docs/src/js/classes/VectorQuery.md",
+]
+extract_keywords = ["rerankers", "Reranker", "RRFReranker", "create", "rerank"]
+
+[[sources]]
+id = "sophon-reranking-example-surface"
+repo = "sophon"
+kind = "enterprise_surface"
+applies_to = ["overview"]
+paths = [
+  "src/dash/src/components/examples/components/ExampleCards.tsx",
+]
+extract_keywords = ["reranking", "RRF", "hybrid-search", "custom reranking"]
diff --git a/workflows/docs-audit/prompts/weekly_automation.md b/workflows/docs-audit/prompts/weekly_automation.md
@@ -39,23 +39,27 @@ Then read the manifest file for each area listed in `enabled_areas` in `config.t
      - `uv run python scripts/run_audit.py prepare --area <first-area> --refresh`
    - For subsequent areas in the same weekly run, skip the refresh to avoid repeating `git pull`:
      - `uv run python scripts/run_audit.py prepare --area <next-area>`
-4. Read the JSON summary printed by each `prepare` command and locate each new run directory.
-5. For each run directory, read `selected_pages.json` and the corresponding files in `page_bundles/`.
+4. Read the JSON summary printed by each `prepare` command and locate each pending run directory.
+   - Use the printed `run_dir`; it should point under `artifacts/pending/<run_id>`.
+   - Do not create or write directly under `artifacts/runs/<run_id>` before completion.
+5. For each pending run directory, read `selected_pages.json` and the corresponding files in `page_bundles/`.
 6. For each selected page bundle:
    - apply `prompts/page_audit_guidelines.md` as the page-level review rubric
    - infer normalized code claims from the evidence bundle
    - infer normalized doc claims from the docs bundle
    - identify only the missing documentation
-7. Write semantic outputs under `llm_outputs/` in each run directory.
+7. Write semantic outputs under `llm_outputs/` in each pending run directory.
    - one file per page for code claims
    - one file per page for doc claims
    - one file per page for candidate gaps
-8. Write `report.md` in each run directory.
+8. Write `report.md` in each pending run directory.
    - `report.md` is the docs-gap summary only.
    - Do not include refresh status, manifest-maintenance notes, selected-pages bookkeeping, or any other workflow narration in `report.md`.
    - Include operational notes only if they materially affected audit quality, such as an unrefreshable repo, missing source files, or a manifest ambiguity that changes confidence in the findings.
 9. Complete each run:
    - `uv run python scripts/run_audit.py complete --run-id <run_id>`
+   - Completion publishes the pending directory to `artifacts/runs/<run_id>` and updates `artifacts/latest_run.json`.
+   - Only completed runs with `report.md` should appear under `artifacts/runs/`.
 10. Return a concise markdown summary suitable for the Codex inbox item.
 
 ## Manifest maintenance rules
diff --git a/workflows/docs-audit/scripts/run_audit.py b/workflows/docs-audit/scripts/run_audit.py
diff --git a/workflows/docs-audit/skills/area-manifest-authoring/SKILL.md b/workflows/docs-audit/skills/area-manifest-authoring/SKILL.md