Skip to content

[BUG] RrfRank.Log: silent degenerate to insertion order with empty Scores on all-negative baselines #497

@tazarov

Description

@tazarov

Observed behavior

Captured from the deferred t.Logf observation in TestCloudClientSearchRRFArithmetic/Log (Phase 21.1 Pass 1 run against Chroma Cloud):

pass1 degenerate Log:    err=<nil> IDs=[[1 2 3 4 5]] Scores=[[]]

The server returned searchErr=nil, an outer IDs slice with exactly one inner entry [1 2 3 4 5] in default insertion order, and an outer Scores slice with one inner entry that is an empty slice.

Expected behavior

The server should EITHER:

  1. Reject the request with a structured error (400 with a message like "log of non-positive score is undefined"), OR
  2. Return well-defined sentinel scores (NaN or -Inf) for every doc so the client can detect and propagate the degenerate state.

Silently falling back to insertion order with an empty inner Scores slice is indistinguishable from "ranking was bypassed entirely" from the client's perspective.

Wire-level explanation

From .planning/phases/21.1-rrf-cloud-integration-test-coverage-including-arithmetic-com/21.1-RESEARCH.md § Pitfall 4:

RrfRank.MarshalJSON() in pkg/api/v2/rank.go:1217 auto-negates the fusion score before serializing:

// Negate (RRF gives higher scores for better, Chroma needs lower for better)
result := rrfSum.Negate()
return result.MarshalJSON()

Consequently, rrf.Log() on the wire operates on -rrf_sum, which is non-positive for the full all-negative baseline this test exercises (empirical Pass 1 baseline RRF scores: [-0.0333, -0.0328, -0.0323, -0.0318, -0.0313]). log(x) for x <= 0 is mathematically undefined — hence the server needs a defined policy.

Classification

Server-behavior defect per the L1 rubric in .planning/phases/21.1-rrf-cloud-integration-test-coverage-including-arithmetic-com/21.1-REVIEWS.md § Action Item L1. The server should either reject with a structured error or return well-defined sentinel scores (NaN/Inf). Silently degenerating to insertion order with an empty inner Scores slice violates the silent-failure prevention guideline in CLAUDE.md § Panic Prevention Guidelines (as a library consumer, the Go client has no way to detect this degenerate state without out-of-band knowledge of the request shape).

Surfaced by

pkg/api/v2/client_cloud_test.go :: TestCloudClientSearchRRFArithmetic/Log (added in Phase 21.1 Pass 1; pinned as a regression assertion in Pass 2).

Repro

export CHROMA_API_KEY=<your key>
export CHROMA_DATABASE=<your database>
export CHROMA_TENANT=<your tenant>
go test -tags="basicv2 cloud" -v -run "TestCloudClientSearchRRFArithmetic/Log" ./pkg/api/v2/...

The test PASSES today because Pass 2 pins the observed degenerate behavior as a regression assertion (require.Empty(sr.Scores[0], ...) and require.Equal([]DocumentID{"1","2","3","4","5"}, sr.IDs[0])). When the server fix lands, the pin will flip to the correct behavior as part of the fix phase.

Scope

Phase 21.1 pins this behavior as a regression assertion. Fix deferred per D-20 (Phase 21.1 does not fold fixes into its own scope — each discovered bug becomes its own phase).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions