Skip to content

Add score visibility, cache-or-compute pattern, and performance improvements#8

Open
4RSIM3R wants to merge 2 commits into
upstash:masterfrom
4RSIM3R:master
Open

Add score visibility, cache-or-compute pattern, and performance improvements#8
4RSIM3R wants to merge 2 commits into
upstash:masterfrom
4RSIM3R:master

Conversation

@4RSIM3R

@4RSIM3R 4RSIM3R commented Dec 7, 2025

Copy link
Copy Markdown

Summary

This PR introduces three major enhancements to the semantic cache library that improve developer experience, performance, and observability:

  1. Default minProximity with validation - Makes the API more user-friendly
  2. Batch upsert optimization - Significantly improves performance for bulk operations
  3. getWithScore() method - Provides visibility into similarity scores for debugging and tuning
  4. getOrSet() method - Implements cache-or-compute pattern for expensive operations

Motivation

Problem 1: No default minProximity

Currently, minProximity is required in the config, but the README claims it defaults to 0.9. This creates confusion and forces users to always specify a value even when the default would work.

Problem 2: Poor bulk upsert performance

The current implementation loops through items and makes sequential HTTP calls:

for (const upsert of upserts) {
  await this.index.upsert(upsert); // N network calls!
}

Problem 3: No visibility into similarity scores

Users can't see why something matched or didn't match, making it difficult to:

  • Debug cache behavior
  • Tune the minProximity threshold
  • Compare embedding models
  • Monitor cache effectiveness

Problem 4: Manual cache-miss handling

Users must manually implement the cache-or-compute pattern for expensive operations like LLM calls, leading to boilerplate code.

Changes

1. Default minProximity with validation ✅

Before:

const cache = new SemanticCache({ index, minProximity: 0.9 }); // Required!

After:

const cache = new SemanticCache({ index }); // Optional, defaults to 0.9
  • minProximity is now optional and defaults to 0.9
  • Values outside [0, 1] are automatically clamped (e.g., -0.50, 1.51)
  • Type signature updated to minProximity?: number

2. Batch upsert optimization 🚀

Before: N sequential HTTP requests

for (const upsert of upserts) {
  await this.index.upsert(upsert); // 10 items = 10 HTTP calls
}

After: Single batch HTTP request

await this.index.upsert(upserts); // 10 items = 1 HTTP call

Performance impact:

  • ~10x faster for 10 items
  • Reduced network overhead and latency
  • Lower costs due to fewer API calls

3. getWithScore() method 🔍

Returns cached values with similarity scores for observability:

const result = await cache.getWithScore("What's the capital of France?");
// { value: "Paris", score: 0.96 }

Use cases:

  • Debugging: See why something matched or didn't
  • Tuning: Analyze score distribution to set optimal minProximity
  • Model comparison: Measure quality of different embedding models
  • Monitoring: Track cache effectiveness over time
  • Confidence thresholds: Different actions based on score ranges

Supports both single and bulk operations:

// Single
const result = await cache.getWithScore(key);

// Bulk
const results = await cache.getWithScore([key1, key2, key3]);

4. getOrSet() method 💡

Implements cache-or-compute pattern to reduce boilerplate:

const capital = await cache.getOrSet("capital of Japan", async () => {
  // Only called on cache miss
  return await expensiveLLMCall();
});

Features:

  • Returns cached value if exists (above minProximity)
  • Computes and caches on miss
  • Supports sync and async compute functions
  • Bulk operations with selective computation (only computes missing items)
  • Uses batch upsert for performance

Bulk example:

const capitals = await cache.getOrSet(
  ["capital of Germany", "capital of Italy"],
  [async () => await fetchCapital("Germany"), async () => await fetchCapital("Italy")]
);
// Only calls functions for cache misses, batches the upserts

Testing

Added comprehensive test coverage:

Validation tests (4 tests):

  • ✅ Default minProximity is 0.9
  • ✅ Clamps values > 1 to 1
  • ✅ Clamps values < 0 to 0
  • ✅ Accepts valid values in range

getWithScore() tests (4 tests):

  • ✅ Returns value with score on hit
  • ✅ Bulk retrieval with scores
  • ✅ Returns undefined on miss
  • ✅ Scores help debug similarity

getOrSet() tests (5 tests):

  • ✅ Computes and caches on miss
  • ✅ Returns cached without computing on hit
  • ✅ Bulk with all misses
  • ✅ Bulk with mixed hits/misses
  • ✅ Async compute functions

Test results:

✓ 20 pass
✓ 0 fail
✓ 37 expect() calls

All existing tests pass - no breaking changes.

Documentation

Updated README.md with:

  • ✅ New highlights for score visibility and cache-or-compute
  • ✅ Updated minProximity documentation (optional, defaults to 0.9, auto-clamping)
  • getWithScore() examples (single + bulk)
  • getOrSet() examples (single + bulk, async functions)
  • ✅ Complete API reference for all methods
  • ✅ Export CacheWithScoreResult type for TypeScript users

Breaking Changes

None. This is a backwards-compatible enhancement:

  • Existing code continues to work unchanged
  • New features are opt-in
  • minProximity can still be explicitly provided

Additional Changes

  • Fixed .husky/pre-commit hook (was running bun test run instead of bun test)
  • Added CacheWithScoreResult type export for TypeScript consumers
  • Improved test stability with proper async cleanup in afterEach

Migration Guide

No migration needed! But you can take advantage of new features:

Optional: Simplify initialization

// Before
const cache = new SemanticCache({ index, minProximity: 0.9 });

// After (if using default)
const cache = new SemanticCache({ index });

Optional: Add score visibility

const result = await cache.getWithScore(query);
console.log(`Match: ${result.value} (confidence: ${result.score})`);

Optional: Use cache-or-compute

const answer = await cache.getOrSet(userQuery, async () => {
  return await callLLM(userQuery);
});

Checklist

  • ✅ Code follows project style guidelines
  • ✅ All tests pass (20/20)
  • ✅ Added tests for new functionality
  • ✅ Updated documentation (README)
  • ✅ No breaking changes
  • ✅ Backward compatible
  • ✅ Type exports updated

Related Issues

Addresses common feature requests:

  • Default configuration values
  • Performance optimization for bulk operations
  • Debugging and observability
  • Cache-or-compute pattern

Thank you for reviewing! I'm happy to make any changes or answer questions.

Muhammad Ilzam Mulkhaq and others added 2 commits December 7, 2025 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant