Go Semantic Cache Example

A Go implementation of semantic caching for LLM responses using Valkey vector search.

Features

Semantic similarity matching using embeddings
Configurable similarity threshold
TTL-based cache expiration
HTTP REST API
Zero external dependencies (uses go-redis)

Quick Start

1. Start Valkey

cd ../../../deployment/docker
docker-compose up -d valkey-stack

2. Build and Run

go mod download
go run main.go

3. Configure (Optional)

export VALKEY_HOST=localhost
export VALKEY_PORT=6379
export SIMILARITY_THRESHOLD=0.92
export CACHE_TTL_SECONDS=86400
export PORT=8000

4. Test the Cache

# First query - cache miss
curl -X POST http://localhost:8000/query \
  -H "Content-Type: application/json" \
  -d '{"query": "What is machine learning?"}'

# Check stats
curl http://localhost:8000/cache/stats

API Endpoints

POST /query

{
  "query": "What is machine learning?",
  "skip_cache": false
}

GET /health

Health check endpoint.

GET /cache/stats

Get cache statistics.

POST /cache/clear

Clear all cache entries.

Project Structure

.
├── main.go   # Complete implementation
├── go.mod    # Go module file
└── README.md # This file

Configuration

Variable	Default	Description
`VALKEY_HOST`	localhost	Valkey server host
`VALKEY_PORT`	6379	Valkey server port
`SIMILARITY_THRESHOLD`	0.92	Cache hit threshold
`CACHE_TTL_SECONDS`	86400	Cache TTL (24 hours)
`PORT`	8000	API server port

Usage as a Library

package main

import (
    "context"
)

func main() {
    config := Config{
        ValkeyHost:          "localhost",
        ValkeyPort:          6379,
        SimilarityThreshold: 0.92,
        CacheTTLSeconds:     86400,
        EmbeddingDimensions: 1536,
    }

    cache := NewSemanticCache(config)
    ctx := context.Background()

    if err := cache.Initialize(ctx); err != nil {
        panic(err)
    }

    // Lookup
    embedding := generateEmbedding("What is AI?")
    result, _ := cache.Lookup(ctx, embedding)

    if result.Hit {
        fmt.Println("Cache hit!", result.Response)
    } else {
        // Store
        cache.Store(ctx, "What is AI?", "AI is...", "gpt-4", embedding)
    }
}

Note

This example uses mock embeddings for demonstration. In production, integrate with an embedding API like OpenAI:

import "github.com/sashabaranov/go-openai"

func getEmbedding(client *openai.Client, text string) ([]float32, error) {
    resp, err := client.CreateEmbeddings(context.Background(), openai.EmbeddingRequest{
        Input: []string{text},
        Model: openai.SmallEmbedding3,
    })
    if err != nil {
        return nil, err
    }
    return resp.Data[0].Embedding, nil
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Go Semantic Cache Example

Features

Quick Start

1. Start Valkey

2. Build and Run

3. Configure (Optional)

4. Test the Cache

API Endpoints

POST /query

GET /health

GET /cache/stats

POST /cache/clear

Project Structure

Configuration

Usage as a Library

Note

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Go Semantic Cache Example

Features

Quick Start

1. Start Valkey

2. Build and Run

3. Configure (Optional)

4. Test the Cache

API Endpoints

POST /query

GET /health

GET /cache/stats

POST /cache/clear

Project Structure

Configuration

Usage as a Library

Note