Skip to content

a script to inspect the payload in a chromadb or any vector database.

Notifications You must be signed in to change notification settings

trevianxyz/testchroma

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

ChromaDB Collection Inspector

A command-line tool for inspecting ChromaDB collections, viewing documents, metadata, and embeddings with rich formatting. Available as both a Python script and a convenient shell script wrapper.

Features

  • Collection Discovery: List all available collections with document counts
  • Document Preview: View documents, metadata, and embedding vectors
  • Rich Formatting: Color-coded output with syntax highlighting
  • Flexible Filtering: Limit results and customize embedding display
  • Error Handling: Graceful handling of missing collections with helpful suggestions
  • Shell Integration: Convenient shell script wrapper with quick commands

Installation

# Install Python dependencies
uv add chromadb rich

# Make shell script executable (optional)
chmod +x testchroma

Usage

Shell Script (Recommended)

The shell script provides a more convenient interface with quick commands:

# Quick collection discovery
./testchroma list
./testchroma collections
./testchroma ls

# Preview a collection
./testchroma my_collection
./testchroma -c my_collection

# Show help
./testchroma help

Python Script (Direct)

# List all available collections
uv run python testchroma.py --list-only

# Preview a specific collection
uv run python testchroma.py --collection my_collection

# Preview with custom limits
uv run python testchroma.py --collection my_collection --limit 50 --head 12

Command Line Options

Shell Script Options

Short Long Description Default
-p --path ChromaDB directory $CHROMA_PATH or .chromadb
-c --collection Collection name $CHROMA_COLLECTION or test_collection
-l --limit Document limit 100
-h --head Embedding components 8
-i --index Embedding index 5
--list-only List collections only false

Python Script Options

Option Description Default
--path ChromaDB persist directory $CHROMA_PATH or .chromadb
--collection Collection name to inspect $CHROMA_COLLECTION or test_collection
--limit Number of documents to preview 100
--head Embedding components to show 8
--show-index Specific embedding index to highlight 5
--list-only Only list collections, don't preview False

Quick Commands (Shell Script)

  • ./testchroma list - List all collections
  • ./testchroma collections - Same as list
  • ./testchroma ls - Same as list
  • ./testchroma help - Show help

Environment Variables

  • CHROMA_PATH: Default ChromaDB directory path
  • CHROMA_COLLECTION: Default collection name

Examples

Shell Script Examples

# Quick collection discovery
./testchroma list

# Preview specific collection
./testchroma educational_content

# Use different ChromaDB path
./testchroma -p ./my_chroma_db -c my_data

# Custom inspection
./testchroma -c my_collection -l 50 -h 12 -i 10

Python Script Examples

# Discover what collections exist
uv run python testchroma.py --list-only

# Preview educational content collection
uv run python testchroma.py --collection educational_content

# Use different ChromaDB path
uv run python testchroma.py --path ./my_chroma_db --collection my_data

# Show more embedding details
uv run python testchroma.py --collection my_collection --head 16 --show-index 10

Output Format

The script displays:

  • Document content (truncated to 2000 chars)
  • Metadata with color-coded keys
  • Embedding statistics (dimension, L2 norm, sample values)
  • Collection information (document count, metadata structure)

Error Handling

When a collection doesn't exist:

  • Shows available collections
  • Provides usage examples
  • Suggests using --list-only for detailed view
  • Exits with helpful error message

Shell Script Features

The shell script wrapper provides additional benefits:

  • Dependency Checking: Automatically verifies required tools (uv, python3)
  • Quick Commands: Short aliases for common operations
  • Better Error Messages: Clear, colored error output
  • Environment Integration: Respects environment variables
  • Help System: Built-in help and usage examples

Use Cases

  • Debugging: Inspect collection contents and embedding quality
  • Exploration: Discover available data in ChromaDB
  • Validation: Verify document structure and metadata
  • Development: Quick collection inspection during development
  • Automation: Shell script integration for CI/CD pipelines

Files

  • testchroma.py - Main Python script
  • testchroma - Shell script wrapper (optional)
  • README.md - This documentation

Dependencies

  • chromadb - ChromaDB client library
  • rich - Rich text and beautiful formatting
  • uv - Python package manager (for shell script)
  • python3 - Python interpreter

About

a script to inspect the payload in a chromadb or any vector database.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published