FastAPI backend for multimodal search powered by ImageBind embeddings and ChromaDB. Includes optional NSFW flagging (precomputed boolean on items) and a simple text-based batch flagging script.
- Multimodal search endpoints (web, image, audio, video, all)
- Persistent ChromaDB index stored under
index_data/(git-ignored) - ImageBind model loading with CUDA support and diagnostics
- Optional NSFW metadata on results (
is_nsfw,nsfw_score,nsfw_confidence) - Batch NSFW flagging scripts (GPU-aware and text-only)
- Telemetry suppression for ChromaDB/PostHog noise-free logs
- Python 3.9+
- CUDA-compatible GPU (optional but recommended)
- uv (fast Python package/environment manager)
Install uv:
curl -LsSf https://astral.sh/uv/install.sh | shFrom this directory:
uv syncuv run uvicorn main:app --host 0.0.0.0 --port 8000 --reloadThe API will start on http://localhost:8000.
- ChromaDB persistence directory:
index_data/chroma_db(ignored in git) - Telemetry is disabled in
main.pyto avoid PostHog warnings
GET /searchweb?query=...&top_k=100&filter_nsfw=falseGET /searchimage?query=...&top_k=100&filter_nsfw=falseGET /searchaudio?query=...&top_k=100&filter_nsfw=falseGET /searchvideo?query=...&top_k=100&filter_nsfw=falseGET /status
When NSFW flags exist in metadata, results include fields like:
{
"is_nsfw": true,
"nsfw_score": 0.8,
"nsfw_confidence": 0.9
}Use filter_nsfw=true to exclude flagged items from results.
Two approaches are provided to (pre)compute NSFW flags in the index metadata.
Scans documents/metadata for NSFW keywords and updates metadata.
uv run python simple_nsfw_flag.py- Adds
is_nsfw,nsfw_score,nsfw_confidence, andnsfw_keywords - Works without GPU
Computes similarities between content and NSFW/safe prompts and updates metadata.
# Prefer GPU 1 if present, otherwise fall back automatically
CUDA_VISIBLE_DEVICES=1 uv run python add_nsfw_flags.pyNotes:
- Attempts to use
cuda:1when multiple GPUs are present; otherwise picks best available device - Handles various embedding shapes/types retrieved from ChromaDB
- Follow PEP 8; include type hints where practical
- Keep functions small and well-named; prefer early returns
- Avoid catching exceptions without meaningful handling
(If configured) run from this folder:
uv run ruff check .Place tests under tests/ and run:
uv run pytest -qWe welcome contributions! Please follow these steps:
- Fork the repository and create a feature branch
git checkout -b feat/your-feature
- Set up the environment
uv sync
- Make focused, atomic commits with clear messages
- Conventional commits encouraged (
feat:,fix:,docs:,chore:)
- Conventional commits encouraged (
- Run the API locally and verify endpoints
- Update documentation if behavior changes
- Open a Pull Request
- Describe the change, motivation, and testing steps
- Link related issues if any
- Keep changes scoped; avoid unrelated refactors
- Include before/after behavior where relevant
- Ensure no large binary or index files are committed
Please include:
- Environment (OS, Python, CUDA, GPU)
- Logs (errors/warnings)
- Steps to reproduce
This backend ignores heavy and generated assets:
index_data/(ChromaDB persistence)- log files (e.g.,
indexer.log,**/logs/**,*.log*) - virtual envs and caches
If you notice other large/generated paths, propose an update to .gitignore.
Maintained by the Glimpse team. Thanks for contributing!