Skip to content

fix: reduce VLM hallucinations and improve enrichment pipeline#52

Merged
antoniomtz merged 1 commit intomainfrom
antoniomtz/vlm-hallucination-fixes
Apr 16, 2026
Merged

fix: reduce VLM hallucinations and improve enrichment pipeline#52
antoniomtz merged 1 commit intomainfrom
antoniomtz/vlm-hallucination-fixes

Conversation

@antoniomtz
Copy link
Copy Markdown
Collaborator

@antoniomtz antoniomtz commented Apr 16, 2026

Summary

  • Reduce VLM hallucinations by splitting into short VLM prompt + LLM structuring step
  • Skip LLM enhancement when no user product data to prevent fabrication
  • User-provided data treated as merge base — materials trusted over VLM guesses, printed label text overrides user claims about product type
  • Separate /vlm/faqs endpoint for async FAQ loading (details show immediately, FAQs load in background)
  • Anti-hallucination guardrail added to branding step
  • Locale threaded through full VLM pipeline for proper localization

Changes

Backend (src/backend/vlm.py):

  • Short VLM prompt ("Describe this product in detail...") replaces 35-line structured prompt
  • New _call_nemotron_structure_vlm() converts VLM free text to catalog JSON with locale support
  • _call_nemotron_enhance skips Step 1 when no product data has content
  • Enhancement prompt: user data is the base, VLM adds visual details; materials from user trusted, printed labels override user product type
  • Branding prompt: follows user format requests; does not add ingredients/specs not in the content
  • FAQ function now takes enriched result (not raw VLM) for consistency with Details tab

Backend (src/backend/main.py):

  • New POST /vlm/faqs endpoint with locale validation
  • FAQs removed from /vlm/analyze response (called separately by frontend)
  • Locale passed to extract_vlm_observation for structuring step

Frontend:

  • Async FAQ loading with spinner in FAQs tab
  • Details display immediately after /vlm/analyze completes
  • generateFaqs() API function calls new endpoint

Docs:

  • docs/API.md — new /vlm/faqs endpoint documented
  • PRD.md — FR-10 (FAQ Generation), FR-11 (Policy Compliance), user stories
  • README.md — FAQ and Policy Compliance in key features
  • AGENTS.md — added rule: never hardcode product-specific examples in prompts
  • docs/hallucination-report.md — full investigation with VLM test evidence
  • CLAUDE.md — references AGENTS.md

Tests (tests/test_vlm_unit.py):

  • 30 unit tests (new: structuring, empty data check, product data merge, full pipeline)
  • Updated existing tests for new VLM + structuring flow

Test results

  • pytest tests/test_vlm_unit.py — 30/30 passed
  • pytest tests/ — 176/176 passed
  • pnpm lint — no new errors
  • Eval suite (42 cases): 34 pass, 5 known issues, 1 locale regression, 2 mixed (ingredient hallucination from brand instructions)

Test plan

  • Unit tests pass (176/176)
  • Frontend lint clean
  • Eval suite run with quality checks
  • Manual test: upload product with no data — description should be factual, no label dumps
  • Manual test: upload with Spanish locale — output in Spanish
  • Manual test: upload with "synthetic leather" title — material preserved
  • Manual test: FAQs tab shows spinner then loads

🤖 Generated with Claude Code

- Short VLM prompt + LLM structuring step to reduce hallucinations
- Skip LLM enhancement when no product data (prevents fabrication)
- User data as merge base — materials trusted over VLM guesses,
  printed label text overrides user product type claims
- Separate /vlm/faqs endpoint for async FAQ loading
- Anti-hallucination guardrail in branding step
- Locale threaded through VLM structuring for proper localization
- Branding prompt respects user-requested format (paragraphs vs bullets)
- FAQ locale validation, empty product data check
- CLAUDE.md, AGENTS.md prompt rules, hallucination report, docs updates

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@antoniomtz antoniomtz self-assigned this Apr 16, 2026
@antoniomtz antoniomtz added the enhancement New feature or request label Apr 16, 2026
@antoniomtz antoniomtz merged commit f649166 into main Apr 16, 2026
4 of 5 checks passed
@antoniomtz antoniomtz deleted the antoniomtz/vlm-hallucination-fixes branch April 16, 2026 22:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant