Skip to content

Weaviate: stale Redis cache causes _create_collection() skip, auto-schema infers wrong types, vectors silently lost #32458

@lagameon

Description

@lagameon

Self Checks

  • This is only a bug report, not seeking help or asking questions.
  • I have searched for existing issues, including closed ones.
  • I have read the README carefully and confirmed this is a bug.

Dify version

v1.13.0 (self-hosted Docker), Weaviate server 1.27.0, Weaviate Python client 4.17.0

Cloud or Self-Hosted

Self-Hosted (Docker Compose)

Steps to reproduce

  1. Create a Knowledge Base using Weaviate as the vector store
  2. Index documents successfully (Weaviate class is created with correct schema by _create_collection(), Redis cache key vector_indexing_{collection_name} is set with TTL=3600s)
  3. Delete the Weaviate class externally (e.g., via REST API, or Weaviate container restart with ephemeral storage) while the Redis cache key is still alive
  4. Trigger document re-indexing (e.g., via update-by-text API or manual re-index from UI)

Expected: _create_collection() detects the class is missing and creates it with the correct schema (doc_id: DataType.TEXT)

Actual: _create_collection() checks Redis cache first (line 179), finds the stale key, and returns immediately — skipping collection creation entirely. The subsequent add_texts() batch insert triggers Weaviate's auto-schema feature, which infers property types from the first object's values. Since doc_id, document_id, and dataset_id contain UUID-formatted strings, auto-schema types them as uuid instead of text.

What went wrong

The Weaviate client v4 batch handler sends objects via gRPC. When doc_id is typed as uuid in the schema but the actual values don't pass strict UUID validation in some cases (or when certain batch operations hit validation edge cases), Weaviate returns HTTP 422 errors. These errors are silently swallowed by the Weaviate client v4's dynamic batch handler — no exception is raised, no error is logged.

Result: All objects are created in Weaviate without vectors (vec_len=0). Dify marks the documents as indexing_status: completed (because embedding generation succeeded), but RAG retrieval returns 0 results because there are no vectors to search against. The UI shows everything as "available" while the knowledge base is completely non-functional.

Root cause analysis

In api/core/rag/datasource/vdb/weaviate/weaviate_vector.py:

def _create_collection(self):
    lock_name = f"vector_indexing_lock_{self._collection_name}"
    with redis_client.lock(lock_name, timeout=20):
        cache_key = f"vector_indexing_{self._collection_name}"
        if redis_client.get(cache_key):   # <-- stale cache: returns True
            return                         # <-- skips everything!

        # ... collection creation with correct schema never runs ...
        # ... including: wc.Property(name="doc_id", data_type=wc.DataType.TEXT) ...

The Redis cache key has a 3600s TTL but doesn't validate whether the Weaviate class actually exists. If the class is deleted externally (container restart, manual cleanup, debugging), the cache becomes stale and prevents proper schema creation.

Evidence from debugging

Auto-schema created class (wrong — doc_id as uuid):

Class: Vector_index_061fa672_..._Node
  description: "This property was generated by Weaviate's auto-schema feature"
  document_id: ['uuid']    ← should be text
  doc_id: ['uuid']         ← should be text  
  dataset_id: ['uuid']     ← acceptable (actually is UUID)
  text: ['text']

Correct schema (when _create_collection() runs properly):

Class: Vector_index_061fa672_..._Node
  document_id: ['text']    ✅
  doc_id: ['text']         ✅
  chunk_index: ['int']     ✅ (missing in auto-schema)
  dataset_id: ['uuid']     (auto-added by _ensure_properties)
  text: ['text']

Suggested fix

Add a class existence check alongside the Redis cache check:

def _create_collection(self):
    lock_name = f"vector_indexing_lock_{self._collection_name}"
    with redis_client.lock(lock_name, timeout=20):
        cache_key = f"vector_indexing_{self._collection_name}"
        if redis_client.get(cache_key):
            # Validate the class still exists in Weaviate
            if self._client.collections.exists(self._collection_name):
                return
            # Stale cache — class was deleted, clear cache and recreate
            redis_client.delete(cache_key)

        try:
            if not self._client.collections.exists(self._collection_name):
                # ... create collection with correct schema ...

This ensures the cache is invalidated if the Weaviate class no longer exists.

Impact

  • Severity: High — complete RAG failure with no visible error
  • Detectability: Very low — Dify UI shows all documents as "completed/available"
  • User experience: Users see "I can't answer your question" from the chatbot with no indication that vectors are missing

Metadata

Metadata

Assignees

No one assigned

    Labels

    🐞 bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions