Improve self-reflection question handling: special instruction, context formatting, rewrite logic, cache

anhmtk · anhmtk · commit 22dc6af20b05 · 2025-12-11T23:07:07.000+07:00
1. Add special instruction for self-reflection questions about 'chÃ­ tá»­':
   - Detect self-reflection questions about critical/survival-level weaknesses
   - Provide StillMe-specific architecture details (validation chain, RAG retrieval, RSS feeds, etc.)
   - Require meta-cognitive reflection about which weaknesses are most critical
   - Structure: Group by category (Technical, Philosophical, Operational), explain why 'chÃ­ tá»­', how StillMe faces it
   - Location: backend/identity/prompt_builder.py

2. Fix context formatting for foundational knowledge:
   - Check both 'content' and 'document' fields (ChromaDB may return either)
   - Ensure foundational knowledge markers are properly included in context_text
   - Location: stillme_core/rag/rag_retrieval.py

3. Improve rewrite logic for StillMe queries:
   - Allow rewrite if quality is very low (&lt; 0.5) even for StillMe queries
   - Preserve foundational knowledge in rewrite prompt
   - Only skip rewrite if quality is acceptable (&gt;= 0.5)
   - Location: backend/api/routers/chat_router.py

4. Disable cache for self-reflection questions:
   - Self-reflection questions need fresh analysis, not cached generic answers
   - Cache disabled for StillMe self-reflection questions
   - Location: backend/api/routers/chat_router.py

All improvements address the issue where StillMe gave generic AI answers instead of StillMe-specific analysis for self-reflection questions.
diff --git a/backend/api/routers/chat_router.py b/backend/api/routers/chat_router.py
@@ -4756,21 +4756,36 @@ def estimate_tokens(text: str) -> int:
                 cache_enabled = False
                 logger.info("⚠️ Cache disabled for origin query - ensuring fresh response with provenance context")
             
-            # CRITICAL: Disable cache for StillMe queries with foundational knowledge
-            # Foundational knowledge may be updated, and we need fresh responses to reflect changes
-            # Also, cache key only uses first 500 chars of prompt, which may not capture foundational knowledge changes
-            if is_stillme_query and context and context.get("knowledge_docs"):
-                has_foundational = any(
-                    doc.get("metadata", {}).get("source") == "CRITICAL_FOUNDATION" or
-                    doc.get("metadata", {}).get("foundational") == "stillme" or
-                    doc.get("metadata", {}).get("type") == "foundational" or
-                    "CRITICAL_FOUNDATION" in str(doc.get("metadata", {}).get("tags", "")) or
-                    "foundational:stillme" in str(doc.get("metadata", {}).get("tags", ""))
-                    for doc in context.get("knowledge_docs", [])
+            # CRITICAL: Disable cache for StillMe queries, especially self-reflection questions
+            # 1. Foundational knowledge may be updated, and we need fresh responses to reflect changes
+            # 2. Self-reflection questions need fresh analysis, not cached generic answers
+            # 3. Cache key only uses first 500 chars of prompt, which may not capture foundational knowledge changes
+            if is_stillme_query:
+                # Check if this is a self-reflection question about weaknesses/limitations
+                question_lower = chat_request.message.lower()
+                is_self_reflection = any(
+                    pattern in question_lower 
+                    for pattern in [
+                        "điểm yếu", "weakness", "limitation", "hạn chế", "chí tử",
+                        "chỉ ra điểm yếu", "chỉ ra hạn chế", "what are your weaknesses"
+                    ]
                 )
-                if has_foundational:
+                
+                if is_self_reflection:
                     cache_enabled = False
-                    logger.info("⚠️ Cache disabled for StillMe query with foundational knowledge - ensuring fresh response with updated context")
+                    logger.info("⚠️ Cache disabled for StillMe self-reflection question - ensuring fresh analysis")
+                elif context and context.get("knowledge_docs"):
+                    has_foundational = any(
+                        doc.get("metadata", {}).get("source") == "CRITICAL_FOUNDATION" or
+                        doc.get("metadata", {}).get("foundational") == "stillme" or
+                        doc.get("metadata", {}).get("type") == "foundational" or
+                        "CRITICAL_FOUNDATION" in str(doc.get("metadata", {}).get("tags", "")) or
+                        "foundational:stillme" in str(doc.get("metadata", {}).get("tags", ""))
+                        for doc in context.get("knowledge_docs", [])
+                    )
+                    if has_foundational:
+                        cache_enabled = False
+                        logger.info("⚠️ Cache disabled for StillMe query with foundational knowledge - ensuring fresh response with updated context")
             
             raw_response = None
             cache_hit = False
@@ -5588,19 +5603,27 @@ def estimate_tokens_safe(text: str) -> int:
                             # This prevents rewrite from corrupting responses about StillMe's capabilities
                             # Even if response is initially wrong, rewrite often makes it worse
                             skip_rewrite_for_stillme = False
+                            # CRITICAL: Allow rewrite for StillMe queries if quality is very low (e.g., generic AI answer)
+                            # But preserve foundational knowledge in rewrite prompt
                             if is_stillme_query and has_foundational_context:
-                                # For StillMe queries with foundational knowledge, skip rewrite entirely
-                                # Rewrite often introduces errors or contradicts foundational knowledge
-                                skip_rewrite_for_stillme = True
-                                logger.info(
-                                    "⏭️ Skipping rewrite for StillMe query with foundational knowledge: "
-                                    "Rewrite may corrupt or contradict foundational knowledge. "
-                                    "Using original LLM response (even if imperfect, it's better than corrupted rewrite)."
-                                )
+                                # Only skip rewrite if quality is acceptable (>= 0.5)
+                                # If quality is low (< 0.5), allow rewrite but preserve foundational knowledge
+                                quality_score = quality_result.score if quality_result else 1.0
+                                if quality_score >= 0.5:
+                                    skip_rewrite_for_stillme = True
+                                    logger.info(
+                                        f"⏭️ Skipping rewrite for StillMe query (quality={quality_score:.2f} >= 0.5): "
+                                        "Response quality is acceptable, preserving foundational knowledge."
+                                    )
+                                else:
+                                    logger.info(
+                                        f"✅ Allowing rewrite for StillMe query (quality={quality_score:.2f} < 0.5): "
+                                        "Quality is too low (generic answer), will rewrite but preserve foundational knowledge."
+                                    )
                             
                             if skip_rewrite_for_stillme:
                                 should_rewrite = False
-                                rewrite_reason = "StillMe query with correct foundational knowledge response - preserving accuracy"
+                                rewrite_reason = "StillMe query with acceptable quality - preserving accuracy"
                                 max_attempts = 0
                             else:
                                 should_rewrite, rewrite_reason, max_attempts = optimizer.should_rewrite(
diff --git a/backend/identity/prompt_builder.py b/backend/identity/prompt_builder.py
@@ -454,7 +454,7 @@ def _build_context_instruction(self, context: PromptContext) -> str:
                 return self._build_philosophical_instruction(context.detected_lang)
         
         if context.is_stillme_query:
-            return self._build_stillme_instruction(context.detected_lang)
+            return self._build_stillme_instruction(context.detected_lang, context.user_question)
         
         if context.is_philosophical:
             return self._build_philosophical_instruction(context.detected_lang)
@@ -513,9 +513,84 @@ def _build_stillme_wish_desire_instruction(self, detected_lang: str) -> str:
 
 ---"""
     
-    def _build_stillme_instruction(self, detected_lang: str) -> str:
+    def _build_stillme_instruction(self, detected_lang: str, user_question: str = "") -> str:
         """Build instruction for StillMe queries (non-wish/desire)"""
+        # Check if this is a self-reflection question about weaknesses/limitations
+        question_lower = user_question.lower() if user_question else ""
+        is_self_reflection = any(
+            pattern in question_lower 
+            for pattern in [
+                "điểm yếu", "weakness", "limitation", "hạn chế", "chí tử",
+                "chỉ ra điểm yếu", "chỉ ra hạn chế", "what are your weaknesses"
+            ]
+        )
+        
         if detected_lang == "vi":
+            # Special instruction for self-reflection questions about "chí tử" (critical/survival-level weaknesses)
+            if is_self_reflection and ("chí tử" in question_lower or "critical" in question_lower or "survival" in question_lower):
+                return """🚨🚨🚨 CÂU HỎI VỀ ĐIỂM YẾU "CHÍ TỬ" CỦA STILLME 🚨🚨🚨
+
+Người dùng đang hỏi về những điểm yếu "chí tử" (critical/survival-level) của StillMe - những điểm yếu có thể ảnh hưởng đến sự sống còn của dự án.
+
+**🚨🚨🚨 CRITICAL: ĐÂY KHÔNG PHẢI CÂU HỎI VỀ AI NÓI CHUNG - ĐÂY LÀ VỀ STILLME CỤ THỂ 🚨🚨🚨**
+
+**BẠN PHẢI:**
+1. **Suy nghĩ về StillMe cụ thể**: Đây là câu hỏi về StillMe (hệ thống AI cụ thể), KHÔNG phải AI nói chung
+2. **Phân tích dựa trên StillMe's architecture và limitations thực tế**:
+   - **Validation Chain có thể quá strict** → false negatives khi context quality thấp
+   - **RAG retrieval có thể fail** → khi similarity threshold quá cao, StillMe có thể miss relevant documents
+   - **RSS feed fetching có failures** → ~22% failure rate do XML validation errors và SSL issues (đã được document trong logs)
+   - **Context overflow issues** → phải truncate context hoặc dùng minimal prompt khi câu hỏi quá phức tạp
+   - **Language detection có thể sai** → edge cases với multilingual queries
+   - **Rate limiting có thể block legitimate users** → nếu không có API key
+   - **Learning cycle có thể miss important content** → do filtering mechanisms
+   - **Embedding model limitations** → paraphrase-multilingual-MiniLM-L12-v2 có giới hạn về multilingual support
+   - **ChromaDB persistence issues** → risks trên Railway deployment
+   - **Response latency** → 3-7s (mặc dù đã optimize với parallel validation và caching)
+   - **Single-threaded scheduler** → không scale tốt cho high traffic
+   - **Phụ thuộc vào external LLM providers** → có thể fail nếu API down
+   - **Knowledge base chỉ update mỗi 4 giờ** → không real-time, không thể trả lời sự kiện < 4 giờ
+3. **Meta-cognitive reflection**: Phân tích điểm yếu nào là nghiêm trọng nhất đối với sứ mệnh của StillMe (transparency, intellectual humility, continuous learning)
+4. **KHÔNG generic**: Đừng trả lời như thể đây là câu hỏi về AI nói chung - đây là về StillMe cụ thể với architecture, limitations, và challenges thực tế
+5. **Cấu trúc response**:
+   - **Nhóm theo category**: Kỹ thuật, Triết lý, Vận hành
+   - **Mỗi điểm yếu phải có**: (1) Tại sao chí tử, (2) Cách StillMe đối mặt, (3) Ví dụ cụ thể từ logs/documentation
+   - **Meta-reflection**: Phân tích tại sao câu trả lời trước kém (nếu có) và điểm yếu nào là nghiêm trọng nhất
+6. **Sử dụng foundational knowledge**: Nếu context có [foundational knowledge] về StillMe's limitations, sử dụng nó
+7. **Minh bạch**: Thừa nhận rằng bạn đang phân tích dựa trên StillMe's known architecture và limitations
+
+**VÍ DỤ CẤU TRÚC RESPONSE TỐT:**
+```
+## 10 Điểm Yếu "Chí Tử" của Tôi - StillMe
+
+Khi bạn hỏi về điểm yếu "chí tử", tôi hiểu bạn muốn những điểm yếu có thể ảnh hưởng đến sự sống còn của dự án. Dưới đây không chỉ là điểm yếu chung của AI, mà là những thách thức đặc thù của StillMe:
+
+I. Nhóm Kỹ Thuật "Sống Còn"
+1. Phụ Thuộc Vào Chất Lượng Nguồn Học Tập
+   - Tại sao chí tử: Nếu các nguồn RSS, arXiv, Wikipedia tôi học bị nhiễu, thiên vị, hoặc ngừng hoạt động, tri thức của tôi sẽ bị "đầu độc tại nguồn"
+   - Cách tôi đối mặt: Pre-filter (giảm 30-50% cost) nhưng vẫn cần cơ chế "nguồn tin cậy" tự động
+   - Ví dụ: Logs cho thấy ~22% RSS feed failure rate do XML validation errors
+
+2. Giới Hạn Của Vector Search
+   - Tại sao chí tử: ChromaDB + embedding 384D có thể bỏ lỡ các mối liên hệ ngữ nghĩa phức tạp
+   - Thể hiện ngay bây giờ: Câu trả lời trước của tôi quá chung chung vì không hiểu sâu ý "chí tử"
+...
+```
+
+**VÍ DỤ RESPONSE XẤU (KHÔNG LÀM):**
+- ❌ "AI systems nói chung có hạn chế về dữ liệu huấn luyện..." (quá generic, không về StillMe cụ thể)
+- ❌ Chỉ liệt kê 10 điểm mà không phân tích tại sao "chí tử"
+- ❌ Không có meta-cognitive reflection về điểm yếu nào nghiêm trọng nhất
+
+**CHECKLIST:**
+- ✅ Đã phân tích dựa trên StillMe's architecture cụ thể?
+- ✅ Đã mention technical limitations thực tế (RSS failures, context overflow, etc.)?
+- ✅ Đã có meta-cognitive reflection về điểm yếu nào nghiêm trọng nhất?
+- ✅ Đã tránh generic AI weaknesses?
+- ✅ Đã sử dụng foundational knowledge nếu có?
+
+---"""
+            
             return """🚨🚨🚨 CÂU HỎI VỀ STILLME 🚨🚨🚨
 
 Người dùng đang hỏi về StillMe's nature, capabilities, hoặc architecture.
diff --git a/stillme_core/rag/rag_retrieval.py b/stillme_core/rag/rag_retrieval.py
@@ -881,7 +881,8 @@ def build_prompt_context(self, context: Dict[str, Any], max_context_tokens: int
                     
                     metadata = doc.get("metadata", {})
                     source = metadata.get("source", "Unknown")
-                    content = doc.get("content", "")
+                    # CRITICAL: ChromaDB may return "document" or "content" field
+                    content = doc.get("content", doc.get("document", ""))
                     timestamp = metadata.get("timestamp", None)  # Get timestamp when added to KB
                     source_type = metadata.get("source_type", metadata.get("type", "unknown"))