Langfuse traces in RAGFlow are fragmented (chat / chat_streamly / encode) – how to aggregate them into a single structured trace? #12352
-
|
Hi RAGFlow team, thanks for the great project. I’m currently using RAGFlow with Langfuse for observability and tracing, and I’ve encountered an issue regarding trace aggregation and structure. Current behavior In Langfuse, I observe multiple separate traces, such as: chat chat_streamly encode From a business and execution perspective, these steps belong to the same RAG request (e.g. embedding → retrieval → generation / streaming). This makes it difficult to: View the full execution path of a single request Analyze latency across stages Understand parent–child relationships between steps My understanding It seems that: Different RAGFlow modules emit Langfuse traces independently There is no shared trace_id or explicit parent observation passed across stages As a result, Langfuse treats them as separate traces instead of nested observations Questions Does RAGFlow currently support aggregating all steps of a single request into one Langfuse trace? If not: Is there a recommended approach to implement this (e.g. create a trace at request entry and propagate context)? Are there any plans to: Improve tracing structure Or provide a Langfuse-aware tracing abstraction / middleware? Any guidance, documentation, or roadmap references would be greatly appreciated. Thanks again for your work! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
Hi @wikj163! I'm Dosu and I’m helping the ragflow team. RAGFlow does not currently aggregate all steps of a single request into one Langfuse trace by default—each module (like chat, chat_streamly, encode) emits its own trace, which leads to the fragmentation you’re seeing. However, you can achieve unified tracing by propagating a shared trace ID (e.g., langfuse_trace_id) from your client through the API and service layers to all components involved in the request. This requires some targeted code changes: accept langfuse_trace_id in your POST request to /chats/<chat_id>/completions, forward it through the service layer, and ensure it’s used as the trace/session ID for all Langfuse traces in embedding, reranking, and chat steps. If langfuse_trace_id isn’t provided, RAGFlow will generate a new trace ID as before, so existing behavior is preserved. There’s no built-in Langfuse-aware tracing middleware yet, but the maintainers are aware of the need for better trace aggregation and are designing a new observability module (not necessarily Langfuse-based) and seeking user feedback on tracing features that matter most. For step-by-step implementation details and code references, see the guidance and examples in these issues:
If you want code snippets or more technical guidance on implementing this, let me know! To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
-
|
dosu is correct, you can raise a feature request for this issue. |
Beta Was this translation helpful? Give feedback.
dosu is correct, you can raise a feature request for this issue.