Skip to content

RFC: Signed receipts for Haystack pipeline component calls #11039

@tomjwxf

Description

@tomjwxf

Problem

Haystack pipelines chain multiple components (retrievers, generators, rankers) in production NLP workflows, but there is no cryptographic audit trail of component-level decisions. For enterprise RAG deployments, compliance teams need to prove: which retriever was used, what documents were ranked, what the generator produced, and that none of this was altered after the fact.

Proposal

Add an optional ReceiptSigningComponent (or pipeline middleware) that wraps Haystack components and emits Ed25519-signed receipts for each invocation:

from haystack import Pipeline
from protect_mcp import ReceiptMiddleware

pipe = Pipeline()
pipe.add_component("retriever", InMemoryBM25Retriever(...))
pipe.add_component("generator", OpenAIGenerator(...))
pipe.add_middleware(ReceiptMiddleware(policy="enterprise.json"))

Each receipt captures:

  • Component name and type
  • Input/output SHA-256 digests
  • Policy evaluation result (allow/deny)
  • Ed25519 signature
  • OpenTelemetry trace/span IDs for correlation

Why Haystack

Haystack already has excellent pipeline telemetry and component tracing. Receipt signing is a natural extension — it turns existing observability data into verifiable evidence that can be independently validated offline.

References

Happy to discuss architecture and submit a PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    P2Medium priority, add to the next sprint if no P1 available

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions