Release v0.9.0 · PalenaAI/palena-websearch-mcp

[0.9.0] - 2026-04-18

Initial public release of the Palena MCP Server.

Added

MCP server with SSE (/sse) and Streamable HTTP (/mcp) transports, plus standalone REST API (/api/v1/search)
web_search MCP tool with input parameters: query, category (general/news/code/science), language, timeRange, maxResults
SearXNG search integration with category-based engine routing, query expansion, and URL deduplication
Tiered content extraction via Playwright
- L0: plain HTTP GET with go-readability for server-rendered pages
- L1: headless Chromium via Playwright for JavaScript-rendered pages
- L2: stealth mode with navigator.webdriver override, viewport/UA randomization, and proxy rotation for bot-protected pages
- Automatic escalation: L0 -> L1 (if content detection flags JS rendering) -> L2 (if bot-blocked)
- Graceful degradation when the Playwright sidecar is unavailable
PII detection and redaction via Microsoft Presidio
- Three modes: audit (detect and log), redact (detect and anonymize), block (reject high-density PII documents)
- Configurable entity types, anonymization strategies, and density thresholds
- Audit records that never contain actual PII values
- Graceful degradation when Presidio is unavailable
Prompt-injection defense via a Hugging Face Text Embeddings Inference (TEI) sidecar serving deepset/deberta-v3-base-injection
- Three modes: audit (detect and log), annotate (wrap suspicious chunks in <untrusted-content> markers), block (drop documents containing any over-threshold chunk)
- Per-paragraph chunked scoring catches short malicious paragraphs hidden inside otherwise legitimate pages
- Pluggable model — swap deepset/deberta-v3-base-injection for any HuggingFace SequenceClassification model (e.g. a fine-tuned successor on the same microsoft/deberta-v3-base backbone) by changing injection.predictURL and the sidecar --model-id only
- Configurable injection-label name (injection.injectionLabel) so fine-tuned models with different label conventions work without code changes
- Audit records that never contain chunk text — only counts, max/mean scores, and over-threshold counts
- Graceful degradation when the TEI sidecar is unreachable
- Documentation: docs/prompt-injection.md
Pluggable reranker subsystem
- KServe provider for GPU cross-encoder models (mxbai-rerank)
- FlashRank provider for CPU ONNX models with Flask sidecar
- RankLLM provider for LLM-as-reranker via any inference endpoint
- Noop provider to skip reranking and preserve search engine order
Domain policy with allow/blocklists and robots.txt enforcement, evaluated before scraping
Content provenance
- Three-stage SHA-256 hash chain: raw HTML, extracted markdown, final content
- Structured provenance records emitted via slog
- Optional batched ClickHouse export for audit trail storage
OpenTelemetry instrumentation
- Distributed tracing with spans for each pipeline stage (search, scrape, PII, injection, rerank, pipeline)
- Prometheus-compatible metrics: counters (requests, errors, PII entities) and histograms (duration, content length)
- Configurable exporters: OTLP gRPC, stdout, Prometheus, or disabled
Proxy pool with round-robin rotation and cooldown-on-failure for L2 extraction
YAML configuration with environment variable overrides (PALENA_* pattern) and built-in defaults
Health endpoint (/health) with sidecar reachability checks
Docker deployment
- Multi-stage Dockerfile producing a runtime image that bundles the Playwright driver subprocess
- Full-stack Docker Compose with all sidecars (SearXNG, Presidio, Playwright, injection-guard, FlashRank)
- Minimal Docker Compose with Palena + SearXNG only
- FlashRank sidecar Dockerfile and Flask server
- Pre-configured SearXNG settings with JSON format enabled
Helm chart for Kubernetes/OpenShift with per-sidecar toggles (presidio, playwright, injection-guard, flashrank), ConfigMap-based configuration, and health probes
Annotated example configuration (config/palena.example.yaml) documenting every option
Subsystem documentation covering architecture, search, scraper, PII, prompt-injection, reranker, MCP transport, configuration, and provenance

Known issues

Injection-guard throughput on long pages is limited by an upstream TEI bug. The released TEI v1.9 Docker image has a DeBERTa-v2/v3 batching defect — multi-input forward passes fail with broadcast_mul shape mismatches. Palena works around it by serializing classifier calls (one HTTP request per chunk), which keeps the classifier correct but means a 70-chunk document spends roughly a minute inside the injection stage. Upstream fix: huggingface/text-embeddings-inference#846, expected in TEI v1.10.0. When the image is bumped, raise predictConcurrency in internal/injection/tei.go to restore parallelism.

Container image

ghcr.io/palenaai/palena-websearch-mcp:0.9.0
Digest: sha256:3b1ab427a11c525be937335a374bf7f16dbecc0b4f683974c7c74f879fd4d417

Supply-chain artifacts

CycloneDX SBOM: sbom.cdx.json (attached + cosign attestation)
SPDX SBOM: sbom.spdx.json (attached)
Trivy HIGH/CRITICAL scan report: trivy-report.json (attached)
Image signature: Sigstore keyless (cosign, GitHub OIDC)
SLSA provenance: Level 3 (attached via slsa-github-generator)

Verify the image signature:

cosign verify \
  --certificate-identity-regexp 'https://github.com/PalenaAI/palena-websearch-mcp/' \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  ghcr.io/palenaai/palena-websearch-mcp@sha256:3b1ab427a11c525be937335a374bf7f16dbecc0b4f683974c7c74f879fd4d417

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.9.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

[0.9.0] - 2026-04-18

Added

Known issues

Container image

Supply-chain artifacts

Uh oh!