artefactual-labs
diff --git a/‎.goreleaser.yml‎
Lines changed: 4 additions & 0 deletions b/‎.goreleaser.yml‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎Makefile‎
Lines changed: 11 additions & 2 deletions b/‎Makefile‎
Lines changed: 11 additions & 2 deletions
diff --git a/‎README.md‎
Lines changed: 149 additions & 49 deletions b/‎README.md‎
Lines changed: 149 additions & 49 deletions
@@ -68,6 +68,10 @@ nfpms:
       - src: web/assets/altcha
         dst: /etc/haproxy/assets/altcha
         type: tree
+      # BotD JS asset (versioned) and VERSION file; served via /assets/botd/active/botd.esm.js
+      - src: web/assets/botd
+        dst: /etc/haproxy/assets/botd
+        type: tree
     dependencies:
       - haproxy
     scripts:
 
@@ -1,6 +1,6 @@
 BIN := bin/cookie-guard-spoa
 
-.PHONY: all build clean altcha-assets install-altcha-assets altcha-go-bump
+.PHONY: all build clean altcha-assets install-altcha-assets altcha-go-bump botd-assets install-botd-assets
 
 all: build
 
@@ -13,8 +13,9 @@ build:
 clean:
 	rm -rf bin dist
 
-# ----- ALTCHA assets management -----
+# ----- ALTCHA / BotD assets management -----
 ALTCHA_VER := $(shell sed -n '1p' web/assets/altcha/VERSION 2>/dev/null || echo unset)
+BOTD_VER := $(shell sed -n '1p' web/assets/botd/VERSION 2>/dev/null || echo unset)
 
 altcha-assets:
 	@[ -x tools/altcha-sync.sh ] || chmod +x tools/altcha-sync.sh || true
@@ -24,6 +25,14 @@ install-altcha-assets:
 	install -d -m0755 /etc/haproxy/assets/altcha
 	cp -a web/assets/altcha/* /etc/haproxy/assets/altcha/
 
+botd-assets:
+	@[ -x tools/botd-sync.sh ] || chmod +x tools/botd-sync.sh || true
+	./tools/botd-sync.sh $(BOTD_VER)
+
+install-botd-assets:
+	install -d -m0755 /etc/haproxy/assets/botd
+	cp -a web/assets/botd/* /etc/haproxy/assets/botd/
+
 altcha-go-bump:
 	@if [ -z "$(VERSION)" ]; then echo "usage: make altcha-go-bump VERSION=vX.Y.Z"; exit 2; fi
 	go get github.com/altcha-org/altcha-lib-go@$(VERSION)
 
@@ -2,7 +2,12 @@
 
 `cookie-guard-spoa` is an HAProxy SPOE (Stream Processing Offload Engine) agent that issues and validates HMAC‑signed cookies.
 
-It ships with a privacy‑friendly browser challenge powered by ALTCHA and dedicated endpoints built into the agent. HAProxy serves a small HTML page and the agent verifies the puzzle solution, issuing the `hb_v2` cookie on success. Only clients presenting a valid cookie reach your backend.
+It ships with two first-party protections enabled by default:
+
+- **ALTCHA** – a lightweight, open-source puzzle that proves the visitor can execute JavaScript, persist cookies, and solve a human-friendly challenge before the origin ever sees the request.
+- **BotD (FingerprintJS)** – a local copy of the BotD detector that fingerprints the browser for automation traits (headless Chrome, Selenium drivers, emulators) and reports the verdict back to Cookie Guard, allowing HAProxy or downstream SPOEs to block, throttle, or log suspect sessions.
+
+HAProxy serves a small HTML page that embeds both protections. The agent verifies the ALTCHA solution, ingests the BotD verdict, and issues the `hb_v2` cookie on success. Only clients presenting a valid cookie reach your backend unless you explicitly disable the challenge or BotD via CLI flags.
 
 Learn more about ALTCHA:
 
@@ -16,14 +21,14 @@ Learn more about ALTCHA:
 
 ## Overview
 
-The agent offloads cookie lifecycle management from HAProxy:
+Cookie Guard inserts an inline checkpoint between HAProxy and your origin that:
 
-1. Generates short-lived, signed cookies derived from the client IP and User-Agent.
-2. Exposes helper endpoints that HAProxy can embed in a challenge page.
-3. Validates cookies on subsequent requests and reports the outcome back to HAProxy via SPOE frames.
-4. Enables HAProxy to allow, rate-limit, or block requests that fail the validation.
+1. **Challenges new sessions** – serves the bundled ALTCHA puzzle and locally hosted BotD detector so only browsers that can execute JavaScript, persist cookies, and pass automation fingerprinting obtain an `hb_v2` cookie.
+2. **Issues and tracks tokens** – mints short-lived, HMAC-signed cookies bound to the client IP and (optionally) User-Agent, then caches recent BotD verdicts for the same tuple.
+3. **Validates on subsequent requests** – verifies hb_v2 on every request via SPOE and reuses the cached BotD verdict so downstream policies can treat “good”, “suspect”, or “bad” sessions differently.
+4. **Feeds HAProxy/SPOE peers** – exposes fresh transaction variables (`cookieguard.valid`, `cookieguard.botd_kind`, `cookieguard.session_hmac`, etc.) that HAProxy, Decision-SPOA, or other agents can use to block, rate-limit, or log.
 
-This setup filters out most headless bots, generic scanners, or curl-based tooling that cannot execute JavaScript or persist cookies.
+Because the HTML and JavaScript are served from your own HAProxy backend, no third-party calls or trackers are involved. The combination of ALTCHA (prove you are interactive) and BotD (fingerprint automation) removes most headless browsers, cURL scripts, and basic scrapers before they ever see your real site.
 
 ---
 
@@ -101,6 +106,7 @@ Additionally, packages include the challenge pages and ALTCHA assets under `/etc
 
 - `/etc/haproxy/altcha_challenge.html.lf`.
 - ALTCHA JS is installed under `/etc/haproxy/assets/altcha/<version>/altcha.min.js[.lf]` with `/etc/haproxy/assets/altcha/active` symlink updated to the packaged version.
+- BotD JS is installed under `/etc/haproxy/assets/botd/<version>/botd.esm.js[.lf]` with `/etc/haproxy/assets/botd/active` baked into the package so the challenge page can import `/assets/botd/active/botd.esm.js` immediately.
 
 After installation, adjust `/etc/cookie-guard-spoa/secret.key` or edit the systemd unit as needed, then `systemctl restart cookie-guard-spoa`.
 
@@ -123,61 +129,118 @@ After editing, run `systemctl restart cookie-guard-spoa`.
 
 ## HAProxy integration
 
-1. **SPOE engine definition** (`/etc/haproxy/cookie-guard.cfg`)
+### SPOE engine definition (`/etc/haproxy/cookie-guard-spoa.cfg`)
 
-   ```ini
-   [spoe]
-   max-frame-size 16384
-   max-waiting-frames 2000
+```ini
+[cookie-guard]
+spoe-agent cookie-guard
+    option var-prefix cookieguard
+    groups issue-token verify-token
+    option pipelining
+    timeout hello      2s
+    timeout idle       30s
+    timeout processing 2s
+    use-backend cookie_guard_spoa_backend
 
-   agent cookie_guard
-       use-backend cookie_guard_backend
-       messages issue-token verify-token
-       option pipelining
-       timeout hello      2s
-       timeout idle       30s
-       timeout processing 2s
+spoe-message issue-token
+    args src-ip=src ua="req.fhdr(User-Agent)"
 
-   message issue-token
-       args src-ip=ip.src ua="req.fhdr(User-Agent)"
+spoe-message verify-token
+    args src-ip=src ua="req.fhdr(User-Agent)" cookie=req.cook(hb_v2)
 
-   message verify-token
-       args src-ip=ip.src ua="req.fhdr(User-Agent)" cookie=req.cook(hb_v2)
-   ```
+spoe-group issue-token
+    messages issue-token
 
-2. **Backend connection**
+spoe-group verify-token
+    messages verify-token
+```
 
-   ```haproxy
-   backend cookie_guard_backend
-       mode tcp
-       server spoa1 127.0.0.1:9903 check
-   ```
+```haproxy
+backend cookie_guard_spoa_backend
+    mode tcp
+    server spoa1 127.0.0.1:9903 check inter 2s fall 2 rise 1
+```
 
-3. **Example application backend**
+### Reference frontend/backends
 
-   ```haproxy
-   backend be_app
-       option http-buffer-request
+Below is a compact `public_www` setup that wires Cookie Guard alone in front of a single backend. Swap the binds/hosts for your environment and layer in additional SPOEs (Decision, Coraza, etc.) later once the basic flow works.
 
-       acl chal_safe_meth method GET HEAD
-       acl chal_exempt_path path_beg -i /health /status /static/ /favicon.ico
-       acl chal_exempt_cookie req.cook(hb_v2) -m found
-       acl chal_target chal_safe_meth !chal_exempt_path
+```haproxy
+frontend public_www
+    bind :80
+    bind :443 ssl crt /etc/haproxy/certs/example.pem alpn h2,http/1.1
+    option httplog
+
+    # Force HTTPS except for ACME
+    acl is_certbot path_beg -i /.well-known/acme-challenge
+    http-request redirect scheme https unless { ssl_fc } || is_certbot
+
+    # Cookie Guard: verify hb_v2 only when present
+    filter spoe engine cookie-guard config /etc/haproxy/cookie-guard-spoa.cfg
+    option http-buffer-request
+    acl has_cookie req.cook(hb_v2) -m found
+    http-request send-spoe-group cookie-guard verify-token if has_cookie
+    acl cookie_ok var(txn.cookieguard.valid) -m str 1
+
+    # Route challenge assets and BotD reports back to the Cookie Guard HTTP listener
+    acl altcha_routes path_beg -i /altcha /altcha- /assets/altcha/
+    acl botd_path     path -i /botd-report
+    acl botd_js       path -i /assets/botd/active/botd.esm.js
+    use_backend cookie_guard_http_backend if altcha_routes or botd_path or botd_js
+
+    use_backend certbot if is_certbot
+    default_backend app_backend
+```
 
-       http-request set-spoe-group cookie_guard verify-token if chal_target chal_exempt_cookie
-       acl cookie_ok var(txn.cookie_guard.valid) -m str 1
+Backends reuse the same Cookie Guard SPOE engine. The snippet below illustrates challenge orchestration plus silent token issuance when Decision (or another policy component) is not involved yet. Feel free to inline your own exemption ACLs.
+
+```haproxy
+backend app_backend
+    option http-buffer-request
+
+    # Verify hb_v2 only when present
+    filter spoe engine cookie-guard config /etc/haproxy/cookie-guard-spoa.cfg
+    acl has_cookie req.cook(hb_v2) -m found
+    http-request send-spoe-group cookie-guard verify-token if has_cookie
+    acl cookie_ok var(txn.cookieguard.valid) -m str 1
+
+    # Simple policy: challenge every request until hb_v2 validates
+    acl need_challenge !cookie_ok
+    http-request redirect code 302 location /altcha?url=%[url] if need_challenge
+
+    # Auto-issue hb_v2 when you prefer a silent token (e.g., authenticated users)
+    http-request send-spoe-group cookie-guard issue-token if !cookie_ok !need_challenge
+    acl new_token var(txn.cookieguard.token) -m found
+    http-response add-header Set-Cookie "hb_v2=%[var(txn.cookieguard.token)]; Max-Age=%[var(txn.cookieguard.max_age)]; Path=/; HttpOnly; Secure; SameSite=Lax" if !need_challenge !has_cookie new_token
+
+    # Forward headers to your origin
+    http-request set-header X-Real-IP %[src]
+    http-request add-header X-Forwarded-Proto https if { ssl_fc }
+    option forwarded
+    option forwardfor
+    server app1 127.0.0.1:8080 check
+```
 
-       http-request set-spoe-group cookie_guard issue-token if chal_target !cookie_ok
+Cookie Guard’s HTTP listener serves the ALTCHA HTML, ALTCHA JS, BotD bundle, and `/botd-report`. Route traffic there using:
 
-       server app1 127.0.0.1:8080 check
-   ```
+```haproxy
+backend cookie_guard_http_backend
+    mode http
+    option forwarded
+    option forwardfor
+    http-request set-header X-Forwarded-For %[src]
+    server spoa_http 127.0.0.1:9904 check
+```
 
-   When HAProxy runs the `verify-token` message, the agent populates the following transaction-scoped variables (prefixed via `option var-prefix cookieguard`):
+### What HAProxy gets back
 
-   - `txn.cookieguard.valid`: `"1"` when the hb_v2 cookie validates, otherwise `"0"`.
-   - `txn.cookieguard.age_seconds`: age of the accepted cookie (stringified integer seconds).
-   - `txn.cookieguard.session_hmac`: HMAC handle derived from the cookie value for downstream session tracking (empty when invalid or missing).
-   - `txn.cookieguard.challenge_level`: textual label for the challenge that produced the cookie (currently `"altcha"` for hb_v2).
+When HAProxy runs the `verify-token` message, the agent populates transaction-scoped variables (prefixed by `option var-prefix cookieguard`):
+
+- `txn.cookieguard.valid`: `"1"` when the hb_v2 cookie validates, otherwise `"0"`.
+- `txn.cookieguard.age_seconds`: age of the accepted cookie.
+- `txn.cookieguard.session_hmac`: deterministic handle for downstream correlation.
+- `txn.cookieguard.challenge_level`: label for the challenge that produced the cookie (`"altcha"` today).
+- `txn.cookieguard.botd_*`: BotD verdict metadata (`botd_verdict`, `botd_kind`, `botd_confidence`, `botd_request_id`; `botd_tool` aliases `botd_kind` for legacy rules).
 
 ## SPOE inputs and outputs
 
@@ -205,6 +268,7 @@ With `option var-prefix cookieguard`, HAProxy sees the following variables under
 - `age_seconds` (stringified integer, always set): age of the accepted cookie. Remains "0" for invalid/missing cookies. You can rate-limit or log based on freshness.
 - `session_hmac` (hex string, optional): deterministic HMAC derived from the hb_v2 payload. Decision-SPOA uses this value as `cookieguard_session` to correlate sessions without exposing the token itself. Empty when validation fails.
 - `challenge_level` (string, optional): label describing how the cookie originated. Currently always `"altcha"` when verification succeeds; keep space for future challenge types.
+- `botd_verdict`/`botd_kind`/`botd_confidence`/`botd_request_id` (strings, optional): populated when a recent BotD report exists for the same client IP + UA hash. `botd_tool` remains as a backward-compatible alias of `botd_kind`. These let [decision-spoa](https://github.com/artefactual-labs/decision-spoa) or native HAProxy ACLs act on BotD detections without re-running the script.
 
 By design, `verify-token` always resets every output to a safe default before attempting validation so stale data never leaks between transactions.
 
@@ -251,7 +315,32 @@ By design, `verify-token` always resets every output to a safe default before at
    - The agent also serves the page at `/altcha` from `-altcha-page` (default `/etc/haproxy/altcha_challenge.html.lf`).
    - Packages enable `-cookie-secure` by default so `hb_v2` ships with the `Secure` attribute. Comment it in `/etc/default/cookie-guard-spoa` if you must disable it.
 
- 
+5. **BotD verdict ingestion (optional)**
+
+   When `-botd` is enabled (default), the metrics listener exposes `POST /botd-report`. The shipped challenge page loads FingerprintJS BotD in the browser, detects automation, and POSTs the verdict before ALTCHA begins. The payload includes `verdict`, `botKind` (only set when automation is detected), `confidence`, `requestId`, and `ua_hash`. The agent caches each verdict for `-botd-ttl` (default `5m`) keyed by client IP and UA hash, exposes it via SPOE transaction variables (`botd_verdict`, `botd_kind`, `botd_confidence`, `botd_request_id`; `botd_tool` remains as an alias), and emits Prometheus metrics.
+
+   - `botd_confidence` mirrors Fingerprint’s 0–1 confidence score (the bundled OSS detector reports `0` for “no automation observed” and `1` for confirmed bots; the hosted SaaS may emit fractional probabilities).
+   - `botd_request_id` surfaces Fingerprint’s request identifier when present, which is useful for correlating detections in their dashboards/logs. Browsers that run entirely locally usually leave it empty.
+
+   - Route `/botd-report` to the same backend that serves `/altcha*` so the agent receives reports.
+   - Serve the bundled JS from `/assets/botd/active/botd.esm.js`; packages install it under `/etc/haproxy/assets/botd`, and `make botd-assets && sudo make install-botd-assets` refreshes the version.
+   - Enable or disable the endpoint with `-botd`; set cache capacity with `-botd-cache-max` (use `0` to disable storage).
+   Prometheus metrics:
+
+   - `cookie_guard_botd_reports_total{verdict="..."}` counts inbound reports.
+   - `cookie_guard_botd_cache_entries` shows live cache cardinality.
+   - `cookie_guard_botd_cache_evictions_total` increments when entries expire or capacity forces eviction.
+
+   Downstream policy engines (e.g., [decision-spoa](https://github.com/artefactual-labs/decision-spoa)) can read the new SPOE variables to make the final allow/challenge/block decision without changing cookie-guard’s core logic. Cookie Guard focuses on proving “is this a real, interactive browser?” while Decision consumes the resulting `cookieguard.*` and `botd_*` variables (plus GeoIP/session context) to apply richer rules—together they form a layered defense that challenges unknown traffic, fingerprints automation, and then enforces nuanced policies.
+
+## Optional integration with decision-spoa
+
+[decision-spoa](https://github.com/artefactual-labs/decision-spoa) is Artefactual’s policy SPOE for HAProxy. Pairing it with Cookie Guard combines:
+
+- **Cookie Guard** – first-party ALTCHA + BotD challenge, hb_v2 issuance/verification, and BotD verdict caching.
+- **Decision** – GeoIP lookups, session-rate tracking, JA3/UA heuristics, and a rule engine that consumes `cookieguard.*` / `botd_*` variables to choose block/allow/challenge routes.
+
+Together they deliver a layered defense: Cookie Guard proves the visitor is an interactive browser and fingerprints automation; Decision ingests those signals plus its own telemetry to decide whether to serve the origin, throttle, or escalate.
 
 6. **Frontend**
 
@@ -333,6 +422,17 @@ Versioning policy and updates:
   systemctl reload haproxy
   ```
 
+### Update BotD
+
+- JS asset: pinned under `web/assets/botd/<version>` with `web/assets/botd/active` symlinked to the active version. To bump:
+  ```bash
+  echo vX.Y.Z > web/assets/botd/VERSION
+  make botd-assets
+  sudo make install-botd-assets
+  systemctl reload haproxy
+  ```
+- Browser challenge: `web/altcha_challenge.html.lf` imports `/assets/botd/active/botd.esm.js`. Ensure HAProxy routes that path (and `/botd-report`) to the Cookie Guard HTTP listener so the new version is served immediately.
+
 ---
 
 ## Security notes