Skip to content

Latest commit

 

History

History
776 lines (566 loc) · 24.2 KB

File metadata and controls

776 lines (566 loc) · 24.2 KB

Administration

This page covers everything an administrator needs to operate BatleHub: configuration, storage, auth providers, registry management, health monitoring, cache cleanup, hot reloading, and the global banner.

For the complete TOML reference see docs/configuration.md.

[[toc]]


Configuration {#configuration}

BatleHub reads a single TOML file, defaulting to config.toml in the working directory. Override the path with --config /path/to/config.toml.

Loading order

  1. TOML file is read from disk.
  2. ${VAR_NAME} placeholders inside string values are replaced with their environment variable values.
  3. The resulting TOML is parsed.
  4. Named PROXY_CACHE__* environment variable overrides are applied on top.
  5. Registry names and types are validated.

Secret injection with ${VAR_NAME} {#env-inline}

Write ${VAR_NAME} inside any TOML string value. BatleHub replaces the placeholder with the named environment variable before parsing. This works for every field — auth secrets, upstream tokens, passwords, and more.

::: danger Missing variable = startup failure If a referenced variable is not set, BatleHub exits immediately with a clear error message naming the missing variable. There is no silent fallback or empty-string default. :::

OIDC client secret:

[[auth]]
type          = "oidc"
issuer_url    = "https://sso.example.com/application/o/batlehub/"
client_id     = "batlehub"
client_secret = "${OIDC_CLIENT_SECRET}"   # export OIDC_CLIENT_SECRET=...
redirect_uri  = "https://hub.example.com/api/v1/auth/oidc/callback"

Upstream registry credentials:

# Bearer token (GitHub PAT, Gitea token, npm auth token)
[registries.upstream_auth]
type  = "bearer"
token = "${REGISTRY_TOKEN}"

# Basic auth (Nexus, Artifactory)
[registries.upstream_auth]
type     = "basic"
username = "deploy"
password = "${REGISTRY_PASSWORD}"

# Custom header (X-API-Key, etc.)
[registries.upstream_auth]
type  = "header"
name  = "X-API-Key"
value = "${REGISTRY_API_KEY}"

Kubernetes / Docker Compose injection:

# docker-compose.yml
services:
  batlehub:
    env_file: .env.secrets   # OIDC_CLIENT_SECRET=...
    volumes:
      - ./config.toml:/etc/batlehub/config.toml:ro
# Kubernetes Deployment
env:
  - name: OIDC_CLIENT_SECRET
    valueFrom:
      secretKeyRef:
        name: batlehub-secrets
        key: oidc-client-secret

To write a literal ${...} string (no variable lookup), escape the first $:

# Stores the literal string "${MY_VAR}" — no substitution performed:
some_field = "$${MY_VAR}"

Named environment variable overrides {#env-named}

A fixed set of top-level fields can also be overridden with named env vars. Useful for tweaking infrastructure addresses (host, port, DB URL) in containerised deployments without modifying the config file.

Variable Config field
PROXY_CACHE__SERVER__PORT server.port
PROXY_CACHE__SERVER__HOST server.host
PROXY_CACHE__SERVER__STATIC_DIR server.static_dir
PROXY_CACHE__DATABASE__URL database.url
PROXY_CACHE__DATABASE__MAX_CONNECTIONS database.max_connections
PROXY_CACHE__STORAGE__PATH storage.path (single filesystem backend)
PROXY_CACHE__STORAGE__BUCKET storage.bucket (single S3 backend)
PROXY_CACHE__STORAGE__REGION storage.region (single S3 backend)
PROXY_CACHE__STORAGE__ENDPOINT_URL storage.endpoint_url (single S3 backend)
PROXY_CACHE__OTEL__ENDPOINT otel.endpoint
PROXY_CACHE__OTEL__SERVICE_NAME otel.service_name

::: tip When to use which Use ${VAR_NAME} placeholders for secrets (auth tokens, passwords, client secrets) — they work for any field and keep credentials out of the TOML file entirely.

Use PROXY_CACHE__* variables for infrastructure addresses (database URL, storage path, host/port) where the value is not secret but varies between environments. :::

Minimal production config

[server]
host = "0.0.0.0"
port = 8080
static_dir = "/app/ui/dist"
cors_allowed_origins = ["https://batlehub.example.com"]

[database]
type = "postgresql"
url  = "postgresql://batlehub:changeme@postgres:5432/batlehub"

[[auth]]
type = "token"

[[auth.tokens]]
value   = "change-me-admin-token"
role    = "admin"
user_id = "admin"

[storage]
type = "filesystem"
path = "/var/cache/batlehub"

[[registries]]
type = "npm"
name = "npm"

[registries.rbac]
anonymous = ["releases:read", "source:read"]

Registry modes

Every registry can run in one of three modes:

Mode Behaviour
proxy Default. Forwards all requests to upstream; publishing is rejected.
local BatleHub is the only source. No upstream needed. Teams publish directly.
hybrid Local-first. Serves locally-published packages; falls back to upstream for everything else.
[[registries]]
type = "cargo"
name = "internal"
mode = "local"         # or "hybrid"

[registries.rbac]
user  = ["source:read"]
admin = ["*"]

Auth providers {#auth}

Auth providers are evaluated in declaration order. The first provider that recognises a credential wins. Requests with no matching credential are treated as anonymous.

Static tokens

[[auth]]
type = "token"

[[auth.tokens]]
value   = "ci-pipeline-token"
role    = "user"
user_id = "ci"

OIDC (Authentik, Keycloak, Dex, …)

[[auth]]
type          = "oidc"
issuer_url    = "https://sso.example.com/application/o/batlehub/"
client_id     = "batlehub"
client_secret = "${OIDC_CLIENT_SECRET}"   # inject from env — never commit secrets
redirect_uri  = "https://batlehub.example.com/api/v1/auth/oidc/callback"
scopes        = ["openid", "profile", "email", "groups"]

user_id_claim = "preferred_username"
role_claim    = "groups"

[auth.role_mappings]
"authentik Admins" = "admin"
"proxy-users"      = "user"

Kubernetes service accounts

[[auth]]
type = "kubernetes"
# api_server, ca_cert_path, token_path all default to in-cluster values

[auth.role_mappings]
"system:serviceaccount:prod:ci-deployer" = "admin"
"system:serviceaccounts:staging"         = "user"

User-generated API tokens

Authenticated users (OIDC sessions) can generate short-lived tokens via the Web UI or API:

curl -X POST \
  -H "Authorization: Bearer <oidc-token>" \
  -H "Content-Type: application/json" \
  -d '{"name": "my-token", "expires_in_days": 30, "role": "user"}' \
  https://batlehub.example.com/api/v1/auth/tokens

The raw token value is returned once — save it immediately.


Storage {#storage}

Filesystem

[storage]
type = "filesystem"
path = "/var/cache/batlehub"

S3-compatible (AWS S3, MinIO, RustFS)

[storage]
type   = "s3"
bucket = "batlehub-artifacts"
region = "us-east-1"

# For self-hosted S3 (MinIO, RustFS): set a custom endpoint
# endpoint = "http://rustfs:9900"

# Credentials (omit to use IAM role / instance profile on AWS)
# access_key_id     = "AKIAIOSFODNN7EXAMPLE"
# secret_access_key = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

Multi-backend storage

Different registries can use different backends — for example, filesystem for most registries and dedicated S3 for large GitHub release artifacts:

[storage]
type = "filesystem"
path = "/var/cache/batlehub"

[[storage.backends]]
name = "github-s3"
type = "s3"
bucket = "batlehub-github"
region = "us-east-1"

[[registries]]
type    = "github"
name    = "github"
storage = "github-s3"

S3 with RustFS (self-hosted)

Start RustFS via the bundled Compose file, then create the bucket:

task compose:s3:db            # start RustFS + Postgres + Authentik
mc alias set local http://localhost:9900 rustfsadmin rustfsadmin
mc mb local/artifacts         # or: task compose:s3:bucket:create
task run:s3                   # run the server with the S3 config

Health & Observability {#health}

Health endpoint

curl -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/health

Returns per-registry status (upstream reachability, cache hit rate) and overall server status.

Clear registry cache

Forces the next request for any package in the registry to re-fetch from upstream:

curl -X POST \
  -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/registries/npm/clear-cache

OpenTelemetry (Jaeger, Tempo)

Enable distributed tracing by adding an [otel] block:

[otel]
endpoint = "http://jaeger:4317"

Start the full observability stack locally:

task compose:otel   # starts Postgres + server + Jaeger

Then open http://localhost:16686 for the Jaeger UI.


Cache policy {#cache-policy}

For a full explanation of how caching works end-to-end — request lifecycle, backend selection, rate-limit counters, deduplication — see the dedicated Caching guide.

All cache settings live under [registries.cache] and are per-registry.

Eviction

[registries.cache]
metadata_ttl_secs = 300      # re-check version lists after 5 minutes (default)
serve_stale       = true     # serve cached metadata when upstream is down (default)

artifact_ttl_secs = 2592000  # delete artifacts older than 30 days
idle_days         = 14       # delete artifacts not accessed for 14 days
max_size_bytes    = 10737418240  # 10 GiB storage cap — evicts LRU when exceeded
keep_latest_n     = 5        # keep only the 5 most-recently-cached versions per package

All eviction fields are optional. Omitting a field disables that eviction strategy. Strategies compose: an artifact is evicted as soon as any active strategy triggers.

Field Default Description
metadata_ttl_secs 300 Metadata cache TTL in seconds
serve_stale true Serve stale metadata on upstream 5xx instead of propagating the error
artifact_ttl_secs Evict artifacts older than N seconds
idle_days Evict artifacts not accessed for N days
max_size_bytes Storage cap; LRU artifacts are removed when exceeded
keep_latest_n Keep only the N most recent versions per package

Cache warming {#cache-warming}

Cache warming pre-fetches artifact versions so they are available with zero latency on first request. Configure it alongside eviction:

[registries.cache]
warm_packages    = ["lodash", "react", "typescript@5.4.5"]
warm_latest_n    = 3   # warm the 3 most recent versions of bare-name entries
warm_concurrency = 4   # up to 4 parallel downloads
Field Default Description
warm_packages [] Packages to warm at startup. "name" warms the latest warm_latest_n versions; "name@version" warms exactly one.
warm_latest_n 1 Versions to pre-fetch per bare-name entry
warm_concurrency 2 Maximum parallel downloads per warming run

BatleHub starts warming immediately after binding the server socket, so the HTTP server is available while warming runs in the background.

On-demand warming via admin API

Re-warm a package at any time without restarting:

# Warm using the registry's configured warm_latest_n
curl -X POST http://localhost:8080/api/v1/admin/registries/npm/warm \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"package": "lodash"}'

# Override the version count for this request only
curl -X POST http://localhost:8080/api/v1/admin/registries/npm/warm \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"package": "lodash", "versions": 10}'

# Warm a single pinned version
curl -X POST http://localhost:8080/api/v1/admin/registries/cargo/warm \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"package": "serde@1.0.200"}'

Response:

{"warmed": 3, "skipped": 0, "errors": 0}
  • warmed — artifact versions fetched and stored in this run
  • skipped — versions already present in the cache (no download needed)
  • errors — versions that failed to fetch or store

::: tip Registry support Version enumeration (used for bare-name warming) is implemented for all registry types except Maven, Terraform, RubyGems, and Composer. For those four, use pinned version strings (e.g. "lodash@4.17.21") to warm specific versions. For GitHub, bare names enumerate releases via the Releases API (paginated). For VS Code Marketplace, bare names enumerate all extension versions via the Gallery API. For Conda, BatleHub synthesises the version list by scanning repodata.json across noarch, linux-64, osx-64, osx-arm64, and win-64. :::

Content-addressable deduplication

BatleHub stores artifact bytes at a content-addressed key (blob/{sha256}) and maps logical artifact keys (e.g. artifact:npm/lodash:4.17.21) to that blob via a reference count. When identical bytes appear under multiple logical keys — the same package mirrored across two registries, a yanked-then-re-released version — only one copy is stored on disk or in S3.

This is automatic and requires no configuration. Pre-deduplication artifacts stored before upgrading continue to be served normally.


Package management {#package-management}

List packages

# All packages
curl -H "Authorization: Bearer <admin-token>" \
  "http://localhost:8080/api/v1/admin/packages"

# Filter by registry and name
curl -H "Authorization: Bearer <admin-token>" \
  "http://localhost:8080/api/v1/admin/packages?registry=npm&name=lodash"

Block a package version

Blocked packages return 403 Forbidden to all clients, regardless of role.

curl -X POST \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"registry": "npm", "name": "lodash", "version": "4.17.20"}' \
  http://localhost:8080/api/v1/admin/packages/block

Unblock

curl -X POST \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"registry": "npm", "name": "lodash", "version": "4.17.20"}' \
  http://localhost:8080/api/v1/admin/packages/unblock

Bulk block

curl -X POST \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"packages": [{"registry":"npm","name":"bad-pkg","version":"1.0.0"}]}' \
  http://localhost:8080/api/v1/admin/packages/bulk-block

Invalidate cache

Removes the cached artifact so the next request re-fetches from upstream:

curl -X POST \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"registry": "npm", "name": "lodash", "version": "4.17.21"}' \
  http://localhost:8080/api/v1/admin/packages/invalidate

Team Namespaces & Package Visibility {#team-namespaces}

Team namespaces let you assign a package name prefix within a registry to an auth-provider group. Only group members — and admins — may publish packages under that prefix. Package visibility independently controls who can download a package.

This feature requires no TOML changes and no server restart — claims and visibility are managed entirely via the admin API.

For the full reference (visibility levels, download-time enforcement, longest-prefix rule, registry support matrix) see the Access Control guide.

Managing namespace claims

# List claims for a registry
curl -H "Authorization: Bearer <admin-token>" \
  https://batlehub.example.com/api/v1/admin/registries/internal-npm/namespaces

# Claim a prefix for a group
curl -X POST \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"prefix":"frontend","group_id":"oidc:frontend-team","claimed_by":"admin"}' \
  https://batlehub.example.com/api/v1/admin/registries/internal-npm/namespaces

# Release a claim (prefix may contain slashes, passed verbatim in the path)
curl -X DELETE \
  -H "Authorization: Bearer <admin-token>" \
  https://batlehub.example.com/api/v1/admin/registries/internal-npm/namespaces/frontend

Managing package visibility

Visibility is package-level — all versions share the same setting. Accepted values: public (default), internal, team.

# Read current visibility
curl -H "Authorization: Bearer <admin-token>" \
  https://batlehub.example.com/api/v1/admin/registries/internal-npm/packages/frontend%2Futils/visibility

# Restrict to team members only
curl -X PUT \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"visibility":"team"}' \
  https://batlehub.example.com/api/v1/admin/registries/internal-npm/packages/frontend%2Futils/visibility

Package names containing slashes must be percent-encoded in the URL (/%2F).


Audit log {#audit-log}

Every access-control decision (allow or deny) is recorded in PostgreSQL.

# Last 50 decisions across all registries
curl -H "Authorization: Bearer <admin-token>" \
  "http://localhost:8080/api/v1/admin/audit-log?limit=50"

# Filter by registry and outcome
curl -H "Authorization: Bearer <admin-token>" \
  "http://localhost:8080/api/v1/admin/audit-log?registry=npm&outcome=deny&limit=100"

Example entry:

{
  "id": "01j...",
  "timestamp": "2025-05-22T10:00:00Z",
  "registry": "npm",
  "package": "lodash",
  "version": "4.17.21",
  "user_id": "ci",
  "role": "user",
  "outcome": "allow",
  "rule": null
}

Beta/Pre-Release Channel {#beta-channel}

Gate pre-release versions (e.g. 1.0.0-beta.1) to specific users or groups. Non-members see only stable versions and get 404 on pre-release artifact downloads.

Enable per registry:

[registries.beta_channel]
enabled = true

Manage members at runtime:

# Add a user
curl -s -X POST \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"principal_type":"user","principal_id":"alice"}' \
  http://localhost:8080/api/v1/admin/registries/my-npm/beta-channel

# List members
curl -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/registries/my-npm/beta-channel

# Remove a member
curl -X DELETE -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/registries/my-npm/beta-channel/user/alice

See the Access Control guide for the full reference, including group membership, per-registry support table, and user-facing behaviour.


IP-Based Blocking {#ip-blocking}

Automatically block IPs that trigger too many violations (rate-limit hits, auth failures) within a time window.

[ip_blocking]
enabled               = true
violation_threshold   = 10
violation_window_secs = 300      # 5-minute window
ban_duration_secs     = 3600     # 1-hour block
trigger_on_status     = [429, 401]

Manage blocks manually:

# List blocked IPs
curl -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/ip-blocks

# Block an IP
curl -s -X POST \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"ip":"1.2.3.4","reason":"bad actor","duration_secs":86400}' \
  http://localhost:8080/api/v1/admin/ip-blocks

# Unblock
curl -s -X DELETE \
  -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/ip-blocks/1.2.3.4

Blocked IPs receive 403 Forbidden with X-Block-Expires. The check runs before authentication. Violation counts and blocks are stored in the same backend as the rate-limit store (memory / postgres / redis).

See the Access Control guide for the full reference including load-balancer setup and storage backend comparison.


Rules {#rules}

Rules are optional per-registry policies evaluated after RBAC.

Release age gate

Block packages published less than min_age_secs ago:

[[registries.rules]]
kind         = "release_age_gate"
min_age_secs = 3600       # 1 hour
bypass_roles = ["admin"]  # admins can still install new packages

Deny latest tag

Force clients to pin exact versions:

[[registries.rules]]
kind         = "deny_latest"
bypass_roles = ["admin"]

Hot reload {#hot-reload}

BatleHub can reload its configuration at runtime — add or remove registries, update RBAC rules, or change policy settings — without restarting the process. In-flight requests finish with the old configuration before the new one takes effect.

How it works

  1. When config.toml changes on disk, the built-in file watcher validates the new config, runs connectivity probes against upstream URLs, and stores a pending reload in memory.
  2. An administrator reviews the pending diff in the Config Reload admin page (/admin/config-reload) and clicks Apply — or discards it.
  3. Alternatively, the POST /api/v1/admin/config/reload endpoint applies a reload immediately (load + validate + apply atomically), which is useful in CI/CD pipelines.
# Immediate reload (no confirmation step)
curl -s -X POST \
  -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/config/reload

# Check for a pending reload loaded by the file watcher
curl -s -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/config/pending

# Apply the pending reload
curl -s -X POST \
  -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/config/pending/apply

# Discard without applying
curl -s -X DELETE \
  -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/config/pending

Pending reloads expire after 10 minutes if not applied or discarded.

What can be hot-reloaded

Component Hot-reloadable
Registry list (add / remove / update)
Per-registry RBAC (anonymous, user, admin, groups)
Per-registry rules (age gate, deny latest)
Per-registry versioning / signing / beta-channel
Artifact size limit
Server host / port ❌ requires restart
Database URL ❌ requires restart
Auth providers ❌ requires restart
Storage backends ❌ requires restart

Audit trail

Every reload (applied or rejected) is written to the config_changes table and visible in the admin page change history:

curl -s -H "Authorization: Bearer <admin-token>" \
  "http://localhost:8080/api/v1/admin/config/changes?per_page=20"

Disabling hot reload

Set BATLEHUB_DISABLE_HOT_RELOAD=1 in the server environment to disable the file watcher and all reload endpoints with a 503 Service Unavailable. This is recommended when config.toml is mounted as a read-only Kubernetes ConfigMap, where the file will not change at runtime.

# Kubernetes Deployment env
- name: BATLEHUB_DISABLE_HOT_RELOAD
  value: "1"

Global banner {#global-banner}

Administrators can broadcast a short message to all website visitors — authenticated or not. Common uses: maintenance windows, reload-in-progress notices, and policy announcements.

The banner is automatically set to "Configuration reload in progress…" when a hot reload starts and cleared when it completes.

Set the banner

From the Config Reload admin page, fill in the message and select a level (info / warning / error), then click Set Banner.

# Set via API
curl -s -X PUT \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{"message":"Scheduled maintenance in 30 min","level":"warning"}' \
  http://localhost:8080/api/v1/admin/banner

# Clear
curl -s -X DELETE \
  -H "Authorization: Bearer <admin-token>" \
  http://localhost:8080/api/v1/admin/banner

The frontend polls GET /api/v1/banner every 30 seconds (no authentication required) and displays the banner as a dismissible bar at the top of every page.

High-availability banner propagation

The banner backend is selected from the same pool as the metadata cache:

[cache] type Banner storage
"memory" (default) In-process — not shared across replicas
"redis" Redis key batlehub:system:banner — shared across all replicas
"postgres" system_kv table — shared across all replicas

In an HA deployment, use "redis" or "postgres" so that all replicas show the same banner regardless of which instance the client reaches.