Skip to content

Allow creating custom user#4640

Open
korney4eg wants to merge 11 commits intoomnivore-app:mainfrom
korney4eg:main
Open

Allow creating custom user#4640
korney4eg wants to merge 11 commits intoomnivore-app:mainfrom
korney4eg:main

Conversation

@korney4eg
Copy link

This changes allows to create user without registration, so on self-hosted instace registration could be disabled

korney4eg and others added 10 commits March 4, 2026 14:41
- Add packages/content-fetch-go: a Go rewrite of packages/content-fetch
  that is fully API-compatible (same HTTP endpoints, env vars, Redis key
  schema, and BullMQ v5 job format) but produces a significantly smaller
  Docker image (Alpine + Chromium only, no Node.js runtime)

- Internal packages:
  - bullmq: BullMQ v5-compatible producer and consumer (Lua moveToActive)
  - queue/worker: concurrent job worker (4 goroutines, 500 ms poll)
  - fetch: Chromium page fetch via chromedp replacing puppeteer-parse
  - handler: processFetchContentJob logic (cache, GCS upload, job queuing)
  - gcs: Google Cloud Storage upload
  - analytics: PostHog failure event capture
  - server: HTTP endpoints (/_ah/health, /metrics, /lifecycle/prestop, /)
  - redisutil: dual Redis connections (cache + BullMQ MQ)
  - config: all env var loading

- Add CLAUDE.md with build/test/lint commands and architecture overview

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace hand-rolled Prometheus text output in /metrics with the official
prometheus/client_golang library. Adds an internal/metrics package that
registers five GaugeVec collectors (active, failed, completed, prioritized,
oldest_job_age_seconds) and refreshes them from Redis on each request via
a thin wrapper around promhttp.Handler.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace short variable/field/parameter names with self-documenting ones:
- rds  → redisDS  (RedisDataSource)
- br   → browser
- cfg  → config
- rdb  → redisClient
- w    → worker  (in New() signatures; http.ResponseWriter stays as w)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Tests spin up a Redis container via testcontainers-go and cover:
- HTTP endpoints: health, metrics, token auth, method validation, 404
- Handler pipeline: cache hit, multi-user jobs, save-page job enqueueing
- Domain blocking: hardcoded list (weibo.com) and failure-count threshold
- BullMQ primitives: AddBulk/PopJob round-trip, priority ordering, complete/fail
- Full end-to-end: worker consumes from content-fetch queue and produces to backend queue
- HTTP POST: valid token + cached result → save-page job in Redis

No PostgreSQL required; service only depends on Redis.
Run with: go test -v -timeout 120s ./...

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaces internal/gcs (cloud.google.com/go/storage) with a new
internal/storage package backed by gocloud.dev/blob, giving self-hosted
users a choice of object storage backend via a single env var:

  BLOB_STORAGE_URL=gs://bucket              → GCS (unchanged behaviour)
  BLOB_STORAGE_URL=s3://bucket?region=...   → AWS S3
  BLOB_STORAGE_URL=s3://bucket?endpoint=http://minio:9000&use_path_style=true&disable_https=true&region=us-east-1
                                            → MinIO

Backward compatibility: when BLOB_STORAGE_URL is not set, a gs:// URL is
constructed from the existing GCS_UPLOAD_BUCKET env var, so existing GCS
deployments require no config changes.

Changes:
- internal/gcs/gcs.go deleted
- internal/storage/storage.go created (gocloud.dev/blob, gcsblob, s3blob)
- internal/storage/storage_test.go created (6 memblob unit tests, no Docker)
- config.go: BlobStorageURL field + BlobURL() fallback method
- handler.go: swaps gcs import for storage, bridges GCS key-file via
  GOOGLE_APPLICATION_CREDENTIALS for the gcsblob URL opener

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replaces the Node.js content-fetch container with the Go implementation
in both self-hosted compose files, and enables MinIO-backed original
content uploads via BLOB_STORAGE_URL.

Changes:
- self-hosting/docker-compose/docker-compose.yml
  - content-fetch image: sh-content-fetch → sh-content-fetch-go
  - Remove USE_FIREFOX (Go service uses Chromium, no Firefox needed)
  - Add dependency on createbuckets (bucket must exist before uploads)

- self-hosting/docker-compose/self-build/docker-compose.yml
  - content-fetch build: packages/content-fetch → packages/content-fetch-go
  - Remove USE_FIREFOX
  - Add dependency on createbuckets

- self-hosting/docker-compose/.env.example
  - Remove SKIP_UPLOAD_ORIGINAL=true (uploads now work via MinIO)
  - Add BLOB_STORAGE_URL pointing to the MinIO container
  - Consolidate AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY entries
    (were split between two comment blocks, now in one place)

- packages/content-fetch-go/Dockerfile
  - Build stage: golang:1.24-alpine → golang:1.25-alpine (matches go.mod)

MinIO URL used: s3://omnivore?endpoint=http%3A%2F%2Fminio%3A9000&use_path_style=true&disable_https=true&region=us-east-1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The queue-processor sends labels as an array of objects matching
TypeScript's CreateLabelInput interface, e.g.:

  labels: [{"name":"RSS"}]

The Go struct had this declared as []string, causing an unmarshal error
whenever an RSS feed job arrived:

  json: cannot unmarshal object into Go struct field JobData.labels of type string

Fix both JobData (incoming jobs) and savePageJobData (outgoing jobs) to
use []LabelInput{Name, Color, Description}, matching the TS types exactly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
handler_test.go (unit, no Docker):
- TestJobData_UnmarshalLabelsAsObjects  — exact payload from refreshFeed.ts
- TestJobData_UnmarshalLabelsWithColor  — optional color field
- TestJobData_UnmarshalNoLabels         — absent labels field is valid
- TestJobData_UnmarshalMultipleLabels   — multiple label objects with all fields
- TestSavePageJobData_MarshalLabels     — outgoing jobs serialise labels as objects

integration_test.go (Redis container):
- TestIntegration_RSSJobWithLabelObjects — enqueues a job with labels:[{"name":"RSS"}]
  through the full worker pipeline and asserts the resulting save-page job carries
  the label object through to the backend queue

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…re binary)

- Add src-go/ with cobra CLI: omnivore server content-fetcher
- Support GCS, S3, and MinIO via gocloud.dev/blob (BLOB_STORAGE_URL)
- Fix labels type mismatch ([]string → []LabelInput) for BullMQ jobs
- Wire Go content-fetcher into self-hosted Docker Compose setup
- Add docker/content-fetcher.Dockerfile built from repo root
- Add Makefile targets: content_fetch_go, docker_build/push_content_fetcher
- Add integration tests (testcontainers-go) and unit tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant