Skip to content

[pull] main from mhdzumair:main#260

Merged
pull[bot] merged 15 commits into
geek-cookbook:mainfrom
mhdzumair:main
May 19, 2026
Merged

[pull] main from mhdzumair:main#260
pull[bot] merged 15 commits into
geek-cookbook:mainfrom
mhdzumair:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 19, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

mhdzumair added 15 commits May 17, 2026 01:07
…t endpoints

- contributions.rs: add data::jsonb cast — PG json type (OID 114) differs from
  jsonb (OID 3802) so sqlx failed to decode the data column silently
- stream_suggestions.rs: add status filter to PendingQuery so
  list_pending_stream_suggestions respects status=all/pending/approved/rejected
  instead of always hardcoding WHERE status = 'pending'
- suggestions.rs: rewrite list_pending_suggestions filter logic to use
  dynamic WHERE 1=1 + appended conditions, removing fragile $1::text IS NULL
  trick that could silently fail when field_name is not provided
- Add is_adult_contribution() helper: skips approval when target media
  has adult=true or nudity_status MODERATE/SEVERE, or when contribution
  data itself carries those flags (for new-media contributions)
- Fix counter comparison: new_status is "APPROVED" (uppercase) so
  approved counter was always 0 due to comparing against "approved"
- Fix SQL injection: contribution_type filter now uses $1 parameterized
  bind instead of manual string interpolation with quote escaping
…k review

Replace the 9-keyword regex in contains_adult_keywords() with the full
1,043-keyword list from PTT's combined-keywords.txt, embedded at compile
time via include_str!. Matching uses case-insensitive substring search
identical to Python's is_adult_content() logic.

Bulk contribution approval now checks data["name"], data["title"], and
per-file fields (filename, meta_title, episode_title, title) in
data["file_data"] against this list, mirroring _contribution_stream_titles_indicate_adult().

Note: combined-keywords.txt is the pre-merged output of short-adult-words.txt
and short-adult-compound-words.txt (generated by PTT's cli.py combine_keywords).
…ort-video category labels from titles

- Add apply_metadata_field_change() to suggestions.rs; wires it into both
  the auto-approve path (create_suggestion) and the manual approve path
  (review_suggestion) so approved suggested_values are written back to
  the media/media_external_id tables instead of being silently dropped
- Strip the " / CATEGORY" suffix (e.g. "/ FILESHARING") from sport-video
  spider titles at parse time so scraped match titles are clean
Add DB-backed keyword filter system replacing the compile-time-only PTT
keyword list. Admins can now manage blocked keywords and whitelist phrases
at runtime without a redeploy.

- Migration: keyword_filters + keyword_whitelist tables with unique
  case-insensitive indexes; seeded from embedded adult-keywords.txt on
  first startup
- AppState: KeywordFilterCache (Arc<RwLock>) loaded/refreshed from DB;
  in-memory fast path for all contribution checks
- Whitelist-first logic: if the title contains a whitelisted phrase (e.g.
  "Sex Education"), the keyword check is skipped entirely
- 8 admin-only REST endpoints under /api/v1/admin/keyword-filters and
  /api/v1/admin/keyword-whitelist (list, add, toggle, delete, reload)
- Bulk contribution approval now uses the runtime cache to skip adult titles
- Frontend: KeywordFiltersTab in admin moderator dashboard with paginated
  keyword list, search, enable/disable toggle, and whitelist management
- Move adult-keywords.txt from src/ptt/ to backend/resources/ (compile-time
  embedded resources separated from runtime config)
- Replace one-shot seed with sync_keywords_from_file(): computes SHA-256 of
  the embedded file, skips sync if hash unchanged, otherwise atomically
  replaces all source='file' rows in keyword_filters and keyword_whitelist
- Support '!phrase' prefix in the file to declare whitelist entries alongside
  blocked keywords in one place
- Add migration 0007: source column on both tables + keyword_sync_state table
  to persist the last-synced hash
- Add /dashboard/keyword-filters route under RoleGuard(admin)
- Add Keyword Filters entry to Administration sidebar section
- Remove keyword-filters tab from ModeratorDashboardPage — it was
  admin-only but buried in the moderator view
sync_keywords_from_file was called inside AppState::build which runs
before migrate::run — if migration 0007 hadn't been applied yet the
keyword_sync_state table wouldn't exist and the sync would fail.

Move the call to after migrate::run in both main.rs and worker.rs,
then reload the in-memory keyword cache from the freshly synced DB.
- Add / → /app redirect and nest ServeDir with ServeFile fallback at /app
  so React Router client-side routes fall through to index.html
- Use spa_cache_headers middleware: no-cache on HTML, default caching on
  content-hashed assets
- Add FRONTEND_DIST_DIR config (default: clients/frontend/dist)
- Exclude node_modules and frontend source files from Docker runtime image
Implements a Rust scraper for sport-video.org.ua that solves the adm.tools
bot challenge (custom JS fingerprinting, not Cloudflare) by running the
challenge-solving fetch entirely inside a real Chrome session via browserless
v2's /chromium/function endpoint.

- browser.rs: new browserless v2 integration — navigates to a static category
  page to prime the browser session, then executes challenge-solving fetch()
  inside page.evaluate() (same cookie jar / TLS fingerprint); handles the
  ___ack token POST flow and returns the torrent binary as decoded bytes
- sport_video.rs: rewritten to use browser::fetch_torrent_via_browser;
  requires BROWSERLESS_URL, rate-limited at 15 RPM; parses category pages,
  resolves torrent URLs, extracts info_hash from bencoded data, persists streams
- config.rs: added browserless_url Option<String> field
- media_resolve.rs: find_or_create_sports_stub for creating media stubs without
  TMDB lookup (is_add_title_to_poster=true, poster URL stored)
- docker-compose.yml: switch browserless to ghcr.io/browserless/chromium,
  add byparr service for CF bypass on public indexers, wire env vars to worker
- docker-compose-minimal.yml: stripped to api + postgres + redis only
- rate_limit.rs: added wait_rpm helper for per-minute rate limiting
All torrent and usenet provider API calls are now routed through the
MediaFlow /proxy/forward endpoint when a non-local MediaFlow proxy is
configured, so debrid services see MediaFlow's IP on every TCP connection
instead of the addon server's IP.

Providers wired:
- realdebrid, alldebrid: forward + {mediaflow_ip} placeholder for ip= field
- torbox (torrents): forward + {mediaflow_ip} for user_ip= param
- debridlink: forward; CDN URL gets actual public IP via /proxy/ip
- premiumize: forward for both Bearer and ApiKey auth modes
- offcloud: forward with key= embedded in URL for GET, in body for POST
- seedr: forward through all API helpers including folder create (form POST)
- pikpak: forward with query params embedded in dest URL via append_query
- easydebrid: forward (X-Forwarded-For stripped by proxy — correct)
- stremthru: intentionally skipped (is itself a proxy service)
- usenet/torbox: forward including multipart NZB file upload via post_raw
- usenet/debrider: forward
- usenet/easynews: forward with Basic auth via get_auth helper

New transport.rs helpers added:
- delete(), get_no_auth(), post_form_no_auth() — for auth-in-URL patterns
- get_auth(authorization) — for verbatim auth header (Basic, etc.)
- post_raw(content_type, body: Vec<u8>) — for binary/multipart uploads
@pull pull Bot locked and limited conversation to collaborators May 19, 2026
@pull pull Bot added the ⤵️ pull label May 19, 2026
@pull pull Bot merged commit 2426d3e into geek-cookbook:main May 19, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant