This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Zotero-arXiv-Daily recommends new arXiv/bioRxiv/medRxiv papers based on a user's Zotero library. It computes embedding similarity between new papers and the user's existing library, generates TLDRs via LLM, and delivers results by email. Designed to run as a GitHub Actions workflow at zero cost.
# Run the application
uv run src/zotero_arxiv_daily/main.py
# Run tests (excludes slow tests by default)
uv run pytest
# Run all tests including slow ones
uv run pytest -m ""
# Run a single test
uv run pytest tests/test_utils.py::TestGlobMatch -v
# Install/sync dependencies
uv syncNo linter or formatter is configured.
The app follows a linear pipeline orchestrated by Executor (src/zotero_arxiv_daily/executor.py):
- Fetch Zotero corpus — retrieves user's library papers via pyzotero API
- Filter corpus — applies
include_pathglob patterns to select relevant collections - Retrieve new papers — fetches from configured sources (arXiv RSS, bioRxiv/medRxiv REST API)
- Rerank — scores candidates by weighted similarity to corpus (newer Zotero papers weighted higher)
- Generate TLDRs + affiliations — via OpenAI-compatible LLM API
- Render + send email — HTML email via SMTP
Retrievers (src/zotero_arxiv_daily/retriever/): Register via @register_retriever decorator, discovered by get_retriever_cls(). Each retriever implements _retrieve_raw_papers() and convert_to_paper().
Rerankers (src/zotero_arxiv_daily/reranker/): Register via @register_reranker decorator, discovered by get_reranker_cls(). Two implementations: local (sentence-transformers) and api (OpenAI-compatible embeddings endpoint).
Uses Hydra + OmegaConf. Config is composed from config/base.yaml (defaults) + config/custom.yaml (user overrides). Environment variables are interpolated via ${oc.env:VAR_NAME,default} syntax. Entry point uses @hydra.main.
Paper and CorpusPaper in src/zotero_arxiv_daily/protocol.py. Paper has LLM-powered methods (generate_tldr, generate_affiliations) that call the OpenAI API directly.
Tests marked @pytest.mark.slow require heavy dependencies (e.g., sentence-transformers model download) and are skipped locally by default (addopts = "-m 'not slow'" in pyproject.toml). All other tests run with pure Python stubs (no Docker containers needed).
# Run tests (excludes slow tests)
uv run pytest
# Run all tests including slow ones
uv run pytest -m ""
# Run with coverage
uv run pytest --cov=src/zotero_arxiv_daily --cov-report=term-missingUse the /browse skill from gstack for all web browsing. Never use mcp__claude-in-chrome__* tools.
Available skills: /office-hours, /plan-ceo-review, /plan-eng-review, /plan-design-review, /design-consultation, /design-shotgun, /design-html, /review, /ship, /land-and-deploy, /canary, /benchmark, /browse, /connect-chrome, /qa, /qa-only, /design-review, /setup-browser-cookies, /setup-deploy, /retro, /investigate, /document-release, /codex, /cso, /autoplan, /plan-devex-review, /devex-review, /careful, /freeze, /guard, /unfreeze, /gstack-upgrade, /learn.
If gstack skills aren't working, run cd .claude/skills/gstack && ./setup to build the binary and register skills.
- PRs should target the
devbranch, notmain - Current development branch:
dev