๐ค Authors: GPT-5.3-Codex, Haodong Li
arXivClaw is a daily arXiv recommender that fetches new papers, scores relevance, and sends digest emails.
In default settings (fetching 500 latest arXiv papers per weekday, generally sufficient for about 3 categories; LLM_MODEL=gemini-3.1-flash-lite-preview), arXivClaw is free.
- Fetches new papers from arXiv for your selected categories/query
- Scores each paper using
keywords + title + abstract - Sorts papers by score (high to low)
- Uses threshold-first delivery with fallback minimum count:
- if papers above
MIN_RELEVANCE_SCOREare greater thanMIN_DAILY_PUSH_COUNT, sends all above-threshold papers (no upper limit) - otherwise sends top
MIN_DAILY_PUSH_COUNTpapers by score
- if papers above
- Runs automatically at 2:00 PM (by default) Los Angeles time on weekdays
- Sends one startup/init email when the process starts, including a brief explanation of key runtime settings
๐ฎ Startup Confirmation Email (an example) โฌ๏ธ

๐ฎ Daily Digest Email (an example) โฌ๏ธ

.env.example: template file with explanations and placeholder values (never put real secrets here).env: your real local config with API keys and SMTP password (ignored by Git)
First-time setup:
cp .env.example .envThen edit only .env and replace values marked as required.
git clone https://github.com/haodong2000/arXivClaw.git
cd arXivClawThen install dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt- Sign up on Brevo.
- Go to
SMTP & APIand create anSMTP key. - Fill these values in
.env:SMTP_HOST=smtp-relay.brevo.comSMTP_PORT=587SMTP_USER=<your Brevo SMTP login>(usually*@smtp-brevo.com)SMTP_PASSWORD=<your Brevo SMTP key>
EMAIL_FROM: must be a valid/verified sender in BrevoEMAIL_TO: where you want to receive the digest
Recommended for easier delivery:
- Use a personal inbox as recipient first (for example
*@gmail.com) instead of school/work email systems. - Add your Brevo sender address (the actual
Fromaddress shown in Brevo logs) to your Contacts/Safe Senders list.
Note: Free-plan limits and policies may change. Always check the latest Brevo dashboard information.
Quota reminder: Before large runs, verify your Brevo sending limits and remaining quota.
- Create an API key in Google AI Studio.
- Set these values in
.env:
LLM_BASE_URL=https://generativelanguage.googleapis.com/v1beta/openai
LLM_API_KEY=YOUR_GEMINI_API_KEY
LLM_MODEL=gemini-3.1-flash-lite-preview- Set your interests:
KEYWORDS=agent, video generation, world model, LLM, VLM
MIN_RELEVANCE_SCORE=50
MIN_DAILY_PUSH_COUNT=50Quota reminder: Check your Gemini API quota/rate limits in Google AI Studio before increasing ARXIV_MAX_RESULTS.
In .env, set:
RUN_ONCE=trueARXIV_MAX_RESULTS=5(small and cheap test)ARXIV_TIMEOUT_SECONDS=30(recommended if your network is slow)ARXIV_MAX_RETRIES=3
Run:
PYTHONPATH=src python main.pyWhen RUN_ONCE=true, the app will:
- Enable verbose debug logs automatically
- Ignore
state.db(no dedup and no run persistence)
In .env, set:
RUN_ONCE=falseRUN_HOUR=14(24-hour format)RUN_MINUTE=0TIMEZONE=America/Los_AngelesINIT_EMAIL_ON_STARTUP=true(setfalseif you do not want startup confirmation email)
Run:
PYTHONPATH=src python main.pyIf logs say sent but inbox is empty, delivery is often blocked on the receiver side.
Try these steps:
- Check Spam/Junk folders.
- Add the Brevo sender address to Contacts and Safe Senders.
- Add sender domain allowlist when available:
brevosend.com. - For school/work mail systems, contact IT to allowlist at gateway level.
-
Digest email sentappears but no email received:- App-side sending usually succeeded.
- Check Brevo
Transactional Logsfor final status (delivered,blocked,bounced).
-
No scoring logs appear:
- In normal mode (
RUN_ONCE=false), already-processed papers are deduplicated. - Use
RUN_ONCE=truewhen debugging.
- In normal mode (
-
ReadTimeoutappears when fetching arXiv:- Increase
ARXIV_TIMEOUT_SECONDS(for example60or90). - Reduce
ARXIV_MAX_RESULTS(for example100for daily runs). - Keep
ARXIV_MAX_RETRIESat3or higher for unstable networks.
- Increase
.
โโโ main.py
โโโ requirements.txt
โโโ .env.example
โโโ src/arxivclaw
โโโ config.py
โโโ models.py
โโโ pipeline.py
โโโ clients
โ โโโ arxiv_client.py
โ โโโ llm_client.py
โ โโโ email_client.py
โโโ storage
โโโ state_store.py