AI agent for scientific data extraction
Part of Schmidt OxRSE Workshop (Sep 11–20, 2025)
Curaitor Agent is an AI-powered tool designed to extract, organize, and process scientific data.
It provides:
- A web interface for running the agent.
- Model Context Protocol (MCP) inspector integration to test tools and server connections.
https://curaitor-agent-docs.readthedocs.io/latest/
curl -LsSf https://astral.sh/uv/install.sh | shgit clone [email protected]:ritesh001/curaitor-agent.git
cd curaitor-agentuv syncchoose the model you want to use under llm:
- provider: openai
- model: "gpt-5-mini"
- send your gmail email address to [email protected] to be added to the user pool
Create .env file in the agent folder with your
OPENAI_API_KEY=
OPENROUTER_API_KEY=
GMAIL_CREDENTIALS_PATH=
GMAIL_TOKEN_PATH=secrets/token.jsonThis will work for 1 hour.
uv run python curaitor_agent_v2/gmail_create_token.pyuv run adk webThis runs the literature RAG workflow orchestrated by LangGraph using your config and API keys.
uv run python -m curaitor_agent.langraph_pipeline --query "your research question"Automate the pipeline via cron or macOS launchd.
- Edit crontab:
crontab -e
- Add lines (update absolute paths):
SHELL=/bin/zshPATH=/opt/homebrew/bin:/usr/local/bin:/usr/bin:/bin0 7 * * * cd /absolute/path/to/curaitor-agent && /opt/homebrew/bin/uv run python scripts/run_daily.py --query "plastic recycling" --max-days 7 --db data/curaitor.sqlite >> logs/langraph_daily.log 2>&1
- Copy
scripts/launchd/curaitor.langraph.sample.plistto~/Library/LaunchAgents/com.curaitor.langgraph.daily.plist - Edit the plist and replace all
/absolute/path/to/curaitor-agentwith your repo path - Ensure log directory exists:
mkdir -p /absolute/path/to/curaitor-agent/logs - Load:
launchctl load ~/Library/LaunchAgents/com.curaitor.langgraph.daily.plistlaunchctl start com.curaitor.langgraph.daily
The CLI wrapper scripts/run_daily.py runs the pipeline and upserts results into data/curaitor.sqlite by default.
- create database
- query database
- search and summarize paper from arxiv
- schedule time of day for daily search
- send email summary to yourself
- send email to [email protected] to be added to the user pool
-
Sync when
requirements.txtis updated:uv sync
-
Add a new package:
uv add package-name
(Don’t forget to update
requirements.txt!)
The MCP Inspector helps verify your MCP server connection and test available tools.
- nvm (Node Version Manager)
- Node.js ≥ 18 (v22 recommended)
-
Install nvm:
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.3/install.sh | bash \. "$HOME/.nvm/nvm.sh"
-
Install Node.js v22:
nvm install 22
-
Verify versions:
node -v # v22.19.0 npm -v # 10.9.3
-
Run the MCP Inspector:
npx @modelcontextprotocol/inspector uv run tools/mcp_server.py
-
In the MCP Inspector UI, click Connect → test tools.
- Ensure you’re using Node.js v22.x when running the inspector.
- Always keep your environment in sync with
requirements.txtfor reproducibility.
This project is licensed under the MIT License.