OpenResearchFlow

English | 中文

OpenResearchFlow is an agentic research automation workspace for turning research goals, papers, and experiment evidence into executable plans, runnable experiment repositories, paper drafts, figures, titles, paper reviews, and reference-quality check reports.

What OpenResearchFlow Does
Architecture
Quick Start
Configure LLM and API Keys
Start and Stop Services
One-Command Mock Demo
Functional Modules
Frontend Surface
Pipeline Handoff
Interface Preview
Artifact Contract
Development

What OpenResearchFlow Does

OpenResearchFlow focuses on the paper-to-experiment-to-paper loop:

Generate and validate research ideas from a target research problem.
Convert promising ideas into experiment plans and implementation workspaces.
Run rapid validation and full experiment stages.
Generate paper drafts, figures, titles, paper reviews, and reference-quality reports.

The active backend modules live under backend/src/modules:

idea_gen_v2
idea2exp
exp_gen_v2
exp_run
draft_gen
paper2figure
paper2title
papercheck, also referred to as paper2check in product discussions
review_gen

Other module directories are historical or compatibility code unless explicitly called by one of the active modules above.

Architecture

OpenResearchFlow runs as a Next.js frontend plus a FastAPI gateway and Python workflow modules. The frontend owns module entry points, run creation, status views, artifact inspection, and workspace Runtime/API Key configuration. The backend owns workflow execution, model calls, experiment execution, and artifact persistence.

frontend/
  src/app/                  Next.js app routes
  src/app/workspace/        workspace pages for module run management
  src/core/                 typed API clients, loaders, hooks, and utilities

backend/
  src/gateway/              FastAPI run-management and artifact APIs
  src/modules/              active workflow modules and historical modules
  src/agents/               lead-agent runtime and thread state
  src/tools/, src/sandbox/  tool and execution infrastructure

Typical run flow:

Workspace page or home module launcher
  -> FastAPI gateway run API
  -> backend/src/modules/<module>/workflow.py
  -> backend/output/<module>/...
  -> event logs, reports, and artifacts loaded back into the frontend

Default local ports: frontend at http://127.0.0.1:3001, gateway at http://127.0.0.1:8101. The frontend proxies /api/* to the gateway via next.config.js, so the browser does not need a separate backend address.

Quick Start

The shortest path from a clean checkout to a running pipeline in the browser.

1. Install local tools

You need node, pnpm, uv, and curl. Rendering real PDFs (Draft Gen) also needs a LaTeX engine; on macOS/Homebrew tectonic is recommended:

make check            # verify node / pnpm / uv / curl are present
brew install tectonic # optional: required only for Draft Gen PDF rendering

2. Install dependencies

make install          # backend: uv sync;  frontend: pnpm install

3. Generate local config files

make config

make config copies three local config files from safe templates (existing files are not overwritten):

Generated file	Template	Purpose
`.env`	`.env.example`	Main backend config: LLM / retrieval / parsing API keys
`frontend/.env`	`frontend/.env.example`	Frontend local overrides, usually no change needed
`config.yaml`	`config.example.yaml`	Gateway / module runtime parameters

4. Configure your LLM API key

Edit the root .env and fill in at least one LLM provider (see the next section). To just look at the UI first, skip this step and run the mock demo.

5. Start the services

make demo             # optional: seed mock data that needs no API key
make dev              # start gateway + frontend

Open http://127.0.0.1:3001/workspace to start using it.

Configure LLM and API Keys

Backend configuration is centralized in the root .env (generated by make config from .env.example). The template is grouped by dependency type; fill in only the providers you actually use and leave the rest empty.

Tip: the frontend Runtime/API Keys settings page can override some model/service parameters in the browser for quick trials, but backend modules that read environment variables directly still need the matching values in .env. Do not put private API keys in frontend/.env.

Required: LLM provider

Most generation modules (idea_gen_v2, idea2exp, exp_gen_v2, draft_gen, paper2title, review_gen) depend on an LLM provider. Fill in at least one set in .env:

# Project-native / compatible gateway
DF_API_URL=https://your-llm-gateway/v1
DF_API_KEY=sk-...

# Or an OpenAI-compatible endpoint
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_API_KEY=sk-...

# Default generation model and language
IDEA_GEN_MODEL=gpt-4o
IDEA_GEN_LANGUAGE=en

Optional: everything else

Dependency group	Key variables	Modules
Paper / web retrieval	`SEMANTIC_SCHOLAR_API_KEY`, `AI4SCHOLAR_API_KEY`, `SERPAPI_API_KEY`, `TAVILY_API_KEY`	`idea_gen_v2`, `idea2exp`, `draft_gen`, `papercheck`, `review_gen`
PDF parsing	`MINERU_API_TOKEN` (or a local MinerU service)	`idea2exp`, `draft_gen`, `paper2figure`, `review_gen`
Figure generation	`TEXT_API_URL` / `TEXT_API_KEY`, `IMAGE_API_URL` / `IMAGE_API_KEY`	`paper2figure`
Experiment execution agents	`PAPERAGENT_EXP_RUN_CODEX_`, `PAPERAGENT_EXP_RUN_CC_`	`exp_run`
ExpGen plan/gen/eval agents	`EXP_GEN_PLAN_`, `EXP_GEN_GEN_`, `EXP_GEN_EVAL_*`	`exp_gen_v2`
Repository access	`GITHUB_TOKEN`, `EXP_GEN_GITHUB_TOKEN`	`exp_gen_v2`, `exp_run`

Every block in .env.example is annotated with its purpose; see docs/open-source/runtime-config.md for the full guide. Keep local secret files such as .env and .env.cc out of git.

Start and Stop Services

make dev runs scripts/start-services.sh, which:

checks that uv, pnpm, and curl are available;
starts the gateway in backend/ via uv run uvicorn and waits for /health;
starts the frontend in frontend/ via pnpm next dev and waits for /workspace;
writes logs to logs/ and process PIDs to logs/pids/.

make dev                       # equivalent to ./scripts/start-services.sh
./scripts/start-services.sh    # or run the script directly

With the conda environment:

conda run -n pzw-dev make demo
conda run -n pzw-dev make dev

On success the script prints the frontend and gateway URLs along with each workspace route.

Common scripts and port overrides:

./scripts/check-services.sh    # report whether gateway / frontend are running
./scripts/stop-services.sh     # stop services (same as make stop)
make clean                     # stop services and clean local logs

# Override ports when the defaults (3001 / 8101) are taken
FRONTEND_PORT=4001 GATEWAY_PORT=9101 make dev

Tail the live logs:

tail -f logs/frontend.log
tail -f logs/gateway.log

One-Command Mock Demo

Open-source users can inspect the full OpenResearchFlow pipeline without configuring provider keys:

make config
make demo
make dev

Open:

Home: http://127.0.0.1:3001
Mock start: http://127.0.0.1:3001/workspace/idea-gen/openresearchflow-mock-idea-001
Idea2Exp: http://127.0.0.1:3001/workspace/idea2exp/openresearchflow-mock-idea2exp-001
ExpGen V2: http://127.0.0.1:3001/workspace/exp-gen_v2/openresearchflow-mock-expgen-001
ExpRun: http://127.0.0.1:3001/workspace/exp-run/openresearchflow-mock-exprun-001
Draft Gen: http://127.0.0.1:3001/workspace/draft-gen/openresearchflow-mock-draft-001
Mock end: http://127.0.0.1:3001/workspace/review-gen/openresearchflow-mock-review-001

make demo writes deterministic mock runs for idea_gen_v2 -> idea2exp -> exp_gen_v2 -> exp_run -> draft_gen -> papercheck -> review_gen, plus DraftGen PDF handoff data for paper2figure and paper2title. See docs/open-source/demo.md.

Functional Modules

Module	Main role	Typical inputs	Key outputs	Frontend
`idea_gen_v2`	Generates research seeds, hypotheses, ideas, validation evidence, critique, and final reports.	Research target, domain, venue, run strategy	Idea report, stage artifacts, validation evidence	`Idea Gen`
`idea2exp`	Converts an idea card and upstream evidence into a structured experiment plan using retrieval and paper extraction.	Idea card, related papers, parser/search credentials	Experiment plan, extracted evidence	`Idea2Exp`
`exp_gen_v2`	Generates milestone-based experiment implementation plans, code workspaces, reviews, and handoff artifacts.	Idea/experiment plan, reference repos, generation constraints	Milestone workspace, repo plan, handoff artifacts	`ExpGen V2`
`exp_run`	Executes rapid validation and full experiment stages.	Generated repo, run matrix, agent runtime credentials	Smoke-test results, full run outputs, final report	`ExpRun`
`draft_gen`	Generates and validates paper drafts from idea, experiment, parse, and reference artifacts.	Idea artifacts, experiment artifacts, references, parsed paper context	LaTeX draft, section artifacts, citation checks	`Draft Gen`
`paper2figure`	Generates charts or conceptual figures from paper context, tables, and related-work evidence.	Paper text, table text, image/text model credentials	Figure assets, review artifacts	`Paper2Figure`
`paper2title`	Generates concise English academic titles from paper body text.	Paper body text, title-generation model	Candidate title	`Paper2Title`
`papercheck`	Parses BibTeX or PDF references, runs static and optional online checks, and reports reference health.	BibTeX/PDF references, optional online-check settings	Reference-quality report	`Paper Check`
`review_gen`	Parses a paper PDF and generates review feedback across novelty, technical correctness, experimental rigor, clarity, and impact.	Paper PDF, language, paper-search settings, extra context	Review report, annotations, key references, related-work analysis	`Paper Review`

Each module has a dedicated flow and design note with a Mermaid flowchart:

Frontend Surface

The home page at / is the OpenResearchFlow module overview. It introduces active modules, module capabilities, and the OpenResearchFlow architecture diagram.

The workspace sidebar intentionally shows only current user-facing module surfaces:

idea_gen_v2 as Idea Gen
idea2exp as Idea2Exp
exp_gen_v2 as ExpGen V2
exp_run as ExpRun
draft_gen as Draft Gen
papercheck as Paper Check
review_gen as Paper Review
paper2figure as Paper2Figure
paper2title as Paper2Title

The built-in DeerFlow chat feature and historical preview modules are not part of the primary OpenResearchFlow navigation. /workspace redirects to /workspace/idea-gen. Each module page owns run creation, upstream session import, status display, artifact inspection, and module-level Runtime/API Key configuration.

Pipeline Handoff

OpenResearchFlow can continue from one module run into the next module without manually copying paths or context:

idea_gen_v2 -> idea2exp -> exp_gen_v2 -> exp_run -> draft_gen
                                             |
                                             -> papercheck / review_gen
draft_gen -> paper2figure / paper2title

Frontend handoff helpers compact upstream run fields, artifact paths, and summaries into downstream initial inputs. UpstreamSessionImporter imports existing runs into module forms, and the shared artifact preview uses the same relative-path contract for text, JSON, PDF, and image outputs.

Interface Preview

The screenshots below come from the mock demo and reproduce locally without any API key.

Home module overview:

Module	Interface
Idea Gen
Idea2Exp
ExpGen V2
ExpRun
Draft Gen
Paper2Figure
Paper2Title
Paper Check
Paper Review

Artifact Contract

Module artifacts use this shared layout:

backend/output/<module>/<run_id>/
  run.json
  events.jsonl
  <artifact files and directories>

run.json.artifacts must contain relative paths. Frontend artifact previews and downstream session imports rely on those stable paths. See docs/open-source/artifact-contract.md for the complete contract and key artifacts by module.

Development

Run the frontend in isolation (when the gateway runs elsewhere):

cd frontend
pnpm dev

Backend tests:

cd backend
uv run pytest tests/ -v

Frontend verification:

cd frontend
pnpm typecheck
pnpm lint

Other conventions:

Gateway APIs are mounted under /api/... and proxied by frontend/next.config.js during local development.
Module run artifacts are stored under backend/output/<module>/... unless a module accepts an explicit output path.
Long-running experiment modules may require model provider credentials, GitHub credentials, dataset access, or CLI-agent configuration.
Local .env, .env.cc, generated output, and ad-hoc research artifacts should stay uncommitted unless they are intentionally promoted into docs or tests.
README figure prompts are saved next to the generated images under docs/assets.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

OpenResearchFlow

Contents

What OpenResearchFlow Does

Architecture

Quick Start

Configure LLM and API Keys

Required: LLM provider

Optional: everything else

Start and Stop Services

One-Command Mock Demo

Functional Modules

Frontend Surface

Pipeline Handoff

Interface Preview

Artifact Contract

Development

Uh oh!

FilesExpand file tree

README.en-US.md

Latest commit

History

README.en-US.md

File metadata and controls

OpenResearchFlow

Contents

What OpenResearchFlow Does

Architecture

Quick Start

Configure LLM and API Keys

Required: LLM provider

Optional: everything else

Start and Stop Services

One-Command Mock Demo

Functional Modules

Frontend Surface

Pipeline Handoff

Interface Preview

Artifact Contract

Development