The Research Project Template provides two main entry points for pipeline operations:
run.sh- Main entry point for manuscript pipeline operations (Interactive TUI)uv run python scripts/execute_pipeline.py --project {name} --core-only- Core 10-stage DAG pipeline without LLM features
The Research Project Template follows a thin orchestrator pattern where all business logic resides in infrastructure/ and projects/{name}/src/ modules, while entry points and scripts act as lightweight coordinators.
┌─────────────────────────────────────────────────────────────┐
│ User Interface │
│ run.sh → execute_pipeline.py → PipelineExecutor │
└─────────────────────┬───────────────────────────────────────┘
│ delegates to
▼
┌─────────────────────────────────────────────────────────────┐
│ Orchestration Layer │
│ scripts/00_*.py … scripts/07_*.py → infrastructure/ modules │
│ projects/{name}/scripts/*.py → projects/{name}/src/ │
└─────────────────────┬───────────────────────────────────────┘
│ implements
▼
┌─────────────────────────────────────────────────────────────┐
│ Business Logic │
│ infrastructure/ (reusable) + projects/{name}/src/ (custom) │
└─────────────────────────────────────────────────────────────┘
Layer 1: Entry Points (Thin Orchestrators)
run.sh: Bash menu system that delegates to Python orchestratorsexecute_pipeline.py: Python pipeline coordinator usingPipelineExecutorexecute_multi_project.py: Multi-project orchestration usingMultiProjectOrchestrator- Purpose: User interface and high-level coordination only
Layer 2: Stage Scripts (Thin Orchestrators)
scripts/00_*.py–scripts/07_*.py: Import frominfrastructure/for business logic (numbered stage entry points; not all run in a single--core-onlypass)projects/{name}/scripts/*.py: Import fromprojects/{name}/src/for business logic- Purpose: Stage-specific coordination and I/O handling
Layer 3: Business Logic (Actual Implementation)
infrastructure/: Generic, reusable algorithms and utilitiesprojects/{name}/src/: Project-specific scientific code and analysis- Purpose: All computational logic and algorithms
graph TD
A[User] --> B[run.sh]
B --> C[execute_pipeline.py]
C --> D[PipelineExecutor]
D --> E[scripts/00_setup_environment.py]
D --> F[scripts/01_run_tests.py]
D --> G[scripts/02_run_analysis.py]
D --> H[scripts/03_render_pdf.py]
E --> I[infrastructure.core.runtime.environment]
F --> J[infrastructure.reporting.test_reporter]
G --> K[infrastructure.core.runtime.script_discovery]
H --> L[infrastructure.rendering.RenderManager]
K --> M[projects/{name}/scripts/*.py]
M --> N[projects/{name}/src/]
✅ Separation of Concerns: Clear boundaries between orchestration and computation ✅ Reusability: Infrastructure modules work across all projects ✅ Testability: Business logic isolated and thoroughly tested ✅ Maintainability: Changes to algorithms don't affect orchestration ✅ Extensibility: New projects inherit infrastructure
✅ CORRECT: Thin Orchestrator Pattern
# scripts/03_render_pdf.py (orchestrator)
from infrastructure.rendering import RenderManager
def run_render_pipeline():
renderer = RenderManager() # Import business logic
pdf = renderer.render_pdf("manuscript.tex") # Delegate computation
return validate_output(pdf) # Orchestrate validation❌ INCORRECT: Violates Architecture
# scripts/03_render_pdf.py (WRONG - implements logic)
def render_pdf_to_tex(content):
# Business logic in orchestrator - WRONG!
lines = content.split('\n')
tex_lines = []
for line in lines:
if line.startswith('# '):
tex_lines.append(f'\\section{{{line[2:]}}}')
# ... complex rendering logic ...
return '\n'.join(tex_lines)The template now supports multiple research projects in a single repository. You can:
- Run individual projects:
./run.sh --project <name> --pipeline - Run all projects sequentially:
./run.sh --all-projects --pipeline - Interactive project selection:
./run.sh(shows menu of available projects)
Projects are discovered dynamically from projects/ (see infrastructure.project.discovery.discover_projects()). Authoritative names: _generated/active_projects.md (see _generated/README.md for policy and regeneration). Examples in this guide use code_project as the stable control-positive layout under projects/.
Archived and in-progress work lives under projects_archive/ and projects_in_progress/ and is not executed by ./run.sh until moved into projects/.
# Interactive project selection
./run.sh
# Run specific project
./run.sh --project code_project --pipeline
# Run all projects sequentially
./run.sh --all-projects --pipeline
# Alternative orchestrator (all projects)
uv run scripts/execute_multi_project.pyrun.sh provides an interactive menu for all manuscript pipeline operations:
./run.sh============================================================
Manuscript Pipeline - Main Menu
============================================================
⚙️ CORE STAGES
0 Setup Environment
1 Run Tests (infra + project)
2 Run Analysis Scripts
3 Render PDF
4 Validate Output
5 LLM Scientific Review
6 LLM Translations
🚀 ORCHESTRATION
7 Core Pipeline [+infra] [-LLM] Stages [1/9]..[9/9] (no LLM stages)
8 Full Pipeline [+infra] [+LLM] Stages [1/9]..[9/9] + optional LLM stages
9 Full Pipeline (fast) [-infra] [+LLM] Skip infra tests
📚 MULTI-PROJECT
a All projects full [+infra] [+LLM] [+report]
b All projects full (fast) [-infra] [+LLM] [+report]
c All projects core [+infra] [-LLM] [+report]
d All projects core (fast) [-infra] [-LLM] [+report]
🔧 PROJECT MANAGEMENT
p Change Active Project [Current: <project_name>]
i Show Project Info
q Quit
============================================================
Verifies the environment is ready for the pipeline.
- Checks Python version (requires >=3.10)
- Verifies dependencies are installed
- Confirms build tools (pandoc, xelatex) are available
- Validates directory structure
- Sets up environment variables
Executes the test suite with coverage validation.
- Runs infrastructure tests (
tests/infra_tests/) with 60%+ coverage threshold - Runs project tests (
projects/{name}/tests/) with 90%+ coverage threshold - Generates HTML coverage reports for both suites
- Generates structured test reports (JSON, Markdown)
Coverage Reports: htmlcov/index.html
Executes project analysis scripts with progress tracking.
- Discovers scripts in
projects/{name}/scripts/ - Executes each script in order with progress tracking
- Collects outputs to
projects/{name}/output/
Generates manuscript PDFs with progress tracking.
- Processes
projects/{name}/manuscript/markdown files - Converts to LaTeX via pandoc
- Compiles to PDF via xelatex
- Also runs analysis scripts first (option 2)
Output: projects/{name}/output/pdf/
Validates build quality with reporting.
- Checks generated PDFs for issues
- Validates markdown references
- Checks figure integrity
- Generates validation reports (JSON, Markdown)
Generates AI-powered manuscript reviews using local Ollama LLM.
- Checks Ollama availability and selects best model
- Extracts full text from combined PDF manuscript
- Generates executive summary, quality review, methodology review, and improvement suggestions
- Saves all reviews to
projects/{name}/output/llm/
Requires: Running Ollama server with at least one model installed. Skips gracefully if unavailable.
Generates multi-language technical abstract translations.
- Translates abstract to configured languages (see
projects/{name}/manuscript/config.yaml) - Uses local Ollama LLM for translation
- Saves translations to
projects/{name}/output/llm/
Requires: Running Ollama server and translation configuration in config.yaml.
Executes the core pipeline (stages 0-6) without LLM features.
- Runs all core stages: Setup → Tests → Analysis → PDF → Validate
- Stops on first failure with clear error messages
- Suitable for CI/CD environments
Executes the full pipeline (9 stages displayed as [1/9] to [9/9], with an initial clean step shown as [0/9]):
- All core stages (setup → tests → analysis → PDF → validate → copy)
- LLM review and translations (optional, requires Ollama)
- Automatic checkpointing and resume capability
Note: The pipeline stages are displayed as [1/9] to [9/9] in progress logs. Clean Output Directories is displayed as a pre-step ([0/9]).
Executes the full pipeline but skips infrastructure tests.
- Useful for multi-project execution where infrastructure tests may have already passed
- Runs project tests only to save time in development workflows
# Core Build Operations
./run.sh --pipeline # Run pipeline (9 stages displayed as [1/9] to [9/9], clean shown as [0/9], includes optional LLM stages)
./run.sh --pipeline --resume # Resume from last checkpoint
./run.sh --infra-tests # Run infrastructure tests only
./run.sh --project-tests # Run project tests only
./run.sh --render-pdf # Render PDF manuscript only
# LLM Operations (requires Ollama)
./run.sh --reviews # LLM manuscript review only (English)
./run.sh --translations # LLM translations only
# Show help
./run.sh --helpFor programmatic access or CI/CD integration, use the Python orchestrator:
# Core pipeline (10-stage DAG) - Python orchestrator
uv run python scripts/execute_pipeline.py --project {name} --core-onlyFeatures:
- Eight executor stages by default (clean, setup, infrastructure tests, project tests, analysis, PDF rendering, validation, copy outputs). Omit infrastructure tests with
--skip-infra(seven stages). - No LLM stages (uses scripts through
05_copy_outputs.pyfor the main path;06/07are optional add-ons for LLM and multi-project reporting) - No LLM dependencies required for
--core-only - Suitable for automated environments
- Checkpoint/resume support:
uv run python scripts/execute_pipeline.py --project {name} --core-only --resume
| Stage | Script | Purpose |
|---|---|---|
| 00 | 00_setup_environment.py |
Environment setup & validation |
| 01 | 01_run_tests.py |
Run test suite (infrastructure + project) |
| 02 | 02_run_analysis.py |
Discover & run projects/{name}/scripts/ |
| 03 | 03_render_pdf.py |
PDF rendering orchestration |
| 04 | 04_validate_output.py |
Output validation & reporting |
| 05 | 05_copy_outputs.py |
Copy final deliverables to output/ |
| 06 | 06_llm_review.py |
LLM manuscript review & translations (optional, requires Ollama) |
| 07 | 07_generate_executive_report.py |
Executive summaries & dashboards (multi-project only) |
--core-only runs the executor stages through copy outputs and does not run 06 or 07; those are optional or multi-project entry points.
| Entry Point | Pipeline Stages | LLM Support | Use Case |
|---|---|---|---|
./run.sh |
Main entry point | Optional | Interactive menu or manuscript pipeline with LLM |
./run.sh --pipeline |
Full 10 stages | Optional | Manuscript pipeline with LLM |
uv run python scripts/execute_pipeline.py --project {name} --core-only |
Core stages | None | Core pipeline, CI/CD automation |
# Main dispatcher
./run.sh
# Direct access to manuscript operations
./run.sh
# Run manuscript pipeline
./run.sh --pipeline
# Resume manuscript pipeline from checkpoint
./run.sh --pipeline --resume
# Run core pipeline (Python)
uv run python scripts/execute_pipeline.py --project {name} --core-onlyIndividual stages can also be run directly via Python:
uv run scripts/00_setup_environment.py # Setup environment
uv run scripts/01_run_tests.py # Run tests only
uv run scripts/01_run_tests.py --verbose # Run tests with verbose output
uv run scripts/02_run_analysis.py # Run project scripts
uv run scripts/03_render_pdf.py # Render PDFs only
uv run scripts/04_validate_output.py # Validate outputs only
uv run scripts/05_copy_outputs.py # Copy final deliverables
uv run scripts/06_llm_review.py # LLM manuscript review
uv run scripts/06_llm_review.py --reviews-only # Reviews only
uv run scripts/06_llm_review.py --translations-only # Translations only- 0: Operation succeeded
- 1: Operation failed - review errors and fix issues
- 2: Operation skipped (e.g., Ollama not available for LLM review)
The scripts automatically set:
PROJECT_ROOT: Repository root directoryPYTHONPATH: Includes root, infrastructure, andprojects/{name}/src
You can override by setting before running:
export LOG_LEVEL=0 # Enable debug logging
./run.sh --pipeline| Variable | Default | Description |
|---|---|---|
LLM_MAX_INPUT_LENGTH |
500000 |
Max chars to send to LLM. Set to 0 for unlimited. |
LLM_REVIEW_TIMEOUT |
300 |
Timeout per review in seconds |
LLM_LONG_MAX_TOKENS |
4096 |
Maximum tokens per review response |
The scripts use strict error handling:
- Stops immediately on first failure
- Provides clear error messages
- Shows which stage/operation failed
- Returns to menu after each operation (interactive mode)
Example error output:
✗ Infrastructure tests failed
Operation completed in 45s
Press Enter to return to menu...
Make the script executable:
chmod +x run.shVerify conftest.py is in the repository root and contains proper path setup.
Check pyproject.toml [tool.coverage.report] for coverage thresholds. Increase test coverage in tests/ and projects/{name}/tests/.
Ensure pandoc and xelatex are installed:
# macOS
brew install pandoc
brew install --cask mactex
# Ubuntu/Debian
sudo apt-get install -y pandoc texlive-xetex texlive-fonts-recommendedEnsure Ollama is running:
# Start Ollama server
ollama serve
# Install a model (if needed)
ollama pull llama3-gradientscripts/README.md- Stage orchestrators documentationscripts/AGENTS.md- scripts documentationAGENTS.md- system documentationCLOUD_DEPLOY.md- Headless / cloud server deployment guide ⭐core/workflow.md- Development workflowRUN_GUIDE.md- Pipeline orchestration reference (this document)
For a fresh headless server (Ubuntu/Debian), all dependencies including uv are installed
automatically when you invoke any non-interactive pipeline flag:
# 1. Install system deps (LaTeX + git + curl)
sudo apt-get update && sudo apt-get install -y \
curl git python3 pandoc \
texlive-xetex texlive-latex-extra texlive-fonts-recommended
# 2. Clone the repository
git clone https://github.com/docxology/template.git && cd template
# 3. Run — uv is installed automatically, .venv is synced
./run.sh --pipelineThe MPLBACKEND=Agg environment variable is required on headless servers (no display):
export MPLBACKEND=Agg
./run.sh --pipeline📖 Full guide: See
CLOUD_DEPLOY.mdfor system prerequisites, optional dependency groups, Docker alternative, Ollama setup, and troubleshooting.