Skip to content

Commit 3524b57

Browse files
edmundmillerclaude
andcommitted
docs: add CLAUDE.md with project guidance for Claude Code
Add comprehensive documentation file to help Claude Code understand the codebase structure, build commands, architecture, and development workflow for the nf-core stats dashboard project. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1 parent 56eebae commit 3524b57

1 file changed

Lines changed: 62 additions & 0 deletions

File tree

CLAUDE.md

Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
# CLAUDE.md
2+
3+
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
4+
5+
## Project Overview
6+
7+
This is the nf-core stats dashboard project. It consists of two main components:
8+
1. **Evidence.dev frontend** - A data visualization dashboard for nf-core statistics
9+
2. **DLT data pipelines** - Python pipelines that collect data from GitHub, Slack, and Twitter APIs
10+
11+
## Build Commands
12+
13+
### Frontend (Evidence.dev)
14+
```bash
15+
npm install # Install dependencies
16+
npm run sources # Refresh data sources
17+
npm run dev # Start development server
18+
npm run build # Build for production
19+
npm run test # Run tests (builds the project)
20+
```
21+
22+
### Data Pipelines (Python/DLT)
23+
```bash
24+
cd pipeline
25+
uv run python github_pipeline.py # Run GitHub stats collection
26+
uv run python slack_pipeline.py # Run Slack stats collection
27+
```
28+
29+
## Architecture
30+
31+
### Data Flow
32+
1. **Data Collection**: Python DLT pipelines (`pipeline/`) fetch data from external APIs (GitHub, Slack)
33+
2. **Data Storage**: Data is stored in MotherDuck (cloud DuckDB) database `nf_core_stats_bot`
34+
3. **Data Visualization**: Evidence.dev reads from MotherDuck and renders interactive dashboards
35+
36+
### Key Directories
37+
- `pipeline/` - DLT data pipelines for collecting stats
38+
- `pages/` - Evidence.dev markdown pages with SQL queries and visualizations
39+
- `sources/` - Database connection configurations
40+
- `.github/workflows/` - GitHub Actions for daily pipeline runs and Netlify builds
41+
42+
### Database Schema
43+
The MotherDuck database contains tables for:
44+
- `github_traffic_stats` - Repository views and clones
45+
- `github_contributor_stats` - Contributor activity by week
46+
- `github_issue_stats` - Issues and pull requests
47+
- `slack_messages` - Slack channel message counts
48+
- `slack_members` - Slack member statistics
49+
50+
### Environment Variables
51+
Required secrets for pipelines (set in GitHub Actions or local `.env`):
52+
- `SOURCES__GITHUB_PIPELINE__GITHUB__API_TOKEN` - GitHub personal access token
53+
- `SOURCES__SLACK_PIPELINE__SLACK__API_TOKEN` - Slack user token
54+
- `DESTINATION__MOTHERDUCK__CREDENTIALS__DATABASE` - MotherDuck database name
55+
- `DESTINATION__MOTHERDUCK__CREDENTIALS__PASSWORD` - MotherDuck token
56+
57+
## Development Notes
58+
59+
- Evidence pages use SQL queries embedded in markdown to fetch data
60+
- The GitHub pipeline uses incremental loading with merge strategy to update existing records
61+
- Pipelines run daily via GitHub Actions and are monitored with runitor
62+
- The frontend is deployed to Netlify and rebuilt daily after pipeline runs

0 commit comments

Comments
 (0)