This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Remote EU job board aggregator. Next.js 16 frontend + GraphQL API backed by Cloudflare D1 (SQLite), with an AI/ML pipeline for job classification, skill extraction, and resume matching.
# Dev
pnpm dev # Start Next.js dev server (localhost:3000)
pnpm build # Production build
pnpm lint # ESLint (next lint)
# GraphQL codegen — run after modifying any schema/**/*.graphql file
pnpm codegen
# Database
pnpm db:generate # Generate Drizzle migration files
pnpm db:migrate # Apply locally with Drizzle Kit
pnpm db:push # Apply migrations to remote D1
pnpm db:studio # Drizzle Studio
# Evaluation
pnpm eval:langfuse # Run LLM classification eval with Langfuse Datasets
# Strategy enforcement
pnpm strategy:check # Validate staged changes against optimization strategy
pnpm strategy:check:all # Validate all tracked files
# Scripts
pnpm jobs:ingest # Ingest jobs from ATS platforms
pnpm jobs:enhance # Enhance all jobs with ATS data
pnpm jobs:status # Check ingestion status
pnpm jobs:extract-skills # Extract skills during ingestion
pnpm skills:extract # Extract skills from jobs
pnpm skills:seed # Seed skill taxonomy
pnpm boards:discover # Discover Ashby boards
pnpm janitor:trigger # Manually trigger janitor worker
# Workers
wrangler deploy --config wrangler.d1-gateway.toml # Deploy D1 gateway
wrangler tail --config wrangler.d1-gateway.toml # Stream gateway logs
cd workers/ashby-crawler && wrangler dev # Ashby crawler local dev
# Deployment
pnpm deploy # Vercel deploy (runs scripts/deploy.ts)
wrangler deploy --config workers/ashby-crawler/wrangler.toml # Ashby crawlerProduction: Next.js (Vercel) → D1 Gateway Worker (CF) → D1 Database (binding)
Dev fallback: Next.js → Cloudflare REST API → D1 Database
The D1 Gateway Worker (workers/d1-gateway.ts) supports batched queries — prefer batching for multi-query operations. Authenticated via API_KEY secret. Database schema is in src/db/schema.ts (Drizzle ORM), migrations in migrations/.
1. Discovery: ATS Sources (D1) --[Cron Worker]--> Trigger Ingestion
2. Board Crawl: Common Crawl CDX --[ashby-crawler (Rust)]--> Ashby boards → D1
3. Ingestion: ATS APIs (Greenhouse/Lever/Ashby) --[Insert Worker]--> D1
4. Enhancement: Job IDs --[Trigger.dev / GraphQL Mutation]--> ATS API --> D1
5. Classification: Unprocessed jobs --[process-jobs (Python) / DeepSeek]--> is_remote_eu --> D1
6. Skill Extract: Job descriptions --[LLM pipeline]--> Skills → D1
7. Resume Match: Resumes --[resume-rag (Python) / Vectorize]--> Vector search
8. Serving: Browser --[Apollo Client]--> /api/graphql --[D1 HTTP]--> Gateway --> D1
9. Evaluation: Langfuse / Vitest --[LLM calls]--> Accuracy scores
Configuration in codegen.ts. Generates from schema/**/*.graphql into src/__generated__/:
- Client preset (typed
gqlfunction, fragment masking) hooks.tsx— React Apollo hookstypes.ts— TypeScript types (strict scalars)resolvers-types.ts— Resolver types withGraphQLContext
Custom scalars: DateTime/URL/EmailAddress → string, Upload → File, JSON → any.
| Worker | Config | Runtime | Key details |
|---|---|---|---|
janitor |
wrangler.toml |
TypeScript | Daily midnight UTC, triggers ATS ingestion |
d1-gateway |
wrangler.d1-gateway.toml |
TypeScript | On-demand HTTP, D1 binding |
insert-jobs |
wrangler.insert-jobs.toml |
TypeScript | Queue-based, still uses Turso (legacy) |
process-jobs |
workers/process-jobs/wrangler.jsonc |
Python/LangGraph | Every 6h + queue, DeepSeek classification |
ashby-crawler |
workers/ashby-crawler/wrangler.toml |
Rust/WASM | Common Crawl → D1, rig_compat module |
resume-rag |
workers/resume-rag/wrangler.jsonc |
Python | Vectorize + Workers AI |
| Route | Purpose |
|---|---|
/api/graphql |
Apollo Server GraphQL endpoint (main API) |
/api/text-to-sql |
Natural language → SQL query |
/api/enhance-greenhouse-jobs |
Trigger Greenhouse job enhancement |
/api/companies/bulk-import |
Bulk import companies |
/api/companies/enhance |
Enhance company data |
GraphQL Playground: http://localhost:3000/api/graphql. Vercel routes have 60s max duration (vercel.json).
| Layer | Technology |
|---|---|
| Frontend | Next.js 16, React 19, App Router |
| Language | TypeScript 5.9 |
| Database | Cloudflare D1 (SQLite) via D1 Gateway Worker |
| ORM | Drizzle ORM |
| API | Apollo Server 5 (GraphQL) |
| Auth | Clerk |
| AI/ML | Vercel AI SDK, Anthropic Claude (+ Agent SDK), DeepSeek, OpenRouter |
| Background jobs | Trigger.dev, Cloudflare Workers (cron + queues) |
| Observability | Langfuse, LangSmith, OpenTelemetry (partially active) |
| Evaluation | Langfuse, Vitest |
| Deployment | Vercel (app), Cloudflare Workers (workers) |
| Package manager | pnpm 10.10 |
| UI | Radix UI (Themes + Icons) |
- GraphQL schema lives in
schema/(by domain:base/,jobs/,companies/,applications/,prompts/). Query/mutation/fragment documents are insrc/graphql/. - Resolvers are in
src/apollo/resolvers/— job resolvers insrc/apollo/resolvers/job/. - ATS ingestion fetchers:
src/ingestion/{greenhouse,lever,ashby}.ts— primary job discovery channel. - Skills subsystem:
src/lib/skills/— taxonomy, extraction, vector ops, filtering. - AI agents:
src/agents/(Vercel AI SDK — SQL, admin, strategy enforcer),src/anthropic/(Claude client, MCP, sub-agents, architect). - Database tools for agents:
src/tools/database/(introspection + SQL execution). - Rust worker:
workers/ashby-crawler/src/lib.rs—rig_compatmodule implements VectorStore, Pipeline, Tool patterns for WASM (swap torig::*when rig-core ships wasm32 support).
See OPTIMIZATION-STRATEGY.md for the full strategy document. Key constraints:
| Meta Approach | Status | What It Guarantees |
|---|---|---|
| Eval-First | PRIMARY | Every prompt/model change tested against >= 80% accuracy bar |
| Grounding-First | PRIMARY | LLM outputs schema-constrained; skills validated against taxonomy |
| Multi-Model Routing | SECONDARY | Cheap model first (Workers AI), escalate on low confidence only |
| Spec-Driven | CROSS-CUTTING | GraphQL + Drizzle + Zod schemas as formal contracts |
| Observability | EMERGING | Langfuse scoring active; production tracing partial |
The strategy enforcer (src/agents/strategy-enforcer.ts) is available as a plain async function.
- Files: kebab-case (
jobs-query.ts). Components: PascalCase (JobsSearchBar.tsx). - DB columns: snake_case. GraphQL fields: camelCase. Variables: camelCase.
- Path alias:
@/*maps to./src/*(tsconfig.json). - Module type: ES Modules (
"type": "module"). - Use Drizzle ORM for all DB queries — no raw SQL strings.
- Mutations that modify production data must include
isAdminEmail()guard (fromsrc/lib/admin.ts). - Prefer generated types from
src/__generated__/resolvers-types.tsoverany. - React providers:
*-provider.tsxinsrc/components/. - Bash scripts: Never create
.shfiles or bash scripts in repository. Use bash tool for simple commands only (e.g.,git status,npm run build). Complex operations should use Task tool with agents.
Copy .env.example to .env.local. Key groups: D1 Gateway (or Cloudflare REST API for dev), Clerk auth, AI provider keys (Anthropic, DeepSeek, OpenAI, Gemini), Langfuse/LangSmith observability, admin email, app URL. See .env.example for full list.
- Full table scan in
src/apollo/resolvers/job/enhance-job.ts— fetches all jobs to find one byexternal_id. - N+1 queries for skills, company, and ATS board sub-fields — no DataLoader.
- CORS on D1 Gateway is
*. - No GraphQL query complexity/depth limiting.
ignoreBuildErrors: trueinnext.config.tsmasks TS errors in builds.- 283+
anytypes in resolvers.
scripts/ingest-jobs.tsstill documents Turso env vars in its help text (stale).@libsql/clientandpgare likely unused after D1 migration — can be removed from dependencies.
@ai-sdk/anthropicpinned to"latest"— should use specific version.@libsql/clientandpgare likely unused after D1 migration.
cargo install worker-build # Install WASM build tool (once)
cd workers/ashby-crawler && wrangler dev # Local dev
wrangler deploy --config workers/ashby-crawler/wrangler.toml # Deploy
wrangler d1 execute nomadically-work-db --remote \
--file workers/ashby-crawler/migrations/0001_init/up.sql # Apply migrationsKey endpoints: /crawl (paginated CC crawl), /boards (list/search), /search (TF-IDF vector search), /enrich / /enrich-all (enrichment pipeline), /tools (OpenAI function-calling schemas), /indexes, /progress, /stats.
Uses agent-teams-lite for spec-driven development. The orchestrator (.claude/commands/sdd.md) delegates work to specialized sub-agents via the Task tool, using native agent teams (TeamCreate, shared task lists, messaging) for parallel phases.
| Path | Contents |
|---|---|
.claude/commands/sdd.md |
SDD orchestrator — routing, DAG detection, team/subagent dispatch |
.claude/skills/sdd-*/SKILL.md |
9 SDD sub-agent skill files (explore, propose, spec, design, tasks, apply, verify, archive, init) |
.claude/skills/improve-*/SKILL.md |
6 job search self-improvement skills (mine, audit, evolve, apply, verify, meta) |
.claude/skills/codefix-*/SKILL.md |
6 codebase self-improvement skills (mine, audit, evolve, apply, verify, meta) |
openspec/ |
Specs, change proposals, designs, and task breakdowns (created by sdd-init) |
.claude/commands/ |
Project-specific commands (build-and-push, gql-agent, improve, codefix, sdd) |
| Command | Action | Mode |
|---|---|---|
/sdd:init |
Bootstrap openspec/ in current project |
single subagent |
/sdd:explore <topic> |
Investigate an idea (no files created) | single subagent |
/sdd:new <change-name> |
Start a new change (creates proposal) | single subagent |
/sdd:continue |
Create next artifact in dependency chain | auto-detect |
/sdd:ff <change-name> |
Fast-forward: create all planning artifacts | agent team (specs+design parallel) |
/sdd:apply |
Implement tasks | agent team if multi-phase |
/sdd:verify |
Validate implementation against specs | single subagent |
/sdd:archive |
Sync specs + archive completed change | single subagent |
- The lead agent NEVER executes phase work inline — always delegate to sub-agents
- Use subagents for single sequential phases (explore, propose, verify, archive)
- Use native agent teams when specs+design or multi-phase apply can parallelize
- Between phases, show the user what was done and ask to proceed
- Keep orchestrator context minimal — pass file paths, not file contents
- Require plan approval for apply teammates (they write code)
- Do NOT force SDD on small tasks (single file edits, quick fixes, questions)
proposal → specs ──→ tasks → apply → verify → archive
↕
design
Specs and design run in parallel (agent team in /sdd:ff); tasks depends on both; verify is optional but recommended before archive.
Goal-driven team of 6 specialists focused on helping find a fully remote EU AI engineering role. Grounded in autonomous agent research (AutoRefine, Meta Context Engineering, CASTER, ROMA, Phase Transition theory).
| Path | Agent | Mission |
|---|---|---|
.claude/skills/improve-mine/SKILL.md |
Pipeline Monitor | Is the pipeline healthy? Are AI jobs flowing? |
.claude/skills/improve-audit/SKILL.md |
Discovery Expander | Find more companies hiring AI engineers remotely in EU |
.claude/skills/improve-evolve/SKILL.md |
Classifier Tuner | Reduce missed opportunities in remote EU classification |
.claude/skills/improve-apply/SKILL.md |
Skill Optimizer | Better AI/ML skill taxonomy, extraction, and matching |
.claude/skills/improve-verify/SKILL.md |
Application Coach | Learn from application patterns, improve interview prep |
.claude/skills/improve-meta/SKILL.md |
Strategy Brain | Coordinate toward the goal: get hired |
.claude/commands/improve.md |
Orchestrator | Entry point and team coordination |
| Command | Action |
|---|---|
/improve |
Full autonomous cycle (Strategy Brain decides what to do) |
/improve status |
Pipeline health check |
/improve discover |
Find new AI engineering job sources |
/improve classify |
Tune classification accuracy |
/improve skills |
Optimize AI/ML skill matching |
/improve coach |
Application coaching and prep improvement |
| Phase | Focus | Trigger |
|---|---|---|
| BUILDING | Discovery + classification | < 5 AI jobs/week |
| OPTIMIZING | Classifier + skills | Jobs flowing but low relevance |
| APPLYING | Coaching + prep | Good jobs surfacing, need to convert |
| INTERVIEWING | Deep prep + research | Applications converting to interviews |
- stop_hook.py + improvement_agent.py → session scoring and learning
- Langfuse → observability and score trends
- Strategy Enforcer → validates changes align with optimization strategy
- State files in
~/.claude/state/→ continuity across sessions
Separate team focused purely on code quality, performance, type safety, security, and dead code — independent of business goals.
| Path | Agent | Role |
|---|---|---|
.claude/skills/codefix-mine/SKILL.md |
Trajectory Miner | Mine session transcripts for code quality patterns |
.claude/skills/codefix-audit/SKILL.md |
Codebase Auditor | Deep code investigation with file:line findings |
.claude/skills/codefix-evolve/SKILL.md |
Skill Evolver | Improve skills, prompts, CLAUDE.md |
.claude/skills/codefix-apply/SKILL.md |
Code Improver | Implement fixes (perf, types, security, dead code) |
.claude/skills/codefix-verify/SKILL.md |
Verification Gate | Validate changes, run builds, catch regressions |
.claude/skills/codefix-meta/SKILL.md |
Meta-Optimizer | Coordinate, prioritize, track progress |
.claude/commands/codefix.md |
Orchestrator | Entry point |
| Command | Action |
|---|---|
/codefix |
Full autonomous cycle |
/codefix audit [target] |
Targeted audit (resolvers, workers, security, types, etc.) |
/codefix apply |
Implement pending findings |
/codefix verify |
Verify recent changes |
/codefix status |
Show meta-state |
Safety: Max 3 code changes + 2 skill evolutions per cycle. Phase detection (IMPROVEMENT/SATURATION/COLLAPSE_RISK). Mandatory verification. State in ~/.claude/state/codefix-*.json.
- SKILLS-REMOTE-WORK-EU.md — Curated agent skills and subagents for remote EU job market focus.
- OPTIMIZATION-STRATEGY.md — Full Two-Layer Model strategy document.
An MCPDoc MCP server is configured in
.claude/settings.jsonwith docs for Drizzle, Next.js, Vercel AI SDK, Trigger.dev, Cloudflare Workers. Calllist_doc_sourcesto see available sources, thenfetch_docson specific URLs when you need deeper detail on any API.
Setup — always cast D1HttpClient as any (it implements a subset of the CF binding interface):
import { drizzle } from "drizzle-orm/d1";
import { createD1HttpClient } from "@/db/d1-http";
const db = drizzle(createD1HttpClient() as any);Querying — use Drizzle expressions, never raw SQL template literals:
import { eq, and, or, like, inArray, desc, count, sql } from "drizzle-orm";
import { jobs, jobSkillTags } from "@/db/schema";
// Paginate with hasMore trick (avoids extra COUNT on first page)
const rows = await db.select().from(jobs).where(eq(jobs.is_remote_eu, true))
.orderBy(desc(jobs.posted_at)).limit(limit + 1).offset(offset);
const hasMore = rows.length > limit;
// Subquery
const skillFilter = inArray(
jobs.id,
db.select({ job_id: jobSkillTags.job_id }).from(jobSkillTags)
.where(inArray(jobSkillTags.tag, skills))
.groupBy(jobSkillTags.job_id)
.having(sql`count(distinct ${jobSkillTags.tag}) = ${skills.length}`)
);Types — always derive from schema, never hand-write:
import type { Job, NewJob, Company } from "@/db/schema";
// typeof jobs.$inferSelect → Job
// typeof jobs.$inferInsert → NewJobMigration workflow — schema change → generate → apply:
pnpm db:generate # creates migration file in migrations/
pnpm db:migrate # applies locally
pnpm db:push # applies to remote D1Anti-patterns:
- Never write raw SQL strings in resolvers — use Drizzle ORM methods.
- Never use
db.execute(sql\...`)` for application queries — use typed builder. - Never import from
drizzle-orm/pg-coreordrizzle-orm/mysql-core— usedrizzle-orm/sqlite-core.
Docs: fetch_docs on
https://orm.drizzle.team/docs/overview
Context — always type context as GraphQLContext:
import type { GraphQLContext } from "../../context";
import type { QueryJobsArgs, JobResolvers } from "@/__generated__/resolvers-types";
// Query resolver
async function jobsQuery(_parent: unknown, args: QueryJobsArgs, context: GraphQLContext) {
return context.db.select().from(jobs)...;
}
// Field resolver — parent type is the raw Drizzle row (Job)
const Job: JobResolvers<GraphQLContext, Job> = {
async skills(parent, _args, context) {
return context.loaders.jobSkills.load(parent.id); // always use DataLoaders
},
async company(parent, _args, context) {
if (!parent.company_id) return null;
return context.loaders.company.load(parent.company_id);
},
};JSON column pattern — D1 stores JSON as text; always parse in field resolvers:
departments(parent) {
if (!parent.departments) return [];
try { return JSON.parse(parent.departments); }
catch { return []; }
},Boolean columns — D1 returns 0/1 for SQLite integers. Fields defined with { mode: "boolean" } in schema are auto-coerced by Drizzle. Fields without it need manual coercion in resolvers:
is_remote_eu(parent) {
return (parent.is_remote_eu as unknown) === 1 || parent.is_remote_eu === true;
},Admin guard — any mutation that modifies production data must check:
import { isAdminEmail } from "@/lib/admin";
if (!context.userId || !isAdminEmail(context.userEmail)) {
throw new Error("Forbidden");
}Anti-patterns:
- Never query the DB directly inside field resolvers — always go through
context.loaders.*DataLoaders to avoid N+1. - Never use
anyfor context — use the generatedGraphQLContexttype. - Never edit files in
src/__generated__/— they are overwritten bypnpm codegen. - Prefer generated types from
@/__generated__/resolvers-types.tsoveranyin resolver signatures.
Run pnpm codegen after any change to schema/**/*.graphql. Generates into src/__generated__/:
| File | Contents |
|---|---|
types.ts |
TS types for schema (strict scalars) |
resolvers-types.ts |
Resolver types with GraphQLContext |
hooks.tsx |
React Apollo hooks |
typeDefs.ts |
Merged type definitions |
Custom scalar mappings (in codegen.ts): DateTime/URL/EmailAddress → string, JSON → any, Upload → File.
Anti-patterns:
- Never skip codegen after schema changes — stale types cause silent runtime mismatches.
- Never manually edit
src/__generated__/files.
Tasks live in src/trigger/ and must be registered. Pattern:
import { task, logger } from "@trigger.dev/sdk/v3";
import { drizzle } from "drizzle-orm/d1";
import { createD1HttpClient } from "../db/d1-http";
// Lazy DB init — don't create at module level
function getDb() {
return drizzle(createD1HttpClient() as any);
}
export const myTask = task({
id: "my-task", // unique kebab-case, matches trigger.config.ts registration
maxDuration: 120, // seconds
retry: {
maxAttempts: 3,
minTimeoutInMs: 2000,
maxTimeoutInMs: 30000,
factor: 2,
},
queue: { concurrencyLimit: 5 },
run: async (payload: MyPayload) => {
logger.info("Starting task", { ...payload }); // use logger, not console
const db = getDb();
// ... do work
return { success: true };
},
handleError: async (payload, error) => {
const msg = error instanceof Error ? error.message : String(error);
if (msg.includes("404")) {
logger.info("Resource not found, skipping retry");
return { skipRetrying: true }; // prevents retry for known terminal errors
}
logger.error("Task failed", { error: msg });
// return nothing = allow retry
},
});Anti-patterns:
- Never import from
@trigger.dev/sdk— use@trigger.dev/sdk/v3. - Never create the DB client at module level — always lazy-init inside
runor a factory function. - Never use
console.loginside tasks — uselogger.*so logs appear in the Trigger.dev dashboard. - Never forget to export the task — unregistered tasks silently fail to trigger.
Docs: fetch_docs on
https://trigger.dev/docs/tasks-overview
Prefer batching when making multiple independent queries in one request:
// Single exec (via Drizzle — preferred for typed queries)
const result = await context.db.select().from(jobs).where(eq(jobs.id, id));
// Raw batch (for admin scripts / multi-statement operations)
const client = createD1HttpClient();
const [jobsResult, companiesResult] = await client.batch([
"SELECT count(*) FROM jobs",
"SELECT count(*) FROM companies",
]);The D1HttpClient singleton is cached (_cachedClient) — do not re-instantiate per request.
Anti-patterns:
- Never make N sequential
fetch()calls to the D1 gateway when a batch would work. - Never bypass
createD1HttpClient()by constructingD1HttpClientdirectly — the factory handles env var selection and caching.