WikiTree Intelligence is a local-first genealogy workbench for reconciling GEDCOM data with WikiTree.
The core job is not bulk import. The core job is:
- finding likely existing WikiTree matches before creating duplicates
- preserving durable match memory between runs
- resuming large imports safely
- surfacing missing matches and data discrepancies
- preparing later sync-review items for already-matched profiles
This repo is in active development.
✅ Google Authentication And App Session Boundary (PR #10, merged 2026-04-08)
- Frontend: React
AuthProviderrestores auth state on app load - Backend: FastAPI login/logout/current-user endpoints are live
- Session cookies: Starlette
SessionMiddlewarepersists app session state - User flow: Google sign-in, returning-session restore, and logout all work
- Coverage: UI 20 tests, API 16 tests with 93.66% backend coverage
✅ Database Spine with SQLModel Tables and State Machines (PR #12, merged 2026-04-09)
- SQLModel table definitions for all 15 minimum v1 tables (single source of truth)
- StrEnum-based state machines: ImportJobStatus (7 states), ImportJobStageStatus (5 states), MatchReviewStatus (5 states)
- Explicit transition validation functions with terminal state detection
- Async database engine initialization with auto table creation
- Database-level enum constraints for state validation
- Type-safe status fields for Pydantic validation at API layer
- 52 passing state machine tests with 100% state_machines.py coverage
- Coverage: 81.84% backend (meets 80% requirement)
✅ WikiTree OAuth Integration (PR #30, merged 2026-04-09)
- WikiTree API client with async httpx for login/logout/profile operations
- Session manager for WikiTree connection state with 30-day expiry tracking
- REST API routes: connect/initiate, callback, disconnect, status, profile retrieval
- Browser-based OAuth-like flow with backend-owned session mapping
- Security: open redirect prevention, backend-only WikiTree user_id storage
- UI: WikiTree settings page with connect/disconnect flow
- Coverage: 114 backend tests (92.24%), 44 frontend tests (100% passing)
✅ Worker Package Scaffold (PR #33, pending merge 2026-04-13)
- Separate
apps/worker/package with FastAPI structure - Health endpoints:
/health/live(liveness) and/health/ready(readiness) - Worker ID auto-generation from hostname and PID
- Docker integration with docker-compose.yml
- CI workflows for worker tests, linting, and Docker builds
- Comprehensive README with architecture, scaling, and troubleshooting guides
- Coverage: 2 tests passing, 100% worker routes coverage
Next: Import job infrastructure (PR6) - see pr6-import-job-plan.md for detailed implementation plan.
Planning and architecture documentation:
office-hours-design.md— approved product/design docimplementation-plan.md— PR-by-PR build planpr6-import-job-plan.md— detailed PR6 implementation speceng-review-test-plan.md— test strategy and critical flowsTODOS.md— deferred follow-up work
Note: PR5 (WikiTree dump ingestion) deferred until dump access is available. Building job infrastructure first.
apps/api/— Python backendapps/worker/— Background worker process for staged import/search jobsapps/ingestion/— WikiTree dump loading service (runs weekly)apps/ui/— React + TypeScript frontendapps/api/tests/— backend testsapps/ui/tests/— frontend testse2e/— Playwright end-to-end testsmigrations/— database migrations- shared Docker volume — raw uploaded GEDCOM storage in local development
- PostgreSQL — WikiTree dump cache (refreshed weekly) + app data
docker-compose.yml— local orchestration
- Google-authenticated app session
- WikiTree-authenticated private-data reads through the backend
- WikiTree weekly dump cache for fast local search (millions of profiles)
- hybrid search: local dump first, API supplement when needed
- staged, resumable GEDCOM imports
- background worker execution for large import/search jobs
- one canonical person/relationship model
- snapshot-backed review receipts and evidence packets
- outward traversal through resolved matches
- later sync-review queue for GEDCOM facts not yet in WikiTree
- every PR must touch fewer than 10 files
- every PR must be easy to review
- repo coverage target is at least 80%
- critical flows get Playwright coverage
- Python 3.12+
- Node.js 18+
- Google OAuth credentials (get from Google Cloud Console)
cd apps/api
# Create virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -e ".[dev]"
# Configure environment (copy .env.example to .env and fill in values)
cp .env.example .env
# Run development server
uvicorn api.app:app --reloadBackend runs at http://localhost:8000
API docs at http://localhost:8000/docs
cd apps/ui
# Install dependencies
npm install
# Configure environment (copy .env.example to .env and fill in values)
cp .env.example .env
# Run development server
npm run devFrontend runs at http://localhost:5173
Backend:
cd apps/api
source .venv/bin/activate
pytest -vFrontend:
cd apps/ui
npm run test
npm run test:ci # with coverageContinue with the next unfinished boundary from
implementation-plan.md: WikiTree connection, import job
storage/worker execution, and the canonical data model.