FAIRy Core

Local-first validator and packager for FAIR-compliant research datasets.

Why FAIRy

Local-first: All processing is on your machine. Your raw and fixed data never leave without your consent.
Extensible: Add new repository templates or contribute improved schemas/rules — keep up with evolving standards.
Practical: Catch real submission pain points (dates, IDs, vocab, file names), export clean packages, avoid resubmission headaches.

What’s in this repo This repo contains the core validation engine and CLI (e.g., fairy preflight, fairy validate).

✅ Validates tabular metadata against repository-specific rulepacks
✅ Emits machine-readable (JSON) and human-readable (Markdown) reports
✅ Writes attestation & provenance, with optional export bundle (zip)
🧪 Includes intentionally "failing" fixtures for smoketests
🚧 Early alpha; interfaces may change prior to v1.0

💡 Want the full UI experience? For project workspaces, guided fixes, visual workflows, and demo examples, see fairy-lab — a Streamlit-based demo tenant that uses this core engine.

Quick look

Validate (CLI)

Try it in 60 seconds

This produces a Markdown file for the penguins kata like the screenshot above.

python -m venv .venv && source .venv/bin/activate
pip install -e .
mkdir -p .tmp

fairy validate \
  --rulepack rulepacks/examples/penguins/rulepack.yml \
  --inputs default=tests/fixtures/penguins_small.csv \
  --report-json .tmp/penguins_report.json \
  --report-md .tmp/penguins_report.md

less .tmp/penguins_report.md   # press q to quit

# (Optional) Open in VS Code if you have it:
code .tmp/penguins_report.md

Choose your path

UI (fairy-lab): local project workspaces, guided fixes, visual workflow, export bundles. → https://github.com/yuummmer/fairy-lab
CLI (this repo): run fairy preflight (human-friendly default) or fairy validate (engine command) against a rulepack, review JSON/Markdown reports, iterate until submission-ready.

Command overview: validate = checks; preflight = checks + outputs + guidance. Most users should run preflight. See CLI usage for details.

Quickstart

# 0) Python 3.10+
python -m venv .venv && source .venv/bin/activate
pip install -e .

# 1) Check CLI
fairy --help
fairy --version

# 2) (Optional) Rulepack schema validation (example penguins rulepack)
python -m fairy.cli rulepack --rulepack rulepacks/examples/penguins/rulepack.yml

📦 Rulepacks

FAIRy-core is the engine. Rulepacks are versioned and released independently in separate CC0 repos.

Rulepack registry (recommended start here): yuummmer/fairy-rulepack-registry Machine-readable index of available rulepacks + versions.
GEO rulepacks: yuummmer/fairy-rulepacks-geo
- geo_bulk_seq (available)
- geo_single_cell (planned)

This repo (fairy-core) includes only small example/kata rulepacks under rulepacks/examples/ for learning and tests.

🗂️ Rulepack registry

Official (and community) rulepacks are tracked in fairy-rulepack-registry. Tools (and FAIRy Lab) can use this to discover rulepacks.

📌 Machine-readable list of rulepack repos + versions (so tools can discover them)
✅ Schema-validated via CI (prevents broken entries)
🧭 Use it to find the latest GEO/ENA/etc rulepacks without hunting across GitHub

Example: GEO bulk RNA-seq (external rulepack repo)

# Get the GEO rulepack repo (CC0)
git clone https://github.com/yuummmer/fairy-rulepacks-geo.git

# Run preflight using its fixtures
mkdir -p .tmp

fairy preflight \
  --rulepack .deps/fairy-rulepacks-geo/rulepacks/geo_bulk_seq/v0_2_0.json \
  --samples  .deps/fairy-rulepacks-geo/rulepacks/geo_bulk_seq/fixtures/samples.tsv \
  --files    .deps/fairy-rulepacks-geo/rulepacks/geo_bulk_seq/fixtures/files.tsv \
  --out      .tmp/geo_bulk_seq_report.json

# View outputs:
ls .tmp/
cat .tmp/geo_bulk_seq_report.md | head -n 80  # or open in your editor

Available rule types: required, unique, enum, range, dup/no_duplicate_rows, foreign_key, url, non_empty_trimmed, regex

See Rule types reference for complete documentation on all rule types and their configuration options.

Reports

JSON: Structured v1.0.0 schema reports with deterministic ordering (see schemas/preflight_report_v1.schema.json and docs/reporting.md)
Markdown: Curator-friendly one-pager (generated alongside JSON)
Exit code: 0 if no FAIL, else 1

Documentation

For full documentation, see the docs/ folder:

Development

pip install -e ".[dev]"
pytest -q
ruff check . --fix
ruff format .

Developer notes

Source files use SPDX headers:

# SPDX-License-Identifier: AGPL-3.0-only
# Copyright (c) 2025 Jennifer Slotnick

We package rulepacks as package data:

[tool.setuptools.package-data]
fairy = ["rulepacks/**/*.json","rulepacks/**/*.yaml","rulepacks/**/*.yml"]

Local artifacts to ignore are preconfigured (.tmp/, .venv/, pycache/).
Coverage config: .coveragerc, pytest.ini
Demo fixtures: tests/fixtures/*, demos/

Repo layout

src/fairy/
  cli/         # CLI entrypoints (validate, preflight, rulepack)
  core/        # services/models/validators (evolving)
  rulepack/    # loader + schema (YAML)
  rulepacks/
    examples/    # small kata/example rulepacks (CC0)
demos/         # demo rulepacks / scratch data (not shipped)
tests/         # unit + smoke tests
decisions/     # Architecture Decision Records (ADRs)

See Architecture Decision Records for major design decisions and rationale.

License

FAIRy-core uses a mixed licensing model:

Core engine code (src/fairy/**): Licensed under AGPL-3.0-only. See LICENSE.
Example/kata rulepacks (rulepacks/examples/**): Licensed under CC0-1.0. See rulepacks/examples/LICENSE.
Official public rulepacks (separate repos, e.g. fairy-rulepacks-geo): Licensed under CC0-1.0 (each repo includes its own LICENSE).
Samples and fixtures (e.g. samples/**, tests/fixtures/**): Licensed under CC BY-4.0 (documented per folder).

Commercial licensing for FAIRy-core is available for organizations that cannot adopt AGPL. See COMMERCIAL.md or contact [email protected].

Contributing

We welcome contributions! See CONTRIBUTING.md for guidelines on:

How to get started (forking, setup, making changes)
Ways to contribute (rulepacks, tests, documentation, engine improvements)
Specs and PRDs for major features
Licensing of contributions

For information about maintainer roles and module stewards, see MAINTAINERS.md.

Citation

If you use FAIRy in a project, demo, or talk, please cite:

FAIRy (v0.2). Local-first validator for FAIR research data. FAIRy-core (engine): https://github.com/yuummmer/fairy-core FAIRy Lab (UI & labs): https://github.com/yuummmer/fairy-lab

Roadmap

Profiles as workflow composition over rulepacks (see ADR-0007)
Preflight evolving to universal operator mode (profile-based + output-dir oriented)

For the full roadmap, see ROADMAP.md.

Who FAIRy is for

Researchers and lab teams tired of “submission rejected → fix → resubmit” loops.
Anyone handling sensitive/pre-publication data who needs local-first validation.
Curators and institutions who want consistent, transparent, hackable validation templates.

Pilots / institutional use

Institutions and labs interested in pilots or dashboards — we’d love to hear from you. Email [email protected] or open an issue labeled pilot-inquiry.

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
.github/workflows		.github/workflows
decisions		decisions
demos		demos
docs		docs
fixtures/manifest_v1		fixtures/manifest_v1
rulepacks/examples		rulepacks/examples
schemas		schemas
scripts		scripts
src/fairy		src/fairy
tasks		tasks
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
LICENSES.md		LICENSES.md
MAINTAINERS.md		MAINTAINERS.md
NOTICE		NOTICE
README.md		README.md
ROADMAP.md		ROADMAP.md
SECURITY.md		SECURITY.md
THIRD_PARTY_LICENSES.md		THIRD_PARTY_LICENSES.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-dev.txt		requirements-dev.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FAIRy Core

Quick look

Validate (CLI)

Try it in 60 seconds

Choose your path

Quickstart

📦 Rulepacks

🗂️ Rulepack registry

Example: GEO bulk RNA-seq (external rulepack repo)

See Rule types reference for complete documentation on all rule types and their configuration options.

Reports

Documentation

Development

Developer notes

Repo layout

See Architecture Decision Records for major design decisions and rationale.

License

Contributing

Citation

FAIRy (v0.2). Local-first validator for FAIR research data. FAIRy-core (engine): https://github.com/yuummmer/fairy-core FAIRy Lab (UI & labs): https://github.com/yuummmer/fairy-lab

Roadmap

Who FAIRy is for

Pilots / institutional use

About

Uh oh!

Releases 3

Packages

Languages

License

yuummmer/fairy-core

Folders and files

Latest commit

History

Repository files navigation

FAIRy Core

Quick look

Validate (CLI)

Try it in 60 seconds

Choose your path

Quickstart

📦 Rulepacks

🗂️ Rulepack registry

Example: GEO bulk RNA-seq (external rulepack repo)

See Rule types reference for complete documentation on all rule types and their configuration options.

Reports

Documentation

Development

Developer notes

Repo layout

See Architecture Decision Records for major design decisions and rationale.

License

Contributing

Citation

FAIRy (v0.2). Local-first validator for FAIR research data. FAIRy-core (engine): https://github.com/yuummmer/fairy-core FAIRy Lab (UI & labs): https://github.com/yuummmer/fairy-lab

Roadmap

Who FAIRy is for

Pilots / institutional use

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Languages

Packages