|
| 1 | + |
| 2 | + |
| 3 | +# 🚀 Epic: Develop `git-gen3` Tool for Git-Based Gen3 Integration |
| 4 | + |
| 5 | +> Create a Git-native utility to track and synchronize remote object metadata, generate FHIR-compliant metadata, and manage Gen3 access control using `git-sync`. |
| 6 | +
|
| 7 | +--- |
| 8 | + |
| 9 | +## 🧭 Sprint 0: Architecture Spike |
| 10 | + |
| 11 | +### 🎯 Goal: |
| 12 | +De-risk implementation by validating core architectural assumptions and tool compatibility. |
| 13 | + |
| 14 | +### 🔬 Tasks: |
| 15 | +| ID | Task Description | Est. | |
| 16 | +|--------|------------------------------------------------------------------------|------| |
| 17 | +| SPK-1 | Prototype `track-remote` to fetch metadata (e.g., ETag, size) from S3/GCS | 1d | |
| 18 | +| SPK-2 | Simulate `.lfs-meta/metadata.json` usage in Git repo + commit/push | 0.5d | |
| 19 | +| SPK-3 | Test `init-meta` to produce `DocumentReference.ndjson` via `g3t`-style logic | 1d | |
| 20 | +| SPK-4 | Validate `git-sync` role mappings and diffs against Gen3 fence API | 1d | |
| 21 | +| SPK-5 | Evaluate GitHub template DX: hooks, portability, local usage | 0.5d | |
| 22 | + |
| 23 | +### ✅ Deliverables: |
| 24 | +- Prototype CLI for `track-remote` |
| 25 | +- Sample `.lfs-meta/metadata.json` and generated `META/DocumentReference.ndjson` |
| 26 | +- Credential access matrix (S3, GCS, Azure) |
| 27 | +- Feasibility report for Git-driven role syncing via `git-sync` |
| 28 | +- Recommendation on proceeding with full implementation |
| 29 | + |
| 30 | +--- |
| 31 | + |
| 32 | +## 🧭 Sprint 1: CLI Bootstrapping & Remote File Tracking |
| 33 | + |
| 34 | +### 🎯 Goal: |
| 35 | +Create the `git-gen3` CLI structure and implement the ability to track remote cloud objects in Git without downloading them. |
| 36 | + |
| 37 | +### 🔨 Tasks: |
| 38 | +| ID | Task Description | Est. | |
| 39 | +|------|------------------------------------------------------|------| |
| 40 | +| S1-1 | Scaffold `git-gen3` CLI with Click (Python) or Cobra (Go) | 2d | |
| 41 | +| S1-2 | Implement `track` and `track-remote` subcommands | 2d | |
| 42 | +| S1-3 | Write to `.lfs-meta/metadata.json` | 1d | |
| 43 | +| S1-4 | Support auth with AWS, GCS, Azure (env vars + profiles) | 1d | |
| 44 | +| S1-5 | Add `pre-push` hook to validate metadata before push | 1d | |
| 45 | +| S1-6 | Unit tests for `track-remote` and metadata structure | 1d | |
| 46 | + |
| 47 | +### ✅ Deliverables: |
| 48 | +- Functional CLI command: `git-gen3 track-remote s3://...` |
| 49 | +- `.lfs-meta/metadata.json` updated and committed in Git |
| 50 | +- Git hook active for metadata validation |
| 51 | +- CI-ready foundation for next sprint |
| 52 | + |
| 53 | +--- |
| 54 | + |
| 55 | +## 🧭 Sprint 2: Metadata Initialization + FHIR Generation |
| 56 | + |
| 57 | +### 🎯 Goal: |
| 58 | +Transform `.lfs-meta/metadata.json` entries into Gen3-compatible `DocumentReference.ndjson` metadata using FHIR structure. |
| 59 | + |
| 60 | +### 🔨 Tasks: |
| 61 | +| ID | Task Description | Est. | |
| 62 | +|------|--------------------------------------------------------------------|------| |
| 63 | +| S2-1 | Implement `init-meta` to emit `META/DocumentReference.ndjson` | 2d | |
| 64 | +| S2-2 | Populate FHIR fields: `subject`, `context.related`, `attachment` | 1d | |
| 65 | +| S2-3 | Create `validate-meta` command to check metadata completeness | 1d | |
| 66 | +| S2-4 | Write tests for `init-meta` and FHIR formatting | 1d | |
| 67 | +| S2-5 | Document schema, CLI usage, and FHIR integration points | 1d | |
| 68 | + |
| 69 | +### ✅ Deliverables: |
| 70 | +- `git-gen3 init-meta` produces valid FHIR NDJSON |
| 71 | +- Tool handles patient/specimen references |
| 72 | +- Tests validate output conformance |
| 73 | +- Documentation aligns with `g3t upload` workflows |
| 74 | + |
| 75 | +--- |
| 76 | + |
| 77 | +## 🧭 Sprint 3: Git-Sync Integration & Access Control |
| 78 | + |
| 79 | +### 🎯 Goal: |
| 80 | +Replace `collaborator` and `project-management` with Git-based role assignments using `git-sync` and Gen3 fence APIs. |
| 81 | + |
| 82 | +### 🔨 Tasks: |
| 83 | +| ID | Task Description | Est. | |
| 84 | +|------|-------------------------------------------------------------------|------| |
| 85 | +| S3-1 | Integrate `git-sync` YAML/CSV parser into `git-gen3 sync-users` | 2d | |
| 86 | +| S3-2 | Implement dry-run and apply modes for syncing to Gen3 fence | 1d | |
| 87 | +| S3-3 | Add change auditing (diff viewer from Git commits) | 1d | |
| 88 | +| S3-4 | End-to-end test: Git → Gen3 user role propagation | 1d | |
| 89 | +| S3-5 | Write user guide and governance documentation | 1d | |
| 90 | + |
| 91 | +### ✅ Deliverables: |
| 92 | +- `git-gen3 sync-users` CLI reads Git-tracked access config |
| 93 | +- Git diffs capture permission changes over time |
| 94 | +- Gen3 access control (via Fence) is synced reliably |
| 95 | +- Finalized documentation for institutional onboarding |
| 96 | + |
| 97 | +--- |
| 98 | + |
| 99 | +## 📅 Sprint Timeline Summary |
| 100 | + |
| 101 | +| Sprint | Focus | Duration | Deliverables | |
| 102 | +|--------|----------------------------------|----------|-----------------------------------------------| |
| 103 | +| 0 | Architecture validation (spike) | 1 week | Prototypes + greenlight for implementation | |
| 104 | +| 1 | Remote file tracking | 2 weeks | `track-remote`, `.lfs-meta`, validation hooks | |
| 105 | +| 2 | Metadata generation (FHIR) | 2 weeks | FHIR output, `init-meta`, validation tooling | |
| 106 | +| 3 | Git-based access control | 2 weeks | `sync-users`, Git audit trail, Fence sync | |
| 107 | + |
| 108 | +--- |
| 109 | + |
| 110 | +## 🛠 Toolchain |
| 111 | + |
| 112 | +| Purpose | Tool/Stack | |
| 113 | +|------------------------|---------------------------| |
| 114 | +| CLI Language | Python (Click) or Go (Cobra) | |
| 115 | +| Object Store APIs | boto3 (S3), gcsfs, Azure SDK | |
| 116 | +| Metadata Serialization | JSON, FHIR NDJSON | |
| 117 | +| Access Sync | git-sync + Gen3 Fence | |
| 118 | +| Testing | `pytest` or `go test` | |
| 119 | +| Docs | Markdown, GitHub Pages | |
| 120 | + |
| 121 | +--- |
| 122 | + |
| 123 | +## 🧭 Sprint 4: User Testing, Documentation, and Release Planning |
| 124 | + |
| 125 | +### 🎯 Goal: |
| 126 | +Conduct functional and usability testing, finalize user documentation, and prepare for internal/external release of the `git-gen3` tool. |
| 127 | + |
| 128 | +--- |
| 129 | + |
| 130 | +### 🔨 Tasks: |
| 131 | +| ID | Task Description | Est. | |
| 132 | +|------|------------------------------------------------------------------------------|------| |
| 133 | +| S4-1 | Recruit early adopters from internal teams or pilot projects | 0.5d | |
| 134 | +| S4-2 | Collect and triage feedback via GitHub issues or survey | 1d | |
| 135 | +| S4-3 | Perform functional validation of all workflows (track, init-meta, sync) | 1d | |
| 136 | +| S4-4 | Finalize and polish all CLI command help strings and usage messages | 0.5d | |
| 137 | +| S4-5 | Write end-user guide (markdown or GitHub Pages) with examples and FAQs | 1d | |
| 138 | +| S4-6 | Create changelog and release notes for v1.0 | 0.5d | |
| 139 | +| S4-7 | Define release checklist and governance process (e.g., approval flow) | 0.5d | |
| 140 | +| S4-8 | Tag first release, publish GitHub release, optionally register PyPI/Homebrew| 0.5d | |
| 141 | + |
| 142 | +--- |
| 143 | + |
| 144 | +### ✅ Deliverables: |
| 145 | +- End-user documentation published and linked from the repo |
| 146 | +- Feedback collected from test users and incorporated as GitHub issues |
| 147 | +- Final `v1.0.0` tag and release notes |
| 148 | +- Optional: Package published to PyPI (Python) or Homebrew (Go binary) |
| 149 | + |
| 150 | +--- |
| 151 | + |
| 152 | +### 📅 Sprint Timeline Summary (Updated) |
| 153 | + |
| 154 | +| Sprint | Focus | Duration | Deliverables | |
| 155 | +|--------|----------------------------------|----------|-----------------------------------------------| |
| 156 | +| 0 | Architecture validation (spike) | 1 week | Prototypes + greenlight for implementation | |
| 157 | +| 1 | Remote file tracking | 2 weeks | `track-remote`, `.lfs-meta`, validation hooks | |
| 158 | +| 2 | Metadata generation (FHIR) | 2 weeks | FHIR output, `init-meta`, validation tooling | |
| 159 | +| 3 | Git-based access control | 2 weeks | `sync-users`, Git audit trail, Fence sync | |
| 160 | +| 4 | Testing, docs, release planning | 1 week | Docs, feedback, `v1.0.0` release | |
| 161 | + |
| 162 | + |
| 163 | +--- |
0 commit comments