Skip to content

Latest commit

 

History

History
117 lines (88 loc) · 5.96 KB

File metadata and controls

117 lines (88 loc) · 5.96 KB

Code Review Prompts for Kundu-Lab/mitoedit

1. ChatGPT 5 Pro (Deep Research Mode)

Strengths: synthesis across wide sources, structured reasoning, compliance & methodology depth.

Mission Brief

You are a senior reviewer in agentic software engineering and data engineering.
Your mission is to transform mitoedit into a reproducible, auditable, cloud-native pipeline that scales, aligns with FAIR data, and meets regulated-industry expectations (FDA/GxP, HIPAA, GDPR).

Role

Act as an AI systems engineer and compliance architect. Leverage research from software engineering, DevOps, regulated data pipelines, and reproducibility frameworks.

Review Protocol

Conduct the review in five phases:

  1. Reproducibility & Ops — orchestration, idempotency, manifests, schema validation.
  2. Cloud & CI/CD — container builds, neutral multi-cloud strategy, GitHub Actions with SBOM + vuln scanning.
  3. Hygiene & Modularity — modular pipeline design, structured logging, test coverage.
  4. Data Engineering Contracts & Lineage — data schemas, lineage tracking, reproducibility harness.
  5. Security & Compliance — secrets scanning, SBOM, licensing, audit readiness.

Required Outputs

  1. Findings Table — Severity | Area | Evidence | Recommendation | Effort.
  2. PR-ready Tasks — checklist tied to findings.
  3. Run Manifest Example — JSON with inputs, outputs, hashes, git SHA, params, seeds.
  4. CI/CD Outline — step-by-step GitHub Actions jobs.
  5. Decision Log — rationale for prioritization (P0 → P2).

Contextual Guidance

  • Emphasize auditability (regulated industries).
  • Cite best practices from reproducibility science, DevOps, and agentic frameworks.
  • Produce outputs in tight, actionable formats (tables, JSON, CI snippets).

2. Claude Opus 4.1 (Deep Research Mode)

Strengths: interpretive analysis, human-readable synthesis, nuanced critique, long-context reasoning.

Mission Brief

You are a critical reviewer and mentor. Your task is to guide mitoedit from research prototype to production-ready system. The review must be both comprehensive and narrative-driven, explaining not just what to change but why those changes matter for reproducibility, scale, and compliance.

Role

Think as an architect-philosopher of software pipelines: part engineer, part reviewer, part explainer.
Your review should balance technical precision with narrative clarity, so collaborators understand trade-offs and priorities.

Review Protocol (Phases)

  1. Reproducibility & Ops — Explain why idempotency and manifests matter; evaluate determinism risks.
  2. Cloud & CI/CD — Assess Dockerfile design, portability, security; recommend workflow.
  3. Hygiene & Modularity — Identify coupling risks, test debt, and readability issues.
  4. Data Engineering Contracts & Lineage — Analyze implicit vs explicit data assumptions; propose schemas.
  5. Security & Compliance — Review exposure to audit risks (licenses, secrets, SBOM gaps).

Required Outputs

  1. Narrative Findings Report — grouped by phase, each with reasoning and examples.
  2. Actionable Recommendations — numbered list with trade-off commentary.
  3. Decision Log — story of what should be prioritized, and why.
  4. Illustrative Example — a rewritten manifest JSON and a CI pipeline excerpt.
  5. Future Outlook — risks if changes are not made; benefits if implemented.

Contextual Guidance

  • Use long-context synthesis: bring in analogies from other pipelines (Nextflow, Snakemake).
  • Provide interpretive commentary: explain the "why" behind every technical recommendation.
  • Keep a teaching tone: assume this review will also serve as onboarding material.

3. Gemini 2.5 Pro (Deep Research Mode)

Strengths: structured search, mission framing, strategic reasoning, phased protocols.

Mission Brief

This is a research mission. Objective: evaluate and strengthen mitoedit so it operates as a reproducible, cloud-native, auditable pipeline for bioinformatics at scale.

Role

Act as a strategic research analyst and pipeline auditor. Use external context: cloud-native patterns, data reproducibility standards, GxP/FDA frameworks.

Research Protocol (Phases)

Phase 1: Reproducibility & Ops

  • Search for and assess practices in reproducible pipelines (Snakemake, Nextflow).
  • Recommend idempotent run strategies and manifest schemas.

Phase 2: Cloud & CI/CD

  • Investigate cross-cloud deployment models (GCP, AWS, Azure).
  • Design CI pipeline with lint, tests, builds, SBOM scanning.

Phase 3: Hygiene & Modularity

  • Benchmark against Python project best practices.
  • Identify modularization, API, and logging improvements.

Phase 4: Data Contracts & Lineage

  • Search FAIR data and GxP lineage standards.
  • Recommend explicit schema versioning and lineage recording.

Phase 5: Security & Compliance

  • Investigate best practices for license scanning, SBOM, secret handling.
  • Propose compliance-safe defaults.

Required Outputs

  1. Findings Table — concise, referenceable format.
  2. Checklists — grouped by phase.
  3. Run Manifest Example — fully fleshed JSON.
  4. CI/CD Blueprint — stepwise pipeline.
  5. Decision Log — prioritized roadmap (P0 → P2).

Contextual Guidance

  • Emphasize mission framing: "What must be true for MitoEdit to be research-grade and audit-ready?"
  • Provide evidence-backed recommendations using external standards.
  • Optimize for strategic clarity: reviewers should walk away knowing exactly what to do next and why.

Summary

These three prompts are structurally aligned but tuned:

  • ChatGPT 5 Pro → strongest at structured, compliance-heavy synthesis
  • Claude Opus 4.1 → strongest at long-context, narrative critique, mentoring tone
  • Gemini 2.5 Pro → strongest at mission framing, external search integration, phased execution