feat(data-engineering): add data-engineering extension family by jleto · Pull Request #266 · awslabs/aidlc-workflows

jleto · 2026-05-14T21:59:34Z

Summary

Introduces an opt-in data-engineering extension family under aidlc-rules/aws-aidlc-rule-details/extensions/data-engineering/ with seven extensions: baseline, catalog, cicd, orchestration, redshift, s3-lakehouse, and glue-etl.
Each extension follows the established pattern: a rule file (<name>.md) plus an opt-in prompt file (<name>.opt-in.md) with A/B/X format and [Answer]: tag.
Rules carry P0 (blocking) / P1 (warning) / P2 (advisory) severity tiers defined in baseline, and verification bullets keyed to AIDLC stages (Requirements Analysis, Functional Design, Infrastructure Design, Code Generation, Build and Test, NFR Requirements).
glue-etl covers Glue as a compute surface (Spark, Spark Streaming, Ray, Python shell, interactive sessions, Studio, DataBrew) — complementing existing catalog/lakehouse extensions that govern Glue Data Catalog and Lake Formation.

Test plan

Verify each *.opt-in.md file is discovered by the extensions loader at workflow start
Confirm that opting IN for an extension loads the corresponding rules file on demand
Confirm that opting OUT skips loading the full rules file
Run a sample inception flow that opts into baseline plus glue-etl and validates the rules surface in Requirements Analysis
Confirm composition references between extensions (e.g., s3-lakehouse S3LH-02 referenced by glue-etl GLUE-05) resolve as expected

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of the project license.

Introduce opt-in data-engineering extensions for AIDLC: baseline (cross-cutting data pipeline rules), catalog (dataset registration and discovery), cicd (version control and promotion), orchestration (MWAA and Step Functions), redshift (provisioned and Serverless), s3-lakehouse (Iceberg/Delta on S3 with Glue Catalog), and glue-etl (Spark, Streaming, Ray, Python shell, interactive sessions, Studio, DataBrew). Each extension follows the rule + opt-in file pattern with P0/P1/P2 severity tiers and AIDLC-stage-keyed verification.

jleto requested review from a team, Kalindi-Dev, harmjeff, leandrodamascena, raj-jain-aws, scottschreckengaust and spraja08 May 14, 2026 21:59

jleto requested review from a team as code owners May 14, 2026 21:59

github-actions Bot added the rules label May 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(data-engineering): add data-engineering extension family#266

feat(data-engineering): add data-engineering extension family#266
jleto wants to merge 1 commit into
awslabs:mainfrom
jleto:extensions-data-engineering

jleto commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jleto commented May 14, 2026

Summary

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant