Skip to content

Add request archive data lake design doc#188

Open
alekhya1098 wants to merge 1 commit into
saayam-for-all:mainfrom
alekhya1098:request-archive-data-lake-design
Open

Add request archive data lake design doc#188
alekhya1098 wants to merge 1 commit into
saayam-for-all:mainfrom
alekhya1098:request-archive-data-lake-design

Conversation

@alekhya1098

Copy link
Copy Markdown

Summary

This PR adds a design document for archiving closed requests data from the operational database into an S3-based data lake.

The design covers:

  • Problem statement and goals
  • Functional and non-functional requirements
  • Proposed AWS architecture
  • Extraction strategy and tradeoffs
  • S3 layout, partitioning, Parquet format, and lifecycle policy
  • Glue Data Catalog and Athena query layer
  • Schema evolution strategy
  • PII and privacy handling
  • Alternatives considered
  • Failure modes and observability
  • Cost estimate at current and 10x scale
  • Rollout and rollback plan
  • Open questions for the team

Notes

This is a design-doc-only change. No pipeline implementation is included.

Issue

References #175

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant