Skip to content

sahilgundu/tier1-swiss-bank-regulatory-reporting-lakehouse-gcp

Regulatory Reporting Lakehouse — Tier-1 Swiss Bank (GCP) · Sanitized Case Study

Docs Version CI Pre-Commit License: MIT

Docs-only, sanitized case study. No executable code. Names/data are illustrative; patterns are production-style and interview-ready.

⚙️ Note on Structure:
This repository represents a docs-only case study.
The following folders exist purely as placeholders to illustrate a production-grade layout:

  • dbt/ – transformation layer (models, macros, configs)
  • composer/ – orchestration DAGs for dataflow jobs and checks
  • sql/ – validation or reporting queries used in regulatory pipelines

These folders contain no executable code in this sanitized version; they are retained to preserve authentic Data Engineering structure.

L2 Architecture

%% Source duplicated from assets/l2_arch.mmd
%% Keep GitHub-friendly: plain rectangles, \n for line breaks
flowchart TB
  subgraph Perimeter[VPC‑SC + CMEK]
    DS[(Ingestion)]
    LS[Bronze]
    SS[Silver]
    BQ[(BigQuery raw/stage/gold)]
    CMP[Orchestration]
    DLP[PII Controls]
  end
  DS --> LS --> SS --> BQ --> EX[Exports]
  CMP -. schedules .-> DS & SS & BQ & EX

Loading

Docs Index

Note: This is a docs-only repo. Folders dbt/, composer/, and sql/ are placeholders to show production structure.

Quick Facts

Area Choice
Cloud Google Cloud Platform (GCP)
Core services (design target) BigQuery, Cloud Storage, Pub/Sub, Dataflow/Dataproc, Cloud Composer, DLP, KMS, VPC‑SC
Workload types T+1 batch reporting; near-real-time surveillance feeds
Data layers Bronze (raw), Silver (validated), Gold (reg marts/views)
Controls Contracts, DQ, lineage, SLOs, audit trail, PII minimization
Compliance mapping BCBS 239, MiFID II, EMIR, Basel III (illustrative)
Export pattern XBRL/CSV + manifest + checksums handed to RegOps
Safety Docs-only. Folders dbt/, composer/, sql/ are placeholders (no runnable artifacts).

Executive summary

  • Problem: Legacy reg stacks were siloed and audit-weak.
  • Solution: GCP lakehouse blueprint with contracted ingestion, curated gold marts, DQ/lineage, and automated export packs.
  • Impact (illustrative): T+1 SLA met; fewer manual exceptions; lower BigQuery scan costs; zero‑touch pipeline replays.

Program context & timeline

During the parent‑bank integration (2023–2025), statutory filings remained the duty of the Swiss legal entities until merger dates. The lakehouse was designed/overseen under the integration umbrella. For the Swiss legal entity, the transformation ran mid‑2023 → mid‑2024, with migration and cut‑over in July 2024 ahead of broader rollout.


Repository structure (docs-only)

  • assets/ — Mermaid architecture diagrams.
  • contracts/specs for datasets & governance (YAML).
  • docs/ — business context, controls, SLOs, lineage, exports, FAQ, runbooks.
    • docs/patterns/pseudocode for orchestration, DQ, and gold marts (annotated examples).
  • ADRs/ — architectural decisions.
  • samples/ — tiny, obviously fake deliverable examples (manifest/checksums/CSV).
  • dbt/, composer/, sql/placeholders only (README + .gitkeep).

This repository intentionally excludes runnable code. See docs/patterns/* for non‑executable examples.

Reuse this pattern

  • Adapt contracts in contracts/datasets/*.yml.

  • Implement orchestration per docs/patterns/orchestration.md.

  • Model gold marts from docs/patterns/gold_marts.md.

  • Export packs via docs/exports.md structure.


🧭 Git Study References

This repository includes my personal Git study notes and live command tracker that demonstrate disciplined version control used across all my Data Engineering projects.

File Purpose
tools/README.md How this works + daily routine. One-click or manual flow to append today’s git ... commands and push.
docs/git-commands-cheatsheet.md Comprehensive reference of essential Git commands with context, explanations, and workflow notes.
tools/git-snippets.sh Running log of actual Git commands executed during repo setup, experiments, and cleanup tasks.

These files show my applied Git workflow discipline — from branching and commits to CI/CD hygiene — following industry practices for production-grade repositories.

License

MIT.

About

GCP-based Regulatory Reporting Lakehouse — Tier-1 Swiss Bank (Simulated Case Study):- Documentation-only repo illustrating a cloud-native data lakehouse architecture for regulatory reporting on Google Cloud Platform (GCS + BigQuery + Dataflow + Composer). Includes ADRs, runbooks, and compliance data contracts.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages