Data engineering pipeline for gym data processing leveraging Pyspark, Databricks, and Azure ADLS.
- Developed a robust data engineering pipeline for gym data processing leveraging Pyspark, Databricks, and Azure ADLS.
- Orchestrated ingestion from diverse sources including CSV, JSON files, and Kafka topics for comprehensive data acquisition.
- Implemented efficient data processing workflows utilizing Databricks Unity Catalog for streamlined data management and accessibility.
- Implemented Medallion architecture to strategically structure data within the lakehouse environment.