Skip to content

This repository showcases all my Tasks which I have completed in data engineering during my 3-month internship.

Notifications You must be signed in to change notification settings

Mariam262/DataEngineering-Internship

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

6 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DataEngineering-Internship

πŸ‘‹ Welcome to my 3-month internship at ByteWise Limited as a Data Engineering intern.Throughout the internship, I worked on a diverse range of tasks that helped me develop a strong grasp of Data Engineering principles.It reflects the culmination of my internship journey, showcasing the skills I acquired in Data Engineering ✌️.

Table of Contents

🌟 Project Overview

During the internship, I worked on a series of tasks assigned by our lead to gain a strong understanding of Data Engineering concepts and practices. The tasks coveres all the steps from beignning to the end to learn Data Engineering, enabling me to develop practical skills and knowledge in the field.✨

πŸš€ Technologies and Tools Used

Programming Languages:

Data Processing Frameworks:

Operating System:

Cloud Services:

πŸ“‚ Folder Structure

The repository is organized into folders, each representing a specific task assigned during the internship. Here's an overview of the folders and their contents:

➑️ Month 1: Covered all the basics of Data Engineering such as:

  • Data Lake, Database, Data Warehouse, Data Marts,Data Mesh
  • OLAP, OLAP
  • ETL, ELT , fundamentals of Structured Query Language (SQL) and Python.

➑️ Month 2: Studied about Microsoft Azure Databricks and covered the following concepts with hands-on experience using Azure Databricks:

  • Databricks Clusters, Databricks Clusters
  • Apache Spark Overview, Using SQL in Spark
  • Databases
  • Tables
  • Views
  • Analysis
  • Data Ingestion – CSV, JSON ,Multiple Files
  • Azure Data Factory

➑️ Month 3: During this Month we have enfolded all the Cloud services and deep knowledge of each Cloud Provider:

  1. AWS Services

    • Extract, Transform and Load (ETL) in AWS

  1. Google Cloud Platform services

    • Extract, Transform and Load (ETL) in GCP

  1. Slowly Changing Dimensions (SCD)

    • Sound knowledege of types of SCD's as mention below and pratciced SCD Type II on any dataset. πŸ‘‰ SCD Type 1 πŸ‘‰ SCD Type 2 πŸ‘‰ SCD Type 3

πŸ’₯ Project

The project was for a client with a social media app, who requires a comprehensive solution architecture for cloud-based data management. I did the following in my project, πŸ‘‰ Providing a cloud platform
πŸ‘‰ Devise a comprehensive end-to-end solution
πŸ‘‰ Conduct a thorough cost analysis for the project.

πŸ† Results and Achievements

Throughout the internship, I successfully completed all the assigned tasks, gaining valuable experience and insights into Data Engineering practices.

βœ”οΈ Conclusion

Feel free to explore the repository and its tasks to get a comprehensive understanding of the project and my capabilities in Data Engineering.

About

This repository showcases all my Tasks which I have completed in data engineering during my 3-month internship.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published