Skip to content

nickefy/Data-Engineering---Gmail-Pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data-Engineering---Gmail-Pipeline

Gmail Data Pipeline Automatically extracting, transforming and loading data from your Gmail Inbox into your preferred data warehouse on a daily basis An automation system that better organises your Gmail attachments into your db. Keep only what you need and scrap the rest  Easy to use and no hassle. Stop downloading your attachments and uploading it into your data warehouse manually

This repo contains the main operators and the DAG to execute the Pipeline.

Operators to execute the Pipeline in order:

1.Crawl through the Gmail Inbox and download all attachments into GCS

2.Check if there are any attachments to be loaded

3.Load all the attachments into Google Bigquery

4.Checking for any duplication of load in Google Bigquery 

5.Write Logs

6.Send Email

documentation: https://towardsdatascience.com/data-engineering-how-to-build-a-gmail-data-pipeline-on-apache-airflow-ce2cfd1f9282

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages