ETL Conagua

What is it?

This repository is a small project consisting of an ETL pipeline using Spark Scala and a public API:

Request the follwing endpoint to download the GZIP about weather foerecast in Mexico per day by municipality: https://smn.conagua.gob.mx/tools/GUI/webservices/?method=1
Converts the GZIP into a json file
Reads the data with Spark and write it into a parquet

It is pretty simple, you just need to check if sbt and scala is appropiately installed

To install dependencies:

sbt compile

To health check

sbt "runMain etl.hello.Hello"

If everything went good then run

sbt "runMain etl.Main"

sbt test

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.bsp		.bsp
.vscode		.vscode
data		data
project		project
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt