A pipeline to retrieve data from the Superhero API, and export it to a csv file. You can read more about this API here: https://superheroapi.com/index.html. This project includes the pipeline script, a simple test suite, and a scheduled action/cron job script
- A Github access token is required to authenticate with the API. You can quickly login to your Github and generate one using the link above.
- If you do not have a Python virtual env set up, run
python -m venv /path/to/new/virtual/environmentto create it, and activate it by running./<venv>\Scripts\activateon windows or<venv>\Scripts\activate.baton linux - Run
pip install -r requirements.txt - To run the pipeline, run
python transformation.pyfrom the project directory and follow the prompts. The csv will be generated in the same directory
All main pipeline code is confined to one python script, for simplicity. The script is divided into functions for: API calls, data manipulation, and exporting. If this project were to grow, I envision a more modular, scaled setup with individual packages for each of the functions I listed.