This repository contains a sample Spark job in Python. The script demonstrates:
- Reading data from a CSV file into a Spark DataFrame (titanic dataset).
- Performing a simple transformation (grouping by the Age column and counting).
- Optionally running a user-provided SQL query on the data (via Valohai parameters).
- Writing the transformation and optional SQL results to disk.