At the moment the final product of the current_elections stack is only defined as input to the outcodes lambda (
|
"dest_bucket_name": pollingstations_private_data.bucket_name, |
|
"dest_path": f"addressbase/{self.dc_environment}/current_elections_parquet", |
)
This means if we wanted another step which checked the data we have written (eg the single source of addressbase, or #21 ) then we have to copy paste the output. I have a feeling glue can't handle the level of partitioning involved, so I'm not sure it makes sense to have it as a GlueTable. So probably want to define a new model and create it based on whatever that is.
At the moment the final product of the
current_electionsstack is only defined as input to the outcodes lambda (dc-data-baker/cdk/stacks/current_elections.py
Lines 201 to 202 in 4b7da07
This means if we wanted another step which checked the data we have written (eg the single source of addressbase, or #21 ) then we have to copy paste the output. I have a feeling glue can't handle the level of partitioning involved, so I'm not sure it makes sense to have it as a
GlueTable. So probably want to define a new model and create it based on whatever that is.