Skip to content

Mechanism for creating and populating tables #9

@jthandy

Description

@jthandy

There are multiple examples where we want to supply supplemental data to be joined in with data used for analysis:

  • mapping data to decode status codes received from services. for example: pardot visitor_activities has type and type_name that we're decoding and then mapping to event_action.
  • creating calendar tables. there isn't a standard way to do this in redshift and, after much investigation, the best process for doing this really is joining against a table with all relevant dates in it.

In order to deal with this, we need a procedural way to build and tear down datasets that is 100% integrated into the way we're deploying models. This should likely look like a python script that is integrated into runner.py, and runs multiple python files, each of them responsible for building or tearing down a particular table. Some tables will be best built via code and some will be best loaded from a CSV, so it should be flexible enough to handle multiple methods of data population.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions