Skip to content

Add new fields for metadata tracking #64

@natemcintosh

Description

@natemcintosh

Goal

We would like to add more and better metadata tracking to our Rt model runs. Ideally something close to what a tool like mlflow's tracking can do, but without needing to set up and manage our own central server.

Context

We currently save a number of fields from the config into the metadata output of the model, e.g. disease, geo_value, task_id, job_id, etc. As we use our new Rt model more and more, new use cases arise that require new metadata fields and tags. To that end, we would like to add three more fields to the configs generated here.

Required

  • group_id: A string, default null (json) / None (python), otherwise a string describing the group. The group id can be applied to a collection of job ids that make up, eg. a back test (API_v2_backtest), or the job ids that make up production on a given week (2025-05-16-production), some other group ...
  • tags: A dict[str, Any], default {}. Can add as many key: value tags as you like. Could be useful for noting which runs are production runs, or that the run is of importance to a certain person, or the git hash of the model used. It's definitely worth thinking about if some of these should be broken out into their own top-level key: value pairs. This was mostly inspired by mlflow's set of default tags. Potential issue: a dynamic dictionary won't play nicely in our downstream conversion to a polars dataframe. Maybe we make this just a list of strings?
  • notes: null A place for writing notes for our later selves. E.g. "because of issue X, we decided to use a different prior for parameter Y".

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentationmetadataFor managing metadata throughout the Rt estimation process

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions