-
Notifications
You must be signed in to change notification settings - Fork 0
Put metadata for each column under the stats for that column #122
Description
Hi @SirMore ,
Thanks so much for adding the metadata on operations to the output of the quends results. As I parse the resulting output dictionaries, (e.g. the compute_statistics() output), I see that you provide a key for each column processed with the stats results. And then there is a separate "metadata" key that has the metadata for all columns.
I was wondering if it would make more sense to add a metadata key under the results of the stats for each column, rather than have the metadata be a separate key in the results. For example:
{'Q_D/Q_GBD': {'confidence_interval': (34.88, 40.4),
'effective_sample_size': 24,
'mean': 37.64,
'mean_uncertainty': 1.41,
'pm_std': (36.23, 39.05),
'window_size': 49,
'sss_start': 162.5,
'metadata': [{'operation': 'is_stationary'},
{'operation': 'trim',
'options': {'batch_size': 50,
'method': 'rolling_variance',
'robust': True,
'start_time': 0,
'threshold': 0.5}},
{'operation': 'effective_sample_size',
'options': {'alpha': 0.05}},
{'operation': 'compute_statistics',
'options': {'ddof': 1,
'method': 'non-overlapping',
'window_size': None}}]
}
In this example above, I also moved sss_start out of the metadata and into the stats results for that column.
I can see how the way you had it takes an operational view, as the metadata can originate from one operation being applied to multiple columns. However, I think adding the metadata in with each column result would make it easier to parse the metadata automatically. If a user grabs the statistics for one column, all the metadata will be retained rather than having to remember to grab the metadata separately and store it in a different object.
What are your thoughts?