In the context of environmental impact assessment, data often comes in the form of time series, where observations of various metrics are recorded at specific points in time with associated durations. The Impact Framework inherently works with time series data, as each observation within a component's inputs
array includes a timestamp
indicating the start time of the observation and a duration
specifying the length of the observation period. The collection of these timestamped observations forms the time series for that particular component.
However, a manifest file can encompass numerous components, each potentially having its own time series data. These individual time series might vary in their start and end times, their temporal resolution (frequency of observations), and may even contain gaps where data is missing. This heterogeneity in time series can pose challenges for tasks such as aggregation, visualisation, and performing meaningful comparisons between different components.
To address these challenges, Impact Framework provides a built-in time-sync
feature. This feature allows for the synchronisation of time series across all the components within a manifest file's tree. The time-sync
feature takes a global start time, end time, and a common interval (resolution) as configuration. It then processes each component's time series to conform to this global temporal configuration. This process involves several key steps:
- Upsampling: Each time series is first upsampled to a common base resolution (typically 1 second) to create a more granular foundation.
- Gap Filling: Any discontinuities or gaps within a component's time series (periods where no data was recorded) are filled in with "zero objects". These are observations with the same structure as the real data but with all usage metrics set to zero, effectively assuming no activity occurred during the missing periods.
- Start and End Time Alignment: The
time-sync
feature compares the start and end times of each component's time series with the globally defined start and end times. If a component's data starts after the global start, the beginning of its time series is padded with "zero objects" to align the start times. Conversely, if a component's data starts before the global start, the initial observations are trimmed. A similar trimming logic is applied to the end times to ensure all time series end at the global end time. - Batching to Interval: Once the start and end times are synchronised and any discontinuities are handled, the upsampled time series are then batched together into time bins according to the globally specified
interval
. This ensures that all synchronised time series have an identical temporal resolution.
The time-sync
feature can be configured within a component's pipeline
in the manifest file. While it can be placed at various points in the pipeline, it is generally recommended to position time-sync
as the first plugin whenever possible. This is because operating on fewer parameters earlier in the pipeline can improve efficiency, and it ensures that all subsequent plugins receive consistently synchronised time series data. However, specific use cases might necessitate placing time-sync
at a different point in the pipeline, and users should be aware of the potential impact this might have on the final results. The padding with zero values can also be toggled off in the manifest if a strict requirement for continuous, originally provided time series exists.
By yielding synchronised time series across all components, the time-sync
feature greatly facilitates visualisation, intercomparison, and is a prerequisite for the framework's aggregation functionality. For more details, refer to the Time documentation.
############################## ORIG COURSE ###############################
🤔
ASIM
This is really a section about “time”. We’ve not really discussed it before. What other time conversations are their worthwhile having?
For instance how to sync time across component, why it matters for aggregation, perhaps how to use the vizualiser for time base analysis, why is time based analysis important?
So far, your IMPs have only had a single element in the inputs
array. However, many applications will actually track usage metrics over time. Therefore inputs
will often be a time series.
When there are multiple observations, the inputs
array contains more than one object. Each one must have a timestamp
and duration
. When the length of inputs
is greater than one, Impact Framework will pass each and every element in inputs
through the plugin pipeline. Every entry in inputs
is treated identically and atomically.
An inputs
array with three timesteps might look as follows:
inputs:
- timestamp: 2023-08-01T00:00
duration: 3600
memory-utilization: 8.8
cpu-utilization: 80
tdp: 205
memory-coefficient: 0.00039
carbon-intensity: 163
- timestamp: 2023-08-02T00:00
duration: 3600
memory-utilization: 2.1
cpu-utilization: 60
tdp: 205
memory-coefficient: 0.00039
carbon-intensity: 163
- timestamp: 2023-08-03T00:00
duration: 3600
memory-utilization: 19.5
cpu-utilization: 70
tdp: 205
memory-coefficient: 0.00039
carbon-intensity: 163
When your IMP runs, Impact Framework applies each plugin in your pipeline to each element in your inputs array. In outputs
, although the values might vary, the fields in each element will be identical, because an identical set of plugins was executed.
With the three-entry inputs
described above, your outputs could look as follows:
outputs:
- timestamp: 2023-08-01T00:00
duration: 3600
memory-utilization: 8.8
cpu-utilization: 80
tdp: 205
memory-coefficient: 0.00039
carbon-intensity: 163
memory-energy-kwh: 0.003432
tdp-multiplier: 0.912
cpu-power: 186.96
cpu-energy-kwh: 0.18696000000000002
energy: 0.190392
operational-carbon: 31.033896000000002
- timestamp: 2023-08-02T00:00
duration: 3600
memory-utilization: 2.1
cpu-utilization: 60
tdp: 205
memory-coefficient: 0.00039
carbon-intensity: 163
memory-energy-kwh: 0.0008190000000000001
tdp-multiplier: 0.804
cpu-power: 164.82000000000002
cpu-energy-kwh: 0.16482000000000002
energy: 0.165639
operational-carbon: 26.999157
- timestamp: 2023-08-03T00:00
duration: 3600
memory-utilization: 19.5
cpu-utilization: 70
tdp: 205
memory-coefficient: 0.00039
carbon-intensity: 163
memory-energy-kwh: 0.007605
tdp-multiplier: 0.858
cpu-power: 175.89
cpu-energy-kwh: 0.17589
energy: 0.183495
operational-carbon: 29.909685
Try this yourself in Exercise 6!