Managing Time Series Data with the Time Feature

In the context of environmental impact assessment, data often comes in the form of time series, where observations of various metrics are recorded at specific points in time with associated durations. The Impact Framework inherently works with time series data, as each observation within a component's inputs array includes a timestamp indicating the start time of the observation and a duration specifying the length of the observation period. The collection of these timestamped observations forms the time series for that particular component.

However, a manifest file can encompass numerous components, each potentially having its own time series data. These individual time series might vary in their start and end times, their temporal resolution (frequency of observations), and may even contain gaps where data is missing. This heterogeneity in time series can pose challenges for tasks such as aggregation, visualisation, and performing meaningful comparisons between different components.

To address these challenges, Impact Framework provides a built-in time-sync feature. This feature allows for the synchronisation of time series across all the components within a manifest file's tree. The time-sync feature takes a global start time, end time, and a common interval (resolution) as configuration. It then processes each component's time series to conform to this global temporal configuration. This process involves several key steps:

Upsampling: Each time series is first upsampled to a common base resolution (typically 1 second) to create a more granular foundation.
Gap Filling: Any discontinuities or gaps within a component's time series (periods where no data was recorded) are filled in with "zero objects". These are observations with the same structure as the real data but with all usage metrics set to zero, effectively assuming no activity occurred during the missing periods.
Start and End Time Alignment: The time-sync feature compares the start and end times of each component's time series with the globally defined start and end times. If a component's data starts after the global start, the beginning of its time series is padded with "zero objects" to align the start times. Conversely, if a component's data starts before the global start, the initial observations are trimmed. A similar trimming logic is applied to the end times to ensure all time series end at the global end time.
Batching to Interval: Once the start and end times are synchronised and any discontinuities are handled, the upsampled time series are then batched together into time bins according to the globally specified interval. This ensures that all synchronised time series have an identical temporal resolution.

The time-sync feature can be configured within a component's pipeline in the manifest file. While it can be placed at various points in the pipeline, it is generally recommended to position time-sync as the first plugin whenever possible. This is because operating on fewer parameters earlier in the pipeline can improve efficiency, and it ensures that all subsequent plugins receive consistently synchronised time series data. However, specific use cases might necessitate placing time-sync at a different point in the pipeline, and users should be aware of the potential impact this might have on the final results. The padding with zero values can also be toggled off in the manifest if a strict requirement for continuous, originally provided time series exists.

By yielding synchronised time series across all components, the time-sync feature greatly facilitates visualisation, intercomparison, and is a prerequisite for the framework's aggregation functionality. For more details, refer to the Time documentation.

############################## ORIG COURSE ###############################

Observation arrays

🤔

ASIM

This is really a section about “time”. We’ve not really discussed it before. What other time conversations are their worthwhile having?

For instance how to sync time across component, why it matters for aggregation, perhaps how to use the vizualiser for time base analysis, why is time based analysis important?

So far, your IMPs have only had a single element in the inputs array. However, many applications will actually track usage metrics over time. Therefore inputs will often be a time series.

When there are multiple observations, the inputs array contains more than one object. Each one must have a timestamp and duration. When the length of inputs is greater than one, Impact Framework will pass each and every element in inputs through the plugin pipeline. Every entry in inputs is treated identically and atomically.

An inputs array with three timesteps might look as follows:

inputs:
	- timestamp: 2023-08-01T00:00
		duration: 3600
	  memory-utilization: 8.8
	  cpu-utilization: 80
	  tdp: 205
	  memory-coefficient: 0.00039
	  carbon-intensity: 163
	- timestamp: 2023-08-02T00:00
		duration: 3600
	  memory-utilization: 2.1
	  cpu-utilization: 60
	  tdp: 205
	  memory-coefficient: 0.00039
	  carbon-intensity: 163
	- timestamp: 2023-08-03T00:00
		duration: 3600
	  memory-utilization: 19.5
	  cpu-utilization: 70
	  tdp: 205
	  memory-coefficient: 0.00039
	  carbon-intensity: 163

When your IMP runs, Impact Framework applies each plugin in your pipeline to each element in your inputs array. In outputs, although the values might vary, the fields in each element will be identical, because an identical set of plugins was executed.

With the three-entry inputs described above, your outputs could look as follows:

outputs:
  - timestamp: 2023-08-01T00:00
    duration: 3600
    memory-utilization: 8.8
    cpu-utilization: 80
    tdp: 205
    memory-coefficient: 0.00039
    carbon-intensity: 163
    memory-energy-kwh: 0.003432
    tdp-multiplier: 0.912
    cpu-power: 186.96
    cpu-energy-kwh: 0.18696000000000002
    energy: 0.190392
    operational-carbon: 31.033896000000002
  - timestamp: 2023-08-02T00:00
    duration: 3600
    memory-utilization: 2.1
    cpu-utilization: 60
    tdp: 205
    memory-coefficient: 0.00039
    carbon-intensity: 163
    memory-energy-kwh: 0.0008190000000000001
    tdp-multiplier: 0.804
    cpu-power: 164.82000000000002
    cpu-energy-kwh: 0.16482000000000002
    energy: 0.165639
    operational-carbon: 26.999157
  - timestamp: 2023-08-03T00:00
    duration: 3600
    memory-utilization: 19.5
    cpu-utilization: 70
    tdp: 205
    memory-coefficient: 0.00039
    carbon-intensity: 163
    memory-energy-kwh: 0.007605
    tdp-multiplier: 0.858
    cpu-power: 175.89
    cpu-energy-kwh: 0.17589
    energy: 0.183495
    operational-carbon: 29.909685

Try this yourself in Exercise 6!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!