Open
Description
Many of the data streams I deal with on fleets are AC energy streams, and frequently they're cumulative and always increasing. I need to correct these streams by differencing them to make them look like normal data. I wrote a simple little function to check if the data is always increasing (I have a passable threshold of increasing 95% of the time), and difference the data if so. Here it is:
def cumulativeEnergyCheck(energy_series, pct_increase_threshold = 95):
"""
Check if an energy time series represents cumulative energy or not.
"""
differenced_series = energy_series.diff()
differenced_series = differenced_series.dropna()
if len(differenced_series) == 0:
warnings.warn("The energy time series has a length of zero and "
"cannot be run.")
else:
#If over X percent of the data is increasing (set via the pct_increase_threshold),
#then assume that the column is cumulative
differenced_series_positive_mask = (differenced_series >= -.5)
pct_over_zero = differenced_series_positive_mask.value_counts(normalize=True) * 100
if pct_over_zero[True] >= pct_increase_threshold:
energy_series = energy_series.diff()
cumulative_energy = True
else:
cumulative_energy = False
return energy_series, cumulative_energy
I'd like to adapt this and add it into PVAnalytics. @cwhanse and @kanderso-nrel what do you think?
Metadata
Metadata
Assignees
Labels
No labels
Activity