Skip to content

feat(automl): add backtesting charts to AutoML time series inference notebook#132

Merged
openshift-merge-bot[bot] merged 7 commits into
opendatahub-io:mainfrom
LukaszCmielowski:automl_ts_notebook
Jun 17, 2026
Merged

feat(automl): add backtesting charts to AutoML time series inference notebook#132
openshift-merge-bot[bot] merged 7 commits into
opendatahub-io:mainfrom
LukaszCmielowski:automl_ts_notebook

Conversation

@LukaszCmielowski

@LukaszCmielowski LukaszCmielowski commented Jun 16, 2026

Copy link
Copy Markdown

Description of your changes:

Description of your changes:

The time series training pipeline already writes metrics/back_testing.json, but the generated inference notebook only printed raw JSON — no charts.

This PR adds offline matplotlib visualization aligned with the AutoGluon backtesting tutorial:

  • back_testing_charts.py — shared plotting module (render_back_testing_charts())
    • Per-window scores: Cutoff …: MASE = … print lines + table (from existing per_window_metrics)
    • Overall summary: mean eval_metric across validation windows (cross-series aggregate, no JSON schema change)
    • Best/worst series forecast panels with P10–P90 band and gray cutoff vlines
    • Lazy matplotlib imports (same pattern as ROC/PR cells in tabular notebooks)
  • timeseries_notebook.ipynb — new “Back-testing charts” section; helpers injected at notebook generation via notebook_backtest_charts_source()
  • autogluon_timeseries_models_training — wires chart source into notebook placeholder substitution
  • back_testing.py — cutoff in per-window metrics; P10/P90 quantile bounds in forecast data

No new runtime dependencies. Works from precomputed artifacts only (no TimeSeriesPredictor required in the notebook).

Checklist:

Pre-Submission Checklist

Additional Checklist Items for New or Updated Components/Pipelines

  • metadata.yaml includes fresh lastVerified timestamp
  • All required files
    are present and complete
  • OWNERS file lists appropriate maintainers
  • README provides clear documentation with usage examples
  • Component follows snake_case naming convention
  • No security vulnerabilities in dependencies
  • Containerfile included if using a custom base image

Summary by CodeRabbit

Summary

  • New Features

    • Back-testing metrics and charts are now embedded in generated notebooks, including per-window forecast comparisons and prediction intervals.
    • Notebook charts render automatically when the back-testing data and render helpers are available, with improved navigation and guidance.
  • Bug Fixes

    • Improved quantile bound selection to better match P10/P90-style behavior.
    • Quantile rows now include explicit lower/upper quantile fields; per-window metrics now include the evaluation cutoff.
  • Documentation

    • Updated back-testing README guidance and verification timestamps.
  • Tests

    • Expanded unit and chart-rendering tests for quantile bounds, cutoff handling, chart/table helpers, and notebook helper source validity.

Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com>
Assisted-by: Cursor
Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com>
Assisted-by: Cursor
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 02dab0fb-55ee-4100-8803-948dfd0cb6b3

📥 Commits

Reviewing files that changed from the base of the PR and between 2daaf22 and 1a608e6.

📒 Files selected for processing (5)
  • components/training/automl/shared/back_testing.py
  • components/training/automl/shared/back_testing_charts.py
  • components/training/automl/shared/notebook_templates/timeseries_notebook.ipynb
  • components/training/automl/shared/tests/test_back_testing.py
  • components/training/automl/shared/tests/test_back_testing_charts.py
🚧 Files skipped from review as they are similar to previous changes (4)
  • components/training/automl/shared/tests/test_back_testing.py
  • components/training/automl/shared/back_testing_charts.py
  • components/training/automl/shared/notebook_templates/timeseries_notebook.ipynb
  • components/training/automl/shared/tests/test_back_testing_charts.py

📝 Walkthrough

Walkthrough

back_testing.py refines quantile column selection to prefer P10/P90 bands (0.1/0.9) over prior thresholds, adds lower_quantile/upper_quantile fields to forecast rows, and includes evaluation cutoff values in per-window metrics. A new module back_testing_charts.py (435 lines) provides metric normalization, DataFrame conversion, tabular reporting, Matplotlib-based forecast plotting with optional quantile intervals, and performer selection. It exports entry-point functions render_back_testing_metrics, render_back_testing_forecast_charts, and render_back_testing_charts, plus a notebook_backtest_charts_source() function that extracts its own source for notebook embedding. The timeseries notebook template gains "Back-testing metrics" and "Back-testing charts" sections with a <REPLACE_BACKTEST_PLOT_HELPERS> placeholder and conditional rendering logic. component.py imports notebook_backtest_charts_source and substitutes the placeholder at notebook-generation time. Tests verify quantile selection, metric aggregation, text output formatting, DataFrame operations, matplotlib rendering, performer ID resolution, and source code syntax validity.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

🚥 Pre-merge checks | ✅ 10
✅ Passed checks (10 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(automl): add backtesting charts to AutoML time series inference notebook' directly and accurately describes the main change: introducing backtesting chart visualization functionality to the inference notebook.
Description check ✅ Passed The description comprehensively addresses the changes across all modified files, explains the rationale and design decisions (lazy matplotlib imports, no new dependencies), and all pre-submission checklist items are marked complete with justifications provided.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Contribution Quality And Spam Detection ✅ Passed PR follows repository's own contribution template, makes substantial multi-file architectural changes with 348 lines of comprehensive tests (17 test methods), implements lazy matplotlib imports cor...
No Hardcoded Secrets ✅ Passed No hardcoded secrets detected. AWS credentials correctly use os.environ[] at runtime; no API keys, tokens, passwords, or base64 encoded credentials found in code or config files.
No Weak Cryptography ✅ Passed PR contains no weak cryptography: no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB, no custom crypto, no insecure secret comparisons. Changes are visualization-focused (matplotlib charts, quantile bounds, met...
No Injection Vectors ✅ Passed No injection vectors detected. PR uses safe json.load() deserialization, f-strings with dict values only, no eval/exec/sql/shell injection, and trusted source code injection via str.replace().
No Privileged Containers ✅ Passed PR contains no Dockerfiles, Containerfiles, Kubernetes manifests, or Helm templates. Changes are limited to Python modules, Jupyter notebook, Markdown, and metadata.yaml—no container security confi...
No Sensitive Data In Logs ✅ Passed All logging statements log operational metrics (row counts, stage names, model names) and exceptions from AutoGluon training. No logging statements expose passwords, tokens, API keys, PII, session...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@LukaszCmielowski LukaszCmielowski changed the title feat: plot backtesting charts in generated notebook feat(automl): add backtesting charts to AutoML time series inference notebook Jun 16, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@components/training/automl/shared/back_testing_charts.py`:
- Around line 16-21: Add error handling to the `_matplotlib()` function to
gracefully handle missing matplotlib dependencies. Wrap the import statements
for matplotlib.dates and matplotlib.pyplot in a try/except block that catches
ImportError, and when caught, raise a new ImportError with a clear message
indicating that matplotlib is required for chart rendering and should be
installed via the requirements.txt file. This aligns with the existing error
handling pattern used in other functions like `_present_frame()` for optional
imports.

In `@components/training/automl/shared/back_testing.py`:
- Around line 156-159: The _closest() function in the code independently selects
the closest quantile level to both 0.1 and 0.9 targets. When only one quantile
column exists, this results in both the lower and upper bounds being selected as
the same value, causing inconsistency. Modify the logic to select the lower
quantile first by calling _closest(0.1), then for the upper quantile, select
from the remaining levels (excluding the already-selected lower quantile) before
calling _closest(0.9). This ensures that when multiple quantiles are available,
lower and upper bounds are always distinct, and when only one exists, you can
handle it as a special case or raise an appropriate error.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: ac28aaa1-bcea-4472-b67a-0e67d96f6a95

📥 Commits

Reviewing files that changed from the base of the PR and between bb0a9e3 and 368325d.

📒 Files selected for processing (8)
  • components/training/automl/autogluon_timeseries_models_training/README.md
  • components/training/automl/autogluon_timeseries_models_training/component.py
  • components/training/automl/autogluon_timeseries_models_training/metadata.yaml
  • components/training/automl/shared/back_testing.py
  • components/training/automl/shared/back_testing_charts.py
  • components/training/automl/shared/notebook_templates/timeseries_notebook.ipynb
  • components/training/automl/shared/tests/test_back_testing.py
  • components/training/automl/shared/tests/test_back_testing_charts.py

Comment thread components/training/automl/shared/back_testing_charts.py
Comment thread components/training/automl/shared/back_testing.py Outdated
Comment thread components/training/automl/shared/back_testing_charts.py Outdated
Comment thread components/training/automl/shared/back_testing_charts.py Outdated
Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com>
Assisted-by: Cursor

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
components/training/automl/shared/back_testing_charts.py (3)

287-290: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing error handling for malformed cutoff_start timestamp.

pd.to_datetime(cutoff_start) will raise ValueError on invalid timestamps. If cutoff_start comes from untrusted back_testing.json, wrap in try/except or validate input.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/training/automl/shared/back_testing_charts.py` around lines 287 -
290, The _draw_cutoff function does not handle invalid timestamp formats in the
cutoff_start parameter, which will cause pd.to_datetime(cutoff_start) to raise a
ValueError if the input is malformed. Wrap the pd.to_datetime(cutoff_start) call
in a try/except block to gracefully handle ValueError exceptions, and either log
a warning and return early, or skip drawing the cutoff line if the timestamp
cannot be parsed. This ensures the function is resilient to invalid timestamps
from untrusted sources like back_testing.json.

81-87: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing validation for untrusted forecast_data structure.

pd.DataFrame(forecast_data) and pd.to_datetime(frame["timestamp"]) will raise exceptions on malformed input. If forecast_data comes from untrusted sources (e.g., user-uploaded back_testing.json), add validation or wrap in try/except to prevent crashes.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/training/automl/shared/back_testing_charts.py` around lines 81 -
87, The forecast_data_to_frame function lacks error handling for malformed input
from potentially untrusted sources like user-uploaded back_testing.json files.
Add validation or wrap the pd.DataFrame(forecast_data) call and the
pd.to_datetime(frame["timestamp"]) call in try/except blocks to gracefully
handle cases where the input structure is invalid, the required columns are
missing, or the timestamp cannot be parsed. Provide meaningful error handling
that either returns a valid empty DataFrame, logs the error details, or raises a
more informative exception rather than allowing the raw pandas exceptions to
propagate.

371-379: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Narrow exception handling may miss timestamp and structure errors.

The try/except only catches ValueError from _draw_forecast(). Malformed timestamps or invalid data structures will raise TypeError, KeyError, or pandas exceptions. Either broaden the exception type or add upstream validation in forecast_data_to_frame().

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/training/automl/shared/back_testing_charts.py` around lines 371 -
379, The exception handling around the _draw_forecast() call is too narrow,
catching only ValueError while malformed timestamps or invalid data structures
will raise TypeError, KeyError, or pandas exceptions. Broaden the except clause
to catch multiple exception types (ValueError, TypeError, KeyError, and pandas
exceptions like pandas.errors.ParserError) or add upstream validation in
forecast_data_to_frame() to ensure the data is properly validated before being
passed to _draw_forecast(), preventing these errors from occurring in the first
place. Choose whichever approach aligns with your error handling strategy.
♻️ Duplicate comments (1)
components/training/automl/shared/back_testing_charts.py (1)

16-21: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Missing ImportError handling remains unaddressed.

Past review flagged the lack of try/except coverage in _matplotlib(). If matplotlib is missing, render_back_testing_charts() will crash at line 344. Add graceful fallback or wrap the call site.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/training/automl/shared/back_testing_charts.py` around lines 16 -
21, The `_matplotlib()` function lacks error handling for missing matplotlib
dependencies, causing `render_back_testing_charts()` to crash if matplotlib is
not installed. Add a try/except block in the `_matplotlib()` function to catch
ImportError when importing matplotlib.dates and matplotlib.pyplot, and either
raise a more informative error or return None to indicate failure.
Alternatively, wrap the call to `_matplotlib()` within
`render_back_testing_charts()` with try/except handling to gracefully handle the
missing dependency and skip chart rendering or provide a user-friendly error
message.
🧹 Nitpick comments (1)
components/training/automl/shared/tests/test_back_testing_charts.py (1)

268-315: 💤 Low value

Test validates fixed date format that strips hour information.

Line 309 asserts DateFormatter is used, which validates the "%m-%d" format on line 177 of back_testing_charts.py. Past review flagged this format as problematic for hourly datasets. The test correctly verifies the implementation but locks in potentially incorrect behavior. Consider updating the test to verify default date formatting instead.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@components/training/automl/shared/tests/test_back_testing_charts.py` around
lines 268 - 315, The test_plot_timeseries_forecasts_styles_date_axis test is
validating a fixed date format ("%m-%d") that strips hour information, which is
problematic for hourly datasets. Update the assertion on line 309 that checks
for DateFormatter (the isinstance check for axis.xaxis.get_major_formatter()) to
instead verify that default date formatting is used, which would appropriately
preserve time information for hourly datasets rather than locking in the
potentially incorrect fixed format behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@components/training/automl/shared/back_testing_charts.py`:
- Around line 183-184: The _target_column_name function attempts to access the
first column of a DataFrame without validating that columns exist, which will
raise an IndexError if the DataFrame is empty. Add a validation check at the
beginning of the _target_column_name function to ensure data.columns is not
empty, and either raise a descriptive ValueError or return a default value if no
columns are present, before attempting to access data.columns[0].

In
`@components/training/automl/shared/notebook_templates/timeseries_notebook.ipynb`:
- Around line 283-289: The code on line 289 calls
render_back_testing_forecast_charts(back_testing) without verifying that the
back_testing variable exists. Add a check for "back_testing" in globals() to the
conditional logic. If back_testing does not exist (i.e., the Back-testing
metrics cell was skipped or failed), print a message indicating the variable is
not available instead of attempting to use it and raising a NameError.
- Around line 257-262: Add JSON structure validation at both locations where
back_testing.json is loaded. At
components/training/automl/shared/notebook_templates/timeseries_notebook.ipynb
lines 257-262 (anchor site with render_back_testing_metrics call) and lines
433-438 (sibling site in plot cell), validate the loaded JSON dictionary against
the expected schema before passing it to rendering functions. You can either add
a validation function that checks for required keys and types once before each
render call, or wrap each json.load and subsequent render call in a try/except
block to catch and gracefully handle malformed data.

---

Outside diff comments:
In `@components/training/automl/shared/back_testing_charts.py`:
- Around line 287-290: The _draw_cutoff function does not handle invalid
timestamp formats in the cutoff_start parameter, which will cause
pd.to_datetime(cutoff_start) to raise a ValueError if the input is malformed.
Wrap the pd.to_datetime(cutoff_start) call in a try/except block to gracefully
handle ValueError exceptions, and either log a warning and return early, or skip
drawing the cutoff line if the timestamp cannot be parsed. This ensures the
function is resilient to invalid timestamps from untrusted sources like
back_testing.json.
- Around line 81-87: The forecast_data_to_frame function lacks error handling
for malformed input from potentially untrusted sources like user-uploaded
back_testing.json files. Add validation or wrap the pd.DataFrame(forecast_data)
call and the pd.to_datetime(frame["timestamp"]) call in try/except blocks to
gracefully handle cases where the input structure is invalid, the required
columns are missing, or the timestamp cannot be parsed. Provide meaningful error
handling that either returns a valid empty DataFrame, logs the error details, or
raises a more informative exception rather than allowing the raw pandas
exceptions to propagate.
- Around line 371-379: The exception handling around the _draw_forecast() call
is too narrow, catching only ValueError while malformed timestamps or invalid
data structures will raise TypeError, KeyError, or pandas exceptions. Broaden
the except clause to catch multiple exception types (ValueError, TypeError,
KeyError, and pandas exceptions like pandas.errors.ParserError) or add upstream
validation in forecast_data_to_frame() to ensure the data is properly validated
before being passed to _draw_forecast(), preventing these errors from occurring
in the first place. Choose whichever approach aligns with your error handling
strategy.

---

Duplicate comments:
In `@components/training/automl/shared/back_testing_charts.py`:
- Around line 16-21: The `_matplotlib()` function lacks error handling for
missing matplotlib dependencies, causing `render_back_testing_charts()` to crash
if matplotlib is not installed. Add a try/except block in the `_matplotlib()`
function to catch ImportError when importing matplotlib.dates and
matplotlib.pyplot, and either raise a more informative error or return None to
indicate failure. Alternatively, wrap the call to `_matplotlib()` within
`render_back_testing_charts()` with try/except handling to gracefully handle the
missing dependency and skip chart rendering or provide a user-friendly error
message.

---

Nitpick comments:
In `@components/training/automl/shared/tests/test_back_testing_charts.py`:
- Around line 268-315: The test_plot_timeseries_forecasts_styles_date_axis test
is validating a fixed date format ("%m-%d") that strips hour information, which
is problematic for hourly datasets. Update the assertion on line 309 that checks
for DateFormatter (the isinstance check for axis.xaxis.get_major_formatter()) to
instead verify that default date formatting is used, which would appropriately
preserve time information for hourly datasets rather than locking in the
potentially incorrect fixed format behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 36c82ebf-04d1-4929-8830-435b328031f3

📥 Commits

Reviewing files that changed from the base of the PR and between 368325d and 2daaf22.

📒 Files selected for processing (3)
  • components/training/automl/shared/back_testing_charts.py
  • components/training/automl/shared/notebook_templates/timeseries_notebook.ipynb
  • components/training/automl/shared/tests/test_back_testing_charts.py

Comment thread components/training/automl/shared/back_testing_charts.py
Comment thread components/training/automl/shared/notebook_templates/timeseries_notebook.ipynb Outdated
Comment thread components/training/automl/shared/notebook_templates/timeseries_notebook.ipynb Outdated
Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com>
Assisted-by: Cursor
Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com>
Assisted-by: Cursor
Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com>
Assisted-by: Cursor
@LukaszCmielowski

Copy link
Copy Markdown
Author

/ok-to-test

Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com>
Assisted-by: Cursor
@LukaszCmielowski

Copy link
Copy Markdown
Author

/ok-to-test

@DorotaDR DorotaDR left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@openshift-ci openshift-ci Bot added the lgtm label Jun 17, 2026
@openshift-ci

openshift-ci Bot commented Jun 17, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: DorotaDR

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot Bot merged commit 84febff into opendatahub-io:main Jun 17, 2026
21 of 23 checks passed
@LukaszCmielowski LukaszCmielowski deleted the automl_ts_notebook branch June 17, 2026 11:25
LukaszCmielowski added a commit to LukaszCmielowski/pipelines-components that referenced this pull request Jun 17, 2026
Resolved conflict in timeseries_data_loader/component.py:
- Kept enhanced sample_rows logic with ISO timestamp conversion (from PR opendatahub-io#132)
- Kept display_name metadata at start of context (from this PR)
- Kept write_outputs status tracking (from PR opendatahub-io#132)

Merged changes:
- PR opendatahub-io#132: AutoML timeseries notebook backtesting charts
- PR opendatahub-io#138: ai4rag 0.6.4 and ogx-client 1.1.0 updates

Signed-off-by: Lukasz Cmielowski <lcmielow@redhat.com>
Assisted-by: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants