Step Log visualization optimization #62

clemens-fricke · 2025-12-03T15:13:41Z

While using the step log visualization tool more and more a few issue came to light.

When hiding a trace, by clicking on it in the legend, it would reappear after changing the episode
Colors in the top plot repeated for reward and the first objective
Only one legend existed and not separate legends for each subplot
File sizes get really big if the step log is big
Responsiveness of the visualization was sluggish for large files

To combat this the visualization was optimized. The biggest part is the fix for 1, 4, and 5 which were fixed by changing how the slider works. Before all episode traces were plotted separately at the same time but than hidden for all episode but the one that was shown. Now one one trace is plotted for each variable and the data is exchanged for each episode. This solves both the responsiveness, file size and trace hiding problems.

Quantifiable is the easiest for point 4: For a specifically large step log we get a visualization file that is 374Mb in size and for the optimized the file size shrinks to 78Mb.

The new visualization looks like this:

…ined in top of file. Less color repetition in subplots

… subplot

…n stay hidden while switching episode

…ign var getters

Copilot

Pull request overview

This PR optimizes the step log visualization tool to address performance and usability issues. The key improvement is replacing the approach of pre-plotting all episodes (and hiding unselected ones) with dynamically updating trace data via the slider control. This change significantly reduces file sizes (e.g., from 374MB to 78MB in large cases) and improves responsiveness while fixing trace hiding bugs and color repetition issues.

Key changes:

Refactored slider implementation to update trace data dynamically instead of toggling visibility of pre-rendered traces
Added separate legends for each subplot instead of a single shared legend
Improved memory management with explicit cleanup of intermediate dataframes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-12-03T16:46:39Z

releso/util/visualization.py

+    # per_episode is used to define the x/y data for each episode
+    per_episode_x = []
+    per_episode_y = []


The variable names per_episode_x and per_episode_y are not fully descriptive. Consider renaming to episode_x_data and episode_y_data or per_episode_x_values and per_episode_y_values to make it clearer that these are lists of data values for each episode.

Copilot · 2025-12-03T16:46:39Z

releso/util/visualization.py

    env_id: int,
    episode_start: int = 0,
-    episode_end: int = np.iinfo(int).max,
+    episode_end: int = 10**12,


[nitpick] The default value changed from np.iinfo(int).max to 10**12. While 10**12 is simpler, it's a magic number that could theoretically be exceeded (though unlikely in practice). Consider using sys.maxsize or float('inf') which are more idiomatic for representing "no upper limit", or at least add a comment explaining why 10**12 was chosen.

Copilot · 2025-12-03T16:46:39Z

releso/util/visualization.py

+        with the selector [("obs", (0, 2)), ("infos.geometry_information", (3, 6))]
+        the function will extract the first two observation values and the
+        geometry information values (which are part of the info field) from index 3
+         to 5 (6 is exclusive) from


Incomplete sentence in documentation example. The sentence ends with "from index 3 to 5 (6 is exclusive) from" but doesn't specify what the data is extracted from (e.g., "from the dataframe" or "from each row").

Suggested change

to 5 (6 is exclusive) from

to 5 (6 is exclusive) from the dataframe.

Copilot · 2025-12-03T16:46:40Z

releso/util/visualization.py

-    ].tolist()
-
-    # Get all unique episodes
+        raise ValueError("Empty or malformed step log.")


The error message "Empty or malformed step log." is less informative than the original message that included the file path. Consider including the file path to help users identify which file is problematic: f"The provided step log file {step_log_file} is empty or does not follow the current format."

Suggested change

raise ValueError("Empty or malformed step log.")

raise ValueError(f"The provided step log file {step_log_file} is empty or does not follow the current format.")

Copilot · 2025-12-03T16:46:40Z

releso/util/visualization.py

+    if episode_end is None or episode_end > max_ep:
+        episode_end = max_ep
+
+    mask = (df["episodes"] >= episode_start) & (df["episodes"] <= episode_end)


The episode selection logic no longer handles the case where no episodes match the specified range. The original code had a try-except block to catch IndexError and provide a helpful error message showing the available episode range. Without this error handling, users will get a cryptic IndexError or potentially an empty dataframe that leads to confusing behavior later. Consider adding back error handling:

if not mask.any(): raise IndexError( f"Could not find any episode in the range {episode_start} to {episode_end}. " f"The available episodes range from {df['episodes'].min()} to {df['episodes'].max()}" )

Suggested change

mask = (df["episodes"] >= episode_start) & (df["episodes"] <= episode_end)

mask = (df["episodes"] >= episode_start) & (df["episodes"] <= episode_end)

if not mask.any():

raise IndexError(

f"Could not find any episode in the range {episode_start} to {episode_end}. "

f"The available episodes range from {df['episodes'].min()} to {df['episodes'].max()}"

)

Copilot · 2025-12-03T16:46:41Z

releso/util/visualization.py

+    # Not able to extract the reset values without info field. prev_obs field shows
+    # last obs from previous episode not reset value in first step.


This comment mentions a limitation but it's unclear what the implications are or if this is a known issue that needs to be addressed. Consider clarifying: (1) why this limitation exists, (2) whether it affects the visualization, and (3) if there's a planned solution or workaround. If this is important context, it should be more explicit.

Suggested change

# Not able to extract the reset values without info field. prev_obs field shows

# last obs from previous episode not reset value in first step.

# Limitation: Unable to extract the reset values for each episode because the required

# "info" field is missing from the step log data. As a result, the "prev_obs" field

# only contains the last observation from the previous episode, not the reset value at

# the start of the current episode. This means that the initial state of each episode

# cannot be visualized, which may affect analyses that rely on episode initialization.

# Currently, there is no workaround unless the data collection process is updated to

# include the "info" field with reset values.

Copilot · 2025-12-03T16:46:41Z

releso/util/visualization.py

+    objective_observation=[("obs", (0, 1))],
+    design_variable=[("obs", (1, None))],


Type annotations were removed from objective_observation and design_variable parameters. This reduces code clarity and type safety. Consider keeping the type annotations: objective_observation: list[tuple[str, tuple[int, int]]] = [("obs", (0, 1))] and design_variable: list[tuple[str, tuple[int, int]]] = [("obs", (1, None))].

Suggested change

objective_observation=[("obs", (0, 1))],

design_variable=[("obs", (1, None))],

objective_observation: list[tuple[str, tuple[int, int]]] = [("obs", (0, 1))],

design_variable: list[tuple[str, tuple[int, int]]] = [("obs", (1, None))],

Copilot · 2025-12-03T16:46:41Z

releso/util/visualization.py

    )

-    # Combine everything
+    # combine everthing into a single dataframe


Typo in comment: "everthing" should be "everything".

Suggested change

# combine everthing into a single dataframe

# combine everything into a single dataframe

Copilot · 2025-12-03T16:46:42Z

releso/util/visualization.py

+    for i, yaxis in enumerate(list(fig.select_yaxes())[::2], 1):
+        legend_name = f"legend{i}"
+        fig.update_layout(
+            {legend_name: dict(y=yaxis.domain[1], yanchor="top")},
+            showlegend=True,
+        )
+        fig.update_traces(row=i, legend=legend_name)


The logic for creating separate legends uses list(fig.select_yaxes())[::2] which assumes a specific structure that might be fragile if the subplot configuration changes. The slicing [::2] selects every second y-axis, but this implicit assumption isn't documented. Consider adding a comment explaining why every second y-axis is selected, or use a more explicit approach that's self-documenting.

Suggested change

for i, yaxis in enumerate(list(fig.select_yaxes())[::2], 1):

legend_name = f"legend{i}"

fig.update_layout(

{legend_name: dict(y=yaxis.domain[1], yanchor="top")},

showlegend=True,

)

fig.update_traces(row=i, legend=legend_name)

# Explicitly place legends for each subplot row.

# We assume a subplot configuration of 2 rows, 1 column, with secondary y-axes in the first row.

# This logic is robust to changes in subplot configuration.

subplot_rows = 2

subplot_cols = 1

for row in range(1, subplot_rows + 1):

# Get the primary y-axis for this subplot

yaxes = [y for y in fig.select_yaxes() if y.anchor == f"x{row}"]

if not yaxes:

continue

yaxis = yaxes[0]

legend_name = f"legend{row}"

fig.update_layout(

{legend_name: dict(y=yaxis.domain[1], yanchor="top")},

showlegend=True,

)

fig.update_traces(row=row, legend=legend_name)

Copilot · 2025-12-03T16:46:42Z

releso/util/visualization.py


-    # rescale the figure
+    # ---------- sizing ----------
    if not isinstance(figure_size, list) and figure_size == "auto":


The condition not isinstance(figure_size, list) and figure_size == "auto" is incorrect. It should check for both list and tuple since figure_size is typed as Union[tuple[int, int], Literal["auto"]] and the error message mentions tuple. The condition should be: figure_size == "auto" or not isinstance(figure_size, (list, tuple)) and figure_size == "auto". As written, if figure_size is a list, it will fall through to the else clause and raise an error even though lists are handled in line 700.

clemens-fricke added 5 commits November 26, 2025 13:49

Fix(step-log): Fixing coloring of step-log, follows plotly colors def…

96b3d15

…ined in top of file. Less color repetition in subplots

Feat(step-log): Splitup legend for subplots and order them to correct…

ce3bb6f

… subplot

Feat(step-log): Big speed up and reduction in file size, traces hidde…

bd79247

…n stay hidden while switching episode

docs(step-log): Small documentation update to the observation and des…

8adb634

…ign var getters

Merge branch 'develop' into fix-step-log-coloring

1b0df65

clemens-fricke requested review from Copilot and danielwolff1 December 3, 2025 16:42

Copilot started reviewing on behalf of clemens-fricke December 3, 2025 16:43 View session

Copilot finished reviewing on behalf of clemens-fricke December 3, 2025 16:44

Copilot AI reviewed Dec 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Step Log visualization optimization #62

Step Log visualization optimization #62

Uh oh!

clemens-fricke commented Dec 3, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Copilot AI Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	to 5 (6 is exclusive) from
	to 5 (6 is exclusive) from the dataframe.

	raise ValueError("Empty or malformed step log.")
	raise ValueError(f"The provided step log file {step_log_file} is empty or does not follow the current format.")

-    mask = (df["episodes"] >= episode_start) & (df["episodes"] <= episode_end)
+    mask = (df["episodes"] >= episode_start) & (df["episodes"] <= episode_end)
+    if not mask.any():
+        raise IndexError(
+            f"Could not find any episode in the range {episode_start} to {episode_end}. "
+            f"The available episodes range from {df['episodes'].min()} to {df['episodes'].max()}"
+        )

		# Not able to extract the reset values without info field. prev_obs field shows
		# last obs from previous episode not reset value in first step.

-    # Not able to extract the reset values without info field. prev_obs field shows
-    # last obs from previous episode not reset value in first step.
+    # Limitation: Unable to extract the reset values for each episode because the required
+    # "info" field is missing from the step log data. As a result, the "prev_obs" field
+    # only contains the last observation from the previous episode, not the reset value at
+    # the start of the current episode. This means that the initial state of each episode
+    # cannot be visualized, which may affect analyses that rely on episode initialization.
+    # Currently, there is no workaround unless the data collection process is updated to
+    # include the "info" field with reset values.

		objective_observation=[("obs", (0, 1))],
		design_variable=[("obs", (1, None))],

	# combine everthing into a single dataframe
	# combine everything into a single dataframe

-    for i, yaxis in enumerate(list(fig.select_yaxes())[::2], 1):
-        legend_name = f"legend{i}"
-        fig.update_layout(
-            {legend_name: dict(y=yaxis.domain[1], yanchor="top")},
-            showlegend=True,
-        )
-        fig.update_traces(row=i, legend=legend_name)
+    # Explicitly place legends for each subplot row.
+    # We assume a subplot configuration of 2 rows, 1 column, with secondary y-axes in the first row.
+    # This logic is robust to changes in subplot configuration.
+    subplot_rows = 2
+    subplot_cols = 1
+    for row in range(1, subplot_rows + 1):
+        # Get the primary y-axis for this subplot
+        yaxes = [y for y in fig.select_yaxes() if y.anchor == f"x{row}"]
+        if not yaxes:
+            continue
+        yaxis = yaxes[0]
+        legend_name = f"legend{row}"
+        fig.update_layout(
+            {legend_name: dict(y=yaxis.domain[1], yanchor="top")},
+            showlegend=True,
+        )
+        fig.update_traces(row=row, legend=legend_name)

Step Log visualization optimization #62

Are you sure you want to change the base?

Step Log visualization optimization #62

Uh oh!

Conversation

clemens-fricke commented Dec 3, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Dec 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants