Skip to content

Commit 0553a31

Browse files
authored
Merge pull request #7 from gperdrizet/dev
Added progress plot option to single climber
2 parents ebc1ffb + 8d27df0 commit 0553a31

File tree

9 files changed

+190
-29
lines changed

9 files changed

+190
-29
lines changed

CITATION.cff

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,5 +21,5 @@ keywords:
2121
- data-science
2222
- Python
2323
license: GPL-3.0
24-
version: 0.1.7
24+
version: 0.1.10
2525
date-released: 2025-11-14

docs/source/advanced.rst

Lines changed: 56 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -81,22 +81,69 @@ Replicate Noise Tuning
8181
- **Medium noise (0.3-0.7)**: General purpose exploration
8282
- **High noise (0.7-1.5)**: When you need very diverse starting points
8383

84-
Checkpoint Strategies
85-
---------------------
84+
Checkpointing
85+
-------------
8686

87-
For Very Long Runs
88-
~~~~~~~~~~~~~~~~~~
87+
For long optimizations, save intermediate progress:
8988

9089
.. code-block:: python
9190
92-
# Save every 10 minutes for 24-hour runs
9391
climber = HillClimber(
9492
data=data,
95-
objective_func=objective,
96-
max_time=1440, # 24 hours
97-
checkpoint_file='long_run.pkl',
98-
save_interval=600 # 10 minutes
93+
objective_func=my_objective,
94+
max_time=60,
95+
checkpoint_file='optimization.pkl',
96+
save_interval=300 # Save every 5 minutes
97+
)
98+
99+
result = climber.climb()
100+
101+
Resume from a checkpoint:
102+
103+
.. code-block:: python
104+
105+
resumed = HillClimber.resume_from_checkpoint(
106+
checkpoint_file='optimization.pkl',
107+
objective_func=my_objective,
108+
new_max_time=30 # Continue for 30 more minutes
99109
)
110+
111+
result = resumed.climb()
112+
113+
Progress Monitoring
114+
-------------------
115+
116+
Live Progress Plots
117+
~~~~~~~~~~~~~~~~~~~
118+
119+
Monitor optimization progress in real-time with automatic plotting:
120+
121+
.. code-block:: python
122+
123+
climber = HillClimber(
124+
data=data,
125+
objective_func=my_objective,
126+
max_time=60,
127+
plot_progress=5 # Plot every 5 minutes
128+
)
129+
130+
result = climber.climb()
131+
132+
This is particularly useful for:
133+
134+
- Long-running optimizations (>10 minutes)
135+
- Interactive Jupyter notebooks
136+
- Debugging objective functions
137+
- Monitoring convergence behavior
138+
139+
**Important Notes**:
140+
141+
- Only works with ``climb()`` method (single-process mode)
142+
- Does **not** work with ``climb_parallel()`` because worker processes don't
143+
report intermediate results
144+
- If no steps are accepted between plot intervals, displays time information
145+
instead of plotting
146+
- In Jupyter notebooks, each plot replaces the previous one for clean output
100147

101148
Performance Optimization
102149
------------------------

docs/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
project = 'Hill Climber'
99
copyright = '2025, Hill Climber Contributors'
1010
author = 'Hill Climber Contributors'
11-
release = '0.1.7'
11+
release = '0.1.10'
1212

1313
# -- General configuration ---------------------------------------------------
1414
extensions = [

docs/source/installation.rst

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,16 @@ To explore the examples, modify the code, or contribute:
3333
2. Open in GitHub Codespaces
3434
3. The development environment will be configured automatically
3535

36+
Verify Installation
37+
-------------------
38+
39+
Test that the installation was successful:
40+
41+
.. code-block:: python
42+
43+
import hill_climber
44+
print(f"Hill Climber {hill_climber.__version__} successfully installed!")
45+
3646
**Option 2: Local Development**
3747

3848
1. Clone or fork the repository:
@@ -48,16 +58,6 @@ To explore the examples, modify the code, or contribute:
4858
4959
pip install -e .
5060
51-
Verify Installation
52-
-------------------
53-
54-
Test that the installation was successful:
55-
56-
.. code-block:: python
57-
58-
from hill_climber import HillClimber
59-
print("Hill Climber successfully installed!")
60-
6161
Running Tests
6262
-------------
6363

docs/source/quickstart.rst

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,26 @@ Here's a simple example that creates a dataset with high Pearson correlation:
4242
# View results
4343
print(f"Final correlation: {result['Pearson correlation']:.3f}")
4444
45+
Monitoring Progress
46+
-------------------
47+
48+
For longer runs, monitor progress with live plots:
49+
50+
.. code-block:: python
51+
52+
climber = HillClimber(
53+
data=data,
54+
objective_func=objective_high_correlation,
55+
max_time=30,
56+
mode='maximize',
57+
plot_progress=5 # Plot every 5 minutes
58+
)
59+
60+
result = climber.climb()
61+
62+
.. note::
63+
Progress plotting only works with ``climb()`` (not ``climb_parallel()``).
64+
4565
Understanding the Results
4666
--------------------------
4767

docs/source/user_guide.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,16 @@ Hyperparameters
7272
Amount of uniform noise to add when creating replicate starting points.
7373
Only used in ``climb_parallel()``.
7474

75+
**plot_progress** (default: None)
76+
Interval in minutes for plotting optimization progress during a run.
77+
When set, creates scatter plots showing the current best solution at
78+
regular intervals. For example, ``plot_progress=5`` plots every 5 minutes.
79+
80+
.. note::
81+
This option only works in single-process mode (``climb()``). It does not
82+
work with parallel mode (``climb_parallel()``) because results from worker
83+
processes are not collected until the end of the run.
84+
7585
Boundary Handling
7686
-----------------
7787

hill_climber/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@
5656
- plotting_functions: Visualization utilities
5757
"""
5858

59-
__version__ = '0.1.7'
59+
__version__ = '0.1.10'
6060
__author__ = 'gperdrizet'
6161

6262
from .optimizer import HillClimber

hill_climber/optimizer.py

Lines changed: 90 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import time
77
import os
88
from multiprocessing import Pool, cpu_count
9+
import matplotlib.pyplot as plt
910

1011
from .climber_functions import perturb_vectors, calculate_objective
1112
from .plotting_functions import plot_input_data, plot_results as plot_results_func
@@ -45,7 +46,8 @@ def __init__(
4546
mode='maximize',
4647
target_value=None,
4748
checkpoint_file=None,
48-
save_interval=60
49+
save_interval=60,
50+
plot_progress=None
4951
):
5052
"""Initialize HillClimber.
5153
@@ -65,6 +67,8 @@ def __init__(
6567
target_value: Target objective value for target mode (default: None)
6668
checkpoint_file: Path to save/load checkpoints (default: None)
6769
save_interval: Seconds between checkpoint saves (default: 60)
70+
plot_progress: Plot results every N minutes during optimization.
71+
If None (default), no plots are drawn during optimization.
6872
6973
Raises:
7074
ValueError: If mode is invalid or target_value missing for target mode
@@ -94,6 +98,7 @@ def __init__(
9498
self.step_size = step_size
9599
self.perturb_fraction = perturb_fraction
96100
self.temperature = temperature
101+
97102
# Convert user-provided cooling_rate to multiplicative factor
98103
# User specifies 1 - multiplicative_rate, we store the multiplicative rate
99104
self.cooling_rate = 1 - cooling_rate
@@ -102,6 +107,7 @@ def __init__(
102107
self.target_value = target_value
103108
self.checkpoint_file = checkpoint_file
104109
self.save_interval = save_interval
110+
self.plot_progress = plot_progress
105111

106112
# These will be set during climb
107113
self.best_data = None
@@ -115,6 +121,7 @@ def __init__(
115121
self.temp = temperature
116122
self.start_time = None
117123
self.last_save_time = None
124+
self.last_plot_time = None
118125

119126

120127
def save_checkpoint(self, force=False):
@@ -162,6 +169,7 @@ def save_checkpoint(self, force=False):
162169

163170
# Create checkpoint directory if needed
164171
checkpoint_dir = os.path.dirname(self.checkpoint_file)
172+
165173
if checkpoint_dir and not os.path.exists(checkpoint_dir):
166174
os.makedirs(checkpoint_dir)
167175

@@ -172,6 +180,74 @@ def save_checkpoint(self, force=False):
172180
print(f"Checkpoint saved: {self.checkpoint_file}")
173181

174182

183+
def plot_progress_check(self, force=False):
184+
"""Plot optimization progress if plot_progress interval has elapsed.
185+
186+
Args:
187+
force: Plot even if plot_progress interval hasn't elapsed (default: False)
188+
"""
189+
190+
if self.plot_progress is None:
191+
return
192+
193+
if self.start_time is None:
194+
return
195+
196+
current_time = time.time()
197+
198+
if not force and self.last_plot_time is not None:
199+
if (current_time - self.last_plot_time) / 60 < self.plot_progress:
200+
return
201+
202+
# Clear any existing plots
203+
plt.close('all')
204+
205+
# Clear output in Jupyter notebooks to replace previous plot
206+
try:
207+
from IPython.display import clear_output
208+
clear_output(wait=True)
209+
210+
except ImportError:
211+
# Not in IPython/Jupyter environment
212+
pass
213+
214+
# Create a result structure for single climb
215+
best_data_output = (
216+
pd.DataFrame(self.best_data, columns=self.columns)
217+
if self.is_dataframe else self.best_data
218+
)
219+
220+
# Format as expected by plot_results (single replicate)
221+
results = {
222+
'input_data': self.data,
223+
'results': [(self.data, best_data_output, pd.DataFrame(self.steps))]
224+
}
225+
226+
# Plot current progress
227+
elapsed_min = (current_time - self.start_time) / 60
228+
last_elapsed_min = (self.last_plot_time - self.start_time) / 60 if self.last_plot_time else 0
229+
230+
# Format elapsed time based on duration
231+
def format_elapsed(minutes):
232+
if minutes < 60:
233+
return f"{int(minutes)} minutes"
234+
else:
235+
hours = minutes / 60
236+
return f"{hours:.1f} hours"
237+
238+
# Check if there are any steps to plot
239+
if len(self.steps['Step']) == 0:
240+
print(f"\nNo accepted steps since last progress update")
241+
print(f"Last progress update: {format_elapsed(last_elapsed_min)}")
242+
print(f"Current time: {format_elapsed(elapsed_min)}")
243+
244+
else:
245+
print(f"\nPlotting progress at {format_elapsed(elapsed_min)}...")
246+
plot_results_func(results, plot_type='scatter')
247+
248+
self.last_plot_time = current_time
249+
250+
175251
def load_checkpoint(self, checkpoint_file):
176252
"""Load optimization state from checkpoint file.
177253
@@ -359,10 +435,16 @@ def climb(self):
359435

360436
# Save checkpoint periodically
361437
self.save_checkpoint()
438+
439+
# Plot progress periodically
440+
self.plot_progress_check()
362441

363442
# Save final checkpoint
364443
self.save_checkpoint(force=True)
365444

445+
# Plot final results
446+
self.plot_progress_check(force=True)
447+
366448
# Convert back to DataFrame if input was DataFrame
367449
best_data_output = (
368450
pd.DataFrame(self.best_data, columns=self.columns)
@@ -457,7 +539,7 @@ def climb_parallel(self, replicates=4, initial_noise=0.0, output_file=None,
457539
data_rep, self.objective_func, self.max_time, self.step_size,
458540
self.perturb_fraction, self.temperature, self.cooling_rate,
459541
self.mode, self.target_value, self.is_dataframe, self.columns,
460-
checkpoint_file, self.save_interval
542+
checkpoint_file, self.save_interval, None # Disable plot_progress for parallel
461543
))
462544

463545
# Execute in parallel
@@ -509,7 +591,8 @@ def climb_parallel(self, replicates=4, initial_noise=0.0, output_file=None,
509591
print(f"Results saved to: {output_file}")
510592

511593
return results
512-
594+
595+
513596
def plot_input(self, plot_type='scatter'):
514597
"""Plot the input data distribution.
515598
@@ -588,15 +671,15 @@ def _climb_wrapper(args):
588671
Args:
589672
args: Tuple of (data_numpy, objective_func, max_time, step_size,
590673
perturb_fraction, temperature, cooling_rate, mode, target_value,
591-
is_dataframe, columns, checkpoint_file, save_interval)
674+
is_dataframe, columns, checkpoint_file, save_interval, plot_progress)
592675
593676
Returns:
594677
Result from climb(): (best_data, steps_df)
595678
"""
596679

597680
(data_numpy, objective_func, max_time, step_size, perturb_fraction,
598681
temperature, cooling_rate, mode, target_value, is_dataframe, columns,
599-
checkpoint_file, save_interval) = args
682+
checkpoint_file, save_interval, plot_progress) = args
600683

601684
# Reconstruct original data format for HillClimber
602685
data_input = (
@@ -615,7 +698,8 @@ def _climb_wrapper(args):
615698
mode=mode,
616699
target_value=target_value,
617700
checkpoint_file=checkpoint_file,
618-
save_interval=save_interval
701+
save_interval=save_interval,
702+
plot_progress=plot_progress
619703
)
620704

621705
return climber.climb()

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
44

55
[project]
66
name = "parallel-hill-climber"
7-
version = "0.1.7"
7+
version = "0.1.10"
88
authors = [
99
{name = "gperdrizet", email = "[email protected]"},
1010
]

0 commit comments

Comments
 (0)