Skip to content

Division by zero in HoeffdingAdaptivetree with small memory estimation period #1717

@jsvobo

Description

@jsvobo

river version:
river 0.22.0 (installed with pip)
Python version:
python 3.10
Operating system:
Debian

Describe the bug

When the memory estimation periodis small (can be forced by seting memory_estimate_period=1), and having large ADWIN Boosting classifier number of models (here I had 50), We can randomly run into this error:

File ~/<my env>/lib/python3.10/site-packages/river/ensemble/boosting.py:176, in ADWINBoostingClassifier.learn_one(self, x, y, **kwargs)
    174 for i, model in enumerate(self):
    175     for _ in range(utils.random.poisson(1, self._rng)):
--> [176]  model.learn_one(x, y, **kwargs)
    178     if model.predict_one(x) == y:
    179         self.correct_weight[i] += lambda_poisson

File ~/<my env>/lib/python3.10/site-packages/river/tree/hoeffding_adaptive_tree_classifier.py:234, in HoeffdingAdaptiveTreeClassifier.learn_one(self, x, y, w)
    231 self._root.learn_one(x, y, w=w, tree=self)
    233 if self._train_weight_seen_by_model % self.memory_estimate_period == 0:
--> [234]   self._estimate_model_size()

File ~/<my env>/lib/python3.10/site-packages/river/tree/hoeffding_tree.py:300, in HoeffdingTree._estimate_model_size(self)
    298         total_inactive_size += calculate_object_size(leaf)
    299 if total_active_size > 0:
--> [300]  self._active_leaf_size_estimate = total_active_size / self._n_active_leaves
    301 if total_inactive_size > 0:
    302     self._inactive_leaf_size_estimate = total_inactive_size / self._n_inactive_leaves

Code snippet:

tree.HoeffdingAdaptiveTreeClassifier(grace_period=100,max_size=10,max_depth=15,memory_estimate_period=10000)
model = ensemble.ADWINBoostingClassifier(model = tree_Hoeff,n_models=50, seed=seed)

After this init, i trained learn_one() on my data on dataset that i know works with other models.

Comment:

We should brace this code : "self._active_leaf_size_estimate = total_active_size / self._n_active_leaves" on line 300 in "river/tree/hoeffding_tree.py" by checking is there are any active leaves first.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions