Hi, I am looking at your hierarchical tokenizer. I have a question for how you calculate ['numeric_count'] per property.
The related codes are "weight = 1.0 / (num_subjects * total_events)" (line 100) and "res['numeric_count'] += len(v['numeric_samples']) / weight" (line 151). So res['numeric_count'] += len(v['numeric_samples'])*num_subjects * total_events. Should it be res['numeric_count'] += len(v['numeric_samples'])*weight instead? I think the idea here is to determine how many quantile bins each property gets depends on total number of numerical values, that's why it is divided by (num_subjects * total_events).