-
Notifications
You must be signed in to change notification settings - Fork 859
Description
Hi, I am using the partial_fit function to perform incremental learning with BERTopic. When I tried to save the BERTopic model using safetensors, I got the following error: KeyError: 'tokenizer'. The error was raised in bertopic/_save_utils.py when the function tries to recreate the countvectorizer delete the parameters in cv but they don't actually exist.
I tried to save the model using the code: model.save('some_directory', serialization="safetensors", save_ctfidf=True),
and here is the error code I got:
/python3.9/site-packages/bertopic/_save_utils.py in save_ctfidf_config(model, path)
293 # Recreate CountVectorizer
294 cv_params = model.vectorizer_model.get_params()
--> 295 del cv_params["tokenizer"], cv_params["preprocessor"], cv_params["dtype"]
296 if not isinstance(cv_params["analyzer"], str):
297 del cv_params["analyzer"]
KeyError: 'tokenizer'
I have run the function model.vectorizer_model.get_params() and it only contains 2 parameters: {'decay': 0.05, 'delete_min_df': None}.
Is there anything I've done wrong? Thank you!