-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[BUG] Raise Error when can't deserialize configuration json from server #4471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
c11b7ad
to
cd63d71
Compare
This PR modifies error handling in configuration deserialization by raising ValueError instead of using warnings and empty objects. It changes how undeserializable configurations are handled, returning None values for spann and hnsw configurations instead of empty objects. This summary was automatically generated by @propel-code-bot |
b40998d
to
af51006
Compare
chromadb/types.py
Outdated
configuration = CollectionConfiguration( | ||
hnsw=None, | ||
spann=None, | ||
embedding_function=None, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[DataTypeCheck]
Now the fallback value for configuration in from_json()
constructs CollectionConfiguration(hnsw=None, spann=None, embedding_function=None)
. Confirm that this matches the expected constructor signature and field types. Add appropriate type annotations to CollectionConfiguration to clarify that None is valid for these fields, or provide documentation about acceptable values.
af51006
to
318d249
Compare
@@ -8,16 +8,13 @@ | |||
from uuid import UUID | |||
from enum import Enum | |||
from pydantic import BaseModel | |||
import warnings | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
The warnings
module is used in this file now (see usage in chromadb/api/collection_configuration.py
), but it is not imported. Add the import at the top of the file to prevent runtime errors.
warnings.warn( | ||
f"Embedding function {ef_config['name']} not found. Add @register_embedding_function decorator to the class definition.", | ||
stacklevel=2, | ||
) | ||
ef = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[BestPractice]
Changing from ValueError
to warnings.warn
and ef = None
when an embedding function is not found might mask critical configuration errors. If an embedding function is specified in the configuration but cannot be located, it typically indicates a misconfiguration that should prevent the system from proceeding as if everything is normal. This could lead to a collection being created or used without the intended embedding capabilities, with the issue only becoming apparent later. Raising an error ensures such issues are addressed immediately. This approach is also more consistent with the stricter error handling for configuration loading adopted elsewhere in these changes (e.g., in chromadb/types.py
).
Consider restoring the behavior of raising an error:
warnings.warn( | |
f"Embedding function {ef_config['name']} not found. Add @register_embedding_function decorator to the class definition.", | |
stacklevel=2, | |
) | |
ef = None | |
raise ValueError( | |
f"Embedding function {ef_config['name']} not found. Add @register_embedding_function decorator to the class definition." | |
) |
⚡ Committable suggestion
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation.
318d249
to
7bb55b9
Compare
try: | ||
return load_collection_configuration_from_json(self.configuration_json) | ||
except Exception as e: | ||
warnings.warn( | ||
f"Server does not respond with configuration_json. Please update server: {e}", | ||
DeprecationWarning, | ||
stacklevel=2, | ||
) | ||
return CollectionConfiguration( | ||
hnsw=HNSWConfiguration(), | ||
spann=SpannConfiguration(), | ||
embedding_function=None, | ||
raise ValueError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[CriticalError]
Instead of returning a default configuration on deserialization error, now code raises ValueError
. Ensure that all consumers of this method are prepared to handle this exception appropriately. If unhandled, it may lead to crashes during deserialization.
try: | ||
configuration_json = json_map.get("configuration_json", None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[DataTypeCheck]
Calls to load_collection_configuration_from_json
may now receive None
for configuration_json
. Before calling the loader, consider explicitly validating the presence of configuration_json
to provide a clearer error message.
Description of changes
This PR fixes a bug where it warns users if the configuration is not deserializable rather than raising an error, giving the users an empty config
Test plan
How are these changes tested?
pytest
for python,yarn test
for js,cargo test
for rustDocumentation Changes
Are all docstrings for user-facing APIs updated if required? Do we need to make documentation changes in the docs section?