You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unsure if this would classify as a bug or a feature, but the Document equality method does a direct dict comparison between two documents.
I think this is potentially sub-optimal in regards to when the score value is set in the Document. It's possible that all other aspects of two Documents match except the score value could differ only slightly due to float imprecision. For example,
I think the score should be excluded from this comparison, since the score is not part of the document itself, it's associated with the retrieval process.
This would also be a breaking change too, not easy to handle either. I'd have to think how we could follow the deprecation policy for this kind of changes.
We'd also need to decide on the tolerance too, and that could be a huge debate on itself. 😅
Also what about the embedding? Should we compare them with a tolerance too?
I think it would be better to leave the current implementation as is, I don't see many benefits to change this. Also as @davidsbatista says most of the times Document won't have a score set. If one needs to compare Documents by score I'd expect they would do it explicitly and not just with doc1 == doc2.
Just for reference this current implementation comes from #6323, before that we were just comparing the id field.
In the context of Information Retrieval, my point is that document content and document score derived from a retrieval process are two completely distinct aspects, and it seems that Sebastian needs to compare retrieved documents by content.
Unsure if this would classify as a bug or a feature, but the Document equality method does a direct dict comparison between two documents.
I think this is potentially sub-optimal in regards to when the score value is set in the Document. It's possible that all other aspects of two Documents match except the score value could differ only slightly due to float imprecision. For example,
To me this feels misleading since I'd normally say these two Documents should be considered the same.
The text was updated successfully, but these errors were encountered: