Open
Description
How can I actually use this library to detect duplicate document,
The function insert only accept integers so I suppose I should implement minHash myself(Is it true??) and then use your simHash library to detect near-duplicates.
How can I connect this to simhash-db you provided? does that needs something users themselves should implement??
Sorry for being noob in Near duplicate detection bussiness 😄 and Thank you for sharing your work for us 👍
Metadata
Assignees
Labels
No labels
Activity