Conversation
Nonunique returns the already seen elements of sequence.
Guarding the seen_add call can improve performance when there are a high ratio of duplicates.
|
@eriknw Can I get your thoughts on this? |
|
Thanks @groutr! Everything here looks reasonable and good. I'm curious: do you have a use case for this? And sorry for my delay. This year has been, uh, a little crazy. |
|
I'm sure that I had a better use case when I created the PR that I cannot recall now. One use case that currently comes to mind: when I'm asking "is this distinct", many times I'm really meaning to ask "why isn't this distinct"? If |
|
Yeah, that sounds reasonable. |
|
@eriknw which name do you find easier to remember? |
|
I think I prefer the name |
|
I think this is ready. What do you think @eriknw? |
itertoolz.uniqueyields the never before seen elements of sequence.nonuniqueis the complement, yielding the already seen elements of a sequence.This is incredibly useful for finding duplicates in a sequence.
This isn't really a new feature to itertoolz, but instead exposes an already existing feature.
isdistinctalready had this logic, but instead of returning True/False, I return the already seen elements as they are encountered. This PR simply moves the logic into its own function.ping: @eriknw