Description
Motivation
Recently, I've run into the problem of extracting unique values in a vector (of any integer, real or complex type or possibly even character). Consider for instance the following vector x = [1, 2, 3, 3, 4]
. What I'd need is a function taking x
as input and returning the vector y = [1, 2, 3, 4]
as output. The interface for a real-valued vector could be as simple as
pure function unique(x, sorted) result(y)
real(dp), intent(in) :: x(:)
!! Array whose unique values need to be extracted.
logical(lk), optional, intent(in) :: sorted
!! Whether the output vector needs to be sorted or not (default .false. ?)
real(dp), allocatable :: y(:)
!! Vector containing only the unique values from x.
end function
The output vector could be sorted or not, depending on the user's choice. I know that there are no Fortran intrinsic functions for that purpose, but I ain't sure something like that is already available in stdlib
. If I'm wrong, could anyone point me to the correct function?
Prior Art
- In Matlab, there is the
unique
function whose description is available here. - Python has the
set
function taking as input alist
and returning only the unique elements of this list. - Numpy has
np.unique
whose description is available here. - @jacobwilliams provides an integer-based implementation on his blog (here).
Additional Information
Both Matlab and Numpy's implementations cover a relatively large set of cases (1D-array, multidimensional arrays, different types, etc) and return values (the unique elements, the corresponding indices, indices to the reconstruct the original array from this unique set, etc).
I don't know if absolutely all these cases need to be covered (at least as a starting point). I would probably recommend to start with the simplest ones (i.e. only input vectors and output vector with the unique elements) as these are probably the most common situations where a unique
function might be needed. That would include integer
, real
, complex
and character
1D-arrays.