Description
I've switched over from using the CV EMD method (https://github.com/egonSchiele/OpenCV/blob/master/modules/imgproc/src/emd.cpp) to using the EMDL1 implementation because of the massive speed benefit the paper it was based on was promising.
And at first it looked good, because I got a 1000x speed increase for the same date (comparison of 20x32 matrices).
But I somehow realised, that the description of the input paramters slightly differed, EMDs description says:
'First signature, a \f$\texttt{size1}\times \texttt{dims}+1\f$ floating-point matrix. Each row stores the point weight followed by the point coordinates.'
while EMDL1 says:
'First signature, a single column floating-point matrix. Each row is the value of the histogram in each bin.'
So the same signature will be interpreted differently by the two functions, which I think is already a bug / very unintuitive.
But given the implementation of EMDL1 it is obvious, that it clearly has the capability to work with 2 or 3 dimensional data.
A simple change in line 64 of https://github.com/opencv/opencv_contrib/blob/4.x/modules/shape/src/emdL1.cpp to correctly determine the dimensionality like
'''
if (!initBaseTrees((int) sig1.at(sig1.rows - 1, 1) + 1, (int) sig1.at(sig1.rows - 1, 2) + 1))
'''
instead of
'''
if(!initBaseTrees(sig1.rows, 1))
'''
does the trick for me, but I don't know if it's ok to assume that the signature has the right format to read out the last entry, nor if it is efficient to read it out like that. I will still propose a PR once I find the time.
I would really appreciate some discussion about this.