-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Adding LinSpace, LogSpace, and Arange prototypes #18200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the good work you did to put together this PR. I proposed some changes which can be applied without much effort, or so I hope, and that will make the new functions (templates) even more useful.
Ah, and also, the equality of double precision floating point is not working too well because of the very meaning of equality of floating point ;-) |
Test Results0 tests 0 ✅ 0s ⏱️ Results for commit b5b5620. ♻️ This comment has been updated with latest results. |
Hi @dpiparo Thank you for the quick response and feedback! For this next commit, I’ve updated the functions as follows:
|
Dear @edfink234 thanks. I am reviewing the code. One thing which we will need is to provide a clean commit, properly formatted with all the changes - it is fine to have some "history" when crafting PRs but we try to keep the history of the repo as clean as possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job. Once the changes are addressed and the tests pass, we are ready to merge.
Thanks for the quick response @dpiparo! I will get to these hopefully today (Tuesday PST) if not at some point this week. It's almost 2 am where I am atm 😅... |
@dpiparo I made another commit to address the new comments and modified the tests a bit to hopefully have them pass. |
Thanks @edfink234 . Let's see how this goes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a quite useful contribution, thanks a lot! I have left some comments for discussion
math/vecops/inc/ROOT/RVec.hxx
Outdated
template <typename T1 = double, typename T2 = double, typename T3 = double, typename Common_t = std::conditional_t<std::is_floating_point_v<std::common_type_t<T1, T2, T3>>, std::common_type_t<T1, T2, T3>, double>> | ||
inline RVec<Common_t> Linspace(T1 start, T2 end, unsigned long long n = 128) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The template parameter T3
is not used anywhere in the template, so I guess it can be removed. Also, from what I understand the return type will always be an RVec of floating point type. I personally do not see the reason to not always just have RVec<double>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the benefit could be if the user wants float or some other floating point type, they have some options on how they call these functions (Linspace
, Logspace
, Arange
) to yield the desired floating point type. For example (after the changes from the most recent commit to this PR), to illustrate some of this flexibility below:
root [1] Linspace(1, 10, 10)
(ROOT::VecOps::RVec<double>) { 1.0000000, 2.0000000, 3.0000000, 4.0000000, 5.0000000, 6.0000000, 7.0000000, 8.0000000, 9.0000000, 10.000000 }
root [2] Linspace(1.0f, 10.0f, 10)
(ROOT::VecOps::RVec<float>) { 1.00000f, 2.00000f, 3.00000f, 4.00000f, 5.00000f, 6.00000f, 7.00000f, 8.00000f, 9.00000f, 10.0000f }
root [3] Linspace<float, float>(1, 10, 10)
(ROOT::VecOps::RVec<float>) { 1.00000f, 2.00000f, 3.00000f, 4.00000f, 5.00000f, 6.00000f, 7.00000f, 8.00000f, 9.00000f, 10.0000f }
root [4] Linspace<long double, long double>(1, 10, 10)
(ROOT::VecOps::RVec<long double>) { 1.0000000L, 2.0000000L, 3.0000000L, 4.0000000L, 5.0000000L, 6.0000000L, 7.0000000L, 8.0000000L, 9.0000000L, 10.000000L }
root [5] Logspace(1.0f, 5.0f, 5, 10.0f)
(ROOT::VecOps::RVec<float>) { 10.0000f, 100.000f, 1000.00f, 10000.0f, 100000.f }
root [6] Logspace<float, float, float>(1, 5, 20)
(ROOT::VecOps::RVec<float>) { 10.0000f, 16.2378f, 26.3665f, 42.8133f, 69.5193f, 112.884f, 183.298f, 297.635f, 483.293f, 784.760f, 1274.28f, 2069.14f, 3359.82f, 5455.60f, 8858.67f, 14384.5f, 23357.2f, 37926.9f, 61584.8f, 100000.f }
root [7] Arange<float, float, float>(1, 10, 3)
(ROOT::VecOps::RVec<float>) { 1.00000f, 4.00000f, 7.00000f }
root [8] Arange<long double, long double, long double>(3, 15, 3.1)
(ROOT::VecOps::RVec<long double>) { 3.0000000L, 6.1000000L, 9.2000000L, 12.300000L }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could get the desired functionality by simplifying this as follows:
template <typename T, typename Ret = double> Ret Linspace(T start, T end, unsigned long long n = 50, const bool endpoint=true) {
const long double step = static_cast<long double>(end - start) / (n - endpoint);
tmp.push_back(std::is_floating_point_v<Ret> ? start + i * step : std::floor(start + i * step));
}
math/vecops/inc/ROOT/RVec.hxx
Outdated
* @brief Produce RVec with N evenly-spaced entries from start to end inclusive. | ||
* | ||
* This function generates a vector of evenly spaced values, starting at @p start and ending at @p end, | ||
* with exactly @p n elements. The spacing is computed as | ||
* \f$\text{step} = \frac{\text{end} - \text{start}}{n-1}\f$, so that the vector includes both endpoints. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throughout the ROOT codebase, the Doxygen token used as sequence beginner is \
rather than @
. See for example the docs of Construct
from above: \begin
, \tparam
, \p
etc. Please move all the @
instances to \
instead.
math/vecops/inc/ROOT/RVec.hxx
Outdated
template <typename T1 = double, typename T2 = double, typename T3 = double, typename Common_t = std::conditional_t<std::is_floating_point_v<std::common_type_t<T1, T2, T3>>, std::common_type_t<T1, T2, T3>, double>> | ||
inline RVec<Common_t> Logspace(T1 start, T2 end, unsigned long long n = 128, T3 base = 10.0) | ||
{ | ||
RVec<Common_t> temp; | ||
|
||
if (!n || (n > std::numeric_limits<long long>::max())) // Check for invalid or absurd n. | ||
{ | ||
return temp; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comments that I made for LinSpace
also apply here
Thanks a lot for this! I have some suggestions, since I think it would be good to mimick python numpy more:
|
I tried my best to address these, though for the second and third points, I don't know how to cleanly add a specialization without incurring an ambiguous compiler error wrt being unable to deduce the correct overload. |
Not sure, maybe it's easier to use just a ternary operator such as: double result = std::is_floating_point ? static_cast : floor ; |
1f49ed5
to
deed8c2
Compare
deed8c2
to
1a2bbfd
Compare
math/vecops/inc/ROOT/RVec.hxx
Outdated
* | ||
* \note If \p n is 1, the resulting vector will contain only the value \p start. | ||
* \note The check `if (!n || (n > std::numeric_limits<long long>::max()))` is used to ensure that: | ||
* - n is nonzero, and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* - n is nonzero, and | |
* - division by zero is avoided when calculating `step` |
{ | ||
RVec<Common_t> temp; | ||
|
||
if (!n || (n > std::numeric_limits<long long>::max())) // Check for invalid or absurd n. |
This comment was marked as outdated.
This comment was marked as outdated.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good idea, but I think we still have design decisions to make. I left some comments to invite further discussion
math/vecops/inc/ROOT/RVec.hxx
Outdated
template <typename T1 = double, typename T2 = double, typename Common_t = std::conditional_t<std::is_floating_point_v<std::common_type_t<T1, T2>>, std::common_type_t<T1, T2>, double>> | ||
inline RVec<Common_t> Linspace(T1 start, T2 end, unsigned long long n = 128, const bool endpoint = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My biggest concern for this PR are the complications added for the templates of all the three newly introduced functions. IMHO, I don't see why we should allow something like
auto arr = [Linspace/Logspace](1.f, 33,...);
I guess it should be more than enough to always have only one T
parameter for all functions, and then change the return type to simply be
RVec<T>
if T is a floating point typeRVec<double>
otherwise
Thoughts @dpiparo ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, we should also allow to return RVec as numpy does if you specify dtype=int
.
So two template parameters:
- T for start/end
- Ret = double (by default), but can be changed to int.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe having the same T for start and end would be a very useful simplification.
For what concerns returning RVec<int>
: what would be the use case? Can this be covered by casting a RVec<double>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For what concerns returning
RVec<int>
: what would be the use case? Can this be covered by casting aRVec<double>
?
double myVector[100];
double myDecimatedVector[10];
for (auto const [idx, val] : std::views::enumerate(ROOT::VecOps::Linspace<auto, int>(0, 90, 10)))
myDecimatedVector[idx] = myVector[val];
Yes, one could cast val
to an int instead, true. But I just thought it's more convenient to follow the same conventions as numpy that allows you to specify dtype=int. And working with integers is safer than working with floating points wrt precision issues.
See silly reproducer below, the result is not the same with floats as with ints.
#include "ROOT/RVec.hxx"
#include <iostream>
template <typename T, typename Ret = double> ROOT::VecOps::RVec<Ret> Linspace(const T start, const T end, const unsigned long long n = 128, const bool endpoint = true, long double *const retstep = nullptr)
{
ROOT::VecOps::RVec<Ret> temp;
long double step = std::is_floating_point_v<T> ? (end - start) / (n - endpoint) : static_cast<long double>(end - start) / (n - endpoint);
if (retstep)
*retstep = step;
temp.reserve(n);
temp.push_back(static_cast<Ret>(start));
for (unsigned long long i = 1; i < n; i++)
temp.push_back(std::is_floating_point_v<T> ? static_cast<Ret>(start + i * step) : std::floor(start + i * step));
return temp;
}
void linspace()
{
double myVector[100000000];
double myDecimatedVector[10];
// with C++23: for (auto const [idx, val] : std::views::enumerate(Linspace<auto, int>(0, 90, 10)))
auto idx = 0;
for (auto val : Linspace<int, int>(100000000-100, 100000000-10, 10)) {
myDecimatedVector[idx++] = myVector[val];
std::cout << val << std::endl;
}
idx = 0;
for (auto val : Linspace<int, float>(100000000-100, 100000000-10, 10)) {
myDecimatedVector[idx++] = myVector[int(val)];
std::cout << int(val) << std::endl;
}
}
The same might happen if I use doubles, just at a later index (much bigger arrays). Giving the user the security to use unsigned long ints if they want to is useful.
fyi @edfink234
8f8e31d
to
9edeb81
Compare
@dpiparo @vepadulano @ferdymercury I made two commits. The first one was changing |
Thanks a lot!! From my side: this would be the only change left to do:
Since step must be floating even if CommonT is int. And for high N, we avoid extrapolation errors by always choosing doubles (or long doubles) even if CommonT is float. |
math/vecops/inc/ROOT/RVec.hxx
Outdated
template <typename T = double, typename Common_t = std::conditional_t<std::is_floating_point_v<T>, T, double>> | ||
inline RVec<Common_t> Linspace(T start, T end, unsigned long long n = 128, const bool endpoint = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my understanding this signature is still going against the documentation. It is clearly stated that
* The function is templated to allow for different arithmetic types. The deduced type \c Common_t is
* determined as follows: if \p T is a floating point type, that type is used;
* otherwise, the arithmetic is performed using \c double.
But that is not all, the user can still specify LinSpace<float, int>
. Now suddenly the common type is not the common type at all!
I still do not understand why the template needs to be anything more complicated than template <typename T>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Vincenzo for the feedback!
Now suddenly the common type is not the common type at all!
From my point of view:
Common_t
should be renamed toOutput
(orRet_t
return type) as I wrote in my suggested snippet. (and fixing step Adding LinSpace, LogSpace, and Arange prototypes #18200 (comment))- The doxygen comment is wrong and should be fixed accordingly
I still do not understand why the template needs to be anything more complicated than
template <typename T>
In my opinion, it's essential that the user can separately specify the auto-deduced input type and the (if-not-default) output type.
Example with your suggestion:
<typename Input> RVec<Input> Linspace(Input start, Input end, ...)
I write Linspace(3,5, 64)
So it's clear that I want double as returned values. But auto-deduction in this case will cast the return value to int's, hence returning nonsense. Of course, you could say it's the user fault, it should have specified Linspace(3.,5., 64)
, but that's not super-user-friendly. And it would go against the default behavior of np.linspace, so translatability of scripts to/from Python would be risked. Since
np.linspace(3,5, 64) == [3. , 3.03174603, 3.06349206, ...
unless you force it to be ints via dtype=int
.
Now you would argue, ok, let's always force that the return values are doubles even if inputs arguments are passed as ints.
<typename Input> RVec<double> Linspace(Input start, Input end, ...)
But that strategy would invalidate this other valid use case:
Linspace(1,10,10)
where here I really want integers returned, not doubles, so that I can do for-range based loops using integers. And that would be a mismatch: some users could say why can I do in python dtype=int and not in C++?
So to me, the best compromise is really:
<typename Input, typename Output = double> RVec<Output> Linspace(Input start, Input end, ...)
math/vecops/inc/ROOT/RVec.hxx
Outdated
template <typename T = double, typename Common_t = std::conditional_t<std::is_floating_point_v<T>, T, double>> | ||
inline RVec<Common_t> Logspace(T start, T end, unsigned long long n = 128, const bool endpoint = true, T base = 10.0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar comment about the template here
math/vecops/inc/ROOT/RVec.hxx
Outdated
template <typename T = double, typename Common_t = std::conditional_t<std::is_floating_point_v<T>, T, double>> | ||
inline RVec<Common_t> Arange(T start, T end, T step) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again similar comment about the template. Why should a user be able to do Arange<float, int>
?
…tep computation
Hi @dpiparo @vepadulano @ferdymercury I added the changes suggested by @vepadulano and @ferdymercury. I'm not sure how I feel about the suggestion of @ferdymercury for step if I understand the need for all these casts and what the floor does exactly, but ok, I changed it and checked that it yields the expected output. |
math/vecops/inc/ROOT/RVec.hxx
Outdated
template <typename T = double, typename Ret_t = std::conditional_t<std::is_floating_point_v<T>, T, double>> | ||
inline RVec<Ret_t> Arange(T start, T end, T step) | ||
{ | ||
unsigned long long n = std::ceil(static_cast<Ret_t>(end-start)/static_cast<Ret_t>(step)); // Ensure floating-point division. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsigned long long n = std::ceil(static_cast<Ret_t>(end-start)/static_cast<Ret_t>(step)); // Ensure floating-point division. | |
unsigned long long n = std::ceil(static_cast<Ret_t>(end-start)/static_cast<long double>(step)); // Ensure floating-point division. |
Otherwise floating point division is not ensured if user sets Ret=int
math/vecops/inc/ROOT/RVec.hxx
Outdated
static_cast<long double>(end - start) / (n - endpoint); | ||
|
||
RVec<Ret_t> temp(n); | ||
temp[0] = static_cast<Ret_t>(start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
temp[0] = static_cast<Ret_t>(start); | |
temp[0] = std::is_floating_point_v<Ret_t> ? static_cast<Ret_t>(start) : std::floor(start); |
Needed for consistency, since static_cast rounds towards zero whereas std::floor rounds towards negative.
Or move this line inside the if-else clause and start at i=0 the loop.
math/vecops/inc/ROOT/RVec.hxx
Outdated
(static_cast<Ret_t>(end_c - start_c) / static_cast<Ret_t>(n - endpoint)) : | ||
static_cast<long double>(end - start) / (n - endpoint); | ||
|
||
temp[0] = static_cast<Ret_t>(std::pow(base_c, start_c)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above. Or start loop at i=0
math/vecops/inc/ROOT/RVec.hxx
Outdated
|
||
RVec<Ret_t> temp(n); | ||
|
||
Ret_t start_c = static_cast<Ret_t>(start); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, or move inside loop.
math/vecops/inc/ROOT/RVec.hxx
Outdated
|
||
long double step = std::is_floating_point_v<Ret_t> ? | ||
(static_cast<Ret_t>(end) - static_cast<Ret_t>(start)) / static_cast<Ret_t>(n - endpoint) : | ||
static_cast<long double>(end - start) / (n - endpoint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static_cast<long double>(end - start) / (n - endpoint); | |
end >= start ? static_cast<long double>(end - start) / (n - endpoint) : (static_cast<long double>(end)- start)) / (n - endpoint); |
This is needed to avoid an underflow if user calls Linspace(5UL, 1UL);
Same change needed below.
math/vecops/inc/ROOT/RVec.hxx
Outdated
{ | ||
for (unsigned long long i = 1; i < n; i++) | ||
{ | ||
temp[i] = static_cast<Ret_t>(start) + static_cast<Ret_t>(i) * step; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it can be simplified as follows?
temp[i] = static_cast<Ret_t>(start) + static_cast<Ret_t>(i) * step; | |
temp[i] = static_cast<Ret_t>(start + i * step); |
same in all below.
math/vecops/inc/ROOT/RVec.hxx
Outdated
|
||
long double step = std::is_floating_point_v<Ret_t> ? | ||
(static_cast<Ret_t>(end_c - start_c) / static_cast<Ret_t>(n - endpoint)) : | ||
static_cast<long double>(end - start) / (n - endpoint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static_cast<long double>(end - start) / (n - endpoint); | |
end >= start ? static_cast<long double>(end - start) / (n - endpoint) : (static_cast<long double>(end)- start)) / (n - endpoint); |
math/vecops/inc/ROOT/RVec.hxx
Outdated
long double step_c = std::is_floating_point_v<Ret_t> ? | ||
static_cast<Ret_t>(step) : | ||
static_cast<long double>(step); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
long double step_c = std::is_floating_point_v<Ret_t> ? | |
static_cast<Ret_t>(step) : | |
static_cast<long double>(step); | |
long double step_c = step; |
I would suggest casting to Ret_t only in the final result, not in the intermediate variables.
Ret_t end_c = static_cast<Ret_t>(end); | ||
Ret_t base_c = static_cast<Ret_t>(base); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just cast to ret when you store things in temp vector, not the intermediate variables.
math/vecops/inc/ROOT/RVec.hxx
Outdated
} | ||
|
||
long double step = std::is_floating_point_v<Ret_t> ? | ||
(static_cast<Ret_t>(end) - static_cast<Ret_t>(start)) / static_cast<Ret_t>(n - endpoint) : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(static_cast<Ret_t>(end) - static_cast<Ret_t>(start)) / static_cast<Ret_t>(n - endpoint) : | |
(end - start) / static_cast<long double>(n - endpoint) : |
no need for Ret cast of intermediate variables?
I just pushed the new commit with these changes! |
This Pull request:
Changes or fixes:
Adds
linSpace
,logSpace
, andarange
functions toROOT::VecOps
namespaceChecklist:
Fixes #17855