Description
So while working on PR #234 to support complex fields @cwsmith and I discussed the fact that the get/set(Scalar,Vector,Matrix) API functions are not type safe in that a Scalar/Vector/Matrix/Packed field can be passed to any of these functions as a Field*, which is then down-cast to the appropriate type. This conversion is well-defined in the case that the passed in Field* actually does point to a field of the appropriate sub-class and thus with the appropriate 'Value' type.
But static_cast
doesn't check that the down-cast is appropriate: dynamic_cast
does do this check (and would work in this case because all of the Field sub-classes are polymorphic). So that would be the appropriate change to make. Of course that cast operation uses RTTI to ensure the result is actually a pointer to the correct type, but that adds overhead to these functions, which are definitely tight-loop functions, which I assume is why the static_cast
is what is currently used (though the internal call to getNodeValue
is still polymorphic so these functions can never be inlined/highly optimized, they will always require at least that one dynamic function dispatch.
In order to no add any call overhead to these functions I will be adding a PCU_DEBUG_ASSERT
against the value of field->getValueType()
to make things slighty safer and not add any overhead on optimized runs of any downstream codes, not that we ever expect this to be fast.
Now the real weirdness and the real reason for the issue:
All of these getter/setter functions call FieldOf<T>::get/setNodeValue()
which takes a pointer to the 'value' of the field, so for SCALAR
and PACKED
this is just double
but for VECTOR
and MATRIX
it is Vector3
and Matrix3x3
respectively. Inside this function the T*
argument is reinterpret_cast<double*>
(or const double*
for setting). This means the operation being performed is:
Vector3/Matrix3x3 value; // given as input
double * cmps = reinterpret_cast<double*>(value);
So we're casting a pointer to an object to a pointer to a double. The Vector3
and Matrix3x3
classes are both ultimately inherited from can::Array
. In the case of Vector3
, it makes some sense that the address of the object is also (apparently in general) the address of the array of double array inside of the object. In the Matrix3x3
case the array contains 3 Vector3
objects, each of which I imagine when layed out in memory also just consist of the length 3 double arrays they contain, so this makes some sense too.
However, this casting a pointer to an object to a pointer to the 'first' data member of an object is not covered by the standard and the result in general is undefined behavior. I'm certain the reason we've gotten away with this for so long is that it is the most sensible way for the compiler to put together the actual memory layout of the class, but it isn't defined this way by the standard, so we're making a huge assumption.
While these classes do have an inheritance hierarchy they are not polymorphic, so there is no function lookup table. But when there is a function lookup table, guess were it goes for each object of that class? Right at the start of the object in memory (technically it is implementation-define (like the address of the first data member is also)). So if we were to ever edit the can::Array
, mth::Vector
, or mth::Matrix
classes and introduce a single virtual function (which we never will but nonetheless), this portion of the API instantly breaks.
Further (as noted), this violated encapsulation as we're directly accessing underlying protected data of the class. Granted in this case is is read-only which is slightly more forgivable, but nonetheless is is still an issue.
The only fix I can think of for this issue is to add an explicit cast operator to the Vector
class:
explicit operator double*() {return &elems[0]}
this takes care of that issue. The Matrix
class is a bigger problem since it is a container of multiple Vector
s... I don't really have an elegant solution for this one.