Replies: 3 comments 1 reply
-
|
Looking at the code in It appears that the 0, 1 case optimization for varialbes is not also done for dynamic parameters. Do you think you could modify that azmul.hpp and do the zero / one optimization for dynamic paraemters (and make a corresponding pull request) ? If so, please first make sure that you can run the check_all.sh script on your stystem; see This may require some setup like installing xrst |
Beta Was this translation helpful? Give feedback.
-
|
Make sure you use the master branch and can run bin/check_all.sh before making any changes. |
Beta Was this translation helpful? Give feedback.
-
|
I made PR #232.. Note I describe that it is lacking tests and so likely isn't ready to merge, that It does appear to me that the actual code tests ran. I am not sure. I also ran |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi @bradbell, we have had CppAD successfully deployed with nimble for a while now. Thanks for making this possible.
In
azmul.hpp, lines 191-199 (the case where neither argument isvariableand at least one isdynamic), would it work to catch cases whenxoryisconstantand exactly 0 or 1 and then avoid putting azmul_dynoperator in the vector ofdynamicparameter operators used bynew_dynamic? In the preceding cases wherexoryisvariable, there is careful catching of cases with the other argument a constant0or1to avoid recording unnecessary operations. However, am I following correctly that this doesn't happen for the case wherexandyare onedynamicand oneconstant?The same question applies to
add.hpp(and possibly others?).I very well may not be following the internal details sufficiently.
Here is a summary of the use case that led me to this. (I can provide code but will try to explain.) Consider taping calculation of a quadratic form
y = x^T A xwherexis an n-by-1 matrix andAis an n-by-n matrix.yis a scalar. Everything is taped with your scalar multiplication and addition operators. Then consider obtaining the Hessian of this with respect to all elements ofx(i.e., the correct answer is2A) by triple taping: (i) tapey = x^T A x; (ii) tape calculation of the Jacobian of (i) by playing (i) forward 0 and reverse 1; and (iii) tape calculation of the Hessian of (i) by playing (ii) forward 0 (once) and then reverse 1, n times, once each with a single element of w being 1 instead of 0. Finally,optimize()the third tape.In this case, if elements of
Aare allconstant, the result is ideal: The optimize()d third tape has successfully obtained the result as simply twice the elements ofA, already calculated as constants, and virtually no computations are needed. (Before optimize(), the third tape still includes the forward 0 calculations, but optimization successfully detects that none of that is needed for the Hessian elements, which all result from multiplication and addition of onlyconstantarguments.)However, if elements of
Aare alldynamic(for all three levels of taping), then thenew_dynamicoperations in the third tape get very large at a fast rate in relation to the size ofA. And they are all (or almost all) multiplications by 1 or additions of 0. Withn=500, creating the third tape (before optimizing it) looks like it takes about 10Gb. Playing that tape takes at least an additional 10Gb. (And that is before optimizing it, which seems to shorten thenew_dynamicoperations somewhat but does not eliminate identity multiplications or additions).What I mean by using
dynamicelements for all three levels of taping is that each tape include a dynamic vector that is used as the elements ofA. When recording one tape by playing another tape, the playing is preceded by call tonew_dynamicso that the dynamic vector for the new tape is used as the updated dynamic values of the tape that will be played.I was surprised that
constantvsdynamicwould give such a huge difference in memory performance. Is there a way to make recording ofdynamicoperations catch cases where efficiency can be gained? Above, I tried to see where that might be done but might be off track.I can definitely work around this but wonder if it is feasible and/or sensible to improve.
Thank you.
Perry
Beta Was this translation helpful? Give feedback.
All reactions