Hi Andy & everyone contributing and maintaining dplR /dplPy.
I repeatedly am confronted with the unnecessary behavior of rwi.stats.running to not correctly calculate rbar when you detrend with difference=T.
That the detrending via division throws you an error when the estimated line goes below zero and reverts to detrending via the mean is great. But when people say difference=T they probably have already converted their values (log- or power-transformed) or they work with stable isotopes. In the case of log-transformed RW or d13C we automatically work with negative values, and it would be great if there was an option in detrend() to say: "I know what I am doing, fit "spline" curve anyways, even when fit goes negative". I am aware that adding a simple line in your code transposing the data by +100 is a very effective workaround, but sometimes I am asking myself if there couldn't be a fix for that. :-) The workaround is probably easier than changing detrend.series() with the multiple detrending options.
Anyways, I digress, the "dirty dog" part is actually not the reason why I opened this issue. Why is the rbar and hence eps and snr calculation affected by close-to-zero mean values? That does not make sense.
tmp <- normalize1(rwi, n, prewhiten)
if(!all(tmp$idx.good)) {
warning("after prewhitening, 'rwi' contains column(s) without at least four observations",
call.=FALSE)
cat(gettext("note that there is no error checking on column lengths if filtering is not performed\n",
domain="R-dplR"))
}
After some step-by-step testing I found normalize1() to be the "culprit". This function needs to be changed. At least, I do not understand why it does what it does. Why do anything, if I do not want to prewhiten the data? Why divide the data by their mean at all? Even for the pre-whitening this steps seems unnecessary. You can do the ar-modeling at any mean. And for the rbar calculation the mean doesn't matter. Am I missing something here, or why was this implemented? I would suggest deleting the lines that say "divide by time series mean" in normalize1(). Seems like an easy fix. What do you think?
BTW: This "set mean to 1" functionality in the detrend.series "Ar" method created also the funny constant offset of Anderegg's 2015 Figure S10, where their legacy baseline is weirdly at ~-0.01 and not 0. because each normally detrended time series actually has a usually a mean of something around 0.99-something (there is variation around the means, but by and large they are slightly <1), but the mean of the prewhitened time series is always exactly 1.
In the detrend.series() function the following lines
if (difference) {
resids$Ar <- Ar - mean(Ar, na.rm = TRUE)
}
else {
resids$Ar <- Ar/mean(Ar, na.rm = TRUE)
}
could be changed to
if (difference) {
resids$Ar <- Ar - mean(y2, na.rm = TRUE)
}
else {
resids$Ar <- Ar/mean(y2, na.rm = TRUE)
}
Just an idea.
Hi Andy & everyone contributing and maintaining dplR /dplPy.
I repeatedly am confronted with the unnecessary behavior of rwi.stats.running to not correctly calculate rbar when you detrend with
difference=T.That the detrending via division throws you an error when the estimated line goes below zero and reverts to detrending via the mean is great. But when people say
difference=Tthey probably have already converted their values (log- or power-transformed) or they work with stable isotopes. In the case of log-transformed RW or d13C we automatically work with negative values, and it would be great if there was an option indetrend()to say: "I know what I am doing, fit "spline" curve anyways, even when fit goes negative". I am aware that adding a simple line in your code transposing the data by +100 is a very effective workaround, but sometimes I am asking myself if there couldn't be a fix for that. :-) The workaround is probably easier than changingdetrend.series()with the multiple detrending options.Anyways, I digress, the "dirty dog" part is actually not the reason why I opened this issue. Why is the rbar and hence eps and snr calculation affected by close-to-zero mean values? That does not make sense.
tmp <- normalize1(rwi, n, prewhiten)
if(!all(tmp$idx.good)) {
warning("after prewhitening, 'rwi' contains column(s) without at least four observations",
call.=FALSE)
cat(gettext("note that there is no error checking on column lengths if filtering is not performed\n",
domain="R-dplR"))
}
After some step-by-step testing I found
normalize1()to be the "culprit". This function needs to be changed. At least, I do not understand why it does what it does. Why do anything, if I do not want to prewhiten the data? Why divide the data by their mean at all? Even for the pre-whitening this steps seems unnecessary. You can do the ar-modeling at any mean. And for the rbar calculation the mean doesn't matter. Am I missing something here, or why was this implemented? I would suggest deleting the lines that say "divide by time series mean" innormalize1(). Seems like an easy fix. What do you think?BTW: This "set mean to 1" functionality in the detrend.series "Ar" method created also the funny constant offset of Anderegg's 2015 Figure S10, where their legacy baseline is weirdly at ~-0.01 and not 0. because each normally detrended time series actually has a usually a mean of something around 0.99-something (there is variation around the means, but by and large they are slightly <1), but the mean of the prewhitened time series is always exactly 1.
In the
detrend.series()function the following linescould be changed to
Just an idea.