Skip to content

some more thoughts for the future (dplR v2.x) #22

@sklesse

Description

@sklesse

Hi Andy & everyone contributing and maintaining dplR /dplPy.

I repeatedly am confronted with the unnecessary behavior of rwi.stats.running to not correctly calculate rbar when you detrend with difference=T.

That the detrending via division throws you an error when the estimated line goes below zero and reverts to detrending via the mean is great. But when people say difference=T they probably have already converted their values (log- or power-transformed) or they work with stable isotopes. In the case of log-transformed RW or d13C we automatically work with negative values, and it would be great if there was an option in detrend() to say: "I know what I am doing, fit "spline" curve anyways, even when fit goes negative". I am aware that adding a simple line in your code transposing the data by +100 is a very effective workaround, but sometimes I am asking myself if there couldn't be a fix for that. :-) The workaround is probably easier than changing detrend.series() with the multiple detrending options.

Anyways, I digress, the "dirty dog" part is actually not the reason why I opened this issue. Why is the rbar and hence eps and snr calculation affected by close-to-zero mean values? That does not make sense.

tmp <- normalize1(rwi, n, prewhiten)
if(!all(tmp$idx.good)) {
warning("after prewhitening, 'rwi' contains column(s) without at least four observations",
call.=FALSE)
cat(gettext("note that there is no error checking on column lengths if filtering is not performed\n",
domain="R-dplR"))
}

After some step-by-step testing I found normalize1() to be the "culprit". This function needs to be changed. At least, I do not understand why it does what it does. Why do anything, if I do not want to prewhiten the data? Why divide the data by their mean at all? Even for the pre-whitening this steps seems unnecessary. You can do the ar-modeling at any mean. And for the rbar calculation the mean doesn't matter. Am I missing something here, or why was this implemented? I would suggest deleting the lines that say "divide by time series mean" in normalize1(). Seems like an easy fix. What do you think?

BTW: This "set mean to 1" functionality in the detrend.series "Ar" method created also the funny constant offset of Anderegg's 2015 Figure S10, where their legacy baseline is weirdly at ~-0.01 and not 0. because each normally detrended time series actually has a usually a mean of something around 0.99-something (there is variation around the means, but by and large they are slightly <1), but the mean of the prewhitened time series is always exactly 1.

In the detrend.series() function the following lines

if (difference) {
     resids$Ar <- Ar - mean(Ar, na.rm = TRUE)
   }
   else {
     resids$Ar <- Ar/mean(Ar, na.rm = TRUE)
   }

could be changed to

if (difference) {
     resids$Ar <- Ar - mean(y2, na.rm = TRUE) 
   }
   else {
     resids$Ar <- Ar/mean(y2, na.rm = TRUE)
   }

Just an idea.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions