-
Notifications
You must be signed in to change notification settings - Fork 4
Description
@seabbs and I have been thinking about ways to better handle the delay from infection to case observation. We have a current thing that's only meh and two additional things that probably don't work and we're stuck on.
What we currently do
We have a fitted model that takes a user-requested vector of dates
The model expects to be provided dates that correspond to cases, but the user-specified dates corresponds to an infection based measure. So we map the user-requested infection-corresponding dates to the model-expected case-corresponding dates by adding a user-specified mean delay
Then we simulate draws of the expected posterior incident cases out of
This has the obvious problems of not correcting for the uncertainty induced by the delay convolution. It's been a hard problem to solve because we don't have a latent process; we're just fitting to cases.
Option 1: signal regression
mgcv has a "linear functional terms" feature that enables signal regression approaches of the form
where the weighted sum of
Hallelujah! Except the convolution happens on the scale of the linear predictor, not the mean response scale. And we're stuck.
One bad workaround would be to switch to an identity link, but that feels like an unpleasant road to walk down. Out of the "poor delay handling" frying pan and into the "negative cases" fire. This also would have a static variance over time, rather than having the variance be a function of the mean.
Option 2: Variance propagation a la density surface models
Inspired by the work here, we could implement their approach from section 3. In that work they have spatial data and some kind of "detection function" indicating likelihood of observation.

Which they smuggle in through a random effect via the Hessian in a way I don't get
(from these slides)
As far as I can tell, this would involve trying to do something clever with a Taylor series and the gamma distribution of the delay? I'm a little worried (1) we would lose something in the tail with the linearization of the convolution from the Taylor series approximation and (2) I'm not totally understanding the implementation. I think this is where the magic happens but it's a little scary.
Option 3: Put a tensor product smooth on the lag
Skip knowing the delay and instead throw more splines at the wall until you get something out. This feels dangerous and like a bad idea in our situation but see here because it's cool: https://ecogambler.netlify.app/blog/distributed-lags-mgcv/
More spaghetti to throw at the wall
Write a custom mgcv family that implements the convolution. Looks like we'd need a bunch of higher order derivatives of a convolution.