Skip to content

Handling delays better? #88

@zsusswein

Description

@zsusswein

@seabbs and I have been thinking about ways to better handle the delay from infection to case observation. We have a current thing that's only meh and two additional things that probably don't work and we're stuck on.

What we currently do

We have a fitted model that takes a user-requested vector of dates $\hat f(\tilde t)$. We ask the user for the dates they want to see $R_t$.

The model expects to be provided dates that correspond to cases, but the user-specified dates corresponds to an infection based measure. So we map the user-requested infection-corresponding dates to the model-expected case-corresponding dates by adding a user-specified mean delay $\delta$.

Then we simulate draws of the expected posterior incident cases out of $\hat f(\tilde t + \delta)$ and convolve them with the user-supplied GI PMF to get $R_t$.

This has the obvious problems of not correcting for the uncertainty induced by the delay convolution. It's been a hard problem to solve because we don't have a latent process; we're just fitting to cases.

Option 1: signal regression

mgcv has a "linear functional terms" feature that enables signal regression approaches of the form

$$g(\mu_i) = ... + \sum_j L_{ij}f(x_ij) + ...$$

where the weighted sum of $f(x)$ and $L$ is part of the model.

Hallelujah! Except the convolution happens on the scale of the linear predictor, not the mean response scale. And we're stuck.

One bad workaround would be to switch to an identity link, but that feels like an unpleasant road to walk down. Out of the "poor delay handling" frying pan and into the "negative cases" fire. This also would have a static variance over time, rather than having the variance be a function of the mean.

Option 2: Variance propagation a la density surface models

Inspired by the work here, we could implement their approach from section 3. In that work they have spatial data and some kind of "detection function" indicating likelihood of observation.
Screenshot 2024-11-25 at 4 00 44 PM

Which they smuggle in through a random effect via the Hessian in a way I don't get

Screenshot 2024-11-25 at 4 02 53 PM

(from these slides)

As far as I can tell, this would involve trying to do something clever with a Taylor series and the gamma distribution of the delay? I'm a little worried (1) we would lose something in the tail with the linearization of the convolution from the Taylor series approximation and (2) I'm not totally understanding the implementation. I think this is where the magic happens but it's a little scary.

Option 3: Put a tensor product smooth on the lag

Skip knowing the delay and instead throw more splines at the wall until you get something out. This feels dangerous and like a bad idea in our situation but see here because it's cool: https://ecogambler.netlify.app/blog/distributed-lags-mgcv/

More spaghetti to throw at the wall

Write a custom mgcv family that implements the convolution. Looks like we'd need a bunch of higher order derivatives of a convolution.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions