Skip to content

Async KV Store Persister #1470

Open
Listed in
Open
@TheBlueMatt

Description

@TheBlueMatt

Sensei ran into an issue where ldk-net-tokio (or some other async context) calls into ChannelManager, which handles some message and then goes to update the ChannelMonitor. Because Sensei is using the new KVStorePersister interface to simplify its persistence it ends up in a sync context in an tokio worker thread thread and then has to block on a future to talk to a (possibly remote) database. It then returns control to tokio by blocking on the future, but at the time we're still holding the channel and ChainMonitor locks, causing most other future execution to block.

While tokio should probably be willing to poll the IO driver in the block_on call from Sensei, even if it did it could still decide to run a different future which ends up blocking on the mutex already held and we're back to a deadlock.

Really if you're storing a monitor update via some async future you should use the TemporaryFailure return to do the monitor update asynchronously instead of blocking on IO with locks held, but we don't have good utilities for doing so today.

We need to (a) create a AsyncKVStorePersister (it has to either be runtime-specific because it needs the ability to spawn background tasks, as an alternative we could make the user provide a spawn method which we can call), and (b) adapt the background processor to using it for async manager/graph/score persistence.

Activity

added this to the 0.0.108 milestone on May 8, 2022
TheBlueMatt

TheBlueMatt commented on May 16, 2022

@TheBlueMatt
CollaboratorAuthor

Oops missed that this is really just a specific instance of #1367, though one that we can fix easily.

tnull

tnull commented on May 17, 2022

@tnull
Contributor

I'd be interested in picking this up soon to get some exposure to the Tokio/Persist side of things.

TheBlueMatt

TheBlueMatt commented on May 17, 2022

@TheBlueMatt
CollaboratorAuthor

I think this is going to snowball into about 3-4 PRs and a large cleanup of the ChannelMonitorUpdateErr stuff, so I'd prefer to tackle this one if that's okay. There's a few places where ChannelMonitorUpdateErr::TemporaryFailure requires a persistence before returning, which won't work with a fully-asyc persister, so that'll need updating first.

tnull

tnull commented on May 17, 2022

@tnull
Contributor

I think this is going to snowball into about 3-4 PRs and a large cleanup of the ChannelMonitorUpdateErr stuff, so I'd prefer to tackle this one if that's okay.

Alright, go ahead!

modified the milestones: 0.0.109, 0.0.110 on Jun 22, 2022
modified the milestones: 0.0.110, 0.0.111 on Jul 14, 2022
G8XSU

G8XSU commented on Aug 10, 2022

@G8XSU
Contributor

Doubt:
I am assuming async KVStore will return a future.
So basically we move out of persistence blocking but we don't proceed through critical section of any channel state change(open/send/or anything) until future is completed right ?

TheBlueMatt

TheBlueMatt commented on Aug 10, 2022

@TheBlueMatt
CollaboratorAuthor

Yep, basically. We actually already support this, but its a bit unsafe - monitor update functions can return Err(ChannelMonitorUpdateRes::TemporaryFailure) which puts a channel on ice and doesn't transmit channel state update functions until the monitor update completes. The intent here is to use that feature to implement the future, but we need to first make that feature a bit more robust.

TheBlueMatt

TheBlueMatt commented on Aug 20, 2022

@TheBlueMatt
CollaboratorAuthor

Once we do this, we should also update the docs on the updatestatus's in-progress state variant. #1106 (comment)

modified the milestones: 0.0.111, 0.1 on Aug 22, 2022
TheBlueMatt

TheBlueMatt commented on Aug 22, 2022

@TheBlueMatt
CollaboratorAuthor

This has turned into a very large project :(. #1678 is the first step, but this is basically the 0.1 milestone, at this point, so I'm just gonna tag it as such.

TheBlueMatt

TheBlueMatt commented on Oct 13, 2024

@TheBlueMatt
CollaboratorAuthor

While we intend to ship fixes for most known async persist issues in 0.1, it doesn't make sense to hold 0.1 for the async persistence keystore API as well. Instead we should target this in the following release.

modified the milestones: 0.1, 0.1.1 on Oct 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

No status

Relationships

None yet

    Development

    No branches or pull requests

      Participants

      @tnull@TheBlueMatt@G8XSU@jkczyz

      Issue actions

        Async KV Store Persister · Issue #1470 · lightningdevkit/rust-lightning