Skip to content

Conversation

@shrima-cf
Copy link
Contributor

These spans will be useful during latency investigations for Durable Objects

Previous context - #5631

@shrima-cf shrima-cf requested review from a team as code owners December 9, 2025 19:50
@shrima-cf shrima-cf force-pushed the shrima/STOR-3398-3 branch 2 times, most recently from e7436b0 to 71c15e3 Compare December 9, 2025 21:18
@codspeed-hq

This comment was marked as outdated.

Copy link
Contributor

@justin-mp justin-mp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is trending in the right direction.

Where do we capture the spans for SQLite operations?

// before the gate is unlocked.
Lock addRef() {
return Lock(*gate);
return Lock(*gate, nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know how often this is called? It would be nice to preserve the span if we can.

auto& context = IoContext::current();
auto userSpan = context.makeUserTraceSpan("durable_object_storage_sync"_kjc);
KJ_IF_SOME(p, cache->onNoPendingFlush()) {
KJ_IF_SOME(p, cache->onNoPendingFlush(context.getCurrentTraceSpan())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this better to do than context.makeTraceSpan like we do in DurableObjectStorage::getCurrentBookmark()?

void shutdown(kj::Maybe<const kj::Exception&> maybeException) override;
kj::OneOf<CancelAlarmHandler, RunAlarmHandler> armAlarmHandler(
kj::Date scheduledTime, bool noCache = false, kj::StringPtr actorId = "") override;
kj::Date scheduledTime, bool noCache, kj::StringPtr actorId, SpanParent parentSpan) override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can keep the optional parameters by putting the parentSpan before them. Unless we always provide the optional parameters, it's probably easier to keep the previous interface.


// Trace span for the current commit operation. Captured from the first write
// that triggers a commit, used for the output gate lock hold trace.
SpanParent currentCommitSpan = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if the first write is allowUnconfirmed? At that point, we don't actually lock the output gate.

Comment on lines +287 to +291
// Reset the commit span after the commit completes
auto resetSpan = kj::defer([this]() {
currentCommitSpan = nullptr;
hasCommitSpan = false;
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New commits can start before the previous commit has finished. The right thing to do is to kj::mv the currentCommitSpan into the commitImpl and reset it it immediately.

Comment on lines +549 to +553
// Capture trace span from the alarm handler for the commit batch.
if (!hasCommitSpan) {
currentCommitSpan = kj::mv(deferredAlarmSpan);
hasCommitSpan = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand why we're capturing the span here?

Comment on lines +641 to +645
// Capture trace span from the first write in this commit batch.
if (!hasCommitSpan) {
currentCommitSpan = kj::mv(traceSpan);
hasCommitSpan = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better pattern would be to capture the currentCommitSpan on every write and then use whatever is the current capture when you actually do lockWhile. That way if you have a allowUnconfirmed write, you won't capture that one, which won't actually wait for the output gate. (As an optimization, you could stop overwriting the currentCommitSpan once you send it to the lockWhile.)

Also, this really should be a method that we call rather than duplicating it in every method.

Comment on lines +229 to +231
// Trace span for the deferred alarm deletion, captured from armAlarmHandler and used when
// the alarm is actually deleted.
SpanParent deferredAlarmSpan = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this one separate from the currentCommitSpan?

Comment on lines +3219 to +3223
// Capture the first span for use at commit time
if (!hasCommitSpan) {
commitSpan = kj::mv(traceSpan);
hasCommitSpan = true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my comment in the sqlite file. I think the same thing applies here, at least when it comes to upgrading the output gate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants