Skip to content

Conversation

@pblazej
Copy link
Contributor

@pblazej pblazej commented Nov 18, 2025

Resolves #844

  • Handles offerId (symptoms)
  • Fixes double offer on ICE restart (root cause)
  • Fixes deadlock, probably introduced by strict ordering in Fix serial runner cancellation #804 (the fix is still valid though)
  • Handles leaveAction param vs legacy canReconnect

@github-actions
Copy link

github-actions bot commented Nov 18, 2025

⚠️ This PR does not contain any files in the .changes directory.

@pblazej pblazej mentioned this pull request Nov 19, 2025
@pblazej pblazej force-pushed the blaze/reconnect-errors branch 10 times, most recently from a68f1d4 to 11ee92a Compare November 25, 2025 12:38
}

if signalingState == .haveLocalOffer, iceRestart, let sd = remoteDescription {
_reNegotiate = false // Clear flag to prevent double offer
Copy link
Contributor Author

@pblazej pblazej Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JS around https://github.com/livekit/client-sdk-js/blob/65e0bfe38ba594bbfe73de60ce72dbc7b96be3a2/src/room/PCTransport.ts#L271

if (this._pc && this._pc.signalingState === 'have-local-offer') {
    const currentSD = this._pc.remoteDescription;
    if (options?.iceRestart && currentSD) {
        // 1. Rollback: Sets remote description
        await this._pc.setRemoteDescription(currentSD);
        // 2. DOES NOT set this.renegotiate = true
        // 3. Falls through to create offer (line 291)
    } else {
        // 1. Defer: Sets renegotiate = true
        this.renegotiate = true;
        // 2. Returns immediately (skips offer creation)
        return;
    }
}

vs Swift

if signalingState == .haveLocalOffer {
    if !(iceRestart && remoteDescription != nil) {
        // 1. Defer: Sets _reNegotiate = true
        _reNegotiate = true
        // 2. Returns immediately
        return
    }
    
    // Else: ICE Restart path falls through...
}

// ... (offer ID increment) ...

if signalingState == .haveLocalOffer, iceRestart, let sd = remoteDescription {
    // 1. Force clear _reNegotiate (Matches JS not setting it)
    _reNegotiate = false  
    // 2. Rollback: Sets remote description
    try await set(remoteDescription: sd)
    // 3. Creates new offer immediately
    return try await _negotiateSequence()
}

@pblazej pblazej marked this pull request as ready for review November 25, 2025 13:40
try await startReconnect(reason: .websocket)
} catch {
log("Failed calling startReconnect, error: \(error)", .error)
Task {
Copy link
Contributor Author

@pblazej pblazej Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SerialRunnerActor (inside SignalClient._delegate)
│
├─> [Task 1] didUpdateConnectionState
│   └─> await startReconnect()  ← blocking the actor
│       └─> waiting for offer...
│
└─> [Task 2] didReceiveOffer
    └─> Can't enter because actor is busy with Task 1

@pblazej
Copy link
Contributor Author

pblazej commented Nov 25, 2025

@hiroshihorie this ain't trivial, I'd appreciate if you give it a spin in ✈️ mode

@pblazej pblazej force-pushed the blaze/reconnect-errors branch from 6384397 to 199d70b Compare November 25, 2025 14:03
// force .full for next reconnect
_state.mutate { $0.nextReconnectMode = .full }
// Abort current connection attempt
await signalClient.cleanUp(withError: error)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more or less equivalent to JS:

        case LeaveRequest_Action.RECONNECT:
          this.fullReconnectOnNext = true;
          // reconnect immediately instead of waiting for next attempt
          this.handleDisconnect(leaveReconnect);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (if-minor): maybe this is a good moment to adopt the .action of the leave request as well here? (backwards compatible of course in case it's unset/0).

We don't actually need a full reconnect on every leave request. But if it makes more sense as a follow up that also sounds good.

Copy link
Contributor Author

@pblazej pblazej Nov 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, moved to the action param here 👍

Re: backwards compatibility, I discarded the legacy canReconnect param, which mimics JS and no-op if unknown.

Copy link
Contributor

@lukasIO lukasIO left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me in general!

// force .full for next reconnect
_state.mutate { $0.nextReconnectMode = .full }
// Abort current connection attempt
await signalClient.cleanUp(withError: error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (if-minor): maybe this is a good moment to adopt the .action of the leave request as well here? (backwards compatible of course in case it's unset/0).

We don't actually need a full reconnect on every leave request. But if it makes more sense as a follow up that also sounds good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reconnect error

3 participants