Skip to content

Feature: Support ICE Restart#48

Draft
KW-M wants to merge 34 commits intoThaUnknown:masterfrom
KW-M:master
Draft

Feature: Support ICE Restart#48
KW-M wants to merge 34 commits intoThaUnknown:masterfrom
KW-M:master

Conversation

@KW-M
Copy link

@KW-M KW-M commented Oct 29, 2024

What is the purpose of this pull request? (put an "X" next to item)

[ ] Documentation update
[ ] Bug fix
[X] New feature
[ ] Other, please explain:

Feat: Add support for automatic or manual ICE Restart
Explained well by @t-mullen "When network conditions change or the current route is congested, it's possible to restart ICE and gather/select a new candidate pair. Apparently this happens very frequently with cellular networks or with certain types of NATs." (feross#579) Also allows automatic reconnection when either peer moves between wifi networks / cell towers.

  • All tests pass on MacOS in Firefox, Brave, Safari & Chromium, and on Windows in Edge & Firefox for full.js & index.js.
  • Works in my website switching between wifi & mobile networks on Android Chrome, Brave and Firefox & IOS Safari.

What changes did you make? (Give an overview)

Add two new configuration options:

  • iceRestartEnabled: false or "onFailure" or "onDisconnect" - if automatic iceRestart is enabled (defaults to "onFailure" unless trickle is disabled).
  • iceFailureRecoveryTimeout: number - milliseconds to wait for ice restart to complete after the ice state reaches "failed".
    Added jsDoc typing for the new Peer() configuration options to help with editor autocomplete

Add the method

  • peer.restartIce() to manually trigger ICE Restart between peers.

Which issue (if any) does this pull request address?
feross#579

Is there anything you'd like reviewers to focus on?

  • I know this looks like a lot of commits, but the final changes aren't very massive - it looks inflated by earlier PRs.
  • The changes to full.js & lite.js are identical (only difference is the JSdoc typedef comment).
  • I wanted to give credit to the original authors of these pull requests that attempted to implement ICE Restarts from when simple-peer was maintained by feross, so I manually "merged" their PRs before further fleshing out the API & connection details - but those PRs added a lot of commits that don't actually end up being reflected in the final changes - oh well.
    based on: Implemented ICE restart [Demonstration Only For Feedback] feross/simple-peer#715 and (Attempt to) Implement ICE Restarts feross/simple-peer#771

Thank you so much for maintaining SimplePeer! I really appreciate it.

feross and others added 30 commits June 1, 2020 20:32
lsp-format-buffer
…re and rewrite to combine the best parts of each approach.
…imeout option apply to manual ice restart as well.
…eout - This allows any calls to restartIce() after ice failure to continue uninterupted.
…g canidates after ice reconnection.

add 'reconnect' event to better signal this event.
rename _iceComplete flag to _iceGatheringComplete to better reflect its purpose.
KW-M added 2 commits October 26, 2024 17:11
@8749236
Copy link

8749236 commented Nov 12, 2024

Just noticed this when I was checking if simple-peer has recovery strategy.

The intent is nice however, IMHO, I think this only benefits quick work, the recovery strategy often needs to be customized depending on scenario. Some may need to prompt user, some may need to have exponential backoff, or do in a way that is not anticipated.

I think adding a task that resolves when connected or rejects when connection attempt failed is better and makes the code simpler.

Then user can await and implement whatever strategy they want, either call restartIce with a fixed delay (which is still quick and easy, you can put it in README), or do their own strategy.

Something like:

var peer = new SimplePeer();
// Wire events
do {
    var connected = await peer.connected.then(function() {
        // do stuffs
        return true;
    }).catch(error => {
        console.warn("Something went wrong, reconnecting", error);
        peer.restartIce(); // Creates new "connected" promise and restart
        return false;
    });
} while(!connected);

User can also easily do something more sophisticated like:

var attempts = 0;
var peer = new SimplePeer();
// Wire events
do {
    try {
        await peer.connected;
    }
    catch (error) {
        var delay = calculate_backoff(++attempts);
        if(attempts % 3 == 0) {
            switch(await prompt_user()) {
                case "troubleshoot":
                    await run_troubleshooter();
                    await upload_diagnostics();
                    delay = 0;
                    break;
                case "alternative":
                    start_legacy_voip();
                    return; // pretend this is inside a function
            }
        }
        await sleep(delay);
        peer.restartIce();
    }
} while(!connected);

@8749236
Copy link

8749236 commented Nov 12, 2024

Instead of promise, another alternative approach consistent with what's already in sample code is adding an event like recover, then let user handle it; similar to signal.

The simplest implementation will be:

var peer = new SimplePeer();
// wire events and stuffs
peer.on("recover", function() {
    this.restartIce(); // or simply: this.restart();, since simple-peer insulated ICE exchange from user, user probably don't know what ICE is.
});

My point is, if recovery mechanism is added as built-in, in the future, people will ask different flavors of recovery mechanism to be implemented, regardless if other users need it or not.

If it doesn't exist, then user still has to built their own recovery mechanism.

@KW-M
Copy link
Author

KW-M commented Nov 12, 2024

Thanks for the feedback!

At the moment, this PR supports custom reconnection schemes like this:

  1. Disable the automatic ice restart option iceRestartEnabled: false
  2. Watch the event peer.on('iceStateChange', (iceConnectionState,iceGatheringState) => { })
    • This exposes the same states as the native browser oniceconnectionstatechange and onicegatheringstatechange events regarding the state of the p2p network pair (you would probably check for an iceConnectionState state of "disconnected" or "failed" and then perform an IceRestart or whatever you want MDN Docs).
  3. Call peer.restartIce() when desired.

Steps 1 & 2 could be better explained as I don't think the native events exposed by SimplePeer are well documented.
For step 1, I was thinking of adding another value like "manual" that would still wait for the iceFailureRecoveryTimeout after the "failed" ice state is reached without actually doing any ice reconnection. Would that be sufficient for your use cases?

My thought was that this is flexible enough to implement any desired reconnection scheme and fits better with simplePeer's event based model. If I add a "recover" event, when to fire it would be kind of subjective (is it better to fire it on "disconnect" or "failed" or something else?) In my experience with current browsers, disconnect is usually better if signaling is not a bottleneck, but still takes longer than necessary to detect the problem. Watching for multiple dropped video frames or packets might be a better way, but that relies on users of SimplePeer sending lots of data over the wire.

@8749236
Copy link

8749236 commented Nov 12, 2024

I'm okay with manual event as long as it is easy to understand and well documented in README.md

I would like to elaborate recover event further.

My motivation behind recover event is, from simple-peer's perspective, we know when user calls destroy method.

If user calls the method => the intention is to disconnect.
If user never called that method => user's intention is to remain connected
=> Thus I believe recover event is more suitable.

That's my two cents.

For the part you mentioned, I understand the ICE exchange, but not ncessarily users. Since the SimplePeer lib is supposed to be simple (pun intended).

The lib hide away all the details, except trickle ICE candidates since that's kinda an edge case that require user knowing what's going on underneath.

Also, from the pesudo code above, I believe recover is the simplest solution available for users. Where user can have a simple understanding: if recover event occurs, I need to reconnect by calling restart or DIY custom strategy.

Then you can put delay of 5 second then reconnect logic in sample code, in README.md.

peer.on("recover", function() {
    setTimeout(() => this.restartIce(), 5000);
});

Since most users will start with copy pasting sample code.

Also, if user requires even more sophisticated recovery strategy that requires reacting to each native events, then they will already have detailed knowledge about WebRTC and writes their own WebRTC wrapper.

@KW-M
Copy link
Author

KW-M commented Nov 12, 2024

I get the idea of having a simple "recover" event. I'll think about how to add it.

The easiest thing would be to trigger the "recover" event when the iceConnectionState becomes "failed", however it often takes up to 30 seconds to reach the "failed" state.

I could trigger the "recover" event when the iceConnectionState becomes "disconnected", however it is possible that the connection will quickly become "connected" on it's own without calling iceRestart(), which might surprise users.

Which is better do you think?

@8749236
Copy link

8749236 commented Nov 12, 2024

My opinion is go for the 80/20 rule, where firing recover at failed state will work for majority of use cases.

Also WebRTC spec has this:

Performing an ICE restart is recommended when iceConnectionState transitions to "failed". An application may additionally choose to listen for the iceConnectionState transition to "disconnected" and then use other sources of information (such as using getStats to measure if the number of bytes sent or received over the next couple of seconds increases) to determine whether an ICE restart is advisable.

If I understand correctly, the recommended state to perform restart or recover is at failed state. Reconnecting at disconnected state is to be used by a sophisticated recovery strategy that may or may not decide to reconnect.

Also the spec mentioned this for failed state:

The "failed" and "completed" states require an indication that there are no additional remote candidates. This can be indicated by calling addIceCandidate with a candidate value whose candidate property is set to an empty string or by canTrickleIceCandidates being set to false.

The long delay of entering failed state you've observed might be related to this.

If you have a setup that replicate it, you can test it by calling addIceCandidate with the input mentioned in the spec, to signal no more additional candidates and see if that speeds up transition (might be browser dependent).

@KW-M KW-M marked this pull request as draft December 23, 2024 19:56
@evoactivity
Copy link

@KW-M Is there much more to do on this before merging? I'm looking to add this to my app but would prefer not to have to depend on the PR in my package.json.

@KW-M
Copy link
Author

KW-M commented Mar 14, 2025

@evoactivity not much, mostly needs a code tidy. There was another PR for typescript support that I was hoping would get merged before this PR gets merged but now both me and the other PR author are busy with other things...

Currently all unit tests pass, but I haven't had the time to test it thoroughly with my application. If you can test it in your use case I'd appreciate it!

The main issue is if the signalling server connection loses messages when the network is interrupted - and network interruption is usually when ICE restart should occur - then it won't work because the other peer won't receive all the required signaling messages. Nothing in this PR fixes an unreliable signal connection, so you'll need to make sure your signalling mechanism is robust for this to work.

The promise API is supposed to allow you to account for that and only trigger ice restart when signalling is available again, but it needs some external mechanism to ensure the signaling messages get through.

@KW-M
Copy link
Author

KW-M commented Mar 14, 2025

I should say that the full.js file has the latest changes and I haven't finished copying those changes over to lite.js and index.js

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants