Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rust): add timer to close secure channel #5200

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

Alef-gabriel
Copy link
Contributor

Current behavior

Currently, when to establish a secure channel, two workers are created: decrypt and encrypt. However, when the initiator shuts down its connection without explicitly removing the workers, they remain active on the responder side. This can lead to issues such as performance degradation or memory consumption over time.

Proposed changes

A timer to shut down the workers when a message is not received

@Alef-gabriel
Copy link
Contributor Author

@davide-baldo

@davide-baldo davide-baldo self-requested a review June 29, 2023 09:55
@davide-baldo
Copy link
Member

I like it! The direction is excellent!

The only note I have is that using a boolean is somewhat fragile because you don't have any guarantee about when a task is actually run. For example, if some piece of the code misbehaves and blocks an asynchronous thread, every task that is using that thread will be suspended until the piece of code returns. In this case, the secure channel may be idle for the duration of the suspension (which could be minutes) and not detect it.

The most common solution would be to store the current timestamp (less performant) instead of storing the boolean. But being a simple timeout, even if we are not super precise, I think it's ok, the goal is to clean up resources, and it's doing that.

@Alef-gabriel
Copy link
Contributor Author

Alef-gabriel commented Jun 30, 2023

Hi @davide-baldo !
I'm not sure if it's suitable, but the security channel takes twice as long to close with timeout.

@Alef-gabriel Alef-gabriel requested a review from davide-baldo July 1, 2023 11:10
@Alef-gabriel Alef-gabriel marked this pull request as ready for review July 4, 2023 12:49
@Alef-gabriel Alef-gabriel requested a review from a team as a code owner July 4, 2023 12:49
@etorreborre etorreborre force-pushed the Alef-gabriel/timer-to-close-secure-channel branch from 8848ca3 to 0c9d3b7 Compare July 10, 2023 13:01
since this is just a boolean, not the timeout itself
Also, moved the code resetting the boolean to a better place
Copy link
Member

@SanjoDeundiak SanjoDeundiak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice approach. Left few comments. Also, could you also please clone the idle flag to the Encryptor worker? Because currently using the encryption side of the Secure Channel doesn't reset the flag, therefore if we only use one-side communtication, the Secure Channel will be shut down soon :)

@@ -1106,7 +1106,7 @@ async fn should_stop_encryptor__and__decryptor__in__secure_channel(
let local_info = IdentitySecureChannelLocalInfo::find_info(msg.local_message())?;
assert_eq!(local_info.their_identity_id(), alice.identifier());
assert_eq!("Hello, Bob!", msg.body());

dbg!(ctx.list_workers().await?);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it makes sense to remove dbg! after the test is passing?

let bob = identities_creation.create_identity().await?;
let alice = identities_creation.create_identity().await?;

let bob_trust_policy = TrustIdentifierPolicy::new(alice.identifier());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a needed detail for that test. Might just use SecureChannelListenerOptions::new(). But it's also OK as it is


let msg = child_ctx.receive::<String>().await?;

let local_info = IdentitySecureChannelLocalInfo::find_info(msg.local_message())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these 2 lines are not needed here


let mut additional_workers = ctx.list_workers().await?;
additional_workers.retain(|w| !initial_workers.contains(w));
let works_count = additional_workers.len();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we please, instead of relying on the number of workers, instead check if the specific encryptors and decryptors are on the list of workers? I think there are potentially a lot of workers on the system that can start/stop even though we don't do anything, which can make this test incorrect in the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is also potentially the reason why should_stop_encryptor__and__decryptor__per__timeout is failing on the CI

// start a background thread to observe the connection activity
// If a message is received before maximum_idle_time has passed then the connection is
// active. Otherwise, we close the secure channel
spawn(async move {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will shut down the channel in the range of [maximum_idle_time, 2 * maximum_idle_time) from the last message. Not quite precise, but it should be fine. Could you please mention that detail in the doc-comment?
Also, mentioning that Secure Channels are shut down due to inactivity is worth mentioning somewhere in the doc-comments of the public API. Could you please add that?

self_clone
.stop_secure_channel(&ctx_clone, &addr)
.await
.unwrap();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's please handle the error gracefully, we generally want to avoid panics

@@ -99,6 +98,7 @@ impl Worker for IdentityChannelListener {
self.options.trust_context.clone(),
None,
None,
None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason that we don't use idle flag here?

@SanjoDeundiak
Copy link
Member

Hey @Alef-gabriel , do you plan to get back to this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants