Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions src/main/java/hudson/plugins/ec2/EC2RetentionStrategy.java
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,7 @@ public long check(EC2Computer c) {
long currentTime = this.clock.millis();

if (currentTime > nextCheckAfter) {
attemptReconnectIfOffline(c);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So after rebasing this PR to master, I discovered that the offline check does not happen if the EC2 agent is set to never be terminated, i.e. idleTerminationMinutes=0.
I moved it here to guarantee the offline check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after rebasing this PR to master

Please just use git pull inside a PR branch and avoid force-pushing, which breaks incremental review in most cases. (Unlikely to matter for a PR with such a short diff, but a good habit.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bad choice of words. I meant after i clicked the update PR button. But yes, noted

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to do this even if DISABLED, in which case AFAICT the other behaviors of the strategy are a no-op?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean I can see it both ways. My opinion is that this is more along the lines of handling network instability versus what to do with a node if it is idle.

long intervalMins = internalCheck(c);
nextCheckAfter = currentTime + TimeUnit.MINUTES.toMillis(intervalMins);
return intervalMins;
Expand Down Expand Up @@ -248,6 +249,17 @@ private long internalCheck(EC2Computer computer) {
return CHECK_INTERVAL_MINUTES;
}

private void attemptReconnectIfOffline(EC2Computer computer) {
if (computer.isOffline()) {
LOGGER.warning("EC2Computer " + computer.getName() + " is offline");
if (!computer.isConnecting()) {
// Keep retrying connection to agent until the job times out
LOGGER.warning("Attempting to reconnect EC2Computer " + computer.getName());
computer.connect(false);
}
}
}

/*
* Checks if there are any items in the queue that are waiting for this node explicitly.
* This prevents a node from being taken offline while there are Ivy/Maven Modules waiting to build.
Expand Down
Loading