-
Notifications
You must be signed in to change notification settings - Fork 2.1k
feat(script): Resume of container for the Docker task runner #11964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
logger.trace("Container created: {}", exec.getId()); | ||
// evaluate resume (task property overrides plugin configuration if set) | ||
Boolean resumeProp = runContext.render(this.resume).as(Boolean.class).orElse(null); | ||
Optional<Boolean> resumeConfig = runContext.pluginConfiguration(RESUME_ENABLED_CONFIG); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We didn't offer plugin configuration for resume as it's a task property so it could be enabled globally via task defaults.
We never offer both a plugin config and a task property.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the ability to configure it via plugin configuration?
if (logger.isDebugEnabled()) { | ||
logger.debug("Resuming existing container {}", containerId); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (logger.isDebugEnabled()) { | |
logger.debug("Resuming existing container {}", containerId); | |
} | |
logger.info("Resuming existing container {}", containerId); |
Await.until(ended::get); | ||
|
||
if (exitCode != 0) { | ||
if (needVolume && FileHandlingStrategy.VOLUME.equals(strategy) && filesVolumeName != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you remove the check on fileVolumeName?
if (needVolume && FileHandlingStrategy.VOLUME.equals(strategy) && filesVolumeName != null) { | ||
downloadOutputFiles(exec.getId(), dockerClient, runContext, taskCommands); | ||
if (needVolume && FileHandlingStrategy.VOLUME.equals(strategy) && filesVolumeName != null) { | ||
// For newly created containers, original condition holds; for resumed ones, filesVolumeName is null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, for resumed container we would never arrives here.
I think you should set fileVolumesName
by inspecting the resumed container, should be doable
"echo \"::{\\\"outputs\\\":{\\\"msg\\\":\\\"Token\\\"}}::\" && sleep 1" | ||
))); | ||
|
||
var first = dockerCreate.run(runContext, createCommands, Collections.emptyList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run will ends the container so I'm not sure the test does what you think it does
For testing, only a single test could be added to check that resuming work. The idea is to start the taskrunner in a thread, interrupt the thread when the container is started, start another one and check that the same container is used. @Test
void killAfterResume() throws Exception {
var runContext = runContext(this.runContextFactory);
var commands = initScriptCommands(runContext);
Mockito.when(commands.getCommands()).thenReturn(Property.ofValue(
ScriptService.scriptCommands(List.of("/bin/sh", "-c"), Collections.emptyList(), List.of("sleep 50"))
));
var taskRunner = ((Kubernetes) taskRunner()).toBuilder().delete(Property.ofValue(false)).resume(Property.ofValue(true)).build();
Thread initialPodThread = new Thread(throwRunnable(() -> taskRunner.run(runContext, commands, Collections.emptyList())));
initialPodThread.start();
try (var client = PodService.client(runContext, taskRunner.getConfig())) {
String labelSelector = "kestra.io/taskrun-id=" + ((Map<String, Object>) runContext.getVariables().get("taskrun")).get("id");
Await.until(() -> {
PodList existingPods;
try {
existingPods = client.pods().inNamespace(runContext.render(taskRunner.getNamespace()).as(String.class).orElseThrow()).list(new ListOptionsBuilder().withLabelSelector(labelSelector).build());
} catch (IllegalVariableEvaluationException e) {
throw new RuntimeException(e);
}
return !existingPods.getItems().isEmpty();
});
initialPodThread.interrupt();
Map<String, Object> taskRunProps = new HashMap<>((Map<String, Object>) runContext.getVariables().get("taskrun"));
RunContext anotherRunContext = runContext(this.runContextFactory, Map.of("taskrun", taskRunProps));
var anotherTaskRunner = ((Kubernetes) taskRunner()).toBuilder().delete(Property.ofValue(false)).resume(Property.ofValue(true)).build();
List<LogEntry> logs = new CopyOnWriteArrayList<>();
Flux<LogEntry> receive = TestsUtils.receive(logQueue, (logEntry) -> {
logs.add(logEntry.getLeft());
});
Thread resumePodThread = new Thread(throwRunnable(() -> anotherTaskRunner.run(anotherRunContext, commands, Collections.emptyList())));
resumePodThread.start();
TestsUtils.awaitLog(logs, logEntry -> logEntry.getMessage().contains("resumed from an already running pod"));
receive.blockLast();
anotherTaskRunner.kill();
resumePodThread.interrupt();
try {
PodList existingPods = client.pods().inNamespace(runContext.render(taskRunner.getNamespace()).as(String.class).orElseThrow()).list(new ListOptionsBuilder().withLabelSelector(labelSelector).build());
assertThat(existingPods.getItems().isEmpty(), is(true));
} catch (IllegalVariableEvaluationException e) {
throw new RuntimeException(e);
}
}
} |
Thanks for the review, I will check and fix tonight 🙂 |
What changes are being made and why?
closes #4129
How the changes have been QAed?
Steps to reproduce:
Alternative with devcontainers:
./gradlew runLocal
docker stop <devcontainerid>
./gradlew runLocal