-
Notifications
You must be signed in to change notification settings - Fork 77
Description
Hey,
I upgraded to the recent v0.8.0 to use the new kubernetes-novolume mode.
And while it's working fine, I noticed that each Step in a Job is delayed by ~30s because of a hash mismatch in the copied files.
Here are some example logs from the Initialize containers step:
##[debug]Evaluating condition for step: 'Initialize containers'
##[debug]Evaluating: success()
##[debug]Evaluating success:
##[debug]=> true
##[debug]Result: true
##[debug]Starting: Initialize containers
##[debug]Register post job cleanup for stopping/deleting containers.
Run '/home/runner/k8s-novolume/index.js'
##[debug]/home/runner/externals/node20/bin/node /home/runner/k8s-novolume/index.js
##[debug]Job pod created, waiting for it to come online <workflow-pod-name>
##[debug]Copying /home/runner/_work to pod <workflow-pod-name> at /__w
(node:98) [DEP0005] DeprecationWarning: Buffer() is deprecated due to security and usability issues. Please use the Buffer.alloc(), Buffer.allocUnsafe(), or Buffer.from() methods instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]internalExecOutput response: {"metadata":{},"status":"Success"}
##[debug]The hash of the directory does not match the expected value; want='742f6770882c57760c85f1c1fd1d8f781d52b04751482098954181e9d1cd8e35' got='a7c551c3c067391a05bd1cd314c0144eb04bd3f5901f7cac48598e167b77acc7'
##[debug]Job pod is ready for traffic
##[debug]execPodStep response: {"metadata":{},"status":"Failure","message":"command terminated with non-zero exit code: command terminated with exit code 1","reason":"NonZeroExitCode","details":{"causes":[{"reason":"ExitCode","message":"1"}]}}
##[debug]{"message":"command terminated with non-zero exit code: command terminated with exit code 1","details":{"causes":[{"reason":"ExitCode","message":"1"}]}}
##[debug]Setting isAlpine to false
##[debug]Finishing: Initialize containers
I searched through the code and found that there is a retry loop of 15 attemps with a delay of 1s each that tries to get the correct hash.
To debug this I ran the command used to list the files manually in each Pod to compare the output:
/bin/sh -c "find . -not -path '*/_runner_hook_responses*' -exec stat -c '%b %n' {} \\;"
And what I noticed is that each time on the runner Pod a file like ./_temp/b584baa0-b98c-11f0-aca5-611cac01a403.sh was present with content like this:
#!/bin/sh -l
set -e
rm "$0" # remove script after running
<calling-some-bash-script>
So it looks to me like there is a temporary shell script which immediately deletes itself and seems to be included in the local hash but not in the exec hash.
I was able to reproduce this multiple times in a row and it happens on every copy execution so it doesn't seem to be a race condition.
Is there a misconfiguration on my end which causes this or is it a bug?
It's mostly an inconvience and not a major issue, but I would appreciate some assistance here.
Thanks!