Multi-job support (using job start/end hooks), optional CloudWatch logging, demo workflows, module/repo rename#2
Multi-job support (using job start/end hooks), optional CloudWatch logging, demo workflows, module/repo rename#2ryan-williams wants to merge 0 commit intomainfrom
Conversation
3f0abc9 to
de868e0
Compare
4b3f29f to
e09ed8f
Compare
6bec165 to
efc6c98
Compare
cdb6a54 to
8dcb14b
Compare
c648782 to
d4bb7fa
Compare
jder
left a comment
There was a problem hiding this comment.
Awesome stuff, this seems great. I didn't read the bash script super carefully but generally LGTM. Thanks!
There was a problem hiding this comment.
🐑 maybe next time we're in here, it would be nice to swap to a snapshot testing library rather than the many individual asserts on template outputs
src/ec2_gha/__main__.py
Outdated
| check_required(env, required) | ||
| # The timeout is infallible | ||
| timeout = int(os.environ["INPUT_GH_TIMEOUT"]) | ||
| timeout = int(os.environ["INPUT_MAX_INSTANCE_LIFETIME"]) |
There was a problem hiding this comment.
I think this timeout (passed to DeployInstance below) has different semantics from the previous GH_TIMEOUT. Was your thinking that we don't need a timeout for how long to wait for the instance to come up which is separate from the timeout for the instance lifetime?
There was a problem hiding this comment.
Good catch. Added runner_registration_timeout, essentially replacing the old gh_timeout, and used it here.
| # Determine the default user based on the home directory | ||
| if [ "$homedir" = "/home/ubuntu" ]; then | ||
| DEFAULT_USER="ubuntu" | ||
| elif [ "$homedir" = "/home/ec2-user" ]; then | ||
| DEFAULT_USER="ec2-user" | ||
| else | ||
| DEFAULT_USER="root" | ||
| fi |
There was a problem hiding this comment.
Perhaps DEFAULT_USER=$$(stat -c "%U" "$homedir")?
Removed workflows not in test branch: - demo-00-minimal.yml - demo-gpu-dbg.yml Added placeholders for workflows from test branch: - demo-gpu-minimal.yml - demo-gpu.yml - demo-multi-instance.yml - demo-multi-job.yml All placeholder workflows display a message that they're being developed in the #2 / rw/hooks branch to avoid confusion. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
Removed workflows not in test branch: - demo-00-minimal.yml - demo-gpu-dbg.yml Added placeholders for workflows from test branch: - demo-gpu-minimal.yml - demo-gpu.yml - demo-multi-instance.yml - demo-multi-job.yml All placeholder workflows display a message that they're being developed in the #2 / rw/hooks branch to avoid confusion. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
a90bd76 to
59bd562
Compare
Removed workflows not in test branch: - demo-00-minimal.yml - demo-gpu-dbg.yml Added placeholders for workflows from test branch: - demo-gpu-minimal.yml - demo-gpu.yml - demo-multi-instance.yml - demo-multi-job.yml All placeholder workflows display a message that they're being developed in the #2 / rw/hooks branch to avoid confusion. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
db06efe to
5a95a93
Compare
ACTIONS_RUNNER_HOOK_JOB_*hooks for job tracking (instead of polling).github/workflows/demo*; seedemos.ymlruns)ec2-gha(and Python module toec2_gha)runner.ymlnow lives in this repo, and wrapsaction.ymlfrom this repo)Examples
demos#18
mamba#15
(these failures are expected, part of demonstrating issues with
pip install mamba_ssmat different versions)Ocean_Emulator#308
CloudWatch console
EC2 instances console (showing many terminat{ed,ing} instances