-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[core] Replace SHA-1 with SHA-256 for internal hash operations #60242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request correctly replaces SHA-1 with SHA-256 for internal hashing operations, which is a great security improvement. The changes are mostly straightforward. I've added a few comments to suggest explicitly specifying the encoding as 'utf-8' when calling .encode() on strings before hashing. This ensures consistent behavior across different platforms and avoids potential issues with default encodings.
| sha1 = hashlib.sha1() | ||
| sha1.update(str(filepath.relative_to(relative_path)).encode()) | ||
| sha256 = hashlib.sha256() | ||
| sha256.update(str(filepath.relative_to(relative_path)).encode()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | ||
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency and to avoid platform-dependent behavior, it's better to explicitly specify the encoding as utf-8 when calling .encode().
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() | |
| ssh_control_hash = hashlib.sha256(cluster_name.encode("utf-8")).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode("utf-8")).hexdigest() |
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | ||
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency and to avoid platform-dependent behavior, it's better to explicitly specify the encoding as utf-8 when calling .encode().
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() | |
| ssh_control_hash = hashlib.sha256(cluster_name.encode("utf-8")).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode("utf-8")).hexdigest() |
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | ||
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency and to avoid platform-dependent behavior, it's better to explicitly specify the encoding as utf-8 when calling .encode().
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() | |
| ssh_control_hash = hashlib.sha256(cluster_name.encode("utf-8")).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode("utf-8")).hexdigest() |
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | ||
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency and to avoid platform-dependent behavior, it's better to explicitly specify the encoding as utf-8 when calling .encode().
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() | |
| ssh_control_hash = hashlib.sha256(cluster_name.encode("utf-8")).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode("utf-8")).hexdigest() |
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | ||
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency and to avoid platform-dependent behavior, it's better to explicitly specify the encoding as utf-8 when calling .encode().
| ssh_control_hash = hashlib.sha256(cluster_name.encode()).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode()).hexdigest() | |
| ssh_control_hash = hashlib.sha256(cluster_name.encode("utf-8")).hexdigest() | |
| ssh_user_hash = hashlib.sha256(getuser().encode("utf-8")).hexdigest() |
| if not group_name: | ||
| raise ValueError("group_name is None.") | ||
| hexlified_name = hashlib.sha1(group_name.encode()).hexdigest() | ||
| hexlified_name = hashlib.sha256(group_name.encode()).hexdigest() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Pure refactoring. The file isn't used anywhere. Unused code adds to maintenance burden. Signed-off-by: irabbani <[email protected]> Signed-off-by: Purushotham Pushpavanth <[email protected]>
This commit addresses security concerns by migrating all internal hash operations from SHA-1 to SHA-256 across the Ray codebase. Fixes ray-project#53435 Signed-off-by: Purushotham Pushpavanth <[email protected]>
## Description When the autoscaler commits to draining a node, immediately update `ClusterResourceManager` so the scheduler sees the node as unavailable. ### Why There's a race condition where work can be scheduled to nodes that are committed for draining: 1. Autoscaler calls `HandleDrainNode()` -> `SetNodeDraining()` updates `GcsNodeManager::draining_nodes_`. 2. Scheduler checks `ClusterResourceManager::IsNodeDraining()` which reads from `NodeResources::is_draining`. 3. But `is_draining` is only updated when the raylet broadcasts via RaySyncer. During this window, scheduler sees `is_draining=false` and schedules work to the draining node. To fix this, `GcsResourceManager` now registers a listener with `GcsNodeManager` that immediately updates `ClusterResourceManager` when a node is set to draining. This closes the race window where work could be scheduled to draining nodes. --------- Signed-off-by: Sagar Sumit <[email protected]> Co-authored-by: Zac Policzer <[email protected]> Signed-off-by: Purushotham Pushpavanth <[email protected]>
Signed-off-by: Purushotham Pushpavanth <[email protected]>
edoakes
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The C++ changes look irrelevant here -- maybe a bad merge?
I believe some of these hashes are used in the runtime_env directory paths. I'd be concerned that extending their length could cause problems in some environments. I don't think this change in those areas is worth the risk of breakage.
This commit reverts runtime_env hash functions back to SHA-1 to address reviewer concerns about filesystem path length limitations. Runtime environment hashes are used in: - Directory paths: /tmp/ray/.../pip/<hash> - GCS URIs: gcs://_ray_pkg_<hash>.zip - Conda env names: ray-<hash> SHA-256 would increase hash length from 40 to 64 characters (+24 chars), which could cause issues on: - Windows systems (PATH_MAX = 260 characters) - Network filesystems with shorter path limits - Legacy systems with path restrictions Additionally, changing these hashes would invalidate all existing cached runtime environments, causing unnecessary rebuilds. For content-addressable hashing (non-cryptographic use), SHA-1 provides sufficient collision resistance. SHA-256 is retained for security-critical operations like SSH control paths, config hashing, and CloudWatch configs. Files reverted to SHA-1: - python/ray/_private/runtime_env/pip.py - python/ray/_private/runtime_env/uv.py - python/ray/_private/runtime_env/conda.py - python/ray/_private/runtime_env/conda_utils.py - python/ray/_private/runtime_env/packaging.py Related to ray-project#53435 Signed-off-by: Purushotham Pushpavanth <[email protected]>
|
@edoakes Thank you for the feedback!
You are right about the bad merge, fixed it by pulling master again.
Great catch! I've reverted the 5 runtime_env files back to SHA-1 to avoid the path length issues you identified. SHA-256 would add +24 characters to filesystem paths, which could definitely cause problems on Windows and network filesystems, plus it would invalidate all existing cached environments. |
Signed-off-by: Purushotham Pushpavanth <[email protected]>
Why are these changes needed?
This PR replaces SHA-1 with SHA-256 across Ray's internal hash operations to address security concerns in #53435 with a conservative approach for runtime environment paths. SHA-1 has known collision vulnerabilities and is deprecated - switching to SHA-256 improves security and FIPS compliance.
Related issue number
Fixes #53435
SHA-256 Migration (15 files)
SHA-1 Retained (5 files)
Runtime environment path generation kept SHA-1 to avoid:
For content-addressable hashing (non-cryptographic), SHA-1 provides sufficient collision resistance.
Testing