Skip to content

Conversation

Delaunay
Copy link
Collaborator

No description provided.

Delaunay and others added 3 commits September 10, 2025 12:01
* Add job push button

* Add status helper to avoid rsync when cache is good
@Delaunay Delaunay force-pushed the realtime_tracking branch 3 times, most recently from 515cfaf to c88db6d Compare September 11, 2025 18:48
Base automatically changed from staging to master September 16, 2025 15:42
@Delaunay Delaunay changed the base branch from master to staging September 16, 2025 15:57
@Delaunay Delaunay force-pushed the realtime_tracking branch 3 times, most recently from c13b037 to c7919ef Compare September 17, 2025 18:57
@Delaunay Delaunay force-pushed the realtime_tracking branch 8 times, most recently from 953b4f0 to 5cb60da Compare September 23, 2025 21:42
@Delaunay Delaunay force-pushed the realtime_tracking branch 8 times, most recently from 9070279 to 417457d Compare October 1, 2025 16:32
@Delaunay Delaunay force-pushed the realtime_tracking branch 2 times, most recently from 8fb880e to 0abd286 Compare October 1, 2025 17:08
Base automatically changed from staging to master October 2, 2025 18:18
nonlocal process_registry

if process_registry.get(hostname) is None:
proc = subprocess.Popen(cmd)

Check failure

Code scanning / CodeQL

Uncontrolled command line Critical

This command line depends on a
user-provided value
.

Copilot Autofix

AI 16 days ago

To fix this problem:

  • Only allow well-formed hostnames to be passed to the SSH command, and reject or sanitize anything else.
  • Make a whitelist validation before using the value in reverse_ssh_tunnel and subsequently in any subprocess call.
  • The recommended solution is to use a regular expression to restrict hostnames to valid DNS hostnames or IP addresses (rejecting whitespace, control characters, and metacharacters).
  • The check should be added in the /api/metric/<string:hostname> route and possibly also in the reverse_ssh_tunnel function for defense in depth.
  • Required: Add import re for regex validation.

Edits must be:

  • Add an import re.
  • At the start of open_reverse_ssh, validate the hostname with a regex that only allows DNS hostnames or IP addresses.
  • If validation fails, return an error response (400 Bad Request).

Suggested changeset 1
milabench/web/realtime.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/milabench/web/realtime.py b/milabench/web/realtime.py
--- a/milabench/web/realtime.py
+++ b/milabench/web/realtime.py
@@ -1,6 +1,7 @@
 import os
 import requests
 import subprocess
+import re
 from threading import Thread, Lock, Event
 import json
 
@@ -71,6 +72,12 @@
 
     @app.route('/api/metric/<string:hostname>')
     def open_reverse_ssh(hostname: str):
+        # Validate hostname: allow only DNS hostnames or IPv4/IPv6 addresses, no spaces or metacharacters
+        hostname_regex = r"^(?!-)[A-Za-z0-9.-]{1,253}(?<!-)$"
+        ipv4_regex = r"^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$"
+        ipv6_regex = r"^[0-9a-fA-F:]+$"
+        if not (re.match(hostname_regex, hostname) or re.match(ipv4_regex, hostname) or re.match(ipv6_regex, hostname)):
+            return {"status": "error", "message": "invalid hostname"}, 400
         cmd = reverse_ssh_tunnel(hostname)
 
         nonlocal process_registry
EOF
@@ -1,6 +1,7 @@
import os
import requests
import subprocess
import re
from threading import Thread, Lock, Event
import json

@@ -71,6 +72,12 @@

@app.route('/api/metric/<string:hostname>')
def open_reverse_ssh(hostname: str):
# Validate hostname: allow only DNS hostnames or IPv4/IPv6 addresses, no spaces or metacharacters
hostname_regex = r"^(?!-)[A-Za-z0-9.-]{1,253}(?<!-)$"
ipv4_regex = r"^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$"
ipv6_regex = r"^[0-9a-fA-F:]+$"
if not (re.match(hostname_regex, hostname) or re.match(ipv4_regex, hostname) or re.match(ipv6_regex, hostname)):
return {"status": "error", "message": "invalid hostname"}, 400
cmd = reverse_ssh_tunnel(hostname)

nonlocal process_registry
Copilot is powered by AI and may make mistakes. Always verify output.
subprocess.check_call(scp_cmd, shell=True)
# rsync the job to remote
result = local_command("rsync", "-az", new_local_dir + "/", f"{SLURM_HOST}:{new_remote_dir}")
if not result['success']:

Check failure

Code scanning / CodeQL

Uncontrolled data used in path expression High

This path depends on a
user-provided value
.

Copilot Autofix

AI 16 days ago

To address the Uncontrolled data used in path expression issue, the code should validate and normalize the constructed path to ensure that it cannot be manipulated to escape the intended cache directory. Specifically, we should:

  • After constructing the full path used in the file write operation (f"{new_local_dir}/cmd.sh"), use os.path.normpath to normalize the path.
  • Check that the normalized path starts with the intended root (JOBRUNNER_LOCAL_CACHE).
  • If it does not, abort the operation (e.g., raise an Exception or return an error response).
  • Only attempt to open the file after validating the path.

This will ensure that no constructed path can escape the intended job cache directory, even if a crafted jr_job_id attempts directory traversal.

Add any needed imports only if not present (in this case, os is already imported).

Locate the code block starting at line 1060 and add normalization and validation before the with open(...) call.


Suggested changeset 1
milabench/web/slurm.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/milabench/web/slurm.py b/milabench/web/slurm.py
--- a/milabench/web/slurm.py
+++ b/milabench/web/slurm.py
@@ -1057,7 +1057,12 @@
 
         new_cmd = "'" + old_cmd.replace(old_jr_job_id, new_jr_job_id) + "'"
 
-        with open(f"{new_local_dir}/cmd.sh", "w") as fp:
+        output_path = os.path.normpath(f"{new_local_dir}/cmd.sh")
+        # Ensure path is within JOBRUNNER_LOCAL_CACHE
+        cache_root = os.path.abspath(JOBRUNNER_LOCAL_CACHE)
+        if not output_path.startswith(cache_root):
+            return jsonify({'error': 'Invalid job id: path outside of cache root'}), 400
+        with open(output_path, "w") as fp:
             fp.write(new_cmd[1:-1])
             fp.flush()
 
EOF
@@ -1057,7 +1057,12 @@

new_cmd = "'" + old_cmd.replace(old_jr_job_id, new_jr_job_id) + "'"

with open(f"{new_local_dir}/cmd.sh", "w") as fp:
output_path = os.path.normpath(f"{new_local_dir}/cmd.sh")
# Ensure path is within JOBRUNNER_LOCAL_CACHE
cache_root = os.path.abspath(JOBRUNNER_LOCAL_CACHE)
if not output_path.startswith(cache_root):
return jsonify({'error': 'Invalid job id: path outside of cache root'}), 400
with open(output_path, "w") as fp:
fp.write(new_cmd[1:-1])
fp.flush()

Copilot is powered by AI and may make mistakes. Always verify output.
return jsonify({'error': result['stderr']}), 500

@app.route('/api/slurm/jobs/<jr_job_id>/earlysync/<job_id>')
def api_early_sync(jr_job_id, job_id):

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.

Copilot Autofix

AI 16 days ago

To fix the information exposure issue, we should avoid returning the raw exception message in API responses visible to the end user. Instead, the exception (with or without its stack trace) should be logged on the server side using proper logging methods, while the client only receives a generic error message. This adjustment should be made in local_command, replacing the user-visible stderr: str(e) field with a fixed friendly message (e.g., "Internal error occurred" or just "Command failed"), and logging the actual error for server diagnostics.

  • Edit local_command in milabench/web/slurm.py, changing the return value in the exception handler.
  • Add proper logging (preferably using the standard logging library) for the exception, including the stack trace.
  • All usages of local_command (as visible in the provided snippet) will now propagate the generic message, reducing information leak.

Suggested changeset 1
milabench/web/slurm.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/milabench/web/slurm.py b/milabench/web/slurm.py
--- a/milabench/web/slurm.py
+++ b/milabench/web/slurm.py
@@ -12,6 +12,7 @@
 import uuid
 from functools import wraps
 import traceback
+import logging
 import time
 import threading
 from filelock import FileLock, Timeout
@@ -743,12 +744,13 @@
             'returncode': result.returncode
         }
     except Exception as e:
+        import logging
         import traceback
-        traceback.print_exc()
+        logging.error("Exception in local_command: %s", traceback.format_exc())
         return {
             'success': False,
             'stdout': '',
-            'stderr': str(e),
+            'stderr': 'An internal error occurred.',  # Do not expose details
             'returncode': -1
         }
 
EOF
@@ -12,6 +12,7 @@
import uuid
from functools import wraps
import traceback
import logging
import time
import threading
from filelock import FileLock, Timeout
@@ -743,12 +744,13 @@
'returncode': result.returncode
}
except Exception as e:
import logging
import traceback
traceback.print_exc()
logging.error("Exception in local_command: %s", traceback.format_exc())
return {
'success': False,
'stdout': '',
'stderr': str(e),
'stderr': 'An internal error occurred.', # Do not expose details
'returncode': -1
}

Copilot is powered by AI and may make mistakes. Always verify output.
@Delaunay Delaunay changed the base branch from master to staging October 15, 2025 15:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant