security: fix path traversal in BaseTuner directory/project_name/tuner_id#1058
security: fix path traversal in BaseTuner directory/project_name/tuner_id#1058dfgvaetyj3456356-hash wants to merge 2 commits into
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
There was a problem hiding this comment.
Code Review
This pull request introduces path traversal validation checks for the directory, project name, and tuner ID parameters in the base tuner class, alongside corresponding unit tests. The code reviewer identified two security vulnerabilities in these new checks: first, an absolute path bypass in project_name could allow arbitrary directory deletion when overwrite=True is set; second, checking only os.path.sep in tuner_id on Windows can be bypassed using forward slashes (/). Both comments are valid and should be kept.
| def _validate_project_path(directory, project_name): | ||
| """Validates that directory and project_name do not contain path traversal. | ||
|
|
||
| Raises: | ||
| ValueError: If path traversal sequences are detected. | ||
| """ | ||
| for segment in (str(directory), str(project_name)): | ||
| if ".." in segment: | ||
| raise ValueError( | ||
| f"Path traversal is not allowed. Received: {segment!r}" | ||
| ) |
There was a problem hiding this comment.
Security Vulnerability: Absolute Path Bypass in project_name\n\nWhile checking for ".." in segment prevents directory traversal using relative path segments, it does not prevent project_name from being an absolute path (e.g., /etc/passwd or C:\\Windows).\n\nIn Python, os.path.join(directory, project_name) will discard the directory prefix entirely if project_name is an absolute path. When overwrite=True is set, this can lead to arbitrary directory deletion or unauthorized file access outside the intended workspace.\n\nTo resolve this, we should explicitly validate that project_name is neither an absolute path nor a drive-relative path.
def _validate_project_path(directory, project_name):\n """Validates that directory and project_name do not contain path traversal.\n\n Raises:\n ValueError: If path traversal sequences are detected.\n """\n for segment in (str(directory), str(project_name)):\n if ".." in segment:\n raise ValueError(\n f"Path traversal is not allowed. Received: {segment!r}"\n )\n proj_str = str(project_name)\n if (\n os.path.isabs(proj_str)\n or os.path.splitdrive(proj_str)[0]\n or proj_str.startswith(("/", "\\"))\n ):\n raise ValueError(\n f"project_name cannot be an absolute or drive-relative path. "\n f"Received: {project_name!r}"\n )There was a problem hiding this comment.
Addressed in the current PR head dbc21de. _validate_project_path() now rejects POSIX absolute paths, Windows rooted paths, and Windows drive-prefixed paths via ntpath, with regression coverage for C:\Windows, C:Windows, and \Windows project names.
| def _validate_tuner_id(tuner_id): | ||
| """Validates that tuner_id does not contain path traversal sequences. | ||
|
|
||
| Raises: | ||
| ValueError: If path traversal sequences or path separators are detected. | ||
| """ | ||
| tuner_id_str = str(tuner_id) | ||
| if ".." in tuner_id_str or os.path.sep in tuner_id_str: | ||
| raise ValueError( | ||
| f"tuner_id cannot contain path separators or traversal sequences. " | ||
| f"Received: {tuner_id_str!r}" | ||
| ) |
There was a problem hiding this comment.
Security Vulnerability: Windows Path Separator Bypass in tuner_id\n\nOn Windows, os.path.sep is \\. Checking os.path.sep in tuner_id_str will not detect forward slashes (/), which are still treated as path separators by many Windows APIs and Python's path manipulation functions.\n\nAn attacker could supply a tuner_id containing / (e.g., subdir/tuner_id) to bypass this check on Windows, leading to path traversal or arbitrary file creation.\n\nTo fix this, we should explicitly check for both / and \\ path separators regardless of the host operating system.
def _validate_tuner_id(tuner_id):\n """Validates that tuner_id does not contain path traversal sequences.\n\n Raises:\n ValueError: If path traversal sequences or path separators are detected.\n """\n tuner_id_str = str(tuner_id)\n if (\n ".." in tuner_id_str\n or "/" in tuner_id_str\n or "\\" in tuner_id_str\n ):\n raise ValueError(\n f"tuner_id cannot contain path separators or traversal sequences. "\n f"Received: {tuner_id_str!r}"\n )There was a problem hiding this comment.
Addressed in the current PR head dbc21de. _validate_tuner_id() checks both / and \ explicitly, and the PR includes regression coverage for a forward-slash KERASTUNER_TUNER_ID.
4d0ebe2 to
e3894a9
Compare
…r_id Fixes a path traversal vulnerability where user-supplied directory, project_name, and tuner_id parameters were joined into filesystem paths without validation. An attacker could set directory='../evil' or project_name='../../etc' to escape the intended project directory. When combined with overwrite=True, this could lead to arbitrary directory deletion. The fix adds _validate_project_path() and _validate_tuner_id() static methods that reject path traversal sequences (..) and path separators in tuner_id. Tests cover directory traversal, project_name traversal, and tuner_id traversal with environment variable input.
e3894a9 to
aa40aa9
Compare
Summary
Fixes a path traversal vulnerability in
BaseTunerwhere user-supplieddirectory,project_name, andtuner_idparameters were joined into filesystem paths without validation.Impact
An attacker controlling these parameters could escape the intended project directory. When combined with
overwrite=True, this leads to arbitrary directory deletion (CWE-22 / CWE-552).Changes
_validate_project_path()to reject..indirectoryandproject_name_validate_tuner_id()to reject..and path separators intuner_idVulnerability Details
Before
After