Skip to content

Conversation

@littleEast7
Copy link
Collaborator

NO JIRA

@littleEast7 littleEast7 requested a review from Copilot December 29, 2025 08:04
@littleEast7 littleEast7 self-assigned this Dec 29, 2025
@littleEast7 littleEast7 added the enhancement New feature or request label Dec 29, 2025
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors the UFS (Underlying File System) manager by extracting logging utilities into a dedicated module, improving code organization, and making several bug fixes and enhancements to the fallback mechanism.

Key changes:

  • Extracted logging utilities (TagAdapter, TagFilter, setup_logger) from utils.py to a new dedicated log.py module
  • Added parameters_adapter function to handle filesystem-specific parameter transformations
  • Refactored client initialization logic to reduce duplication and improve clarity
  • Improved method naming in client/core.py for better code readability

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
alluxiofs/client/log.py New module containing extracted logging utilities (TagAdapter, TagFilter, setup_logger)
alluxiofs/client/utils.py Removed logging utilities and added parameters_adapter function for filesystem-specific parameter handling
alluxiofs/client/ufs_manager.py Updated imports to use new log module, added path normalization logic, and integrated convert_ufs_info_to
alluxiofs/core.py Refactored AlluxioClient initialization, improved fallback logic with VAR_KEYWORD parameter handling, added duplicate exception handler, and fixed AlluxioFile initialization
alluxiofs/client/core.py Renamed methods for clarity, added S3_RANGE_URL_FORMAT constant usage, and created new _send_get_range_message_and_write_to_file method
alluxiofs/client/const.py Added S3_RANGE_URL_FORMAT constant
alluxiofs/client/prefetch_policy.py Updated imports to use new log module
alluxiofs/client/loadbalance.py Updated imports to use new log module
alluxiofs/client/cache.py Updated imports to use new log module
tests/local_cache/test_utils.py Updated test imports and mock patches to reflect log module extraction, improved test assertion for HTTP errors
tests/local_cache/test_ufs_manager.py Updated mock patches to reflect module rename from transfer to ufs_manager
config_template.yaml Enhanced documentation with detailed log tag descriptions and examples
Comments suppressed due to low confidence (1)

alluxiofs/core.py:669

  • The code assumes 'alluxio' is not None when accessing 'alluxio.fallback_logger' and 'alluxio.alluxio'. However, on line 650, there's a check 'if alluxio:' suggesting that alluxio can be None. If alluxio is None, lines 658-669 will raise AttributeError when trying to access alluxio.fallback_logger and alluxio.alluxio. Add a null check or ensure alluxio is never None.
        self.fallback_logger = alluxio.fallback_logger
        if alluxio.alluxio and alluxio.alluxio.config.local_cache_enabled:
            block_size = (
                int(alluxio.alluxio.config.local_cache_block_size_mb)
                * 1024
                * 1024
            )
            self.prefetch_policy = get_prefetch_policy(
                alluxio.alluxio.config, block_size
            )
        else:
            self.prefetch_policy = None

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 1529 to 1533
error=f"Error when reading file {alluxio_path} with offset {offset} and length {length},"
f" error: {e}",
)
)

Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message formatting is broken across two lines (1529-1530) without proper concatenation. This will result in a string that includes a line break and indentation in the middle. The f-string should be written on a single line or use parentheses for proper line continuation.

Suggested change
error=f"Error when reading file {alluxio_path} with offset {offset} and length {length},"
f" error: {e}",
)
)
error=f"Error when reading file {alluxio_path} with offset {offset} and length {length}, error: {e}",
)
)

Copilot uses AI. Check for mistakes.
Comment on lines +1480 to +1481
def _send_get_range_message_and_return_bytes(
self, worker_host, worker_port, alluxio_path, offset, length
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name '_send_get_range_message_and_return_bytes' is verbose and inconsistent with the original naming pattern. The old name '_all_file_range_generator' was misleading (it doesn't generate anything), but the new name could be simplified to something like '_read_range_bytes' or '_fetch_range' for better readability and consistency with similar methods.

Copilot uses AI. Check for mistakes.
Comment on lines +1535 to 1536
def _send_get_range_message_and_return_bytes_alluxiocommon(
self, worker_host, worker_http_port, file_path, offset, length
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name '_send_get_range_message_and_return_bytes_alluxiocommon' is excessively verbose. Consider simplifying to something like '_read_range_bytes_common' or '_fetch_range_common' for better readability.

Copilot uses AI. Check for mistakes.
Comment on lines 721 to 725
if (
self.fs.fallback_to_ufs_enabled
and self.f is None
or not self.fs.fallback_to_ufs_enabled
):
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition logic here is incorrect. The expression self.fs.fallback_to_ufs_enabled and self.f is None or not self.fs.fallback_to_ufs_enabled evaluates incorrectly due to operator precedence. It should be: (self.fs.fallback_to_ufs_enabled and self.f is None) or (not self.fs.fallback_to_ufs_enabled). Currently, due to operator precedence, this is evaluated as (self.fs.fallback_to_ufs_enabled and self.f is None) or (not self.fs.fallback_to_ufs_enabled), but the lack of explicit parentheses makes this confusing and error-prone. Additionally, the logic appears backwards - if fallback is enabled and we have a valid file handle (self.f is not None), we should use it, not raise an exception.

Suggested change
if (
self.fs.fallback_to_ufs_enabled
and self.f is None
or not self.fs.fallback_to_ufs_enabled
):
# Re-raise the original exception if fallback is disabled, or if
# fallback is enabled but opening the UFS file failed.
if (not self.fs.fallback_to_ufs_enabled) or self.f is None:

Copilot uses AI. Check for mistakes.
Comment on lines 43 to 44
adapted_kwargs["src_path"] = final_kwargs.pop("path1")
adapted_kwargs["target_path"] = final_kwargs.pop("path2")
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using 'pop' on 'final_kwargs' when the key might not exist ('mode' may not always be present) could raise a KeyError. The default value 'overwrite' is provided, which is good, but when using 'pop' on 'path1' and 'path2' on lines 43-44, there's no default value, which will raise KeyError if these keys don't exist. Consider using 'get' with defaults or adding existence checks.

Suggested change
adapted_kwargs["src_path"] = final_kwargs.pop("path1")
adapted_kwargs["target_path"] = final_kwargs.pop("path2")
if "path1" in final_kwargs and "path2" in final_kwargs:
adapted_kwargs["src_path"] = final_kwargs.pop("path1")
adapted_kwargs["target_path"] = final_kwargs.pop("path2")

Copilot uses AI. Check for mistakes.
Comment on lines 392 to 399
except Exception as e:
self.fallback_logger.error(
f"Fallback to UFS failed for {method_name}"
)
# [Critical] Use 'from e' to preserve the original exception stack trace
raise RuntimeError(
f"Fallback to UFS failed for {method_name}"
) from e
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate exception handler detected. Lines 384-391 and 392-399 are identical except handlers. The second except block (392-399) will never be reached because the first one (384-391) catches all exceptions. This is unreachable code that should be removed.

Copilot uses AI. Check for mistakes.
Comment on lines +1507 to +1508
def _send_get_range_message_and_write_to_file(
self, f, worker_host, worker_port, alluxio_path, offset, length
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method name '_send_get_range_message_and_write_to_file' is verbose. Consider simplifying to something like '_fetch_range_to_file' or '_write_range_to_file' for better readability and consistency with similar methods.

Copilot uses AI. Check for mistakes.
Comment on lines +48 to +55
headers = {}
mode = adapted_kwargs.pop("mode", "overwrite")
if mode == "overwrite":
# Default behavior, explicitly allowing overwrite (optional, usually not needed)
headers["x-oss-forbid-overwrite"] = "false"
else: # Or whatever logic you use for "don't overwrite"
# This will cause the upload to fail if the file exists
headers["x-oss-forbid-overwrite"] = "true"
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'headers' dictionary is created but never used. After setting the headers based on the mode, they should be passed to the appropriate OSS API call or stored in adapted_kwargs. Currently, this code has no effect.

Copilot uses AI. Check for mistakes.
# log_dir: Log directory
# Purpose: Directory to store log files. If not set, logs might be output to console or default location.
# Default Value: null
# Default Value: null,meaning no specific log directory is set,only console output
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chinese punctuation (full-width comma ',') is used instead of standard ASCII comma. Replace ',' with ',' for consistency with English text.

Suggested change
# Default Value: nullmeaning no specific log directory is setonly console output
# Default Value: null, meaning no specific log directory is set, only console output

Copilot uses AI. Check for mistakes.
Comment on lines 1502 to 1503
error=f"Error when reading file {alluxio_path} with offset {offset} and length {length},"
f" error: {e}",
Copy link

Copilot AI Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message formatting is broken across two lines (1502-1503) without proper concatenation. This will result in a string that includes a line break and indentation in the middle. The f-string should be written on a single line or use parentheses for proper line continuation.

Copilot uses AI. Check for mistakes.
@littleEast7 littleEast7 merged commit b9277a1 into fsspec:main Dec 30, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant