Skip to content

Performance Analysis of rename Operations: Why is Rename so Time-Consuming with Large File Counts? #1204

@tianyu524-art

Description

@tianyu524-art

I created a new directory and conducted a time-consuming test to compare the latency of three different scenarios:
Scenarios:

  1. Empty Directory: Create file A, then rename it to B.
  2. Existing Files (200): Randomly create 200 files in the directory, then create file AA and rename it to BB.
  3. Post-Deletion: Delete all files in the current directory, then create file AAA and rename it to BBB.

Test Results:

  • Scenario 1 (Empty Dir): A -> B
    • Average: 64.772 μs | Median: 55.800 μs
  • Scenario 2 (200 Files): AA -> BB
    • Average: 4000.512 μs | Median: 3696.000 μs
  • Scenario 3 (Cleared Dir): AAA -> BBB
    • Average: 5647.116 μs | Median: 5332.300 μs

Comparisons:

  • Scenario 2 is ~61.8x slower than Scenario 1.
  • Scenario 3 is ~87.2x slower than Scenario 1.
  • Scenario 3 is ~1.41x slower than Scenario 2.

Due to the accumulation of historical metadata, Scenario 3 deteriorated significantly. I have the following questions regarding the implementation:

  1. In lfs_rawrename, during lfs_dir_commit, the tag {LFS_MKTAG(LFS_FROM_MOVE, newid, lfs_tag_id(oldtag)), &oldcwd} uses LFS_FROM_MOVE. This appears to be extremely time-consuming in lfs_dir_traverse, exhibiting $O(N^2)$ complexity. Why can't the information be located directly via the ID and then updated to achieve $O(N)$ complexity?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions