Skip to content

ls parser speed improvements, reworked #4408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

PaulWay
Copy link
Contributor

@PaulWay PaulWay commented Mar 31, 2025

All Pull Requests:

Check all that apply:

  • [Y] Have you followed the guidelines in our Contributing document, including the instructions about commit messages?
  • [Y] No Sensitive Data in this change?
  • [N] Is this PR to correct an issue?
  • [Y] Is this PR an enhancement?

Complete Description of Additions/Changes:

This is an update for my previous ls parser speed improvements work, after the removal of 'raw_entry' by @xiangce.

Main speed improvements are:

  • Once the format of the directory has been found, use the same function to parse all lines in that directory.
  • In parsing subfunctions, put data directly into the dirent's dictionary.
  • Only parse name -> link if the dirent is actually a link.
  • Handle files first when adding to the files/dirs/specials lists.
  • Ignore obviously malformed lines.
  • Faster and easier to read logic for partly-formed directory listings with dirname not first line.

Readability improvements are:

  • Functions for name handling, SELinux context handling, major,minor/size/date handling.

PaulWay added 14 commits March 31, 2025 14:53
- Avoid re-splitting the line if it's already been split.
- Handle a few failure modes in parsing links and size on '?' and other bad
  input.
- Each directory is all one 'mode' - normal, or SELinux on RHEL 6/7 or 8+.
  Instead of trying to detect the mode of each line, detect the mode once and
  then use that for all lines in this directory.  Difficult to extrapolate
  that to all listings in a parse - let's see how this goes for now.
- Simplify how we pick up directory names and store previously processed
  directories.
- Add a few comments to make it easier to see what's being parsed where.

Signed-off-by: Paul Wayper <[email protected]>
Signed-off-by: Paul Wayper <[email protected]>
Signed-off-by: Paul Wayper <[email protected]>
@PaulWay PaulWay requested a review from xiangce March 31, 2025 05:05
@PaulWay PaulWay self-assigned this Mar 31, 2025
@PaulWay PaulWay requested a review from chenlizhong March 31, 2025 05:05
@codecov-commenter
Copy link

codecov-commenter commented Mar 31, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 77.51%. Comparing base (05eb089) to head (c942700).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4408      +/-   ##
==========================================
+ Coverage   77.50%   77.51%   +0.01%     
==========================================
  Files         744      744              
  Lines       41574    41571       -3     
  Branches     6667     6665       -2     
==========================================
+ Hits        32220    32223       +3     
+ Misses       8319     8315       -4     
+ Partials     1035     1033       -2     
Flag Coverage Δ
unittests 77.50% <100.00%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@xiangce xiangce added the BREAK Rules The change breaks the test of some rules. label Apr 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
BREAK Rules The change breaks the test of some rules. enhancement
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants