Skip to content

gh-130167: Improve speed of difflib.IS_LINE_JUNK by replacing re #130170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions Lib/difflib.py
Original file line number Diff line number Diff line change
Expand Up @@ -1038,11 +1038,9 @@ def _qformat(self, aline, bline, atags, btags):
# remaining is that perhaps it was really the case that " volatile"
# was inserted after "private". I can live with that <wink>.

import re

def IS_LINE_JUNK(line, pat=re.compile(r"\s*(?:#\s*)?$").match):
def IS_LINE_JUNK(line, pat=None):
r"""
Return True for ignorable line: iff `line` is blank or contains a single '#'.
Return True for ignorable line: if and only if `line` is blank or contains a single '#'.

Examples:

Expand All @@ -1054,6 +1052,9 @@ def IS_LINE_JUNK(line, pat=re.compile(r"\s*(?:#\s*)?$").match):
False
"""

if pat is None:
stripped = line.strip()
return stripped == '' or stripped == '#'
return pat(line) is not None

def IS_CHARACTER_JUNK(ch, ws=" \t"):
Expand Down Expand Up @@ -2027,7 +2028,6 @@ def make_table(self,fromlines,tolines,fromdesc='',todesc='',context=False,
replace('\1','</span>'). \
replace('\t','&nbsp;')

del re

def restore(delta, which):
r"""
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Improve speed of :func:`difflib.IS_LINE_JUNK` by replacing :mod:`re` with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using re or built-in string methods is an implementation detail. This does not concern users and is not of interest to them.

Copy link
Contributor Author

@donBarbos donBarbos Feb 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serhiy-storchaka ok so should I remove NEWS.d entry or replace with something else. I just took as a sample news entries from PRs related to issue on improving importtime

built-in string methods.
Loading