Atom: drink up the string soup (rebase of #1497 + more)#1594
Closed
mattst88 wants to merge 10 commits into
Closed
Conversation
59bed7f to
59cb02b
Compare
Contributor
|
thanks matt for picking up on this - I've been very short on time lately, sorry. In the long term we should probably get rid of all For the public record, I approve of reusing and modifying my commits for this. |
385f57b to
f788603
Compare
thesamesam
reviewed
Jun 23, 2026
thesamesam
left a comment
Member
There was a problem hiding this comment.
Some small comments from a first pass. Thanks a lot to you and Jannik for doing this.
Please make sure you're daily driving it if you're not already (I recommend this in general for Portage patches), it's invaluable testing.
| else: | ||
| cp_exists = False | ||
| if atom.package and not atom.cp.startswith("null/"): | ||
| if atom.package and atom.category != "null": |
Member
There was a problem hiding this comment.
I'm not asking you to fix this now, but this null crap really needs to go in the future.
0628c37 to
9e16de7
Compare
thesamesam
reviewed
Jun 23, 2026
[mattst88] Fix call sites that relied on Atom being a str subclass: - wrap str method calls (lstrip, indexing, join, concatenation) with str() in BlockerDB, glsa, actions, depgraph, bintree, binrepo/config, repository/config - use key=str in emaint/world sorted() call instead of adding ordering methods to Atom - add TYPE_CHECKING guard and lazyimport block in versions.py Co-authored-by: Matt Turner <mattst88@gentoo.org> Signed-off-by: Matt Turner <mattst88@gentoo.org>
Convert @type/@rtype docstring annotations to PEP 3107 function annotations in dep/__init__.py. Also replace getattr()-based duck typing in dep_getslot() and dep_getrepo() with explicit isinstance() checks now that Atom has a defined interface rather than being a str subclass with arbitrarily-set attributes. In paren_enclose(), replace getattr(x, "unevaluated_atom", x) with an isinstance guard. Signed-off-by: Matt Turner <mattst88@gentoo.org>
…erator Signed-off-by: Matt Turner <mattst88@gentoo.org>
Replace the post-construction __setattr__ hack for _orig_atom with a proper orig_atom keyword argument. The slot is now always initialized in __init__ (to None for ordinary atoms), evaluate_conditionals() propagates it, and depgraph.py accesses it via getattr() to remain safe for SonameAtom and other non-Atom types in mixed-type loops. Signed-off-by: Matt Turner <mattst88@gentoo.org>
Add Atom.with_cp(cp) to replace the str(atom).replace(atom.cp, cp, 1) pattern, and update three call sites in _sets/base.py and depgraph.py. Signed-off-by: Matt Turner <mattst88@gentoo.org>
Defer construction of the without_use Atom until first access. For atoms with USE deps the __init__ previously always built a second full Atom eagerly; with the lazy property it is computed on demand and cached in the slot. For atoms without USE deps the result is still self, at zero cost. Benchmarks on USE-dep atoms: ~40% faster construction, ~22% less memory. Signed-off-by: Matt Turner <mattst88@gentoo.org>
_pkg_str(cpv) called catpkgsplit() which fully re-parsed the atom string in __init__. Since the atom regex already extracts both cpv and cp as separate groups, the version is simply cpv[len(cp)+1:] -- no second parse needed. Store _cpv as a plain string and compute _version directly. Update best_match_to_list's cmp_cpv to use cpv_getversion() as a fallback for plain strings that lack a .version attribute. Construction time (µs): Before → After =sys-apps/portage-3.0.50 4.5 → 2.8 (-38%) >=dev-libs/glib-2.68:2 4.9 → 3.2 (-35%) >=dev-python/requests-2.28[ssl] 6.3 → 4.6 (-27%) Memory (bytes): =sys-apps/portage-3.0.50 1665 → 487 (-71%) >=dev-libs/glib-2.68:2 1690 → 518 (-69%) >=dev-python/requests-2.28[ssl] 3321 → 2137 (-36%) Signed-off-by: Matt Turner <mattst88@gentoo.org>
Add __repr__ returning Atom('=cat/pkg-1.0') for readable output in
debuggers and the REPL.
Add is_versioned property (self._version is not None) to replace the
atom.cp != atom.cpv string comparison pattern at call sites. Avoids a
full string equality check in favor of a single None check.
hyperfine 'python bench.py' (250k is_versioned checks):
cp != cpv (before): 89.2 ms ± 3.8 ms
is_versioned (after): 83.7 ms ± 3.4 ms (-6%)
Signed-off-by: Matt Turner <mattst88@gentoo.org>
__eq__ previously called str() on both operands, invoking __str__ even
when we can read _string directly. Compare _string to _string for
Atom-to-Atom and _string to the str directly.
category used split("/")[0] which allocates a list; partition("/")[0]
returns the first element without the list allocation.
hyperfine (500k × (a==a, a==a_unequal, a==str, a.category)):
before: 420.7 ms ± 12.3 ms
after: 286.5 ms ± 4.8 ms (-32%)
Signed-off-by: Matt Turner <mattst88@gentoo.org>
vardb.match() calls match_from_list(), which uses only cp, cpv, use, operator, etc. -- none of which differ between an evaluated Atom and a freshly constructed Atom(str(evaluated_atom)). Pass the atom directly. Update the comment to drop the now-misleading Atom(str(atom)) reference. Signed-off-by: Matt Turner <mattst88@gentoo.org>
thesamesam
approved these changes
Jun 24, 2026
Contributor
Author
|
Merged! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.