Skip to content

Conversation

@tomasfratrik
Copy link
Member

@tomasfratrik tomasfratrik commented Jul 7, 2025

Introduce actors to detect presence of third-party
Python modules installed for target Python. Those modules could
interfere with the upgrade process or cause issues after rebooting
into the target system.

Scanner (scanthirdpartytargetpythonmodules):
- Identifies the target Python interpreter
- Queries the target Python's sys.path to determine where it searches
  for modules
- Recursively scans these directories for Python files (.py, .so, .pyc)
- Cross-references found files against the RPM database to determine
  ownership and categorize them

Checker (checkthirdpartytargetpythonmodules) creates a high severity
report to inform users about findings and presents full list of them
 in logs and short version in report.

Jira: RHEL-71882

@github-actions
Copy link

github-actions bot commented Jul 7, 2025

Thank you for contributing to the Leapp project!

Please note that every PR needs to comply with the Leapp Guidelines and must pass all tests in order to be mergeable.
If you want to request a review or rebuild a package in copr, you can use following commands as a comment:

  • review please @oamg/developers to notify leapp developers of the review request
  • /packit copr-build to submit a public copr build using packit

Packit will automatically schedule regression tests for this PR's build and latest upstream leapp build.
However, here are additional useful commands for packit:

  • /packit test to re-run manually the default tests
  • /packit retest-failed to re-run failed tests manually
  • /packit test oamg/leapp#42 to run tests with leapp builds for the leapp PR#42 (default is latest upstream - main - build)

Note that first time contributors cannot run tests automatically - they need to be started by a reviewer.

It is possible to schedule specific on-demand tests as well. Currently 2 test sets are supported, beaker-minimal and kernel-rt, both can be used to be run on all upgrade paths or just a couple of specific ones.
To launch on-demand tests with packit:

  • /packit test --labels kernel-rt to schedule kernel-rt tests set for all upgrade paths
  • /packit test --labels beaker-minimal-8.10to9.4,kernel-rt-8.10to9.4 to schedule kernel-rt and beaker-minimal test sets for 8.10->9.4 upgrade path

See other labels for particular jobs defined in the .packit.yaml file.

Please open ticket in case you experience technical problem with the CI. (RH internal only)

Note: In case there are problems with tests not being triggered automatically on new PR/commit or pending for a long time, please contact leapp-infra.

@pirat89 pirat89 marked this pull request as draft July 7, 2025 14:57
@tomasfratrik tomasfratrik force-pushed the 3rd-party-components-fix branch from 3c38570 to c7ae623 Compare July 8, 2025 10:19
@tomasfratrik tomasfratrik force-pushed the 3rd-party-components-fix branch from c7ae623 to c4c7af3 Compare July 17, 2025 15:43
@tomasfratrik
Copy link
Member Author

Note that current solution doesn't have solution for preventing actual broken system after reboot.
This just introduces concept of an actor (not yet finished and without tests yet)

@tomasfratrik tomasfratrik force-pushed the 3rd-party-components-fix branch 2 times, most recently from 90c9fdc to 21709b0 Compare July 17, 2025 15:47
@tomasfratrik tomasfratrik force-pushed the 3rd-party-components-fix branch 2 times, most recently from 282f582 to 05d4ca9 Compare August 6, 2025 10:01
@tomasfratrik tomasfratrik changed the title [WIP] Change python3 to platform-python [WIP] Inform user of 3rd party python modules that can break system after reboot Aug 6, 2025
@tomasfratrik tomasfratrik force-pushed the 3rd-party-components-fix branch from 05d4ca9 to b70c6ec Compare August 6, 2025 10:06
@pirat89 pirat89 added this to the 8.10/9.8 milestone Aug 12, 2025
@pirat89 pirat89 added enhancement New feature or request report Any reports have been added / removed / changed in the PR labels Aug 12, 2025
@karolinku
Copy link
Contributor

/packit copr-build

@karolinku karolinku force-pushed the 3rd-party-components-fix branch 2 times, most recently from 64448bb to e8a3486 Compare October 21, 2025 14:09
@karolinku karolinku changed the title [WIP] Inform user of 3rd party python modules that can break system after reboot Add upgrade inhibitor if 3rd party python modules detected Oct 21, 2025
Copy link
Member

@pirat89 pirat89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs yet some changes. See my comments for more details.

"""
topic = SystemInfoTopic

is_third_party_module_present = fields.Boolean(default=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's a question whether it has a value if third party python modules are listed. but currently they are not listed in the model et

third_party_rpm_names = fields.List(fields.String(), default=[])
"""
List of names of RPMs that own third-party Python modules. Empty list if no modules found.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

possibly we could hav ethe list of such third party python modules for the target python version listed here as well

@karolinku karolinku force-pushed the 3rd-party-components-fix branch 8 times, most recently from a16a8bd to 2c934cd Compare October 31, 2025 15:50
@karolinku karolinku force-pushed the 3rd-party-components-fix branch 2 times, most recently from 50aefbd to 1352064 Compare November 3, 2025 13:23
if third_party_rpms:
api.current_logger().info(
'Complete list of third-party RPM packages:\n{}'.format(
'\n'.join(' - {}'.format(rpm) for rpm in third_party_rpms)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use please the FMT_LIST_SEPARATOR = '\n - ' here as well - that's for all other logs as well. also I suggest to use the _formatted_list_output function, so the refactoring in future will be easier.

@karolinku karolinku marked this pull request as ready for review November 4, 2025 12:19
@karolinku karolinku force-pushed the 3rd-party-components-fix branch 2 times, most recently from 029f602 to e07c47c Compare November 5, 2025 12:21
Introduce actors to detect presence of third-party
Python modules installed for target Python. Those modules could
interfere with the upgrade process or cause issues after rebooting
into the target system.

Scanner (scanthirdpartytargetpythonmodules):
- Identifies the target Python interpreter
- Queries the target Python's sys.path to determine where it searches
  for modules
- Recursively scans these directories for Python files (.py, .so, .pyc)
- Cross-references found files against the RPM database to determine
  ownership and categorize them

Checker (checkthirdpartytargetpythonmodules) creates a high severity
report to inform users about findings and presents full list of them
 in logs and short version in report.

Jira: RHEL-71882
@karolinku karolinku force-pushed the 3rd-party-components-fix branch from e07c47c to 34a5d75 Compare November 5, 2025 12:27
@karolinku karolinku changed the title Add upgrade inhibitor if 3rd party python modules detected Add detection for third-party target Python modules Nov 5, 2025
@karolinku karolinku requested a review from pirat89 November 5, 2025 12:29
Copy link
Member

@MichalHe MichalHe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solution looks good logic-wise 👍 . Probably the biggest issue is the code emitting warnings into the upgrade log about something that is considered normal and should not have a potential to negatively impact the upgrade process.

reporting.Summary(
'Third-party target Python modules may interfere with '
'the upgrade process or cause unexpected behavior after the upgrade.\n\n'
'Non-distribution RPM packages detected:{}\n\n'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if there are no third party rpms? The output would contain only:

Non-distribution RPM packages detected:

Non-distribution modules detected (list possibly incomplete): ...

Which I think will be confusing for the user.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This report will be created only if there are found any modules or RPMs, however, you are right that it's possible that only one of it will be present.

Comment on lines +24 to +25
result = run([python_interpreter, '-c', 'import sys, json; print(json.dumps(sys.path))'])['stdout']
return json.loads(result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are frequently ad-hoc constructing a Path object and then throwing it away (when resolving symlinks, when using rglob). Why not construct it once, and use it where required?

Suggested change
result = run([python_interpreter, '-c', 'import sys, json; print(json.dumps(sys.path))'])['stdout']
return json.loads(result)
result = run([python_interpreter, '-c', 'import sys, json; print(json.dumps(sys.path))'])['stdout']
raw_paths = json.loads(result)
paths = [Path(raw_path) for raw_path in raw_paths]
return paths

api.current_logger().warning(
'Found Python files from non-distribution RPM package: {}'.format(rpm_name)
)
third_party_files.extend([str(f) for f in files])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files should already be a list of strings, right?

Suggested change
third_party_files.extend([str(f) for f in files])
third_party_files.extend(files)


rpms_to_check, third_party_unowned_files = scan_python_files(system_paths[1:], rpm_files)

third_party_rpms, rpm_owned_files = identify_unsigned_rpms(rpms_to_check)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name is confusing, I was not sure whether these files come from signed or unsigned RPMs (= whether they are problematic or not)

Suggested change
third_party_rpms, rpm_owned_files = identify_unsigned_rpms(rpms_to_check)
third_party_rpms, unsigned_rpm_files = identify_unsigned_rpms(rpms_to_check)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request report Any reports have been added / removed / changed in the PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants