Implement lcov coverage collection for libvirt#4326
Implement lcov coverage collection for libvirt#4326chunfuwen merged 1 commit intoavocado-framework:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces a new function collect_lcov_coverage to gather code coverage data using lcov. However, this function contains a command injection vulnerability due to unsafe construction of a shell command with unquoted parameters, which should be addressed by properly escaping the parameters using shlex.quote(). Additionally, I've proposed refactoring a nested conditional block to improve readability and using f-strings for consistency with modern Python practices.
| collect_cmd = ( | ||
| "lcov --capture " | ||
| "--directory %s " | ||
| "--output-file %s " | ||
| "--test-name %s " | ||
| "--rc lcov_branch_coverage=1" % (build_dir, tracefile, test_name) | ||
| ) | ||
|
|
||
| try: | ||
| a_process.system(collect_cmd, shell=True) |
There was a problem hiding this comment.
The collect_lcov_coverage function is vulnerable to command injection. The collect_cmd is constructed using string interpolation with parameters build_dir, tracefile, and test_name, and then executed with shell=True. If any of these parameters contain shell metacharacters (e.g., ;, $(...), `...`), an attacker could execute arbitrary commands on the system. Constructing shell commands with string formatting in this manner is unsafe. To remediate this, use shlex.quote() to escape all variables before they are interpolated into the command string, or avoid using shell=True and pass the command as a list of arguments to prevent shell interpretation.
import shlex
collect_cmd = (
"lcov --capture "
"--directory %s "
"--output-file %s "
"--test-name %s "
"--rc lcov_branch_coverage=1" % (shlex.quote(build_dir),
shlex.quote(tracefile),
shlex.quote(test_name))
)
try:
a_process.system(collect_cmd, shell=True)| ) | ||
|
|
||
| try: | ||
| a_process.system(collect_cmd, shell=True) |
| return None | ||
|
|
||
| os.makedirs(output_dir, exist_ok=True) | ||
| tracefile = os.path.join(output_dir, "coverage_%s.info" % test_name) |
| if os.path.exists(tracefile): | ||
| file_size = os.path.getsize(tracefile) | ||
| if file_size > 0: | ||
| LOG.info( | ||
| "%s coverage tracefile saved: %s (%d bytes)", | ||
| component.upper(), | ||
| tracefile, | ||
| file_size, | ||
| ) | ||
| return tracefile | ||
| else: | ||
| LOG.warning("%s coverage file is empty, removing: %s", | ||
| component, tracefile) | ||
| os.unlink(tracefile) | ||
| return None | ||
| else: | ||
| LOG.warning("%s coverage file was not created: %s", component, tracefile) | ||
| return None |
There was a problem hiding this comment.
The nested if/else block can be flattened to improve readability. You can check for the failure conditions first and return early. This is often called a "guard clause" pattern.
if not os.path.exists(tracefile):
LOG.warning("%s coverage file was not created: %s", component, tracefile)
return None
file_size = os.path.getsize(tracefile)
if file_size > 0:
LOG.info(
"%s coverage tracefile saved: %s (%d bytes)",
component.upper(),
tracefile,
file_size,
)
return tracefile
LOG.warning("%s coverage file is empty, removing: %s",
component, tracefile)
os.unlink(tracefile)
return None632bcd3 to
85f9af4
Compare
954016e to
ccd2c5c
Compare
|
@YongxueHong Please help to review this one as a high priority task. Thanks! |
|
|
||
| # Libvirt code coverage configuration | ||
| # Enable/disable gcov for libvirt, default: disable | ||
| gcov_libvirt = no |
There was a problem hiding this comment.
Whether collect libvirt or qemu code coverage depends on parameters dynamically passed in by Kar, therefore needs to confirm with Kar members whether this parameter value can be override by parameters passed in by Kar
There was a problem hiding this comment.
Kar and avocado can set different variants values in command line. Please check the test results in description.
| # Coverage output format: lcov or html | ||
| gcov_libvirt_format = lcov | ||
| # Libvirt build directory where .gcda files are generated | ||
| gcov_libvirt_builddir = /var/tmp/libvirt |
There was a problem hiding this comment.
if we use libvirt virtcov packages, the value for this needs to be /builddir/build/BUILD
There was a problem hiding this comment.
Yes, we need to get the proper path after the package installation, so I just put a fake dir here. Users need to specify the correct one during their testings.
| "--directory %s " | ||
| "--output-file %s " | ||
| "--test-name %s " | ||
| "--rc lcov_branch_coverage=1" % (build_dir, tracefile, test_name) |
There was a problem hiding this comment.
suggest lcov_branch_coverage is configured. Since previous experience show branch level coverage will add up 30% more additional data ,and extend this case running time
| try: | ||
| path.find_command("lcov") | ||
| except path.CmdNotFoundError: | ||
| LOG.warning("lcov package not installed, cannot collect QEMU coverage") |
There was a problem hiding this comment.
For lcov , we needs to use https://github.com/LuyaoHuang/lcov --rhel version since it fix the issue that test case can not contain special letters. So here we needs to consider this point
There was a problem hiding this comment.
Yes, that's right. I think it's better to put lcov installation part in ci, similar place with libvirt/qemu coverage package installation, so I just checked lcov command here.
| return None | ||
|
|
||
|
|
||
| def collect_html_coverage(build_dir, output_dir, component, cmd_opts="--html"): |
There was a problem hiding this comment.
I suggest moving the collect_lcov_coverage and collect_html_coverage to a new utility module(e.g: virttest/utils_coverage)
From my understanding of both functions, I think both mainly handle the data of the coverage as a utility for the upper layer, but the test_setup package is responsible for setting up the essential environment for running tests.
Please let me know your thoughts if I misunderstood. Thanks.
There was a problem hiding this comment.
As I know, test_set is used for pre/post test, lcov collection will be executed in postprocess in env_process.py, so I think it should be in the current location. what do you think?
There was a problem hiding this comment.
Yeah, most of the modules of the package test_setup are used for the pre/post process, but there are some cases where using it within their own test cases. For example:
https://github.com/autotest/tp-libvirt/blob/ffe5bdb31268dd04a38a6df59df4687629d269e7/libvirt/tests/src/migration/migration_numa/migration_numatune.py#L66-L67
https://github.com/autotest/tp-qemu/blob/22f974662556eb283722d230c99406d908c010fc/qemu/tests/hugepage_specify_node.py#L70-L72
Structurally, there is no hard coupling between test_setup and collect_html_coverage; here, we just want to need it to deal with the coverage data in the postprocess.
Theoretically, as long as I provide the argument build_dir, output_dir, component, and cmd_opts, I can call this function anywhere, right? For example, if the user generated their own coverage data, and then the user could use this function to collect, it means it can work not only in the pre/post process, but also in the test case if there is such a requirement in the future. This flexibility allows users to collect coverage data in the post-process or even mid-test if a specific requirement arises.
Plus, treating this as a standalone utility module doesn't change your current implementation, but it does make the architecture cleaner, more extensible, and easier to maintain for future development."
Those are just my personal thoughts, but I’d love to hear more perspectives. @chunfuwen @nickzhq, what do you think? If everyone else is in agreement with your point, I’m happy to move forward with it. Thanks!
There was a problem hiding this comment.
@YongxueHong I appreciate your suggestion for moving 'collect_lcov_coverage' and 'collect_gcovr_coverage' to a new utils_coverage.py file to promote modularity. However, I think their current location in virttest/test_setup/gcov.py is appropriate for the following reasons:
- Since the coverage flag is not used in the release, I don't believe it's necessary for a test. Please let me know me if you are aware of any test that uses it.
- The gcov.py file currently have all functionality related to gcov. This includes the 'ResetGCov' class, which handles resetting gcov counters as part of the test setup, and the 'collect_*_coverage functions', which specifically process gcov output files (.gcda) using lcov and gcovr. Keeping them together ensures that all gcov-related logic is self-contained and easily discoverable within a single module.
- Both resetting and collecting gcov data are integral parts of a complete gcov-based coverage workflow within the testing environment. Separating the collection functions would break this natural grouping and could make the overall gcov management logic harder to follow.
- While lcov and gcovr are general tools, their usage here is specifically for GCOV. Naming a file utils_coverage.py might imply it contains utilities for any type of coverage. We still have coverage.py in avocado.
- I'm trying to determine if clear guidelines exist for using test_setup/ and other utils_*.py files, specifically to check if my current approach aligns with any rules. Can you please share them to me if you have one?
There was a problem hiding this comment.
@YongxueHong I appreciate your suggestion for moving 'collect_lcov_coverage' and 'collect_gcovr_coverage' to a new utils_coverage.py file to promote modularity. However, I think their current location in virttest/test_setup/gcov.py is appropriate for the following reasons:
- The gcov.py file currently have all functionality related to gcov. This includes the 'ResetGCov' class, which handles resetting gcov counters as part of the test setup, and the 'collect_*_coverage functions', which specifically process gcov output files (.gcda) using lcov and gcovr. Keeping them together ensures that all gcov-related logic is self-contained and easily discoverable within a single module.
Hi @Yingshun
I agree with your opinion about the description of the function collect_*_coverage functions that process the output of the gcov and lcov, but I think those functions not only serve for the test_setup process, which you mentioned the ResetGCov test_setup, but also elsewhere as long as it provided the gcov output files and the user want to process the files if needed in the future.
- Both resetting and collecting gcov data are integral parts of a complete gcov-based coverage workflow within the testing environment. Separating the collection functions would break this natural grouping and could make the overall gcov management logic harder to follow.
Following up on your comment: should we move the gcov data collection into the cleanup(self) method? Since setup(self) already handles resetting the gcov data, placing the collection in cleanup would create a more symmetric and intuitive gcov-based coverage workflow. This ensures that the setup and teardown operations remain balanced and logically grouped.
- While lcov and gcovr are general tools, their usage here is specifically for GCOV. Naming a file utils_coverage.py might imply it contains utilities for any type of coverage.
Yeah, if we name the utils_coverage.py, it means that it can work for any type of coverage, but we can add the corresponding functions to this module if there is other type supportment in the future. And this design principle does make sense.
We still have coverage.py in avocado.
There is no conflict between the aovocado and avocado-vt since they have their own namespace, refer to aovocado.utils.disk vs virttest.utils_disk.
- I'm trying to determine if clear guidelines exist for using test_setup/ and other utils_*.py files, specifically to check if my current approach aligns with any rules. Can you please share them to me if you have one?
Unfortunately, I assume there are no such guidelines and docs for developing the test_setup and utils_*, instead, it depends on the understanding of each maintainer.
Hi @Yingshun
While we have different perspectives on this specific implementation, I don't believe it's a critical issue for the framework since it doesn't introduce a risk that can't be managed later if needed. Your approach is a solid solution, so I’m happy to approve it and move forward to keep the project on track.
Thanks for sharing your thoughts and working through this with me!
5b12fce to
2b80c58
Compare
700013f to
ab28cb6
Compare
1. Add lcov coverage collection for both libvirt and qemu 2. Move gcovr collection from env_process to gcov.py Signed-off-by: Yingshun Cui <yicui@redhat.com>
|
merge it since it is urgently needes. |
Test results: