-
Notifications
You must be signed in to change notification settings - Fork 23
Chore/scan tunning parameters #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
0e16a3e to
1bd454d
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
1bd454d to
c12833c
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
c12833c to
92ed9a1
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
92ed9a1 to
ebe0ef3
Compare
|
Warning Rate limit exceeded@agustingroh has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 21 minutes and 59 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (12)
WalkthroughAdded snippet-tuning configuration, CLI flags, and settings-schema entries; introduced ScanSettingsBuilder to merge CLI, file_snippet, and root settings; propagated a Changes
Sequence Diagram(s)sequenceDiagram
participant User as User (CLI)
participant CLI as cli.py
participant Settings as ScanossSettings (file/snippet)
participant Builder as ScanSettingsBuilder
participant Scanner as Scanner
participant API as ScanossApi
participant HTTP as HTTP
User->>CLI: run scan with flags (e.g. --min-snippet-hits, --ranking)
CLI->>Settings: load scanoss.json (root + file_snippet)
CLI->>Builder: instantiate with scanoss_settings and CLI values
Note over Builder: Merge precedence: file_snippet > root > CLI
Builder-->>CLI: return merged scan_settings
CLI->>Scanner: Scanner(scanoss_settings, min_snippet_hits, ...)
Scanner->>API: instantiate ScanossApi(..., min_snippet_hits, ranking, ...)
API->>API: _build_scan_settings_header() -> base64(JSON of non-default tunings)
API->>HTTP: scan() with header "scanoss-settings: <base64-payload>"
HTTP-->>API: response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas to focus review on:
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/scanoss/scanossapi.py (1)
58-80: Scan settings header logic is sound; update docs to match behaviourFunctionally this looks good:
- Tuning fields are stored on the instance and
_build_scan_settings_header()emits a base64-encoded JSON blob only when values are “meaningful” (!= 0for snippet counts,!= 'unset'for booleans,!= -1for ranking threshold).scan()attaches the header asscanoss-settings, so the extra metadata rides with each request.A few minor doc cleanups would help:
__init__docstring still says the new params default toNone, buthonour_file_extsandrankingactually default to'unset'._build_scan_settings_headerdocstring referencesx-scanoss-scan-settings; the actual header key isscanoss-settings.- Consider clarifying in the docstring that
0and-1are treated as “defer to server configuration” and intentionally omitted from the header.No changes to runtime logic are needed.
Also applies to: 90-95, 173-180, 293-326
src/scanoss/scanner.py (2)
147-171: Skip-headers merge logic is correct; fix_merge_cli_with_settingsdocstringThe new logic to merge
skip_headersandskip_headers_limit:settings_skip_headers = file_snippet_settings.get('skip_headers') settings_skip_headers_limit = file_snippet_settings.get('skip_headers_limit') skip_headers = Scanner._merge_cli_with_settings(skip_headers, settings_skip_headers) skip_headers_limit = Scanner._merge_cli_with_settings(skip_headers_limit, settings_skip_headers_limit)combined with:
if settings_value is not None: return settings_value return cli_valuecorrectly gives precedence to values from
scanoss.jsonover CLI defaults, which is sensible.However, the
_merge_cli_with_settingsdocstring claims “Merged value with CLI taking priority over settings”, which is the opposite of what the function actually does (and what the code above relies on). I’d update the “Returns:” description to clearly state that settings take priority over CLI values.Also applies to: 248-260
173-238:post_processorcreation condition uses the wrong variableHere:
scan_settings = (ScanSettingsBuilder(scanoss_settings) .with_proxy(proxy) .with_url(url) .with_ignore_cert_errors(ignore_cert_errors) .with_min_snippet_hits(min_snippet_hits) .with_min_snippet_lines(min_snippet_lines) .with_honour_file_exts(honour_file_exts) .with_ranking(ranking) .with_ranking_threshold(ranking_threshold) ) ... self.post_processor = ( ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None )
scan_settingsis always aScanSettingsBuilderinstance, so this condition is effectively always true, even whenscanoss_settingsisNone. That’s a behavioural change from the usual “only post-process when we actually have settings” pattern and risks constructingScanPostProcessorwith aNonesettings object.If
ScanPostProcessorassumes a non-Nonesettings object, this could lead to unexpected errors; even if it handlesNone, the condition is misleading.Suggest switching the condition to check
scanoss_settingsinstead of the builder:Proposed fix
- self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None - ) + self.post_processor = ( + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) + if scanoss_settings + else None + )
🧹 Nitpick comments (4)
tests/test_scan_settings_builder.py (1)
33-359: Good coverage of ScanSettingsBuilder behaviourThe tests exercise initialization, helper methods, all
with_*methods, priority rules, and chaining in a realistic way usingtests/data/scanoss.json. This gives solid confidence in the builder logic.If you want to harden things further, consider adding a couple of tests for sentinel values (
min_snippet_hits == 0,ranking_threshold == -1) to reflect the semantics used when building the API header (i.e., “defer to server config” cases).src/scanoss/cli.py (1)
190-222: Snippet-tuning CLI options wired correctly; consider tightening validationThe new
--min-snippet-hits,--min-snippet-lines,--ranking,--ranking-threshold, and--honour-file-extsoptions are consistent with howScannerandScanSettingsBuilderconsume them, including the “0 / -1 mean defer to server config” semantics.Two optional polish items:
--ranking-threshold: you document-1..99but don’t enforce it; a customtypeor validator would catch out-of-range values early.--honour-file-exts: help text is clear here, but its JSON-schema description currently says “Ignores file extensions”; aligning those descriptions would avoid confusion between positive/negative wording.docs/source/_static/scanoss-settings-schema.json (1)
142-260: Schema additions align with code; tweakhonour_file_extsdescriptionThe new
settings.proxy,settings.http_config,settings.file_snippet,settings.hpfm, andsettings.containerblocks line up well with the new getters and builder logic in code (same field names and types, includingranking_*,min_snippet_*,honour_file_exts, skip header controls, etc.).One small documentation inconsistency:
file_snippet.properties.honour_file_exts.descriptionsays “Ignores file extensions…”, but the field name and CLI option (--honour-file-exts) are phrased positively (“honour file extensions”).To avoid confusion, it would be clearer if this description matched the positive semantics (or explicitly stated whether
truemeans “honour” vs “ignore”).src/scanoss/scanner.py (1)
342-346: Dependency auto-enable via settings: confirm intended behaviour
is_dependency_scan()now returnsTrueeither when the dependency bit is set inscan_optionsor whensettings.file_snippet.dependency_analysisis true:if self.scan_options & ScanType.SCAN_DEPENDENCIES.value: return True file_snippet_settings = self.scanoss_settings.get_file_snippet_settings() if self.scanoss_settings else {} return file_snippet_settings.get('dependency_analysis', False)This means a settings file can implicitly turn on dependency scanning even if the user didn’t pass
--dependencies/--dependencies-only. That’s reasonable, but it is a change in behaviour.If this is intentional, it might be worth documenting somewhere that
dependency_analysisinscanoss.jsonacts as a default “enable deps scan” toggle, overridden only by explicitly disabling dependency bits inscan_options.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)docs/source/_static/scanoss-settings-schema.json(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
tests/test_scan_settings_builder.py (3)
src/scanoss/scan_settings_builder.py (12)
ScanSettingsBuilder(31-263)_str_to_bool(220-226)_merge_with_priority(203-209)_merge_cli_with_settings(212-216)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
ScanossSettings(76-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-433)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (6)
ScanossSettings(76-433)get_file_snippet_settings(339-345)get_file_snippet_proxy(419-425)get_proxy(403-409)get_http_config(411-417)get_file_snippet_http_config(427-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
🪛 markdownlint-cli2 (0.18.1)
CHANGELOG.md
776-776: Link and image reference definitions should be needed
Duplicate link or image reference definition: "1.42.0"
(MD053, link-image-reference-definitions)
🔇 Additional comments (7)
src/scanoss/__init__.py (1)
25-25: LGTM!Version bump to 1.43.0 appropriately reflects the new scan tuning parameters and
ScanSettingsBuilderadditions.scanoss.json (1)
18-21: LGTM!Adding
scanoss-winnowing.pyto the BOM includes is appropriate since it's a declared dependency used for fast winnowing functionality.CHANGELOG.md (1)
11-20: LGTM!The changelog entry for version 1.43.0 clearly documents the new scan engine tuning parameters and
ScanSettingsBuilderfunctionality.CLIENT_HELP.md (1)
262-304: LGTM!Comprehensive documentation for the new snippet tuning options. The examples are clear, and the note about using
=syntax for negative values (line 292) is a helpful detail for users.tests/data/scanoss.json (1)
28-51: LGTM!The test data structure effectively exercises the nested configuration hierarchy with distinct values at root and
file_snippetlevels (e.g., different proxy hosts and http_config settings), which enables proper testing of theScanSettingsBuildermerge priority logic.src/scanoss/cli.py (1)
1410-1437: scan / wfp / folder- settings wiring looks consistent*The refactor to use
scanoss_settings(instead of the oldscan_settings) inwfp(),scan(),folder_hashing_scan(), andfolder_hash()is coherent:
- Settings files are loaded once into
ScanossSettings, with legacy vs new file types handled inscan().- The same
scanoss_settingsinstance is threaded intoScanner,ScannerHFH, andFolderHasher.- New snippet-tuning CLI options are only passed to
Scannerfor thescancommand, which is appropriate given they affect server-side scanning, not fingerprint-only flows.No functional issues stand out in these call sites.
Also applies to: 1518-1532, 1597-1641, 2716-2831
src/scanoss/scanoss_settings.py (1)
339-433: New settings accessors are straightforward and consistentThe added getters for
file_snippet, proxy, andhttp_config(includingmin_snippet_*, ranking,honour_file_exts, skip headers, etc.) correctly mirror the underlying data structure and provide clear defaults (None/0/Falseas documented). They integrate cleanly withScanSettingsBuilderandScanner.
ebe0ef3 to
59b57ba
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
CHANGELOG.md (1)
775-776: Fix duplicate link reference definition.The duplicate link reference issue flagged in the previous review remains unresolved. Line 776 uses
[1.42.0]as the link label, but it should be[1.43.0]to match the version and prevent Markdown rendering issues.🔎 Apply this diff to fix the link label:
[1.42.0]: https://github.com/scanoss/scanoss.py/compare/v1.41.1...v1.42.0 -[1.42.0]: https://github.com/scanoss/scanoss.py/compare/v1.42.0...v1.43.0 +[1.43.0]: https://github.com/scanoss/scanoss.py/compare/v1.42.0...v1.43.0
🧹 Nitpick comments (2)
src/scanoss/cli.py (1)
190-222: Fix inconsistent spacing in choices lists.Lines 206 and 219 have inconsistent spacing in the
choicesparameter:
- Line 206:
choices=['unset' ,'true', 'false'](space before comma)- Line 219:
choices=['unset','true', 'false'](no space after first comma)🔎 Apply this diff for consistent formatting:
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )src/scanoss/scanossapi.py (1)
75-79: Fix spacing before comma in parameter default.Line 77 has a space before the comma:
honour_file_exts = 'unset' ,🔎 Apply this diff:
- honour_file_exts = 'unset' , + honour_file_exts='unset',
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- CLIENT_HELP.md
🚧 Files skipped from review as they are similar to previous changes (3)
- src/scanoss/scan_settings_builder.py
- tests/data/scanoss.json
- tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/scanoss/cli.py (2)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-433)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)src/scanoss/inspection/utils/file_utils.py (1)
load_json_file(29-44)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
src/scanoss/__init__.py (1)
25-25: LGTM!Version bump is consistent with the new features introduced in this PR.
scanoss.json (1)
18-21: LGTM!Adding the scanoss-winnowing.py library to the BOM is appropriate for dependency tracking.
src/scanoss/scanoss_settings.py (1)
339-433: LGTM!The new getter methods provide clean access to file_snippet settings with consistent patterns, appropriate defaults, and clear documentation.
src/scanoss/scanner.py (1)
173-182: TheScanSettingsBuilderimplementation does not match the stated priority order. The builder implements file_snippet > root settings > CLI, where CLI arguments are the lowest priority (used as fallback). This is the opposite of the PR objectives which require CLI > file_snippet > root settings. The priority order logic must be reversed in the merge methods to ensure CLI arguments take precedence when provided.Likely an incorrect or invalid review comment.
59b57ba to
0fb75e9
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
src/scanoss/scanner.py (1)
147-157: Priority order contradicts PR objectives.Lines 153-155 use
_merge_cli_with_settingswhich implements "settings > cli" priority (as seen in lines 258-260). However, the PR objectives explicitly state: "priority: CLI > file_snippet > root settings".This means CLI arguments should override settings file values, but the current implementation does the opposite for
skip_headersandskip_headers_limit.🔎 Apply this diff to fix the priority order:
@staticmethod def _merge_cli_with_settings(cli_value, settings_value): - """Merge CLI value with settings value (two-level priority: settings > cli). + """Merge CLI value with settings value (two-level priority: cli > settings). Args: cli_value: Value from CLI argument settings_value: Value from scanoss.json file_snippet settings Returns: - Merged value with CLI taking priority over settings + Merged value with CLI taking priority over settings file """ + if cli_value is not None: + return cli_value if settings_value is not None: return settings_value - return cli_value + return None
🧹 Nitpick comments (3)
src/scanoss/scanossapi.py (1)
75-79: Consider type enforcement for boolean parameters.The
honour_file_extsandrankingparameters useUnion[bool, str, None]to support both boolean values and the 'unset' sentinel string. While theScanSettingsBuildercorrectly converts string inputs to booleans before passing them toScanossApi, direct instantiation ofScanossApicould lead to strings like 'true'/'false' being serialized as JSON strings rather than booleans.Consider adding runtime validation in
_build_scan_settings_header()to ensure these values are booleans when included:🔎 Suggested fix
# honour_file_exts: None = unset, don't send if self.honour_file_exts is not None and self.honour_file_exts != 'unset': - settings['honour_file_exts'] = self.honour_file_exts + settings['honour_file_exts'] = bool(self.honour_file_exts) if not isinstance(self.honour_file_exts, bool) else self.honour_file_exts # ranking: None = unset, don't send if self.ranking is not None and self.ranking != 'unset': - settings['ranking_enabled'] = self.ranking + settings['ranking_enabled'] = bool(self.ranking) if not isinstance(self.ranking, bool) else self.rankingtests/test_scan_settings_builder.py (1)
36-38: Consider usingsetUpClassfor shared fixtures.The
scan_settingsfixture is created at class definition time. While this works, usingsetUpClassis the conventional approach for shared test fixtures in unittest:🔎 Suggested refactor
class TestScanSettingsBuilder(unittest.TestCase): """Tests for the ScanSettingsBuilder class.""" - script_dir = os.path.dirname(os.path.abspath(__file__)) - scan_settings_path = Path(script_dir, 'data', 'scanoss.json').resolve() - scan_settings = ScanossSettings(filepath=scan_settings_path) + @classmethod + def setUpClass(cls): + script_dir = os.path.dirname(os.path.abspath(__file__)) + cls.scan_settings_path = Path(script_dir, 'data', 'scanoss.json').resolve() + cls.scan_settings = ScanossSettings(filepath=str(cls.scan_settings_path))src/scanoss/cli.py (1)
190-222: Fix inconsistent spacing in choices lists.Lines 206 and 219 have inconsistent spacing in the
choicesparameter:
- Line 206:
choices=['unset' ,'true', 'false']has a space before the comma after 'unset'- Line 219:
choices=['unset','true', 'false']is missing a space after 'unset' comma🔎 Apply this diff to fix the spacing:
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--ranking-threshold', type=int, default=None, help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- scanoss.json
- CHANGELOG.md
- src/scanoss/scan_settings_builder.py
- src/scanoss/init.py
🧰 Additional context used
🧬 Code graph analysis (3)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
tests/test_scan_settings_builder.py (3)
src/scanoss/scan_settings_builder.py (12)
ScanSettingsBuilder(31-266)_str_to_bool(223-229)_merge_with_priority(203-209)_merge_cli_with_settings(212-219)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
ScanossSettings(76-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
src/scanoss/scanner.py (2)
src/scanoss/scan_settings_builder.py (10)
ScanSettingsBuilder(31-266)_merge_cli_with_settings(212-219)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
get_file_snippet_settings(339-345)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (14)
tests/data/scanoss.json (1)
29-51: LGTM!The test data file is well-structured with a clear hierarchy between root settings and file_snippet settings, enabling comprehensive testing of the priority-based configuration merging logic.
src/scanoss/scanoss_settings.py (1)
339-433: LGTM!The new accessor methods for file_snippet settings are well-implemented with:
- Consistent patterns across all getters
- Proper Optional return types
- Clear docstrings documenting expected return values
- Clean separation between root-level and file_snippet-level configuration access
src/scanoss/scanossapi.py (2)
176-179: LGTM!The integration of the
scanoss-settingsheader into the scan method is clean and correct. The header is only added when meaningful settings are configured.
293-326: LGTM!The
_build_scan_settings_headermethod correctly:
- Excludes sentinel values (0, 'unset', -1, None) from the serialized payload
- Uses base64 encoding for safe header transmission
- Logs the settings for debugging before encoding
- Returns None when no settings need to be sent
CLIENT_HELP.md (1)
262-304: LGTM!The documentation for the new snippet tuning options is comprehensive and well-structured:
- Clear explanations of each flag's purpose
- Practical usage examples
- Helpful note about
=syntax for negative threshold values- Example showing how to combine multiple options
tests/test_scan_settings_builder.py (2)
44-117: LGTM!Excellent test coverage for:
- Initialization with None and with settings object
- Static helper methods (
_str_to_bool,_merge_with_priority,_merge_cli_with_settings)- Edge cases like all-None inputs and case-insensitive boolean parsing
119-358: LGTM!Comprehensive test coverage for all
with_*methods including:
- CLI-only scenarios (no settings file)
- Settings file values taking priority over CLI
- Method chaining verification
- All tuning parameters: proxy, url, ignore_cert_errors, min_snippet_hits, min_snippet_lines, honour_file_exts, ranking, ranking_threshold
src/scanoss/cli.py (3)
1410-1418: LGTM!Settings loading is correctly refactored to use
scanoss_settingsinstead ofscan_settings, maintaining consistency with the rest of the codebase.
1518-1535: LGTM!Settings loading logic correctly handles different scenarios (identify, ignore, and standard settings) while maintaining backward compatibility.
1631-1641: LGTM!New snippet tuning parameters are correctly wired from CLI arguments to the Scanner constructor, enabling configuration of snippet matching behavior.
src/scanoss/scanner.py (4)
110-117: LGTM!Constructor signature correctly updated to accept the new snippet tuning parameters and renamed
scanoss_settingsfor clarity. The new parameters enable fine-grained control over snippet matching behavior.
186-206: LGTM!The
ScanossApiis correctly initialized with merged settings fromScanSettingsBuilder, properly wiring through the new snippet tuning parameters.
337-345: LGTM!The
is_dependency_scanmethod correctly falls back to checkingdependency_analysisin thefile_snippetsettings when the explicit scan option is not set, providing flexible configuration.
173-184: Correct the stated priority order in review comment.The review comment incorrectly states the PR objectives. The ScanSettingsBuilder class documentation specifies priority as: file_snippet section > settings section > CLI arguments, and the
_merge_with_priorityimplementation correctly follows this priority order. The code is working as designed; the review comment's stated objectives are reversed.Likely an incorrect or invalid review comment.
0fb75e9 to
c62cf42
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
src/scanoss/cli.py (1)
1511-1516: Fix typo in conflicting options error message.The error message currently says
--skip-file-settings, but the actual flag is--skip-settings-file. Updating the string will avoid confusion for users trying to fix their invocation.Suggested diff
- if args.settings and args.skip_settings_file: - print_stderr('ERROR: Cannot specify both --settings and --skip-file-settings options.') + if args.settings and args.skip_settings_file: + print_stderr('ERROR: Cannot specify both --settings and --skip-settings-file options.')src/scanoss/scanpostprocessor.py (2)
112-121: Resolve PLW2901 by avoiding reassignment of loop variable in _load_component_info.
resultis used as the loop variable and then reassigned, which triggersPLW2901and breaks lint. Use a separate local name for the normalized result.Suggested diff
- for _, result in self.results.items(): - result = result[0] if isinstance(result, list) else result - purls = result.get('purl', []) - for purl in purls: - self.component_info_map[purl] = result + for _, result in self.results.items(): + normalized_result = result[0] if isinstance(result, list) else result + purls = normalized_result.get('purl', []) + for purl in purls: + self.component_info_map[purl] = normalized_result
159-164: Resolve PLW2901 in _replace_purls by not overwritingresult.Same pattern here: the loop variable
resultis reassigned, triggering the lint error. Use a separate variable to hold the normalized result before passing it downstream.Suggested diff
- for result_path, result in self.results.items(): - result = result[0] if isinstance(result, list) else result - should_replace, to_replace_with_purl = self._should_replace_result(result_path, result, to_replace_entries) - if should_replace: - self.results[result_path] = [self._update_replaced_result(result, to_replace_with_purl)] + for result_path, result in self.results.items(): + normalized_result = result[0] if isinstance(result, list) else result + should_replace, to_replace_with_purl = self._should_replace_result( + result_path, + normalized_result, + to_replace_entries, + ) + if should_replace: + self.results[result_path] = [self._update_replaced_result(normalized_result, to_replace_with_purl)]src/scanoss/scanner.py (1)
174-238: Fix post-processor initialisation to avoid None scanoss_settings crash.
ScanSettingsBuilder(scanoss_settings)is always truthy, soscan_settingsis truthy even whenscanoss_settingsisNone. As a result:
self.post_processoris created withscanoss_settings=None.- Later,
post_process()callsself.scanoss_settings.is_legacy(), which will raiseAttributeErrorwhen no settings file is used (e.g.--skip-settings-file).You should gate the post-processor on
scanoss_settingsitself, not on the merged builder object.Suggested diff
- scan_settings = (ScanSettingsBuilder(scanoss_settings) + scan_settings = ( + ScanSettingsBuilder(scanoss_settings) .with_proxy(proxy) .with_url(url) .with_ignore_cert_errors(ignore_cert_errors) .with_min_snippet_hits(min_snippet_hits) .with_min_snippet_lines(min_snippet_lines) .with_honour_file_exts(honour_file_exts) .with_ranking(ranking) .with_ranking_threshold(ranking_threshold)) @@ - self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None - ) + self.post_processor = ( + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) + if scanoss_settings + else None + )
♻️ Duplicate comments (3)
src/scanoss/scan_settings_builder.py (1)
101-120: Docstring priority description doesn’t match implementation.
with_ignore_cert_errors’s docstring says “CLI True > file_snippet > settings > False”, but the code uses_merge_with_priority(cli, file_snippet, root)which choosesfile_snippetfirst, then root, then CLI. This matches the class-level description (“file_snippet > settings > CLI”), not the method docstring. Please update the docstring to reflect the actual precedence.src/scanoss/scanner.py (2)
147-171: Skip-headers merge logic is fine; doc for helper should emphasise “settings > CLI”.The
_merge_cli_with_settingshelper is used here so thatfile_snippetsettings override CLI values forskip_headersandskip_headers_limit, with CLI acting as a fallback when the settings file doesn’t specify them. That behaviour is consistent and useful.However, the helper’s docstring later in the file still contains contradictory text (it both says “settings > cli” and “CLI taking priority over settings”). Consider cleaning that up to state clearly that settings take precedence over CLI for callers like this.
248-260: Clarify _merge_cli_with_settings docstring to match behaviour.
_merge_cli_with_settingsreturnssettings_valuewhen it is notNone, otherwise it falls back tocli_value, so the effective priority is settings > CLI. The docstring currently mixes both “settings > cli” and “CLI taking priority over settings”, which is contradictory and can mislead future maintainers. It would be better to describe it unambiguously as “settings override CLI; CLI is only used as a fallback when settings are unset.”
🧹 Nitpick comments (3)
src/scanoss/scanossapi.py (2)
58-95: Align constructor docstrings with actual default values.The
__init__docstrings still describehonour_file_exts,ranking, andranking_thresholdas defaulting toNone, but the function signature uses'unset'(for the booleans) andNonefor the threshold. It would be clearer to document the actual sentinel defaults and what they mean (e.g.'unset'→ “do not send; defer to server config”,Nonevs-1forranking_threshold).
293-326: Header construction is sound; consider updating the docstring header name.The
_build_scan_settings_headerlogic around sentinels (0,'unset',-1) and base64-encoded JSON looks correct and matches the CLI help. The docstring still refers to anx-scanoss-scan-settingsheader, whilescan()actually sendsscanoss-settings; updating the wording would avoid confusion.src/scanoss/scan_settings_builder.py (1)
63-67: Tighten type hints for honour_file_exts and ranking.
self.honour_file_extsandself.rankingare annotated asOptional[any], which refers to the builtin function rather than a meaningful type. Given the usage, something likeOptional[bool | str](to coverTrue/Falseand'unset') would be clearer for type-checkers and readers.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)src/scanoss/scanpostprocessor.py(4 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
- CHANGELOG.md
- scanoss.json
- CLIENT_HELP.md
- tests/test_scan_settings_builder.py
- src/scanoss/init.py
🧰 Additional context used
🧬 Code graph analysis (4)
src/scanoss/scan_settings_builder.py (2)
src/scanoss/scanoss_settings.py (5)
get_file_snippet_settings(339-345)get_file_snippet_proxy(419-425)get_proxy(403-409)get_http_config(411-417)get_file_snippet_http_config(427-433)src/scanoss/scanner.py (1)
_merge_cli_with_settings(249-260)
src/scanoss/scanpostprocessor.py (1)
src/scanoss/scanoss_settings.py (3)
is_legacy(315-317)get_bom_remove(210-218)get_bom_replace(220-228)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (4)
ScanossSettings(76-433)load_json_file(103-138)set_file_type(140-152)set_scan_type(154-161)
src/scanoss/scanner.py (4)
src/scanoss/scan_settings_builder.py (10)
ScanSettingsBuilder(31-266)_merge_cli_with_settings(212-219)with_proxy(69-83)with_url(85-99)with_ignore_cert_errors(101-119)with_min_snippet_hits(121-134)with_min_snippet_lines(136-149)with_honour_file_exts(151-167)with_ranking(169-184)with_ranking_threshold(186-199)src/scanoss/scanoss_settings.py (1)
get_file_snippet_settings(339-345)src/scanoss/scanossapi.py (1)
ScanossApi(52-341)src/scanoss/scanpostprocessor.py (1)
ScanPostProcessor(76-289)
🪛 GitHub Actions: Lint
src/scanoss/scanpostprocessor.py
[error] 117-119: PLW2901: 'for' loop variable 'result' overwritten by assignment target. CI: make lint failed (exit code non-zero).
[error] 159-161: PLW2901: 'for' loop variable 'result' overwritten by assignment target. CI: make lint failed (exit code non-zero).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (4)
tests/data/scanoss.json (1)
28-50: Test settings structure matches code expectations.The new
settings→file_snippetstructure (proxy/http_config plus snippet and ranking fields) aligns with the new getters inScanossSettingsand the merge logic inScanSettingsBuilder. Looks good for exercising the new path.src/scanoss/cli.py (1)
190-222: Snippet tuning flags are well-wired.The new
--min-snippet-hits,--min-snippet-lines,--ranking,--ranking-threshold, and--honour-file-extsoptions are plumbed cleanly through toScannerand ultimately intoScanSettingsBuilder/ScanossApi. Semantics (0 / -1 / 'unset' as “defer to server config”) match the help text and header-building logic.src/scanoss/scanoss_settings.py (1)
339-433: New settings accessors are consistent with schema and callers.The added getters for
file_snippetsettings, skip-headers values, and root/file_snippet proxy/http_config cleanly wrap the underlying JSON structure and match the keys used inScanSettingsBuilderand the testscanoss.json. No issues spotted here.src/scanoss/scanner.py (1)
342-346: dependency_analysis fallback is a nice extension.Using
file_snippet_settings.get('dependency_analysis', False)as a fallback whenSCAN_DEPENDENCIESis not set lets the settings file enable dependency scanning without extra CLI flags. This is a reasonable extension of behaviour and fits the overall configuration model.
c62cf42 to
9fb4a2b
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
src/scanoss/scanner.py (2)
236-238: Condition checks wrong variable - still unfixed from past review.The condition checks
scan_settings(the builder object, which is always truthy after construction) but passesscanoss_settingstoScanPostProcessor. This means the post-processor will be created even whenscanoss_settingsisNone.🔎 Apply this diff to fix the condition:
self.post_processor = ( - ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scan_settings else None + ScanPostProcessor(scanoss_settings, debug=debug, trace=trace, quiet=quiet) if scanoss_settings else None )
248-260: Priority order still contradicts PR objectives and has internal inconsistency.The PR objectives state: "priority: CLI > file_snippet > root settings", meaning CLI should override settings file values. However:
- The implementation (lines 258-260) gives
settings_valuepriority overcli_value- The docstring is internally inconsistent: line 250 says "settings > cli" but line 256 says "CLI taking priority"
This affects
skip_headersandskip_headers_limitoptions at lines 153-155.🔎 Apply this diff to fix the priority order:
@staticmethod def _merge_cli_with_settings(cli_value, settings_value): - """Merge CLI value with settings value (two-level priority: settings > cli). + """Merge CLI value with settings value (two-level priority: cli > settings). Args: cli_value: Value from CLI argument settings_value: Value from scanoss.json file_snippet settings Returns: - Merged value with CLI taking priority over settings + Merged value with CLI taking priority over settings file """ - if settings_value is not None: - return settings_value - return cli_value + if cli_value is not None: + return cli_value + if settings_value is not None: + return settings_value + return None
🧹 Nitpick comments (1)
src/scanoss/cli.py (1)
190-222: Minor formatting inconsistencies in choices lists.Lines 206 and 219 have inconsistent spacing in the
choicesarrays:
- Line 206:
['unset' ,'true', 'false']- space before comma- Line 219:
['unset','true', 'false']- missing space after 'unset'While this doesn't affect functionality, it's inconsistent with Python style conventions.
🔎 Apply this diff to fix formatting:
p_scan.add_argument( '--ranking', type=str, - choices=['unset' ,'true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Enable or disable ranking (optional - default: server configuration)', ) p_scan.add_argument( '--ranking-threshold', type=int, default=None, help='Ranking threshold value. Valid range: -1 to 99. A value of -1 defers to server configuration (optional)', ) p_scan.add_argument( '--honour-file-exts', type=str, - choices=['unset','true', 'false'], + choices=['unset', 'true', 'false'], default='unset', help='Honour file extensions during scanning. When not set, defers to server configuration (optional)', )
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (12)
CHANGELOG.md(2 hunks)CLIENT_HELP.md(1 hunks)scanoss.json(1 hunks)src/scanoss/__init__.py(1 hunks)src/scanoss/cli.py(5 hunks)src/scanoss/scan_settings_builder.py(1 hunks)src/scanoss/scanner.py(9 hunks)src/scanoss/scanoss_settings.py(1 hunks)src/scanoss/scanossapi.py(6 hunks)src/scanoss/scanpostprocessor.py(5 hunks)tests/data/scanoss.json(1 hunks)tests/test_scan_settings_builder.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (7)
- src/scanoss/scan_settings_builder.py
- tests/data/scanoss.json
- CHANGELOG.md
- CLIENT_HELP.md
- src/scanoss/scanoss_settings.py
- src/scanoss/scanpostprocessor.py
- tests/test_scan_settings_builder.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/scanoss/scanossapi.py (2)
src/scanoss/scanossbase.py (1)
print_debug(58-63)src/scanoss/spdxlite.py (1)
print_debug(61-66)
src/scanoss/cli.py (1)
src/scanoss/scanoss_settings.py (3)
ScanossSettings(76-433)set_file_type(140-152)set_scan_type(154-161)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (11)
src/scanoss/__init__.py (1)
25-25: LGTM!Version bump to 1.43.0 aligns with the new scan engine tuning parameters feature.
scanoss.json (1)
18-21: LGTM!Adding the
scanoss-winnowing.pypackage tobom.includealigns with the optional fast winnowing dependency used in the codebase.src/scanoss/cli.py (3)
1410-1436: LGTM!The
wfpfunction correctly usesscanoss_settingsconsistently throughout, and properly loads and passes settings to the Scanner.
1631-1641: LGTM!The new tuning parameters (
min_snippet_hits,min_snippet_lines,ranking,ranking_threshold,honour_file_exts) are correctly wired from CLI arguments to the Scanner constructor.
2821-2830: LGTM!The
get_scanoss_settings_from_argshelper function provides a clean, reusable pattern for loading scan settings from CLI arguments, maintaining consistency with other command flows.src/scanoss/scanossapi.py (2)
293-326: LGTM!The
_build_scan_settings_headermethod correctly builds a base64-encoded JSON payload for scan tuning settings. The logic properly excludes default/unset values from the payload.Note: The double space issue flagged in a past review appears to be resolved in this version.
176-179: LGTM!The scan settings header is correctly integrated into the request flow, only adding the
scanoss-settingsheader when tuning parameters are configured.src/scanoss/scanner.py (4)
147-156: LGTM!The settings merging logic for
skip_headersandskip_headers_limitcorrectly retrieves values fromfile_snippet_settingsand merges them with CLI values. The debug output helps trace the final merged values.
172-184: LGTM!Clean use of the builder pattern to construct merged scan settings, properly chaining all tuning parameters from CLI and settings file sources.
344-345: LGTM!The fallback to
file_snippet_settings.get('dependency_analysis', False)correctly extends dependency scan detection to honor settings file configuration when not explicitly set via scan options.
409-421: LGTM!The
FileFiltersinstantiation correctly passesself.scanoss_settingsfor consistent settings propagation throughout the scanning workflow.
9fb4a2b to
c402c34
Compare
SCANOSS SCAN Completed 🚀
View more details on SCANOSS Action Summary |
What's Changed
Added
--min-snippet-hits- Minimum snippet hits required (0 defers to server config)--min-snippet-lines- Minimum snippet lines required (0 defers to server config)--ranking- Enable/disable result ranking (unset/true/false)--ranking-threshold- Ranking threshold value (-1 to 99, -1 defers to server config)--honour-file-exts- Honour file extensions during matching (unset/true/false)file_snippetsection to scanoss.json settings schema for configuring tuning parametersScanSettingsBuilderclass for merging CLI and settings file configurations with priority: CLI > file_snippet > root settingsSummary by CodeRabbit
New Features
Configuration
Documentation
Tests
Other
✏️ Tip: You can customize this high-level summary in your review settings.