adding a script that calls aipcc index and parses hashes for a given …#76
adding a script that calls aipcc index and parses hashes for a given …#76nsingla wants to merge 1 commit into
Conversation
…list of arches Signed-off-by: Nelesh Singla <117123879+nsingla@users.noreply.github.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughThis PR consolidates requirement generation from inline Makefile shell commands into a dedicated Python script. The new Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Security observations
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
.github/workflows/sync-requirements.yml (1)
30-31: ⚡ Quick winSplit the failure branch to satisfy workflow lint limits.
Line 31 exceeds the configured YAML line-length limit reported by static analysis.
Proposed fix
- git diff --exit-code requirements.txt requirements-build.txt \ - || { echo ""; echo "Lockfiles are out of sync."; echo "Run 'make requirements' and commit the result."; exit 1; } + git diff --exit-code requirements.txt requirements-build.txt || { + echo "" + echo "Lockfiles are out of sync." + echo "Run 'make requirements' and commit the result." + exit 1 + }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/sync-requirements.yml around lines 30 - 31, The long one-liner that runs git diff and an inline failure block (the command starting with "git diff --exit-code requirements.txt requirements-build.txt || { echo \"\"; echo \"Lockfiles are out of sync.\"; echo \"Run 'make requirements' and commit the result.\"; exit 1; }") exceeds YAML line-length limits; replace it with a short multiline shell block: run the git diff command on its own, then use an if/then/else or explicit check of its exit status to echo the three messages and exit nonzero in the else branch (for example: run git diff --exit-code ...; if [ $? -ne 0 ]; then echo ""; echo "Lockfiles are out of sync."; echo "Run 'make requirements' and commit the result."; exit 1; fi) so each line is short and the workflow linter is satisfied.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/sync-requirements.yml:
- Around line 6-10: The workflow's pull_request path filter is missing Makefile,
so add "Makefile" to the paths array under the paths: key in the
sync-requirements.yml diff (i.e., update the existing paths list that contains
pyproject.toml, requirements.txt, requirements-build.txt,
scripts/compile_requirements.py) so changes to Makefile also trigger the job;
ensure the entry is exactly "Makefile" (case-sensitive) and commit the updated
YAML.
In `@Makefile`:
- Line 110: The Makefile invokes the system python directly ("python
scripts/compile_requirements.py"), which can produce inconsistent environments;
update that line to call the project-managed runtime instead (for example
replace "python" with the Makefile Python variable used elsewhere like
"$(PYTHON)" or "$(VENV_PYTHON)", or use the project tool wrapper such as "poetry
run python scripts/compile_requirements.py") so lockfile generation runs in the
same managed environment as the rest of the build.
In `@scripts/compile_requirements.py`:
- Around line 148-151: The current code writes requirements without hashes when
fetch_hashes_from_index returns empty (the block that checks "if not hashes" and
appends f"{name}=={version}{marker_part}"), which breaks integrity checks;
change this to fail fast by raising an exception or exiting with a non-zero
status unless an explicit opt-in CLI flag (e.g., --allow-missing-hashes) is
provided; if the flag is set, still record the package into a separate
"missing-hashes" report file or log entry for manual review instead of silently
writing an un-hashed requirement, and update the CLI parsing and relevant
function (e.g., compile_requirements / the caller that invokes
fetch_hashes_from_index) to support and document the new flag.
- Around line 25-31: The _parse_build_requires function currently uses a fragile
regex; replace it with proper TOML parsing using tomllib (or tomli as a backport
fallback): read the pyproject.toml text, parse it with tomllib.loads (or
tomli.loads if tomllib import fails), then extract requires =
parsed["build-system"]["requires"], validate presence and type, and return that
list (raising SystemExit with the existing message if the keys are missing);
ensure you do not perform manual quote stripping and that the function returns a
list[str] of the parsed entries.
- Around line 110-112: The except Exception block that swallows all errors
around the index fetch (the block that logs "WARNING: could not fetch index for
{name}: {e}" and returns []) should be replaced with granular handlers: catch
HTTP 404 responses specifically (treat as "no hashes" and return [] for the
package name), catch other HTTP/transport errors
(requests.exceptions.RequestException / urllib.error.URLError / TimeoutError)
and JSON/decoding errors separately and log the full traceback (use
traceback.format_exc()) instead of a simple message, and for unexpected critical
errors either re-raise after logging or allow the caller to handle them; update
the handler(s) around the fetch logic that reference the variable name and the
log(...) call and add an import for traceback as needed.
---
Nitpick comments:
In @.github/workflows/sync-requirements.yml:
- Around line 30-31: The long one-liner that runs git diff and an inline failure
block (the command starting with "git diff --exit-code requirements.txt
requirements-build.txt || { echo \"\"; echo \"Lockfiles are out of sync.\"; echo
\"Run 'make requirements' and commit the result.\"; exit 1; }") exceeds YAML
line-length limits; replace it with a short multiline shell block: run the git
diff command on its own, then use an if/then/else or explicit check of its exit
status to echo the three messages and exit nonzero in the else branch (for
example: run git diff --exit-code ...; if [ $? -ne 0 ]; then echo ""; echo
"Lockfiles are out of sync."; echo "Run 'make requirements' and commit the
result."; exit 1; fi) so each line is short and the workflow linter is
satisfied.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Central YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 6cdbc020-72de-4233-a9f3-b31c40c34361
📒 Files selected for processing (3)
.github/workflows/sync-requirements.ymlMakefilescripts/compile_requirements.py
| paths: | ||
| - pyproject.toml | ||
| - uv.lock | ||
| - requirements.txt | ||
| - requirements-build.txt | ||
| - scripts/compile_requirements.py |
There was a problem hiding this comment.
Include Makefile in PR path filters for this check.
Lockfile generation behavior can change via Makefile; without it in pull_request.paths, this job may not run for relevant changes.
Proposed fix
pull_request:
paths:
- pyproject.toml
- requirements.txt
- requirements-build.txt
- scripts/compile_requirements.py
+ - Makefile📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| paths: | |
| - pyproject.toml | |
| - uv.lock | |
| - requirements.txt | |
| - requirements-build.txt | |
| - scripts/compile_requirements.py | |
| paths: | |
| - pyproject.toml | |
| - requirements.txt | |
| - requirements-build.txt | |
| - scripts/compile_requirements.py | |
| - Makefile |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/workflows/sync-requirements.yml around lines 6 - 10, The workflow's
pull_request path filter is missing Makefile, so add "Makefile" to the paths
array under the paths: key in the sync-requirements.yml diff (i.e., update the
existing paths list that contains pyproject.toml, requirements.txt,
requirements-build.txt, scripts/compile_requirements.py) so changes to Makefile
also trigger the job; ensure the entry is exactly "Makefile" (case-sensitive)
and commit the updated YAML.
| printf 'setuptools\nwheel\n' | uv pip compile --generate-hashes --no-header --no-annotate \ | ||
| --python-version 3.12 \ | ||
| --index-url $(AIPCC_INDEX_URL) - >> requirements-build.txt | ||
| python scripts/compile_requirements.py |
There was a problem hiding this comment.
Use the project-managed runtime for lockfile generation.
Line 110 invokes python directly, which can resolve to a different interpreter/environment than the rest of this Makefile and produce inconsistent lockfiles.
Proposed fix
requirements:
- python scripts/compile_requirements.py
+ $(UVRUN) python scripts/compile_requirements.py📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| python scripts/compile_requirements.py | |
| $(UVRUN) python scripts/compile_requirements.py |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Makefile` at line 110, The Makefile invokes the system python directly
("python scripts/compile_requirements.py"), which can produce inconsistent
environments; update that line to call the project-managed runtime instead (for
example replace "python" with the Makefile Python variable used elsewhere like
"$(PYTHON)" or "$(VENV_PYTHON)", or use the project tool wrapper such as "poetry
run python scripts/compile_requirements.py") so lockfile generation runs in the
same managed environment as the rest of the build.
| def _parse_build_requires() -> list[str]: | ||
| """Extract [build-system] requires from pyproject.toml.""" | ||
| content = (REPO_ROOT / "pyproject.toml").read_text() | ||
| match = re.search(r"\[build-system\].*?requires\s*=\s*\[(.*?)\]", content, re.DOTALL) | ||
| if not match: | ||
| raise SystemExit("Could not find [build-system] requires in pyproject.toml") | ||
| return [dep.strip().strip("\"'") for dep in match.group(1).split(",") if dep.strip().strip("\"'")] |
There was a problem hiding this comment.
Regex-based TOML parsing will break on valid syntax. (CWE-1286: Improper Validation of Syntactic Correctness)
The regex pattern r"\[build-system\].*?requires\s*=\s*\[(.*?)\]" with re.DOTALL will fail on:
- Multi-line arrays with TOML formatting
- Comments within the requires array
- Trailing commas
- Any nested TOML structures between
[build-system]andrequires
Manual quote stripping on line 31 doesn't handle escaped characters.
Use tomllib (Python 3.11+) or tomli (backport) for robust parsing.
📝 Proposed fix using tomllib
+try:
+ import tomllib
+except ImportError:
+ import tomli as tomllib
+
def _parse_build_requires() -> list[str]:
"""Extract [build-system] requires from pyproject.toml."""
- content = (REPO_ROOT / "pyproject.toml").read_text()
- match = re.search(r"\[build-system\].*?requires\s*=\s*\[(.*?)\]", content, re.DOTALL)
- if not match:
+ with open(REPO_ROOT / "pyproject.toml", "rb") as f:
+ data = tomllib.load(f)
+ requires = data.get("build-system", {}).get("requires")
+ if not requires:
raise SystemExit("Could not find [build-system] requires in pyproject.toml")
- return [dep.strip().strip("\"'") for dep in match.group(1).split(",") if dep.strip().strip("\"'")]
+ return requires📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def _parse_build_requires() -> list[str]: | |
| """Extract [build-system] requires from pyproject.toml.""" | |
| content = (REPO_ROOT / "pyproject.toml").read_text() | |
| match = re.search(r"\[build-system\].*?requires\s*=\s*\[(.*?)\]", content, re.DOTALL) | |
| if not match: | |
| raise SystemExit("Could not find [build-system] requires in pyproject.toml") | |
| return [dep.strip().strip("\"'") for dep in match.group(1).split(",") if dep.strip().strip("\"'")] | |
| try: | |
| import tomllib | |
| except ImportError: | |
| import tomli as tomllib | |
| def _parse_build_requires() -> list[str]: | |
| """Extract [build-system] requires from pyproject.toml.""" | |
| with open(REPO_ROOT / "pyproject.toml", "rb") as f: | |
| data = tomllib.load(f) | |
| requires = data.get("build-system", {}).get("requires") | |
| if not requires: | |
| raise SystemExit("Could not find [build-system] requires in pyproject.toml") | |
| return requires |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/compile_requirements.py` around lines 25 - 31, The
_parse_build_requires function currently uses a fragile regex; replace it with
proper TOML parsing using tomllib (or tomli as a backport fallback): read the
pyproject.toml text, parse it with tomllib.loads (or tomli.loads if tomllib
import fails), then extract requires = parsed["build-system"]["requires"],
validate presence and type, and return that list (raising SystemExit with the
existing message if the keys are missing); ensure you do not perform manual
quote stripping and that the function returns a list[str] of the parsed entries.
| except Exception as e: | ||
| log(f" WARNING: could not fetch index for {name}: {e}") | ||
| return [] |
There was a problem hiding this comment.
Broad exception handler hides all errors.
Catching Exception on line 110 silently suppresses:
- Network failures (transient vs. persistent)
- HTTP 404 (package not in index)
- HTTP 403/401 (auth issues)
- Decoding errors
- Timeout errors
This makes debugging failures difficult and treats all errors as "package has no hashes."
Catch specific exceptions or at minimum log the full traceback for non-404 errors.
🔍 Proposed fix with granular error handling
+ from urllib.error import HTTPError, URLError
+
try:
with urllib.request.urlopen(url, timeout=30) as resp:
html = resp.read().decode()
- except Exception as e:
+ except HTTPError as e:
+ if e.code == 404:
+ log(f" WARNING: package {name} not found in index")
+ else:
+ log(f" ERROR: HTTP {e.code} fetching {name}: {e}")
+ return []
+ except (URLError, OSError) as e:
log(f" WARNING: could not fetch index for {name}: {e}")
return []🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/compile_requirements.py` around lines 110 - 112, The except Exception
block that swallows all errors around the index fetch (the block that logs
"WARNING: could not fetch index for {name}: {e}" and returns []) should be
replaced with granular handlers: catch HTTP 404 responses specifically (treat as
"no hashes" and return [] for the package name), catch other HTTP/transport
errors (requests.exceptions.RequestException / urllib.error.URLError /
TimeoutError) and JSON/decoding errors separately and log the full traceback
(use traceback.format_exc()) instead of a simple message, and for unexpected
critical errors either re-raise after logging or allow the caller to handle
them; update the handler(s) around the fetch logic that reference the variable
name and the log(...) call and add an import for traceback as needed.
| if not hashes: | ||
| log(f" WARNING: no matching wheels found for {name}=={version}") | ||
| lines.append(f"{name}=={version}{marker_part}") | ||
| continue |
There was a problem hiding this comment.
Writing requirements without hashes defeats integrity verification. (CWE-494: Download of Code Without Integrity Check)
When fetch_hashes_from_index returns no hashes (network error, missing wheels, etc.), line 150 writes the requirement without --hash=... entries. This allows pip install -r to accept any uploaded version of that package without checksum verification, enabling supply-chain attacks.
Either:
- Fail hard when hashes are missing for critical packages
- Generate a separate "missing-hashes" report that requires manual review
- Document that lockfiles without full hashes must not be used in production
🔒 Proposed fix: fail on missing hashes
if not hashes:
log(f" WARNING: no matching wheels found for {name}=={version}")
- lines.append(f"{name}=={version}{marker_part}")
- continue
+ raise SystemExit(f"CRITICAL: Cannot generate secure lockfile without hashes for {name}=={version}")Or add a --allow-missing-hashes flag if some packages legitimately have no matching wheels.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if not hashes: | |
| log(f" WARNING: no matching wheels found for {name}=={version}") | |
| lines.append(f"{name}=={version}{marker_part}") | |
| continue | |
| if not hashes: | |
| log(f" WARNING: no matching wheels found for {name}=={version}") | |
| raise SystemExit(f"CRITICAL: Cannot generate secure lockfile without hashes for {name}=={version}") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/compile_requirements.py` around lines 148 - 151, The current code
writes requirements without hashes when fetch_hashes_from_index returns empty
(the block that checks "if not hashes" and appends
f"{name}=={version}{marker_part}"), which breaks integrity checks; change this
to fail fast by raising an exception or exiting with a non-zero status unless an
explicit opt-in CLI flag (e.g., --allow-missing-hashes) is provided; if the flag
is set, still record the package into a separate "missing-hashes" report file or
log entry for manual review instead of silently writing an un-hashed
requirement, and update the CLI parsing and relevant function (e.g.,
compile_requirements / the caller that invokes fetch_hashes_from_index) to
support and document the new flag.
…list of arches
Description of your changes:
Checklist:
Pre-Submission Checklist
Learn more about the pull request title convention used in this repository.
Additional Checklist Items for New or Updated Components/Pipelines
metadata.yamlincludes freshlastVerifiedtimestampare present and complete
snake_casenaming conventionSummary by CodeRabbit