Skip to content

Add transaction safety and update comments from code review#2295

Merged
mrrajan merged 2 commits intoguacsec:mainfrom
mrrajan:fix/code-review-expanded-license-transaction
Mar 18, 2026
Merged

Add transaction safety and update comments from code review#2295
mrrajan merged 2 commits intoguacsec:mainfrom
mrrajan:fix/code-review-expanded-license-transaction

Conversation

@mrrajan
Copy link
Copy Markdown
Contributor

@mrrajan mrrajan commented Mar 18, 2026

This PR includes,

  • Add transaction safety to the expanded licenses
  • Replace LEFT JOIN + NULL approach to NOT EXISTS approach to non_sbom_query
    These changes are accommodated with the backport PR - Backport 2287 to release/0.4.z #2294 on the release branch based on the sourcery review comments.

Summary by Sourcery

Ensure SBOM ingestion and expanded license population run within database transactions and optimize license lookup queries for unlinked licenses.

Bug Fixes:

  • Prevent partial expanded license writes by wrapping population in an explicit database transaction.

Enhancements:

  • Require transaction-capable database connections throughout SBOM ingestion paths (CycloneDX, SPDX, and component creation).
  • Document behavioral differences between migration-time and ingestion-time expanded license backfilling logic.
  • Replace LEFT JOIN + NULL filtering with a NOT EXISTS subquery to more efficiently find licenses not associated with any SBOM.

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 18, 2026

Reviewer's Guide

Introduces transaction-scoped SBOM ingestion and adjusts license queries for better performance and correctness, including making expanded license population transactional and replacing a LEFT JOIN + NULL filter with a NOT EXISTS subquery.

Sequence diagram for transactional expanded_license population

sequenceDiagram
    participant Caller
    participant Db as DbConnection
    participant Func as populate_expanded_license
    participant Tx as Transaction

    Caller->>Func: populate_expanded_license(sbom_id, Db)
    activate Func
    Func->>Db: begin()
    activate Db
    Db-->>Func: Tx
    deactivate Db
    activate Tx

    Func->>Tx: execute(INSERT INTO expanded_license ...)
    Tx-->>Func: Result

    Func->>Tx: execute(WITH license_expansions AS (...) INSERT INTO sbom_license_expanded ...)
    Tx-->>Func: Result

    Func->>Tx: commit()
    Tx-->>Func: Ok
    deactivate Tx

    Func-->>Caller: Ok
    deactivate Func
Loading

Sequence diagram for transactional SBOM ingestion (CycloneDX/SPDX)

sequenceDiagram
    actor IngestJob
    participant SbomContext
    participant Db as DbConnection
    participant Creator
    participant ComponentCreator

    IngestJob->>SbomContext: ingest_cyclonedx or ingest_spdx(sbom, Db)
    activate SbomContext
    SbomContext->>Db: preflight checks, lookups

    SbomContext->>Creator: new Creator(sbom_id, components)
    Creator-->>SbomContext: Creator

    SbomContext->>Creator: create(Db, processors)
    activate Creator
    Creator->>ComponentCreator: new ComponentCreator(sbom_id, components.len())
    ComponentCreator-->>Creator: ComponentCreator

    Creator->>ComponentCreator: create(Db)
    activate ComponentCreator
    ComponentCreator->>Db: Licenses.create(Db)  (requires TransactionTrait)
    ComponentCreator->>Db: Purls.create(Db)     (requires TransactionTrait)
    ComponentCreator->>Db: Cpes.create(Db)      (requires TransactionTrait)
    ComponentCreator-->>Creator: Ok
    deactivate ComponentCreator

    Creator-->>SbomContext: Ok
    deactivate Creator

    SbomContext-->>IngestJob: Ok
    deactivate SbomContext
Loading

File-Level Changes

Change Details Files
Make expanded license population transactional and document differences from the migration backfill.
  • Refactor populate_expanded_license to accept a generic connection type that supports both ConnectionTrait and TransactionTrait.
  • Wrap the expanded license dictionary and junction table inserts in an explicit transaction, using the transaction handle for SQL execution and database backend resolution.
  • Add detailed rustdoc explaining how the ingestion-time expanded license logic differs from the migration backfill in m0002120_normalize_expanded_license.
modules/ingestor/src/graph/sbom/common/expanded_license.rs
Optimize and clarify query for licenses not yet linked to any SBOM using NOT EXISTS instead of LEFT JOIN + IS NULL.
  • Replace the LEFT JOIN-based approach on package license and sbom_license_expanded tables with a NOT EXISTS anti-join subquery against sbom_license_expanded.
  • Document that the query is intended to find licenses not yet linked to any SBOM, including pre-loaded SPDX and older SBOM licenses, and that NOT EXISTS yields better query plans on large tables.
  • Simplify the non_sbom_query to a distinct select over license text with a NOT EXISTS filter on the anti-join subquery.
modules/fundamental/src/license/service/mod.rs
Require transaction-capable database handles for SBOM ingestion paths to support transactional operations like expanded license population.
  • Update ingest_cyclonedx to be generic over a connection type that implements both ConnectionTrait and TransactionTrait, preparing it to run transactional work during ingestion.
  • Update the Creator::create method to require a transaction-capable connection and propagate that requirement through its signature.
  • Update ComponentCreator::create and SbomContext::ingest_spdx to similarly require ConnectionTrait + TransactionTrait so that downstream calls (including license creation/expansion) can participate in the same transaction.
modules/ingestor/src/graph/sbom/cyclonedx.rs
modules/ingestor/src/graph/sbom/spdx.rs

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@mrrajan mrrajan force-pushed the fix/code-review-expanded-license-transaction branch from cefa398 to 730c345 Compare March 18, 2026 08:40
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The revised populate_expanded_license always starts and commits its own transaction; if this is ever called inside a broader ingestion transaction, this can lead to nested-transaction/rollback surprises—consider taking an existing &Transaction (or a generic ConnectionTrait that may already be a transaction) and not unconditionally calling begin() inside the function.
  • In non_sbom_query, the NOT EXISTS subquery can be inlined directly into the .filter(Expr::exists(...).not()) call (rather than pre-building exists_subquery), which would make the correlation on license::Column::Id more obvious and the query easier to follow.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The revised `populate_expanded_license` always starts and commits its own transaction; if this is ever called inside a broader ingestion transaction, this can lead to nested-transaction/rollback surprises—consider taking an existing `&Transaction` (or a generic `ConnectionTrait` that may already be a transaction) and not unconditionally calling `begin()` inside the function.
- In `non_sbom_query`, the NOT EXISTS subquery can be inlined directly into the `.filter(Expr::exists(...).not())` call (rather than pre-building `exists_subquery`), which would make the correlation on `license::Column::Id` more obvious and the query easier to follow.

## Individual Comments

### Comment 1
<location path="modules/ingestor/src/graph/sbom/common/expanded_license.rs" line_range="31-34" />
<code_context>
+///
+/// Both use the same core logic (expand_license_expression_with_mappings + md5 hash
+/// matching) but optimize for their different use cases.
+pub async fn populate_expanded_license<C>(sbom_id: Uuid, db: &C) -> Result<(), DbErr>
+where
+    C: ConnectionTrait + TransactionTrait,
+{
+    let txn = db.begin().await?;
</code_context>
<issue_to_address>
**suggestion:** Narrow the generic bound to only what this function actually needs.

Within this function, `db` is only used to start a transaction (`db.begin().await?`); all subsequent SQL calls use `txn` (which implements `ConnectionTrait`). So the function only requires `C: TransactionTrait`, not `C: ConnectionTrait`. Narrowing the bound better reflects the actual requirement and keeps the API more flexible.

```suggestion
pub async fn populate_expanded_license<C>(sbom_id: Uuid, db: &C) -> Result<(), DbErr>
where
    C: TransactionTrait,
{
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

where
C: ConnectionTrait + TransactionTrait,
{
let txn = db.begin().await?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why we need this. The function normally gets a transaction anyway. Creating a transaction here would actually create a savepoint. But why would a nested transaction be required here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, I will remove this the transaction trait.

@codecov
Copy link
Copy Markdown

codecov bot commented Mar 18, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 67.68%. Comparing base (43685c7) to head (a13fd51).
⚠️ Report is 4 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2295      +/-   ##
==========================================
- Coverage   67.97%   67.68%   -0.29%     
==========================================
  Files         430      433       +3     
  Lines       24731    24852     +121     
  Branches    24731    24852     +121     
==========================================
+ Hits        16810    16822      +12     
- Misses       6999     7103     +104     
- Partials      922      927       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: mrrajan <86094767+mrrajan@users.noreply.github.com.>
@mrrajan mrrajan force-pushed the fix/code-review-expanded-license-transaction branch 2 times, most recently from 60945c9 to b7b3142 Compare March 18, 2026 10:58
Signed-off-by: mrrajan <86094767+mrrajan@users.noreply.github.com.>
@mrrajan mrrajan force-pushed the fix/code-review-expanded-license-transaction branch from b7b3142 to a13fd51 Compare March 18, 2026 11:04
@mrrajan mrrajan added this pull request to the merge queue Mar 18, 2026
Merged via the queue into guacsec:main with commit 2abb2a4 Mar 18, 2026
5 of 6 checks passed
@mrrajan mrrajan deleted the fix/code-review-expanded-license-transaction branch March 18, 2026 12:04
@github-project-automation github-project-automation bot moved this to Done in Trustify Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants