Skip to content

Conversation

@agustingroh
Copy link
Contributor

@agustingroh agustingroh commented Dec 19, 2025

What's Changed

Changed

  • Used write and read streams to save scan responses
  • Include requirement in component crypto algorithm response

Summary by CodeRabbit

  • New Features

    • Streaming-based scan output to handle very large scans with lower memory use.
    • Cryptography results now include a requirement field for clearer grouping.
  • Bug Fixes / Reliability

    • Reduced scan buffering for more prompt processing and improved resource cleanup.
    • HTML report now rejects extremely large inputs with guidance to use JSON.
  • Chores

    • Bumped version to 0.29.0 and added streaming JSON runtime dependency.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 19, 2025

Walkthrough

Replaces in-memory result aggregation with NDJSON write streams and a converter; adds CLI streaming JSON transform using stream-json; introduces requirement for cryptography component grouping; reduces scanner buffer size; bumps package to 0.29.0 and adds stream-json deps.

Changes

Cohort / File(s) Summary
Release & deps
CHANGELOG.md, package.json
Added Unreleased and [0.29.0] release notes; bumped version to 0.29.0; added runtime dependency stream-json and devDependency @types/stream-json.
Scanner streaming & buffering
src/sdk/scanner/Scanner.ts, src/sdk/scanner/ScannerCfg.ts
Replaced in-memory result assembly with NDJSON write streams (wfpWriteStream, resultWriteStream), added convertNDJSONToJSON, stream lifecycle helpers (initializeWriteStreams, closeWriteStreams), wrote per-entry deobfuscation/post-processing into streaming path, and lowered MAX_RESPONSES_IN_BUFFER from 300 to 20.
CLI streaming integration
src/cli/commands/scan.ts
Added streamAndTransformResults using stream-json to parse and transform results incrementally for JSON output; retained HTML path with large-file guard; updated cryptography extraction to work with streamed input.
Cryptography result grouping
src/sdk/Clients/Cryptography/ICryptographyClient.ts, src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts
Added requirement: string to component algorithm data; changed result collector keying to use purl@requirement, updated getOrCreateResult signature to accept requirement, and adjusted collection logic to pass and store requirement.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI
    participant Scanner as Scanner
    participant ND as NDJSON Streams
    participant Converter as NDJSON→JSON Converter
    participant Parser as stream-json Parser
    participant Crypto as Cryptography logic
    participant Output as File/Stdout

    CLI->>Scanner: start scan (format)
    Scanner->>ND: open wfp & result write streams
    loop emit results
        Scanner->>ND: write NDJSON line per result
    end
    Scanner->>Scanner: close write streams
    Scanner->>Converter: convertNDJSONToJSON()
    Converter->>Converter: read NDJSON, assemble final JSON

    alt format == JSON
        CLI->>Parser: streamAndTransformResults(input JSON)
        Parser->>Crypto: emit component objects (includes requirement)
        Parser->>Output: stream transformed JSON to file/stdout
    else format == HTML
        CLI->>Converter: load full JSON into memory (guard large files)
        CLI->>Output: generate HTML report from in-memory JSON
    end

    Output-->>CLI: finished
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review focus:
    • src/sdk/scanner/Scanner.ts: stream lifecycle, NDJSON formatting, convertNDJSONToJSON atomic replacement, abort/error paths.
    • src/cli/commands/scan.ts: stream-json pipeline correctness and memory/IO handling for HTML vs JSON paths.
    • src/sdk/Cryptography/Helper/ResultCollector/...: keying change to requirement, updated signatures, and callers.
    • src/sdk/scanner/ScannerCfg.ts: effect of reducing buffer threshold on throughput/backpressure.

Poem

🐇 I hop through streams where NDJSON flows,
nibbling lines as the output grows.
A quiet requirement tucked in each roll,
we group, we stream, we save every soul.
Cheers to 0.29.0—light feet, full bowl!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Chore/stream scan responses' directly aligns with the main changes: implementing streaming-based processing for scan responses and adding requirement field to cryptography responses.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/stream-scan-responses

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts (1)

48-55: Inconsistent requirement handling between algorithm and hint collection.

collectAlgorithmResults skips components with a falsy requirement (line 50), but collectHintResults (line 63) processes all components regardless of requirement value. This could lead to data inconsistency where hints are collected for components that have no corresponding algorithm entries due to missing requirements.

Consider applying the same guard to collectHintResults, or document why the asymmetric behavior is intentional.

🔎 Proposed fix to add consistent guard
   public collectHintResults(hintResults: HintsInRangeResponse):void {
     hintResults.components.forEach((c) => {
+      if (c.requirement) {
         const result = this.getOrCreateResult(c.purl,c.version,c.requirement);
         result.hints = c.hints;
+      }
     });
   }
🧹 Nitpick comments (5)
src/cli/commands/scan.ts (3)

306-319: Duplicate streaming pipeline code should be extracted to a helper.

The streaming logic to load scanner results appears twice with nearly identical code (lines 306-319 and 333-346). Extract this into a reusable helper function to improve maintainability.

🔎 Proposed helper function
async function loadScannerResultsFromStream(resultPath: string): Promise<Record<string, any>> {
  return new Promise((resolve, reject) => {
    const pipeline = fs.createReadStream(resultPath)
      .pipe(parser())
      .pipe(streamObject());

    const scannerResults: Record<string, any> = {};

    pipeline.on('data', (data: { key: string; value: any }) => {
      scannerResults[data.key] = data.value;
    });

    pipeline.on('end', () => resolve(scannerResults));
    pipeline.on('error', reject);
  });
}

Then replace both occurrences:

const scannerData = await loadScannerResultsFromStream(scannerResultPath);

Also applies to: 333-346


62-65: OS-specific EOL may cause inconsistent JSON output across platforms.

Using EOL from os module means the JSON output will have \r\n on Windows and \n on Unix. This could cause issues if files are shared across platforms or compared for equality. Consider using \n consistently for JSON output.

🔎 Proposed fix
+const JSON_EOL = '\n'; // Use consistent line endings for JSON output

 const indentLines = (jsonStr: string, spaces: number): string => {
   const indent = ' '.repeat(spaces);
-  return jsonStr.split(EOL).map((line, idx) => idx === 0 ? line : indent + line).join(EOL);
+  return jsonStr.split('\n').map((line, idx) => idx === 0 ? line : indent + line).join(JSON_EOL);
 };

And replace all EOL usages in streamAndTransformResults with JSON_EOL.


111-116: Pipeline error may leave partial output file.

When a pipeline error occurs, the write stream is destroyed but the partial output file may remain on disk. Consider unlinking the partial file on error.

🔎 Proposed fix
     pipeline.on('error', (error) => {
       if (outputPath && writeStream !== process.stdout) {
         (writeStream as fs.WriteStream).destroy();
+        fs.unlink(outputPath, () => {}); // Best-effort cleanup
       }
       reject(error);
     });
src/sdk/scanner/Scanner.ts (2)

607-638: Consider using Promise.all for cleaner stream closing.

The current implementation works but the reject parameter is unused. Since stream.end() callbacks don't receive errors, this is functionally fine, but you could simplify using Promise.all.

🔎 Proposed simplified implementation
-  private closeWriteStreams(): Promise<void> {
-    return new Promise((resolve, reject) => {
-      let wfpClosed = false;
-      let resultClosed = false;
-
-      const checkBothClosed = () => {
-        if (wfpClosed && resultClosed) {
-          resolve();
-        }
-      };
-
-      if (this.wfpWriteStream) {
-        this.wfpWriteStream.end(() => {
-          wfpClosed = true;
-          checkBothClosed();
-        });
-      } else {
-        wfpClosed = true;
-      }
-
-      if (this.resultWriteStream) {
-        this.resultWriteStream.end(() => {
-          resultClosed = true;
-          checkBothClosed();
-        });
-      } else {
-        resultClosed = true;
-      }
-
-      checkBothClosed();
-    });
+  private closeWriteStreams(): Promise<void> {
+    const closeStream = (stream: fs.WriteStream | undefined): Promise<void> =>
+      stream ? new Promise(resolve => stream.end(resolve)) : Promise.resolve();
+
+    return Promise.all([
+      closeStream(this.wfpWriteStream),
+      closeStream(this.resultWriteStream)
+    ]).then(() => {});
   }

602-605: Consider adding error listeners to write streams.

The write streams are created without error listeners. If a write fails (e.g., disk full), the error would be unhandled and could crash the process or silently fail.

🔎 Proposed fix to add error listeners
   private initializeWriteStreams() {
     this.wfpWriteStream = fs.createWriteStream(this.wfpFilePath, { flags: 'a' });
     this.resultWriteStream = fs.createWriteStream(this.resultFilePath, { flags: 'a' });
+
+    this.wfpWriteStream.on('error', (err) => {
+      logger.error(`[ SCANNER ]: WFP write stream error: ${err.message}`);
+      this.errorHandler(err, ScannerEvents.MODULE_WINNOWER);
+    });
+
+    this.resultWriteStream.on('error', (err) => {
+      logger.error(`[ SCANNER ]: Result write stream error: ${err.message}`);
+      this.errorHandler(err, ScannerEvents.MODULE_DISPATCHER);
+    });
   }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c905f71 and ba228a9.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • CHANGELOG.md (2 hunks)
  • package.json (3 hunks)
  • src/cli/commands/scan.ts (5 hunks)
  • src/sdk/Clients/Cryptography/ICryptographyClient.ts (1 hunks)
  • src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts (4 hunks)
  • src/sdk/scanner/Scanner.ts (9 hunks)
  • src/sdk/scanner/ScannerCfg.ts (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/cli/commands/scan.ts (3)
src/sdk/Clients/Dependency/IDependencyClient.ts (1)
  • DependencyResponse (32-35)
src/sdk/Clients/Grpc/scanoss/api/dependencies/v2/scanoss-dependencies_pb.d.ts (1)
  • DependencyResponse (83-102)
src/sdk/Cryptography/CryptographyTypes.ts (2)
  • LocalCryptography (65-69)
  • CryptographyResponse (75-80)
src/sdk/scanner/Scanner.ts (3)
src/index.ts (1)
  • logger (64-64)
src/sdk/scanner/ScannerTypes.ts (1)
  • ScannerResults (70-70)
src/sdk/scanner/ScannnerResultPostProcessor/rules/rule-factory.ts (1)
  • ScannerResultsRuleFactory (6-22)
🔇 Additional comments (8)
src/sdk/Clients/Cryptography/ICryptographyClient.ts (1)

23-26: LGTM!

The requirement field addition to ComponentAlgorithm aligns with the existing ComponentHintResponse interface pattern (line 40), maintaining consistency across component-related types.

src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts (1)

30-31: Verify that keying by requirement instead of version is intentional.

The key changed from ${purl}@${version} to ${purl}@${requirement}. If the same purl has different versions but the same requirement (e.g., ^1.0.0), they will collide and the later entry will overwrite algorithms/hints from the earlier one.

src/cli/commands/scan.ts (1)

42-51: LGTM - Well-structured streaming implementation.

The streamAndTransformResults function provides a clean streaming approach for JSON output that avoids holding the entire scanner result in memory during transformation.

package.json (1)

40-40: No security issues detected for stream-json ^1.9.1.

The dependency addition aligns with streaming implementation and has no known vulnerabilities.

src/sdk/scanner/ScannerCfg.ts (1)

30-30: LGTM!

Reducing the buffer size from 300 to 20 aligns well with the new streaming approach, minimizing peak memory usage at the cost of more frequent (but efficient) stream writes.

src/sdk/scanner/Scanner.ts (3)

85-87: LGTM!

The new write stream properties properly support the streaming architecture for both WFP and result files.


708-715: LGTM!

Using end() for cleanup ensures buffered data is flushed before closing, which is appropriate for abort scenarios. Not awaiting completion is acceptable here since it's an abort path.


471-479: LGTM!

The finish sequence correctly ensures the buffer is flushed, streams are closed, and then NDJSON is converted to JSON. This properly coordinates the streaming pipeline.

@agustingroh agustingroh force-pushed the chore/stream-scan-responses branch 2 times, most recently from dc6fa26 to 98f759e Compare December 22, 2025 10:44
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts (2)

30-42: Inconsistent null/undefined handling between collectAlgorithmResults and collectHintResults for the requirement parameter.

collectAlgorithmResults (line 50) guards against missing requirement with if (c.requirement), but collectHintResults (line 63) does not. If c.requirement is undefined or null in collectHintResults, the key becomes ${purl}@undefined or ${purl}@null, which could cause data integrity issues or unexpected collisions.

Additionally, if the requirement string contains the @ character, it could lead to key collisions (e.g., npm/pkg + requirement foo@bar produces the same key as npm/pkg@foo + requirement bar). Consider either escaping the requirement value or using a different delimiter for the map key.


48-55: Guard check in collectAlgorithmResults is inconsistent with collectHintResults.

The collectAlgorithmResults method guards against falsy requirement values with if (c.requirement), but collectHintResults (line 61-65) lacks this guard and directly uses c.requirement on all components. Although requirement is typed as required (string, not string?), this inconsistency creates a potential issue: algorithms may be skipped for components with falsy requirements, while hints are always collected. This could lead to incomplete or misaligned results where a component has hints but no algorithms recorded.

Consider:

  1. Applying the same guard to collectHintResults for consistency, or
  2. Removing the guard from collectAlgorithmResults if requirement is guaranteed to be present, or
  3. Documenting why the two methods handle requirement differently.
🧹 Nitpick comments (2)
src/sdk/scanner/Scanner.ts (2)

556-567: Handle potential ENOENT in cleanup.

fs.rmSync(tempFilePath) will throw if the file doesn't exist (e.g., if both streams error in quick succession). Consider using { force: true } to silently ignore missing files.

🔎 Proposed fix
       readStream.on('error', (error) => {
         writeStream.destroy();
-        fs.rmSync(tempFilePath);
+        fs.rmSync(tempFilePath, { force: true });
         reject(error);
       });

       writeStream.on('error', (error) => {
         rl.close();
         readStream.destroy();
-        fs.rmSync(tempFilePath);
+        fs.rmSync(tempFilePath, { force: true });
         reject(error);
       });

609-640: Consider error handling in stream closure.

The stream.end() callback may receive an error argument that's currently ignored. While uncommon, stream close errors could indicate data wasn't fully flushed.

🔎 Optional improvement with error propagation
   private closeWriteStreams(): Promise<void> {
-    return new Promise((resolve, reject) => {
-      let wfpClosed = false;
-      let resultClosed = false;
-
-      const checkBothClosed = () => {
-        if (wfpClosed && resultClosed) {
-          resolve();
-        }
-      };
-
-      if (this.wfpWriteStream) {
-        this.wfpWriteStream.end(() => {
-          wfpClosed = true;
-          checkBothClosed();
-        });
-      } else {
-        wfpClosed = true;
-      }
-
-      if (this.resultWriteStream) {
-        this.resultWriteStream.end(() => {
-          resultClosed = true;
-          checkBothClosed();
-        });
-      } else {
-        resultClosed = true;
-      }
-
-      checkBothClosed();
-    });
+    const closeStream = (stream: fs.WriteStream | undefined): Promise<void> => {
+      if (!stream) return Promise.resolve();
+      return new Promise((resolve, reject) => {
+        stream.end((err?: Error) => {
+          if (err) reject(err);
+          else resolve();
+        });
+      });
+    };
+    return Promise.all([
+      closeStream(this.wfpWriteStream),
+      closeStream(this.resultWriteStream)
+    ]).then(() => {});
   }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dc6fa26 and 98f759e.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • CHANGELOG.md
  • package.json
  • src/cli/commands/scan.ts
  • src/sdk/Clients/Cryptography/ICryptographyClient.ts
  • src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts
  • src/sdk/scanner/Scanner.ts
  • src/sdk/scanner/ScannerCfg.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (2)
src/cli/commands/scan.ts (3)
src/sdk/Clients/Dependency/IDependencyClient.ts (1)
  • DependencyResponse (32-35)
src/sdk/Clients/Grpc/scanoss/api/dependencies/v2/scanoss-dependencies_pb.d.ts (1)
  • DependencyResponse (83-102)
src/sdk/Cryptography/CryptographyTypes.ts (2)
  • LocalCryptography (65-69)
  • CryptographyResponse (75-80)
src/sdk/scanner/Scanner.ts (3)
src/index.ts (1)
  • logger (64-64)
src/sdk/scanner/ScannerTypes.ts (1)
  • ScannerResults (70-70)
src/sdk/scanner/ScannnerResultPostProcessor/rules/rule-factory.ts (1)
  • ScannerResultsRuleFactory (6-22)
🔇 Additional comments (10)
src/sdk/scanner/ScannerCfg.ts (1)

30-30: LGTM - buffer reduction aligns with streaming approach.

Reducing from 300 to 20 responses makes sense for the new streaming NDJSON output. More frequent flushes reduce memory pressure at the cost of additional I/O, which is the expected trade-off for streaming workflows.

src/sdk/scanner/Scanner.ts (3)

85-88: LGTM - stream member declarations.

The write stream members are properly typed and will be managed through the new lifecycle methods.


646-672: LGTM - NDJSON streaming output.

The streaming approach with deobfuscation and rule application is well-structured. While write() doesn't handle backpressure explicitly, the reduced buffer size (20 responses) keeps individual batch writes small.


710-717: LGTM - abort cleanup.

Using end() for cleanup during abort is appropriate, allowing buffered data to flush before closing.

src/cli/commands/scan.ts (4)

29-31: LGTM - streaming imports.

The stream-json imports enable efficient streaming JSON parsing without loading the entire file into memory.


302-326: LGTM - streaming component data loading.

The streaming approach for loading scanner results when building the component list is memory-efficient. This addresses the earlier concern about unconditionally loading scanner results into memory.


330-343: LGTM - HTML format size safeguard.

The 2GB limit check is a sensible safeguard before loading the entire file into memory. The error message clearly guides users to use JSON format for large scans.


366-375: LGTM - JSON streaming output.

The JSON format path now uses streaming transformation, avoiding the need to load the entire scanner results file into memory.

src/sdk/Clients/Cryptography/ICryptographyClient.ts (1)

23-26: LGTM - requirement field addition.

The requirement field enables proper grouping of components by their version requirement (e.g., purl@requirement), aligning with the cryptography result collection changes.

Verify that all implementations populate the requirement field:

#!/bin/bash
# Search for implementations that create ComponentAlgorithm or ComponentHintResponse objects
rg -n "ComponentAlgorithm|ComponentHintResponse" --type ts -A 5 | head -100

Also applies to: 38-41

package.json (1)

3-3: LGTM - version bump and streaming dependencies align well.

The latest version of stream-json is 1.9.1, and the dependency constraint ^1.9.1 is appropriate. With 380 dependents on npm, stream-json is a well-established micro-library for creating custom JSON processing pipelines with minimal dependencies and memory footprint. The @types/stream-json version 1.7.8 is current and used by 17 other projects. The version bump to 0.29.0 appropriately reflects the new streaming functionality.

@agustingroh agustingroh force-pushed the chore/stream-scan-responses branch from 98f759e to e4f9653 Compare December 22, 2025 10:56
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts (1)

61-66: Critical: Missing guard for c.requirement causes inconsistent behavior.

Unlike collectAlgorithmResults (line 50), this method does not check whether c.requirement exists before using it. If a component lacks a requirement field, the key at line 31 will become ${purl}@undefined, leading to:

  • Unexpected map keys and potential data corruption
  • Inconsistent behavior between algorithm and hint collection

Apply the same guard pattern used in collectAlgorithmResults:

🔎 Proposed fix
 public collectHintResults(hintResults: HintsInRangeResponse):void {
   hintResults.components.forEach((c) => {
+    if (c.requirement) {
       const result = this.getOrCreateResult(c.purl,c.version,c.requirement);
       result.hints = c.hints;
+    }
   });
 }
🧹 Nitpick comments (1)
src/cli/commands/scan.ts (1)

112-124: Missing error handler for stdout writes could leave promise hanging.

When writing to stdout (no outputPath), the code doesn't attach an error handler to writeStream. If stdout encounters an error (e.g., broken pipe when piped to a closed process), the promise may never reject.

🔎 Proposed fix
     pipeline.on('error', (error) => {
       if (outputPath && writeStream !== process.stdout) {
         (writeStream as fs.WriteStream).destroy();
       }
       reject(error);
     });

-    if (outputPath) {
-      writeStream.on('error', (error) => {
-        pipeline.destroy();
-        reject(error);
-      });
-    }
+    writeStream.on('error', (error) => {
+      pipeline.destroy();
+      if (outputPath && writeStream !== process.stdout) {
+        (writeStream as fs.WriteStream).destroy();
+      }
+      reject(error);
+    });
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 98f759e and e4f9653.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • CHANGELOG.md
  • package.json
  • src/cli/commands/scan.ts
  • src/sdk/Clients/Cryptography/ICryptographyClient.ts
  • src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts
  • src/sdk/scanner/Scanner.ts
  • src/sdk/scanner/ScannerCfg.ts
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/sdk/scanner/ScannerCfg.ts
  • package.json
  • src/sdk/Clients/Cryptography/ICryptographyClient.ts
  • CHANGELOG.md
🧰 Additional context used
🧬 Code graph analysis (2)
src/cli/commands/scan.ts (3)
src/sdk/Clients/Dependency/IDependencyClient.ts (1)
  • DependencyResponse (32-35)
src/sdk/Clients/Grpc/scanoss/api/dependencies/v2/scanoss-dependencies_pb.d.ts (1)
  • DependencyResponse (83-102)
src/sdk/Cryptography/CryptographyTypes.ts (2)
  • LocalCryptography (65-69)
  • CryptographyResponse (75-80)
src/sdk/scanner/Scanner.ts (3)
src/index.ts (1)
  • logger (64-64)
src/sdk/scanner/ScannerTypes.ts (1)
  • ScannerResults (70-70)
src/sdk/scanner/ScannnerResultPostProcessor/rules/rule-factory.ts (1)
  • ScannerResultsRuleFactory (6-22)
🔇 Additional comments (14)
src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts (1)

48-55: Good defensive guard for undefined requirement.

The guard at line 50 properly prevents processing components without a requirement field, avoiding the creation of invalid map keys.

src/cli/commands/scan.ts (5)

29-31: LGTM!

Imports are correctly added for streaming JSON parsing and platform-specific line endings.


62-66: LGTM - addressed previous feedback.

The indentLines helper now correctly splits on '\n' and joins with EOL for platform consistency, addressing the previous EOL mismatch concern.


330-344: Good implementation of format-specific memory management.

The 2GB file size check before loading for HTML format is a sensible guard. The streaming path for JSON format avoids memory issues for large files. This addresses the previous feedback about deferring scanner result loading.


367-376: LGTM!

The JSON format path now uses streaming transformation, avoiding loading the entire scanner result file into memory.


306-319: Missing error handler on streaming pipeline.

If the read stream or parser encounters an error, the promise will hang indefinitely since no 'error' event handler is attached to pipeline.

🔎 Proposed fix
       const pipeline = fs.createReadStream(scannerResultPath)
         .pipe(parser())
         .pipe(streamObject());

       const scannerResults: any = {};

       pipeline.on('data', (data: { key: string; value: any }) => {
         scannerResults[data.key] = data.value;
       });

       pipeline.on('end', () => resolve(scannerResults));
-      pipeline.on('error', reject);
+      pipeline.on('error', (error) => {
+        reject(new Error(`Failed to parse scanner results: ${error.message}`));
+      });

Likely an incorrect or invalid review comment.

src/sdk/scanner/Scanner.ts (8)

85-87: LGTM!

New write stream fields properly declared as private class members.


533-535: Addressed: Error logging now in place.

Previous review comment about silent error handling has been addressed - parse errors are now logged via this.reportLog.


604-640: LGTM!

Stream lifecycle management is well implemented. The closeWriteStreams method correctly handles cases where streams may not be initialized and uses a proper completion pattern.


659-673: LGTM - Streaming NDJSON writes.

Good use of streaming writes with NDJSON format. The rule processing chain correctly reassigns processedResponse before iterating to write entries.


710-717: LGTM!

Good addition of stream cleanup in the abort path to prevent resource leaks.


474-478: LGTM!

Correct sequencing - streams are closed before the NDJSON conversion reads from the file.


413-414: LGTM!

Useful debug logging for monitoring buffer behavior.


588-602: LGTM!

Output files are properly initialized as empty before write streams are created with append mode.

@agustingroh agustingroh force-pushed the chore/stream-scan-responses branch from e4f9653 to d9a2517 Compare December 22, 2025 11:07
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts (1)

61-66: Missing guard for c.requirement in collectHintResults - inconsistent with collectAlgorithmResults.

The collectAlgorithmResults method (line 50) checks if (c.requirement) before processing, but collectHintResults does not. While ComponentHintResponse.requirement is typed as required string, the guard in collectAlgorithmResults suggests the runtime data may not always include this field. For consistency and defensive coding, apply the same guard pattern here.

This was flagged in a previous review but appears unaddressed.

🔎 Proposed fix
 public collectHintResults(hintResults: HintsInRangeResponse):void {
   hintResults.components.forEach((c) => {
+    if (c.requirement) {
       const result = this.getOrCreateResult(c.purl,c.version,c.requirement);
       result.hints = c.hints;
+    }
   });
 }
🧹 Nitpick comments (2)
src/cli/commands/scan.ts (1)

62-66: Mixed line endings could cause inconsistent output.

Splitting on '\n' but joining with EOL creates mixed line endings when JSON.stringify output (which uses \n) is joined with platform-specific EOL (which is \r\n on Windows). The output will have \n within JSON values but \r\n between lines.

Consider using consistent line endings throughout:

🔎 Proposed fix for consistent line endings
     // Helper to indent JSON output
-    // Note: JSON.stringify always uses \n, so we split on \n but join with EOL for platform consistency
     const indentLines = (jsonStr: string, spaces: number): string => {
       const indent = ' '.repeat(spaces);
-      return jsonStr.split('\n').map((line, idx) => idx === 0 ? line : indent + line).join(EOL);
+      return jsonStr.split('\n').map((line, idx) => idx === 0 ? line : indent + line).join('\n');
     };

Then replace all EOL usage with '\n' for consistent JSON output, or normalize the entire output at the end.

src/sdk/scanner/Scanner.ts (1)

526-532: Potential memory concern with large individual results.

Line 530 calls JSON.stringify(result, null, 2).replace(/\n/g, '\n ') which creates multiple string copies for each result entry. For files with very large result objects, this could consume significant memory. Consider whether individual entries are expected to be large.

This is acceptable if individual file results are expected to remain reasonably sized, but worth noting for future optimization if large results become common.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4f9653 and d9a2517.

⛔ Files ignored due to path filters (1)
  • package-lock.json is excluded by !**/package-lock.json
📒 Files selected for processing (7)
  • CHANGELOG.md
  • package.json
  • src/cli/commands/scan.ts
  • src/sdk/Clients/Cryptography/ICryptographyClient.ts
  • src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts
  • src/sdk/scanner/Scanner.ts
  • src/sdk/scanner/ScannerCfg.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/sdk/scanner/ScannerCfg.ts
  • package.json
🧰 Additional context used
🧬 Code graph analysis (2)
src/cli/commands/scan.ts (4)
src/sdk/Clients/Dependency/IDependencyClient.ts (1)
  • DependencyResponse (32-35)
src/sdk/Cryptography/CryptographyTypes.ts (2)
  • LocalCryptography (65-69)
  • CryptographyResponse (75-80)
src/sdk/Logger/ScanossLogger.ts (1)
  • error (19-21)
src/sdk/Logger/Logger.ts (1)
  • error (33-37)
src/sdk/scanner/Scanner.ts (3)
src/index.ts (1)
  • logger (64-64)
src/sdk/scanner/ScannerTypes.ts (1)
  • ScannerResults (70-70)
src/sdk/scanner/ScannnerResultPostProcessor/rules/rule-factory.ts (1)
  • ScannerResultsRuleFactory (6-22)
🔇 Additional comments (13)
src/sdk/Clients/Cryptography/ICryptographyClient.ts (1)

23-26: LGTM!

The addition of requirement to ComponentAlgorithm creates symmetry with ComponentHintResponse (line 38-41) and aligns with the updated result collection keying in ComponentCryptographyResultColletor.

CHANGELOG.md (2)

5-11: LGTM!

The changelog entries accurately document the streaming changes and the requirement field addition. The previous issues with the empty list item and compare URL have been addressed.


199-199: Verified: Compare URL is now correct.

The compare URL correctly references v0.28.1 as the base version for the 0.29.0 release.

src/cli/commands/scan.ts (4)

42-126: Well-structured streaming transformation.

The streamAndTransformResults function correctly streams scanner results while constructing the output structure. Good error handling with proper cleanup of write streams on pipeline errors.


303-327: Streaming for cryptography component list extraction.

Good use of streaming to extract component data for cryptography scanning when the API key is provided. This avoids loading the entire file into memory for the main JSON output path while still supporting the cryptography feature.


330-344: Sensible 2GB limit for HTML format.

Good safeguard to prevent OOM errors for HTML generation, with a clear error message guiding users to use JSON format instead.


367-376: JSON output now fully streams.

The JSON format path correctly uses streamAndTransformResults to avoid loading the entire scanner result file into memory.

src/sdk/Cryptography/Helper/ResultCollector/Component/ComponentCryptographyResultColletor.ts (1)

30-31: Verify keying strategy change from version to requirement.

The map key changed from ${purl}@${version} to ${purl}@${requirement}. Components with the same purl and requirement but different version values will now share the same result entry, potentially overwriting each other. Confirm this is the intended behavior for the cryptography result aggregation.

src/sdk/scanner/Scanner.ts (5)

85-87: Good addition of write stream fields for resource management.

Adding dedicated write stream fields enables proper lifecycle management and cleanup, which is essential for the streaming approach.


501-577: Well-implemented NDJSON to JSON conversion with proper error handling.

The streaming conversion handles the NDJSON format correctly. Previous review concerns have been addressed:

  • Parse errors are now logged (line 534) instead of silently skipped
  • fs.rmSync calls are wrapped in try/catch (lines 558-563, 569-573) to prevent masking the original error

612-648: Robust stream lifecycle management.

The initializeWriteStreams and closeWriteStreams methods properly manage stream resources. The closeWriteStreams correctly waits for both streams to finish before resolving.


650-681: NDJSON output with proper processing pipeline.

The appendOutputFiles method correctly:

  1. Writes WFP content via stream
  2. Applies deobfuscation when configured
  3. Applies settings rules when present
  4. Writes each result as a separate NDJSON line

This maintains the streaming benefit while preserving existing functionality.


718-725: Proper stream cleanup in abort path.

Calling end() on write streams during abort prevents resource leaks. This is important for error scenarios and manual scan termination.

@agustingroh agustingroh merged commit 88c3d9b into main Dec 22, 2025
4 checks passed
@agustingroh agustingroh deleted the chore/stream-scan-responses branch December 22, 2025 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants