Skip to content

Commit 1cfedb3

Browse files
kotakanbeUbuntuclaude
authored
refactor: audit-driven sweep — remove ADR-0004 dead code, consolidate duplication, split god files (#440)
* chore: format all Go files with goimports gofmt -l flagged 16 files with alignment/import-ordering violations introduced across recent feature PRs (audit finding mech-1). No logic changes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor: remove ADR-0004 dead-code cascade (~1500 LOC) ADR-0004 merged the old analyze command into scan, but batch.go and its entire dependency cone were never deleted. This commit removes all of it. Deleted files: - internal/interfaces/cli/batch.go (1058 lines, 850 dead) - internal/interfaces/cli/batch_test.go, batch_processing_test.go - internal/interfaces/cli/option_predicates.go, option_predicates_test.go - internal/interfaces/cli/purl_helpers_test.go (purlHasVersion etc.) - internal/application/fetch_service.go (FetchService — zero live callers) - internal/infrastructure/export/csv/license.go + license_test.go - internal/common/result_processor.go + result_processor_test.go - internal/common/collections/ (map_utils.go — only imported by batch.go) - internal/common/purl/versionless.go + versionless_test.go Extracted from batch.go into new input_helpers.go (live helpers scan.go uses): - ProcessingOptions struct (slimmed: removed OnlyReviewNeeded, OnlyEOL, LicenseCSVPath, Ecosystem — no live readers after batch.go deletion) - truncateDescription, randomSample, validateLineRange - categorizeInputs, categorizeFileLines Minimal edits to remaining files: - application/analysis_service.go: removed WriteLicenseCSV - common/errors.go: removed NewConfigError, NewInsufficientPermissionsError, IsInsufficientPermissionsError - domain/config/value_objects.go: removed Service/Source/Validator interfaces, GetDefaultApp, GetDefaultDepsDev, deleted config field from ConfigService - infrastructure/config/loader.go: removed GetConfig, Reload methods - infrastructure/links/builder.go: removed three dead delegators - domain/analysis/summary.go: removed FirstSentence, paragraphBreak, sentenceBreak - domain/analysis/models.go: removed dead Score struct - integration/populate_project.go: fixed stale comment referencing FirstSentence All gates pass: go build ./..., go test ./..., go vet ./..., gofmt -l . Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs: drop stale references to symbols deleted in the dead-code sweep Code-review findings on phase 1: NewFetchServiceFromConfig mentions in ADR-0018 and two integration option godocs, a FirstSentence reference in NormalizeSummary's godoc, and anachronistic batch wording on ProcessingOptions. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(scan): delegate RunFromPURLs to RunFromPURLsWithActions Replaces the ~33-line duplicate body of RunFromPURLs with a one-line delegation to RunFromPURLsWithActions with a zero ActionsConfig. A zero ActionsConfig skips Phase B (Actions discovery) and is behaviorally identical to the old implementation, eliminating the risk of future drift. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(analysis): use purl.HasVersion in IsVersionResolved (INTENDED BEHAVIOR FIX) Replaces strings.Contains(a.EffectivePURL, "@") with purl.HasVersion which uses the PURL parser. The naive strings.Contains check misclassifies npm scoped packages like pkg:npm/@scope/name as versioned because "@" appears in the namespace segment, not as the version delimiter. Also removes the now-unused "strings" import. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(application): unify batch pipeline, add AnalysisSource interface and EOL evaluator seam Step 1 (dup-1/ddd-2): extract enrichAndAssess method containing all of Phase 1 (EOL evaluation + registry-fallback + repo-URL-fallback), Phase 2 (enrichers), and Phase 3 (composite assessment). ProcessBatchPURLs and ProcessBatchGitHubURLs now delegate to this single shared implementation after their fetch call. Unifies the drifted Phase-3 debug message to "composite_assessment_failed" with snake_case event naming. INTENDED BEHAVIOR FIX (err-2): registry_fallback_resolved and repo_url_fallback_resolved demoted from slog.Info to slog.Debug to prevent log flooding on large batch runs. Step 2 (ddd-1 seam): introduce AnalysisSource interface (AnalyzeFromPURLs + AnalyzeFromGitHubURLs; GitHubClient deliberately excluded per architect decision). Change AnalysisService.integrationService field from *integration.IntegrationService to AnalysisSource — existing concrete callers remain compile-compatible. Add unexported eolBatchEvaluator interface and newEOLEvaluator factory field, defaulted to today's per-call construction in both constructors so per-call cache semantics are preserved byte-for-byte. enrichAndAssess calls s.newEOLEvaluator() once per invocation. GitHubClient() accessor updated to use unexported capability assertion so test fakes that implement AnalysisSource are never required to import the concrete *github.Client type. Step 3 (ddd-1 tests): rewrite TestRegistryFallback_ErrorClearedWhenEOLResolved and TestRepoURLFallback_ErrorClearedWhenRepoResolved to drive the REAL ProcessBatchPURLs through fakeAnalysisSource + fakeEOLEvaluator instead of duplicating production logic inline. Each test also asserts that newEOLEvaluator is called exactly once per ProcessBatchPURLs invocation, guarding the deliberate per-call evaluator construction semantics. Step 4 (ddd-3): add GitHubRateLimitSummary() delegation method on AnalysisService that delegates to GitHubClient().RateLimitSummary() with nil safety. Replace as.GitHubClient().RateLimitSummary() in internal/interfaces/cli/scan.go with as.GitHubRateLimitSummary() so the interfaces layer consumes only application API and never touches the concrete *github.Client value directly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(render): remove omitempty from meaningful zero-value JSON fields (INTENDED BEHAVIOR FIX) scan_render.go: removes omitempty from enrichedJSONEntry.Archived (bool). diet_render.go: removes omitempty from has_vulnerabilities, vulnerability_count, max_cvss_score, and overall_score. All five fields are populated unconditionally in the rendering code, so a false/0 value is a meaningful result ("not archived", "no vulnerabilities"), not an absent one. Consumers can now distinguish false from absent. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(facade): introduce EvaluationService interface and fix NewEvaluatorFromService Introduces a narrow EvaluationService interface in pkg/uzomuzo covering the three methods Evaluator delegates to (ProcessBatchPURLs, ProcessBatchGitHubURLs, WriteScoreCardCSV). Changes Evaluator.service field and NewEvaluatorFromService parameter from *application.AnalysisService to EvaluationService, making the constructor callable by external modules (it was previously structurally unusable because *application.AnalysisService cannot be imported outside the module). *application.AnalysisService still satisfies the interface. Also removes three duplicate trailing godoc summary lines from EvaluatePURLs, EvaluateGitHubRepos, and ExportCSV that were corrupting go doc output. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(integration): unify license-enricher fan-out into dispatchLicenseJobs Add licenseCoord and licenseLogEvents types and a shared dispatchLicenseJobs helper that implements the bounded semaphore(10) fan-out pipeline, ctx.Done-aware dispatch loop, rate-limit-aware WARN switching, and applyManifestLicenses application — all of which were duplicated between enrichLicenseFromManifest and enrichLicenseFromClearlyDefined. Each enricher is now a thin job-collection wrapper that builds the jobs map (the genuinely divergent part) and delegates to dispatchLicenseJobs with a closure over its tier-specific client and an events struct. The Maven tier omits hit/miss/noChange event names (empty string = skip), preserving the intentional telemetry asymmetry. Reduces ~50 lines of structurally identical fan-out code and establishes a single canonical concurrency block. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs(config): add godoc comment to NormalizeLifecycleConfig NormalizeLifecycleConfig was the only exported identifier in internal/domain/** and pkg/uzomuzo/** without a godoc comment. The comment accurately describes that it fills zero fields from DefaultValues.Lifecycle and is safe to call multiple times. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(integration): use structured PURL parser in generateVersionedPURL INTENDED BEHAVIOR FIX: replace the naive strings.Contains(basePURL, "@") + strings.Split(basePURL, "@") logic with purl.WithVersion, which uses packageurl.FromString internally. Before: strings.Split("pkg:npm/@angular/core", "@") yields ["pkg:npm/", "angular/core"], so baseWithoutVersion becomes "pkg:npm/" — a corrupted base that produces "pkg:npm/@1.5.0" instead of "pkg:npm/%40angular/core@1.5.0". After: packageurl.FromString correctly identifies namespace="@angular", name="core", and purl.WithVersion sets the Version field and serializes back with percent-encoding: "pkg:npm/%40angular/core@15.0.0". Also: empty version now returns the base PURL without a trailing "@" (e.g. "pkg:npm/express" not "pkg:npm/express@") because an empty version component is not emitted by packageurl-go's serializer; this matches the PURL spec. Invalid PURL inputs still fall back to fmt.Sprintf to preserve observable behaviour. Test cases updated and extended to pin: scoped npm package without corruption, version replacement on scoped package, Maven PURL with namespace, malformed PURL fallback, and empty-version semantics. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(eolevaluator): replace hand-rolled PURL parsers with purl package (dup-5) Add pinning table tests for parseComposerFromPURL, parseNuGetIDFromPURL, and parseMavenFromPURL on canonical inputs before replacing their bodies. Replace the three ad-hoc string-slicing parser bodies with purl-based implementations: each now accepts a *purl.ParsedPURL (already parsed by the caller) and delegates to Namespace()/Name()/Version(). For NuGet, the namespace is reconstructed as "Namespace/Name" when present to preserve malformed-input behavior. This eliminates the double-parse at each apply* call site. Update fuzz targets and pinning tests to use the new ParsedPURL-based signatures. Note: packageurl-go normalizes composer vendor/name to lowercase (Packagist canonical form) — this is a behavior correction toward registry-canonical values. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(httpclient): extract waitWithContext, fix timer leaks, add 408 retry cons-2: Replace the three remaining time.After usages (custom-retry, network-error, 5xx paths) with waitWithContext — a shared helper using time.NewTimer + Stop/drain so the timer goroutine is released immediately when the context fires. The 429 path was already correct; unify it onto the same helper. cons-7 INTENDED BEHAVIOR FIX: Classify HTTP 408 Request Timeout as a transient condition and retry it under the RetryOn5xx flag (no new config, comment explains). Add table test case mirroring the existing 5xx test. dup-8: Add RegistryRetryConfig() constructor returning the shared fast-fail policy (2 retries / 400ms base / 2s max) used by all registry metadata clients. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(registry): replace RetryConfig literals with RegistryRetryConfig() dup-8: Replace all 7 identical RetryConfig{MaxRetries:2, ...} literals across crates, pypi, npmjs, rubygems, and packagist with the new RegistryRetryConfig() constructor. Align three drifted SetHTTPClient implementations (npmjs, rubygems, packagist) from DefaultRetryConfig() to RegistryRetryConfig() so injected test clients exercise the same retry policy as production. SetHTTPClient has no production callers (test seam only); this is a test-policy parity fix listed under behavior_changes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * feat(ttlcache): add generic TTL cache and migrate pypi/crates caches dup-3: Create internal/common/ttlcache — a generic, stdlib-only (sync, time) Cache[V any] with Get/Set/SetTTL replicating the exact semantics of the four hand-written copies it replaces (TTL<=0 disables reads/writes; expiry checked on read; no background eviction). Migrate four copies: - crates.Client: cache field -> ttlcache.Cache[*VersionInfo] - pypi.Client: two caches -> Cache[*ProjectInfo] + Cache[*VersionInfo] - pypi wheel import-name cache -> Cache[[]string] (drops the sync.Once lazy-init; zero-value usability replaces it) Caller-side nil-skip guards preserved at call sites. SetCacheTTL fans the same TTL to all caches as before (single policy per client). Add table-driven unit tests covering hit, miss, expiry, TTL<=0 disabled, and zero-value usability. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(maven,nuget): use errors.Is for io.EOF and u.Hostname() for hostname checks err-4: maven/client.go pomProperties.UnmarshalXML — change direct err == io.EOF comparison to errors.Is(err, io.EOF) for consistency with the rest of the codebase and correctness with wrapped errors. cons-6: Switch hostname equality/suffix comparisons from u.Host to u.Hostname() in nuget/client.go (aka.ms, .microsoft.com, fu redirects, empty-host guard) and maven/client.go (dev.java.net, java.net). u.Host includes the port component, which silently breaks comparisons for URLs with explicit ports. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(eolevaluator): extract shared yanked-version rule core (dup-7) applyPyPIYanked and applyCargoYanked shared an identical guard chain, PURL parse, name/version extraction with StableVersion fallback, and evidence-append block — differing only in ecosystem, name lowercasing, fetch closure, source string, confidence, and log event. Extract applyRegistryYanked as the shared unexported method parameterized by those six values. The two rules become ~12-line wrappers that supply closures over their respective clients. Evidence Source/Summary (including PyPI YankedReason suffix) and reference URLs are preserved byte-for-byte. evaluator_pypi_yanked_test.go and evaluator_cargo_yanked_test.go pass unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * docs(clearlydefined): add best-effort comment to non-OK body drain err-5: The io.CopyN discard on the non-OK status path was missing the same explanatory comment that the adjacent 404 branch already carries. No logic change — add '// best-effort drain before close' for consistency and to satisfy the project's silent-discard rule. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(purl): rename ParsedPURL.GetEcosystem/GetPackageName to Ecosystem/PackageName Go style discourages Get prefixes on accessors; the type's other accessors (Name, Version, Namespace) already follow the convention (audit api-3). Internal-only type — no public API impact. 46 call sites across 17 files. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(crates,pypi): drop always-true nil guards before cache writes staticcheck SA4031: the guarded values are constructed two lines above and can never be nil. The nil-skip behavior the guards meant to preserve lives in call paths where the value can actually be nil. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix: align ttlcache doc with actual nil-storage behavior; always serialize overall_score Phase-2 review findings: the ttlcache package doc claimed nil values are not stored, but Set deliberately stores them and the PyPI wheel import cache relies on that for negative caching. scan_render kept omitempty on overall_score while removing it from archived (populated under the same condition) and from diet_render's score fields — a zero score is data. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(github): replace string-matching isRateLimitError with common.IsRateLimitError Delete the private Client.isRateLimitError (err.Error() string matching) and replace both call sites in fetchRepositoryStatesBatch and githubWorker with common.IsRateLimitError, which uses errors.As against the typed ScorecardError wrapping that httpclient already applies to 429 responses. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(scan): implement three skipped RunFromParser tests; drop duplicate policy test - Add minimalService helper (NewAnalysisService(nil)) — sufficient for the three guards that exit RunFromParser before ProcessBatchPURLs is ever called - TestRunFromParser_NilParser: remove t.Skip, assert error contains "parser is nil" - TestRunFromParser_ParserError: remove t.Skip, assert non-nil error via mockParser - TestRunFromParser_EmptyDeps: remove t.Skip, assert nil error + empty Entries - Delete TestParseFailPolicy_Integration: strict subset of the comprehensive domain-level TestParseFailPolicy in internal/domain/scan/policy_test.go; tests no application-layer wiring Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(depsdev): extract collectBounded generic fan-out helper and migrate all six batch sites Add parallel.go with a generic bounded concurrency helper that uses the ctx-aware semaphore-acquisition pattern mandated by the project's learned concurrency rule (acquire before spawning, select on ctx.Done()). Migrate all six hand-rolled fan-out scaffolds to use it: - GetLatestReleasesForPURLs: stores error-bearing ReleaseInfo (ok=true) - fetchPackageInfoBatch: drops errored items (ok=false) - fetchReleaseInfoBatch: drops errored items (ok=false) - FetchDependentCountBatch: normalizes to canonical key before storing - FetchAdvisoriesBatch: deduplication kept in caller, stores by advisory ID - FetchDependenciesBatch: preserves version-fallback logic in closure Per-site key normalization, skip-on-error decisions, and progress DEBUG logs moved into the fetch closures. Divergent semantics between GetLatestReleasesForPURLs and fetchReleaseInfoBatch are preserved exactly. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(eoltext): decompose DetectLifecycle and hoist per-call regex compilation - Hoist all regexp.MustCompile calls that were inside DetectLifecycle to package-level vars (rxReadmeSuccessor, rxShortMessageExplicit, rxShortMessageSuccessor), mirroring the existing rxReadmeExplicit / contextualEOLPatterns style. This eliminates per-call regex compilation on a hot path (batch README/description scanning) with zero behavior change. - Extract per-source switch into sourceConfig(opts LifecycleDetectOpts) returning (text, detectorConfig, bool), isolating source-specific config from detection dispatch. - Promote the findSentence closure to a named package function findSentenceContaining(full, phrase string) string. - Extract the ~75-line post-match suppression block (sentence negatives, ambiguous-strong token checks, successor extraction, self-reference suppression, date backfill) into postProcessDetection(res, text, cfg, opts) DetectionResult. - DetectLifecycle reduced to ~25 lines of dispatch. All existing tests (detector_test.go, detector_extras_test.go, fuzz_test.go) pass unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(integration): move GitHub-URL flow out of service.go into github_url.go Extract AnalyzeFromGitHubURL, fetchAndValidateGitHubAnalysis, validateRepoURLMatch, buildGitHubOnlyAnalysis, githubURLToPURL, inferPURLFromLanguages, parseGitHubURL, generateVersionedPURL, extractPackageManagersFromManifests, mapPackageManagerToEcosystem, generatePURLsFromPackageManagers, and generatePURLForEcosystem (~430 lines) into a new github_url.go sibling, matching the per-flow file convention already established by github_url_batch.go, purl_batch.go, and the populate_*/enrich_*/resolve_vanity.go files. service.go now retains only the struct definition, options, constructor, FetchAnalysis/FetchAnalysisWithGitHub, and shared converters (createPackageFromPURL, buildVersionDetail, classifyAdvisory, extractScorecardChecks). Zero signature changes; all methods remain on *IntegrationService; all tests pass unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(nuget): split client.go by concern and flatten GetDeprecation Split the 799-line nuget/client.go into four focused files: - client.go: Client struct, NewClient, Set* methods, cacheEntry, remember - deprecation.go: DeprecationInfo, GetDeprecation, registration* wire types, extractFirstDeprecation, scrapeDeprecationFromNuGetHTML - repo_url.go: GetRepoURL, extractRepoURLFromLeaves, normalizeRepoURL, resolveRepoURLHeuristics, followRedirect, scrapeFirstGitHubFromHTML - service_index.go: ServiceIndex, getRegistrationCandidates, discoverRegistrationBases Also extract the inner page-fetch loop from the 128-line GetDeprecation into deprecationFromIndex(ctx, reg, id) to flatten the nested control flow. Types are placed before their first user per project convention. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(diet): release treesitter Analyzer C-heap resources in tests Add t.Cleanup(analyzer.Close) immediately after every treesitter.NewAnalyzer() call in e2e_test.go (4 sites: runDiet helper, TestE2E_DietCLIFlags, TestE2E_DietSourceValidation, TestE2E_DietStdinSBOM) and coupling_integration_test.go (2 sites: table-driven loop subtest and the standalone "unused dependency alongside used dependency" subtest). Using t.Cleanup (rather than defer) ensures the handles are released at subtest/test exit scope, not at helper-function return scope. Closes [test-2] from the refactor audit sweep. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(diet): pin exact HealthRisk values in computeHealthSignals tests Replace range assertions (>= 0.85, <= 0.6, >= 0.85) with exact-value comparisons (± 0.001) in TestComputeHealthSignals_{EOL,Active,Archived}. Derived from the current formula in service.go: EOL: base=0.9 (EOLConfirmed) + (1-5/10)*0.1 = 0.95 Active: base=0.5 (LabelReviewNeeded) + (1-8/10)*0.1 = 0.52 Archived: base=0.5 → max(0.5,0.85)=0.85 via IsArchived → max(0.85,0.6)=0.85 via DaysSince>365 + (1-5/10)*0.1 = 0.90 Added formula comments so future changes to computeHealthSignals prompt updating the pins. Also adds the math import required by math.Abs. Closes [test-3] from the refactor audit sweep. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(diet): replace wantNonZero pattern with exact values in scoring tests Remove the wantNonZero bool field and its conditional assertion branches from TestNormalizeCouplingEffort_ZeroCounts and TestNormalizeCouplingEffort_BlankImportSideEffect. Pin exact expected values derived from the current formula in scoring.go: 0.4*logistic(f,5) + 0.4*logistic(c,20) + 0.2*logistic(a,10) where logistic(x,m) = 1/(1+exp(-4/m*(x-m))) Computed values (tolerance ±0.001): files=1, calls=0, api=0: 0.026458 (0.4*0.039166 + 0.4*0.017986 + 0.2*0.017986) files=2, calls=5, api=3: 0.063704 (0.4*0.083173 + 0.4*0.047426 + 0.2*0.057324) files=1, calls=2, api=0: 0.029902 (0.4*0.039166 + 0.4*0.026597 + 0.2*0.017986) NOTE: The audit computed 0.01567 for case 1 — that value reflects only the file term (0.4*logistic(1,5)), ignoring the non-zero callScore and apiScore terms that the code also evaluates when ImportFileCount>0. The correct value from the actual formula is 0.026458. Formula comments point at scoring.go so future formula changes prompt updating the pins. Closes [test-4] from the refactor audit sweep. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(github): split client.go by responsibility into 4 new files Mechanical same-package split of the 1057-line client.go into focused files: - batch_states.go: repoResult type, FetchRepositoryStates, fetchRepositoryStatesBatch, githubWorker - ratelimit.go: resetRateLimitAggregation, recordRateLimit, snapshotRateLimit, RateLimitSummary, formatResetLocal - license_enrich.go: enrichProjectLicenseFromGitHub (pairs with client_license_enrich_test.go) - commit_stats.go: processCommitHistory, getLatestHumanCommit, getLatestCommit client.go retains Client/NewClient, FetchBasicRepositoryInfo, FetchDetailedRepositoryInfo, FetchRepoLanguages, executeGraphQLQuery, graphqlEndpoint, normalizeRepoURL, and repo-info helpers (MaxTopics, collectTopics, primaryLanguageName, forkSourceFromRepoInfo), reordered with exported functions preceding unexported ones per project convention. No signature changes; full test suite passes unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(cli): split boxdraw.go into frame, sections, and advisory files boxdraw.go (346 LOC) now contains only the box-drawing primitives, wrap helpers, title/verdict helpers, and the two entry-point orchestrators. boxdraw_sections.go (664 LOC) holds all ten writeBox* section renderers plus packageEcoName — the audit-entry display logic layer. boxdraw_advisory.go (126 LOC) holds the four advisory formatting helpers (sortedAdvisoryBlock, formatAdvisoryLines, advisoryCountText, formatTransitiveAdvisoryLines) that are co-used by Releases rendering. No signature changes; boxdraw_test.go compiles and passes unchanged. truncateDescription remains in input_helpers.go (phase 1). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(diet): extract import-path heuristics to import_paths.go Move stateless import-path helpers and ecosystem data tables out of service.go into a new same-package file import_paths.go: - isWorkspaceDep, filterWorkspaceDeps - buildImportPaths - pypiPrefixes, buildPyPIImportPaths, isPythonIdentifierSafe, isPythonDottedIdentifierSafe - mavenPackageOverrides, mavenRuntimeDeps, buildMavenImportPaths, isJavaPackageSafe, isJavaDottedPackageSafe service.go retains only parsePURLParts, the Service methods (Run, buildEntries, resolveUnmatchedPyPI), and computeHealthSignals. No caller changes — same-package visibility is preserved. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(treesitter): decompose AnalyzeCoupling into three helpers Extract three unexported same-package helpers from the 189-line AnalyzeCoupling method, with no behavior change: - buildImportToPURL: builds the lowercase importPath→[]PURL reverse map (previously inlined at the top of AnalyzeCoupling) - (a *Analyzer) analyzeFile: holds the per-file parse body previously inlined in the WalkDir closure; the closure now keeps only dir-skip, ctx.Err, and filepath.Rel plumbing then delegates - buildCouplingResults: converts accumulators to CouplingAnalysis including the dot/blank-import baseline rules (previously the result-assembly loop at the end of AnalyzeCoupling) Parser and cursor lifecycles are preserved exactly: parser is shared across files, cursor is created/closed per file inside analyzeFile. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * refactor(depsdev): split depsdev.go into per-API-flow files (struct-1) Mechanical same-package split of depsdev.go (1567 LOC) into six per-flow files that mirror the existing test layout: - release.go: GetLatestReleasesForPURLs, fetchLatestRelease, fetchReleaseInfoBatch - batch_details.go: GetDetailsForPURLs, fetchBatchPURLs, resolveRepoURLsBatch, repoURLFromRelatedProjects, isGemSystem, fetchPackageInfoBatch, countSuspiciousMavenPURLs, fetchProjectsBatch, buildFinalResults, buildBasicResult, buildCompleteResult, convertRepoURLToProjectKey, normalizeRepoURLForProject - package_info.go: fetchPackageInfo, fetchPURLRaw, tryMavenSearchFallback, stripTrailingVersion - dependents.go: FetchDependentCount, FetchDependentCountBatch - licenses.go: GetPackageVersionLicenses, collectVersionLicenses - advisory.go: FetchAdvisory, FetchAdvisoriesBatch depsdev.go retains: errPURLNotFound, DepsDevClient, NewDepsDevClient, With* options, purlpkgToParsed, ExtractRepositoryURLFromLinks, truncateString. No signature changes. go test ./internal/infrastructure/depsdev/... passes unchanged. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * fix(github): restore emoji stripped from progress output during the file split The struct-3 split was labeled a pure mechanical move but silently dropped the leading emoji from the six user-facing GraphQL progress fmt.Printf lines and from three doc-comment headers (phase-3 review finding). Output is now byte-identical to pre-split. The remaining intentional delta from the split is doc-comment list indentation normalized to godoc style. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> * test(cli): port input_helpers unit tests dropped in ADR-0004 dead-code cascade The ADR-0004 cleanup (commit 1462ff5) extracted randomSample, categorizeInputs, and categorizeFileLines from the deleted batch.go into the new input_helpers.go as live helpers (scan.go calls all three), but their unit tests in batch_test.go were deleted without being ported. This left the live helpers at 0.0% coverage — a "Port Tests When Replacing Services" regression. The categorizeFileLines unrecognized-line threshold (0.5) is a boundary heuristic that previously had explicit regression coverage. Restores the three table-driven tests verbatim (they reference no removed ProcessingOptions fields). Coverage after: randomSample 100%, categorizeInputs 100%, categorizeFileLines 83.9%. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(depsdev): log max_workers, not unused batch_size, after collectBounded GetLatestReleasesForPURLs logged "batch_size", c.config.BatchSize, but the collectBounded migration dropped BatchSize chunking — the function fans the full PURL list out under a fixed maxWorkers=10 cap, so BatchSize no longer governs anything here. Logging it is misleading telemetry. Switch the field to max_workers (the value that actually bounds concurrency) and reword the stale "Collect all batches into a flat list" comment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(nuget): wrap deprecationFromIndex error with package id context GetDeprecation propagated the deprecationFromIndex error with a bare `return nil, false, err`, inconsistent with every other return in the same function (all use %w) and dropping the package id. Wrap with the id so failures are attributable; the phrase avoids repeating the inner "NuGet ..." prefix per the redundant-context rule. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(cli): pin always-serialized zero-value JSON fields in scan/diet output The audit sweep intentionally removed `omitempty` from scan `archived` / `overall_score` and diet `has_vulnerabilities` / `vulnerability_count` / `max_cvss_score` / `overall_score` (zero values are data; absent must differ from false/zero), but no test locked in the new shape. Add raw-bytes key-presence assertions (struct unmarshalling cannot observe key presence) so a future re-introduction of omitempty is caught. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(cli): rune-safe truncateDescription and accurate line-range error Two Copilot-review fixes in input_helpers.go (live helpers moved from batch.go): - truncateDescription sliced by byte length, which can split a multi-byte UTF-8 rune and emit invalid output for non-ASCII repository descriptions (rendered via boxdraw_sections.go). Truncate by rune instead. - validateLineRange rejected only negative values but the message said "must be positive"; 0 is a valid sentinel meaning "no range filtering". Reword to "non-negative" to match the documented behavior. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(nuget): hoist GitHub URL scrape regex to package scope scrapeFirstGitHubFromHTML compiled its GitHub-URL regex on every call; the HTML-scrape fallback can run once per package. Compile it once at package scope (githubURLPattern), consistent with the eoltext regex hoisting in this sweep. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(uzomuzo): use local Analysis alias in public EvaluationService The EvaluationService interface (and the Evaluator methods) referenced *domain.Analysis via the internal/domain/analysis path. pkg/uzomuzo already re-exports it as the local alias `Analysis` (types.go), so the public surface should use *Analysis — external implementers cannot name the internal path. Behavior-identical (type alias); the now-unused domain import is dropped. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix: resolve review-until-clean Phase A drift (stale docs, dead var, dup package doc) Phase A local-review fixes for PR #440 (no logic change; all verified by build/vet/test/lint): - nuget/deprecation.go: remove dead `decodeErrors` counter + its `_ =` suppressor (tracked-for-future-metrics state that was never consumed). - docs/adr/0018: the CD fan-out no longer uses its own `maxClearlyDefinedConcurrency` semaphore — update to the shared `maxLicenseFetchConcurrency` dispatcher (`license_dispatch.go`). - Drop stale `FetchService` references left by this PR's deletion of fetch_service.go (CLAUDE.md, .github/agents/{architect,planner}.agent.md, populate_manifest_license_test.go comment). - diet/import_paths.go: demote the duplicate package-doc comment to a file note (blank line so Go does not treat it as a second package doc). - github/batch_states.go: replace stream-of-consciousness comment with a concise WHY. - boxdraw_test.go: drop the deleted `--export-license-csv` flag from a comment. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(cli): make --line-range error hint precise about 0-as-unset Copilot flagged "0 disables range filtering" as misleading: a single 0 does not disable filtering — it only leaves that one bound unset (start 0 = from first line, end 0 = to EOF; both 0 = no filtering). Reword the non-negative-guard hint accordingly. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(ttlcache): fix duplicated word in Cache godoc ("Set SetTTL") Copilot flagged the doc comment "Set SetTTL to a positive duration" as a duplicated word that surfaces in GoDoc. Reword to "Call SetTTL with a positive duration to enable caching." Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * docs(uzomuzo): fix stale "domain Analysis" in EvaluationService godoc The ProcessBatchPURLs interface method now returns *Analysis (the exported pkg/uzomuzo alias), but its doc comment still said "domain Analysis" — leftover drift from switching the signature off the internal type. Reword to reference the exported type so external implementers see the right type. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(uzomuzo): reject nil/typed-nil service in NewEvaluatorFromService This PR changed NewEvaluatorFromService's parameter from the concrete *application.AnalysisService to the EvaluationService interface, which introduced a typed-nil footgun: a caller passing `var s *MySvc; ...(s)` hands over a non-nil interface wrapping a nil pointer, so the failure only surfaces as an opaque nil-pointer panic on the first Evaluate* call. Fail fast at construction with a clear message (isNilService catches both untyped nil and typed-nil pointers via reflect). A missing required dependency is a programmer error; the constructor returns no error, so a descriptive panic is the idiomatic guard. Add a table test covering untyped-nil, typed-nil, and valid-service paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * test(ttlcache): de-flake TestCache_Expiry timing under parallel CI load The t.Parallel test used TTL=10ms with a 20ms post-TTL sleep, a margin too tight when the scheduler pauses the goroutine under CI load. Raise the TTL to 50ms and sleep 5x past it (named const), keeping the test fast (~250ms) while removing the nondeterminism Copilot flagged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(depsdev): harden collectBounded worker clamp and FetchAdvisory ID normalization Two defensive fixes on the new/refactored depsdev fan-out code: - collectBounded: clamp maxWorkers to >=1. A value <=0 would build an unbuffered (deadlocking) or invalid (panicking) semaphore. All current callers pass a positive constant; the clamp keeps the generic helper safe for future reuse. - FetchAdvisory: TrimSpace the advisory ID and reject empty, matching FetchAdvisoriesBatch. Prevents whitespace-padded cache keys and invalid requests when the singular entry point is called directly. No-op for the valid IDs all current callers pass (parity preserved). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(uzomuzo): use reflect.Pointer instead of deprecated reflect.Ptr CI golangci-lint (version: latest) flagged `reflect.Ptr` via govet's modernize check ("Constant reflect.Ptr should be inlined"). reflect.Ptr is the deprecated alias for reflect.Pointer (Go 1.18+). Swap to reflect.Pointer in isNilService; no behavior change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Ubuntu <ubuntu@NOTE13759.future.co.jp> Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
1 parent 9d8c6b6 commit 1cfedb3

119 files changed

Lines changed: 6418 additions & 9230 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.github/agents/architect.agent.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@ uzomuzo/
6060
licenses/ # License resolution (spdx_generated.go)
6161
eolresult/ # EOL status types
6262
application/ # Use case orchestration
63-
analysis_service.go # AnalysisService, FetchService
63+
analysis_service.go # AnalysisService
6464
infrastructure/ # External integrations, parallel processing
6565
depsdev/ # deps.dev API client
6666
github/ # GitHub GraphQL/REST client

.github/agents/planner.agent.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ You are an expert planning specialist for this DDD-based Go CLI project, focused
2727
### 2. Codebase Review
2828
- Analyze existing DDD layer structure:
2929
- `internal/domain/` — Pure business logic, entities, value objects
30-
- `internal/application/` — Use case orchestration (AnalysisService, FetchService)
30+
- `internal/application/` — Use case orchestration (AnalysisService)
3131
- `internal/infrastructure/` — External APIs, parallel processing
3232
- `internal/interfaces/cli/` — CLI entry points (thin orchestration)
3333
- `pkg/uzomuzo/` — Public Go library facade

CLAUDE.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ Team uses **Go 1.26.1**. `go.mod` `go 1.25.0` is the dependency minimum — do n
2828
`Interfaces → Application → Domain ← Infrastructure`. See `.claude/rules/ddd-architecture.md`.
2929

3030
- **domain/** — Pure business logic. Core: `Analysis`, `ResolvedLicense`, `EOLStatus`, `AssessmentResult`
31-
- **application/** — Use case orchestration. `AnalysisService` / `FetchService`. Supports `AnalysisEnricher` hook
31+
- **application/** — Use case orchestration. `AnalysisService`. Supports `AnalysisEnricher` hook
3232
- **infrastructure/** — API clients (depsdev, github, eolevaluator), integration, CSV export
3333
- **interfaces/cli/** — CLI entry points. No concurrent logic
3434
- **pkg/uzomuzo/** — Public library facade

cmd/uzomuzo-diet/e2e_test.go

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ func runDiet(t *testing.T, format string) string {
7676

7777
graphAnalyzer := depgraph.NewAnalyzer()
7878
sourceAnalyzer := treesitter.NewAnalyzer()
79+
t.Cleanup(sourceAnalyzer.Close)
7980

8081
opts := cli.DietOptions{
8182
SBOMPath: testSBOMPath,
@@ -254,6 +255,7 @@ func TestE2E_DietCLIFlags(t *testing.T) {
254255

255256
graphAnalyzer := depgraph.NewAnalyzer()
256257
sourceAnalyzer := treesitter.NewAnalyzer()
258+
t.Cleanup(sourceAnalyzer.Close)
257259

258260
app := &urfcli.Command{
259261
Name: "uzomuzo-diet",
@@ -301,6 +303,7 @@ func TestE2E_DietSourceValidation(t *testing.T) {
301303

302304
graphAnalyzer := depgraph.NewAnalyzer()
303305
sourceAnalyzer := treesitter.NewAnalyzer()
306+
t.Cleanup(sourceAnalyzer.Close)
304307

305308
// --source pointing to a file should fail
306309
opts := cli.DietOptions{
@@ -388,6 +391,7 @@ func TestE2E_DietStdinSBOM(t *testing.T) {
388391

389392
graphAnalyzer := depgraph.NewAnalyzer()
390393
sourceAnalyzer := treesitter.NewAnalyzer()
394+
t.Cleanup(sourceAnalyzer.Close)
391395
runErr := cli.RunDiet(context.Background(), cfg, opts, graphAnalyzer, sourceAnalyzer, nil)
392396

393397
os.Stdout = oldStdout

cmd/uzomuzo/main.go

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,4 +56,3 @@ func main() {
5656
os.Exit(1)
5757
}
5858
}
59-

cmd/uzomuzo/main_test.go

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -52,12 +52,6 @@ func TestBuildScanOptions(t *testing.T) {
5252
name: "zero/default flags",
5353
args: nil,
5454
check: func(t *testing.T, opts cli.ScanOptions) {
55-
if opts.OnlyReviewNeeded {
56-
t.Error("OnlyReviewNeeded should be false")
57-
}
58-
if opts.OnlyEOL {
59-
t.Error("OnlyEOL should be false")
60-
}
6155
if opts.Format != "" {
6256
t.Errorf("Format should be empty, got %q", opts.Format)
6357
}

docs/adr/0018-clearlydefined-integration.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -105,9 +105,9 @@ Mirrors the existing maven package's shape (`infrastructure/maven/client.go` + `
105105

106106
`internal/infrastructure/integration/populate_clearlydefined_license.go` houses `enrichLicenseFromClearlyDefined`, called from `purl_batch.go` immediately after `enrichLicenseFromManifest`. The method:
107107

108-
- Skips when the injected `cdClient` is nil (CD is opt-in for library users; the application-layer factories `NewAnalysisServiceFromConfig` and `NewFetchServiceFromConfig` wire it eagerly).
108+
- Skips when the injected `cdClient` is nil (CD is opt-in for library users; the application-layer factory `NewAnalysisServiceFromConfig` wires it eagerly).
109109
- Reuses the `needsManifestLicense` predicate so eligibility stays consistent with the manifest tier.
110-
- Fans out fetches under its own semaphore (`maxClearlyDefinedConcurrency = 10`) so the budget does not contend with Maven Central.
110+
- Fans out fetches under the shared `maxLicenseFetchConcurrency = 10` semaphore (`license_dispatch.go::dispatchLicenseJobs`, the same dispatcher the manifest tier uses) so the combined license-fetch budget is bounded.
111111
- Deduplicates by `(ecosystem, namespace, name, version)` so identical coordinates issue exactly one CD call per batch.
112112
- Applies CD results via the same `applyManifestLicenses` helper as the manifest tier.
113113

0 commit comments

Comments
 (0)