Add --neo4j-url CLI option to configure the Neo4j connection URL#22
Open
mscottford wants to merge 30 commits intovladbatushkov:mainfrom
Open
Add --neo4j-url CLI option to configure the Neo4j connection URL#22mscottford wants to merge 30 commits intovladbatushkov:mainfrom
mscottford wants to merge 30 commits intovladbatushkov:mainfrom
Conversation
- Targets net10.0 (up from net9.0) - Buildalyzer/Workspaces 7.1.0 → 8.0.0 - Microsoft.CodeAnalysis/CSharp 4.13.0 → 5.3.0 (Analyzers pin removed) - Neo4j.Driver 5.27.0 → 6.0.0 - System.CommandLine 2.0.0-beta4 → 2.0.5 (first stable release) - Dockerfile base image updated to dotnet/sdk:10.0-alpine - Neo4j image updated to 2026.03.1 with corrected env var names - docker-compose.yml: removed obsolete version attribute, replaced hardcoded Windows volume path with relative ./SystemUnderTest path - Code updated for breaking API changes in System.CommandLine 2.0 (SetAction, Options.Add, Aliases.Add, Required, AllowMultipleArgumentsPerToken) and Neo4j.Driver 6.0 (IDriver.CloseAsync removed, use await using) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…from analysis When AddToWorkspace(workspace, addProjectReferences: true) is called for a project, Buildalyzer eagerly adds its referenced projects into the workspace. The previous deduplication check silently dropped those projects on their own loop iteration, causing them to receive no CONTAINS triple and no code analysis. Adds a regression test using a two-project fixture solution where ProjectA references ProjectB — ensuring both appear in the analysis context regardless of workspace insertion order. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Stage 1: Build projects in parallel using .AsParallel() in GetAnalysisContext; each p.Build() spawns an isolated MSBuild process so there is no shared state. AdhocWorkspace mutations remain sequential (not thread-safe). - Stage 3: Analyze and insert all projects concurrently using Task.WhenAll with Task.Run so CPU-bound syntax tree walking runs across thread-pool threads. The delete step is extracted to DbManager.DeleteData and run once upfront before the parallel loop to avoid races. - Stage 4: A SemaphoreSlim(4) gates concurrent Neo4j sessions to avoid exhausting the connection pool on large solutions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The workspace population step (AddToWorkspace) is the slowest sequential phase on large solutions and previously emitted no output, causing a silent pause between "Building projects - finished." and "done." Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…l to complete Previously the pipeline had two hard barriers: all projects had to finish building before workspace loading began, and all projects had to finish loading before analysis began. The Build stage now launches all MSBuild design-time builds concurrently via Task.WhenEach. As each build completes it flows immediately into the Load stage (sequential AdhocWorkspace mutation) and then yields a (Project, IAnalyzerResult) pair to the caller. The Analyze and Insert stages pick up each pair as it arrives via IAsyncEnumerable, so analysis of early- finishing projects begins while later projects are still building and loading. The solution-order sort is removed — AddToWorkspace(addProjectReferences:true) handles any insertion order correctly via the transitive-reference path introduced in the previous fix. GetAnalysisContext is updated to async Task<AnalysisContext> to consume the stream; the regression test is updated accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AddToWorkspace(addProjectReferences: true) calls analyzer.Build() internally for any referenced project not yet in the workspace. When Build and Load were interleaved via Task.WhenEach, in-flight build tasks raced with AddToWorkspace's internal builds for the same project, causing Buildalyzer's environment detection to fail with InvalidOperationException: Could not find build environment. Fix: wait for all builds to complete (Task.WhenAll) before starting the Load stage. Build → Load streaming is lost, but Load → Analyze streaming is retained: each project is yielded as soon as it is loaded, so analysis begins immediately without waiting for all projects to finish loading. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously all projects were built in parallel with no limit, spawning one dotnet process per project simultaneously. A SemaphoreSlim(Environment.ProcessorCount) now gates the Build stage so at most one build per logical core runs at a time. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
string.GetHashCode() is randomized per-process in .NET Core and later (hash randomization was introduced to prevent hash-flooding DoS attacks). Using it as the Neo4j pk meant every run produced a different pk for the same logical node, defeating MERGE and silently creating duplicate nodes instead of updating existing ones. Symptom: projects shared between two solutions (e.g. via a git submodule) appeared as two separate Project nodes in the graph rather than one node with CONTAINS edges from both Solution nodes. Fix: replace GetHashCode() with MD5 in all SetPrimaryKey overrides. MD5 is stable across processes and runtimes and is appropriate here since this is a content-addressing use case, not a security one. Note: existing databases will have stale pks. Re-analyze with --delete to clear and rebuild the graph with deterministic pks. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…nged Introduces a --cache option (defaulting to the platform application data folder) that stores a .binlog and a .deps sidecar per project. On subsequent runs, source file and project file timestamps are compared against the binlog; if nothing has changed the binlog is replayed instead of re-running MSBuild. Transitive dependency invalidation is derived from the .deps sidecars written by previous builds, so MSBuild's own evaluation is used rather than our own project file parsing. Adds tests verifying that cached (cold and warm) builds yield the same results as uncached builds. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two bugs prevented project folder nodes from being correctly captured in
the graph for solution-based analysis:
1. AddToWorkspace returns null for project types Roslyn does not support
(e.g. F# or native projects). The null was yielded into Analyze where
Project.Name access threw NullReferenceException, silently dropping the
entire task and all its triples. Unsupported project types are now
logged and skipped cleanly.
2. Project folder nodes (e.g. FolderNode("SomeProject")) were created and
connected to their ProjectNode via INCLUDED_IN, but had no edge pointing
up to the solution's root folder. They appeared in the graph as orphans
unreachable from the solution folder via INCLUDED_IN traversal. A new
TripleIncludedIn(projectRootNode, solutionRootNode) triple now connects
each project folder to the solution's root folder.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Without a uniqueness constraint on the pk property, each MERGE operation performs a full label scan — performance degrades linearly as the database grows. EnsureIndexes creates a constraint (and its implicit B-tree index) for all 8 node labels before any data operations run. The constraint is idempotent (IF NOT EXISTS), so re-running is always safe. Also removes the per-triple Console.WriteLine calls from InsertData that were producing excessive noise in the output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces IAnalysisProgress as an event bus between Analyzer and the terminal UI, removing all AnsiConsole/Console calls from Analyzer.cs. SpectreConsoleProgress implements the interface using AnsiConsole.Live with a custom IRenderable that sorts in-flight tasks by most-recently- updated so stage changes always float to the top of the visible area. Skipped projects (unsupported types such as Android/MAUI) are now removed from the live display immediately. The summary line is pinned near the bottom of the terminal via padding lines computed from the terminal height. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CreateTypeNode() returns null for non-class/non-interface type kinds (structs, delegates, enums, unresolvable types). Without the null guard, new TripleOfType(classNode, null) creates a triple whose NodeB is null, which throws NullReferenceException during triple.ToString() in the grouping phase. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Accepts any number of IAnalysisProgress instances and fans out every notification to all of them. WrapAsync builds a nested chain so the first implementation is the outermost wrapper, ensuring its UI context (e.g. Spectre.Console live display) is active when inner implementations run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Writes one timestamped, machine-readable line per pipeline event so that
build failures, cache hits, stage timings, and grouping errors can be
diagnosed after a run without re-running or relying solely on the live
console output.
Format: {ISO-8601-UTC} {EVENT} {key=value ...}
String values are double-quoted; numbers and booleans are unquoted.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Direct AnsiConsole.MarkupLine calls from concurrent project tasks
interleaved with ctx.Refresh() in the ticker, corrupting Spectre.Console's
cursor position and crashing the ticker. Route all console writes through
a ConcurrentQueue<Action> that the ticker drains before each Refresh(), so
all output originates from a single thread.
Also fixes a FormatException in OnGroupingError where AnsiConsole.Write
treated the triple JSON string as a format template, causing { characters
to throw. Changed to AnsiConsole.Write(new Text(...)) which bypasses
format processing.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously every skipped project showed [yellow]Skipped[/] regardless of why it was skipped. Now each distinct outcome has its own label: Timeout — build timed out (yellow) Unreadable — binlog exists but could not be parsed (yellow) Failed — MSBuild reported a build failure (red) Skipped — other reasons such as unsupported project type (yellow) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Introduces an Options record to replace the positional constructor parameters, making it easier to add new options without breaking call sites. Adds BuildLogDirectory (defaults to the platform data folder's logs/ subdirectory) and NoCache (forces all projects to rebuild even when a valid cached binlog exists). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When reading from a cached binlog, manager.Analyze could return null (e.g. corrupted or truncated binlog from a previous interrupted build). Without the null check the project silently received no triples. The deps sidecar was also never updated on a cache hit, so a project whose build previously timed out kept carrying stale dependency information into subsequent staleness-propagation passes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The MsBuildPipeLogger can silently drop events when the host MSBuild version doesn't match the SDK's, leaving p.Build() returning null even after a successful build. Switching to manager.Analyze(binlogPath) reads the binlog written natively by MSBuild, which is version-independent. The child process closes its stdout pipe before the BinaryLogger finishes flushing the binlog to disk. TryAnalyzeBinlogAsync retries once after 500ms to cover the window where the file exists but hasn't been fully committed, eliminating an intermittent false "build failed" report. BuildWithTimeoutAsync enforces a 5-minute limit so a hung build (e.g. an Android project waiting on SDK tools absent from the environment) doesn't block the rest of the pipeline indefinitely. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously both cases reported "build failed", making it impossible to tell from the log whether MSBuild actually failed or whether the binlog was written but then couldn't be parsed. Now reports: "build failed" — no binlog file produced after build "build log could not be read" — binlog exists but manager.Analyze yields no result Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds a buildLogDirectory parameter to StreamProjectsAsync and passes it through to CreateBuildOptions, which adds /fileLoggerParameters to the MSBuild invocation so each project gets a plain-text build log alongside the binary log. The directory is created on demand if it doesn't exist. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When noCache is true, marks every project as needing a rebuild regardless of whether its cached binlog is still fresh. This lets users force a clean build pass to refresh stale cache entries without deleting the cache directory manually. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Exposes the NoCache and BuildLogDirectory settings added to AnalyzerConfig as first-class CLI flags. Also refactors BuildKnowledgeGraph to accept an AnalyzerConfig.Options record instead of six positional parameters, making it easier to thread new options through without changing every call site. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Threads the two new AnalyzerConfig properties through the Analyze → StreamProjectsAsync call so that --no-cache and --build-log-dir actually take effect during a run. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Every analysis run now writes a structured, timestamped log to the build log directory alongside the per-project MSBuild logs. The log captures build starts, cache hits, stage transitions, grouping errors, and final outcomes so failures can be diagnosed post-run without re-running or relying on the live terminal output. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Previously the bolt connection URL was hardcoded to neo4j://localhost:7687 in DbManager.cs and the HTTP healthcheck URL was hardcoded to http://localhost:7474/. Both are now derived from a single --neo4j-url / -u option, defaulting to the same values when omitted so existing usage is unchanged. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This was referenced Apr 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
neo4j://localhost:7687bolt URL fromDbManager.cshttp://localhost:7474/healthcheck URL fromHealthcheck.cs--neo4j-url/-uCLI option; defaults toneo4j://localhost:7687when omitted so existing usage is unchangedTest plan
--neo4j-urlflag — behaviour is identical to before--neo4j-url neo4j://remotehost:7687and confirm the driver connects toremotehost--helpoutput shows the new-u/--neo4j-urloptionDependencies
This PR builds on the following open PRs in the stack:
🤖 Generated with Claude Code