-
Notifications
You must be signed in to change notification settings - Fork 1.4k
[IBuildEngine callbacks] Stage 1: Packet infrastructure + IsRunningMultipleNodes #13149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[IBuildEngine callbacks] Stage 1: Packet infrastructure + IsRunningMultipleNodes #13149
Conversation
Implements callback infrastructure for TaskHost IBuildEngine callbacks: - Add packet type enum values for TaskHost callbacks (0x20-0x27 range) - Create ITaskHostCallbackPacket interface for request/response correlation - Add TaskHostQueryRequest/Response packets for property queries - Implement IsRunningMultipleNodes forwarding in OutOfProcTaskHostNode - Handle query requests in TaskHostTask (parent side) - Add integration tests for IsRunningMultipleNodes callback This enables tasks running in TaskHost to correctly query whether the build is running with multiple nodes, which is needed for multithreaded mode (-mt) support. Partial fix for dotnet#12863
3c57f6f to
684a479
Compare
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Stage 1 of enabling IBuildEngine callback support for out-of-proc TaskHost execution by introducing request/response packet infrastructure and implementing IBuildEngine2.IsRunningMultipleNodes via a TaskHost→parent query/response roundtrip.
Changes:
- Added TaskHost callback packet types and correlation interface (
ITaskHostCallbackPacket) plus query request/response packets. - Implemented
IsRunningMultipleNodesforwarding from TaskHost to parent (OutOfProcTaskHostNode↔TaskHostTask). - Added unit + integration tests validating packet serialization and TaskHost callback behavior.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Shared/TaskHostQueryResponse.cs | Adds response packet for TaskHost query callbacks. |
| src/Shared/TaskHostQueryRequest.cs | Adds request packet and query enum for simple callback queries. |
| src/Shared/ITaskHostCallbackPacket.cs | Defines request/response correlation contract via RequestId. |
| src/Shared/INodePacket.cs | Reserves and defines new NodePacketType values for TaskHost callbacks. |
| src/Samples/TaskHostCallback/TestIsRunningMultipleNodesTask.cs | Sample task exercising IBuildEngine2.IsRunningMultipleNodes in TaskHost. |
| src/Samples/TaskHostCallback/TestIsRunningMultipleNodes.proj | Sample project wiring task via TaskHostFactory. |
| src/Samples/TaskHostCallback/TaskHostCallback.csproj | Builds the sample TaskHost callback task assembly. |
| src/MSBuild/Resources/xlf/Strings.zh-Hant.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.zh-Hans.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.tr.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.ru.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.pt-BR.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.pl.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.ko.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.ja.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.it.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.fr.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.es.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.de.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/xlf/Strings.cs.xlf | Adds localized entry for new TaskHost connection-loss string. |
| src/MSBuild/Resources/Strings.resx | Adds new TaskHost connection-loss resource string. |
| src/MSBuild/OutOfProcTaskHostNode.cs | Implements callback request plumbing + forwards IsRunningMultipleNodes to parent. |
| src/MSBuild/MSBuild.csproj | Includes new Shared callback packet source files in MSBuild build. |
| src/Build/Microsoft.Build.csproj | Includes new Shared callback packet source files in Microsoft.Build build. |
| src/Build/Instance/TaskFactories/TaskHostTask.cs | Handles TaskHost query requests and replies with forwarded results. |
| src/Build.UnitTests/BackEnd/TaskHostQueryPacket_Tests.cs | Adds roundtrip serialization tests for query request/response packets. |
| src/Build.UnitTests/BackEnd/TaskHostFactory_Tests.cs | Adds integration test verifying IsRunningMultipleNodes works from TaskHost. |
| src/Build.UnitTests/BackEnd/IsRunningMultipleNodesTask.cs | Adds test task used by integration test to call IsRunningMultipleNodes. |
| // Wait for either: response arrives, task cancelled, or connection lost | ||
| // No timeout - callbacks like BuildProjectFile can legitimately take hours | ||
| WaitHandle[] waitHandles = [responseEvent, _taskCancelledEvent]; | ||
|
|
||
| while (true) | ||
| { | ||
| int signaledIndex = WaitHandle.WaitAny(waitHandles, millisecondsTimeout: 1000); | ||
|
|
||
| if (signaledIndex == 0) | ||
| { | ||
| // Response received | ||
| break; | ||
| } | ||
| else if (signaledIndex == 1) | ||
| { | ||
| // Task cancelled | ||
| throw new BuildAbortedException(); | ||
| } | ||
|
|
||
| // Timeout - check connection status (no WaitHandle available for this) | ||
| if (_nodeEndpoint.LinkStatus != LinkStatus.Active) | ||
| { | ||
| throw new InvalidOperationException( | ||
| ResourceUtilities.FormatResourceStringStripCodeAndKeyword("TaskHostCallbackConnectionLost")); | ||
| } | ||
| } |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 1-second WaitAny polling loop will wake up blocked task threads repeatedly even for long-running callbacks (potentially hours), which can add unnecessary CPU wakeups at scale. Consider introducing a wait handle/signaling path for link failure (e.g., set from OnLinkStatusChanged) so the wait can be indefinite without polling.
| // Wait for either: response arrives, task cancelled, or connection lost | |
| // No timeout - callbacks like BuildProjectFile can legitimately take hours | |
| WaitHandle[] waitHandles = [responseEvent, _taskCancelledEvent]; | |
| while (true) | |
| { | |
| int signaledIndex = WaitHandle.WaitAny(waitHandles, millisecondsTimeout: 1000); | |
| if (signaledIndex == 0) | |
| { | |
| // Response received | |
| break; | |
| } | |
| else if (signaledIndex == 1) | |
| { | |
| // Task cancelled | |
| throw new BuildAbortedException(); | |
| } | |
| // Timeout - check connection status (no WaitHandle available for this) | |
| if (_nodeEndpoint.LinkStatus != LinkStatus.Active) | |
| { | |
| throw new InvalidOperationException( | |
| ResourceUtilities.FormatResourceStringStripCodeAndKeyword("TaskHostCallbackConnectionLost")); | |
| } | |
| } | |
| // Wait for either: response arrives or task cancelled. | |
| // Indefinite wait to avoid periodic wakeups for long-running callbacks. | |
| WaitHandle[] waitHandles = [responseEvent, _taskCancelledEvent]; | |
| int signaledIndex = WaitHandle.WaitAny(waitHandles); | |
| if (signaledIndex == 0) | |
| { | |
| // Response received | |
| } | |
| else if (signaledIndex == 1) | |
| { | |
| // Task cancelled | |
| throw new BuildAbortedException(); | |
| } | |
| else | |
| { | |
| // Defensive: unexpected result from WaitAny | |
| throw new InvalidOperationException( | |
| $"Unexpected wait result in {nameof(OutOfProcTaskHostNode)} callback: index={signaledIndex}"); | |
| } |
| if (_nodeEndpoint.LinkStatus != LinkStatus.Active) | ||
| { | ||
| throw new InvalidOperationException( | ||
| ResourceUtilities.FormatResourceStringStripCodeAndKeyword("TaskHostCallbackConnectionLost")); | ||
| } |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR description states that connection loss cancels pending callbacks with TaskCanceledException, but this implementation throws InvalidOperationException on link loss (and there’s no centralized cancellation of all pending requests). Either update the PR description or adjust the implementation to match the documented behavior.
| <data name="TaskHostCallbackConnectionLost" xml:space="preserve"> | ||
| <value>MSB5025: TaskHost lost connection to parent process during callback. The build may have been cancelled.</value> | ||
| <comment>{StrBegin="MSB5025: "}</comment> |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new TaskHostCallbackConnectionLost resource uses error code MSB5025, but MSB5025 is already used elsewhere (e.g., src/Shared/Resources/Strings.shared.resx for solution filter JSON errors). Please assign an unused MSB number to avoid duplicate error codes (and update corresponding localized xlf entries accordingly).
| // Use ManualResetEvent to bridge TaskCompletionSource to WaitHandle | ||
| using var responseEvent = new ManualResetEvent(false); | ||
| var tcs = new TaskCompletionSource<INodePacket>(TaskCreationOptions.RunContinuationsAsynchronously); | ||
| tcs.Task.ContinueWith(_ => responseEvent.Set(), TaskContinuationOptions.ExecuteSynchronously); | ||
| _pendingCallbackRequests[requestId] = tcs; |
Copilot
AI
Jan 29, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SendCallbackRequestAndWaitForResponse allocates a ManualResetEvent and a Task continuation just to get a WaitHandle. This adds kernel handle + continuation overhead per callback. Consider waiting on the Task’s built-in AsyncWaitHandle (via IAsyncResult) or restructuring to avoid the extra event/continuation.
Context
TaskHost (OutOfProcTaskHostNode) doesn't support IBuildEngine callbacks - tasks that call
IsRunningMultipleNodes, BuildProjectFile, RequestCores, etc. fail with MSB5022 or
NotImplementedException. This blocks multithreaded mode (-mt) where non-thread-safe tasks
are forced to run in TaskHost.
This PR is Stage 1 of implementing callback support. It establishes the infrastructure and
implements IsRunningMultipleNodes as a proof-of-concept for the pattern.
Partial fix for #12863
Changes Made
Packet Infrastructure:
correlation
- Add TaskHostQueryRequest/Response packets for property queries
OutOfProcTaskHostNode (TaskHost side):
until response
- Add HandleCallbackResponse() - routes incoming responses to waiting TCS by RequestId
- Implement IsRunningMultipleNodes to forward query to parent instead of logging error
TaskHostTask (Parent side):
Testing
Notes
Callback Flow:
sequenceDiagram participant Task as Task Thread<br/>(TaskHost) participant Main as Main Thread<br/>(TaskHost) participant Parent as TaskHostTask<br/>(Parent Process) Task->>Task: BuildEngine.IsRunningMultipleNodes Task->>Task: Create TaskHostQueryRequest(requestId) Task->>Task: Store TCS in _pendingCallbackRequests[requestId] Task->>Parent: SendData(request) Task-->>Task: Block on TCS.Task Parent->>Parent: HandleQueryRequest(request) Parent->>Parent: result = _buildEngine.IsRunningMultipleNodes Parent->>Main: SendData(TaskHostQueryResponse) Main->>Main: HandleCallbackResponse(response) Main->>Main: _pendingCallbackRequests[requestId].SetResult(response) Task-->>Task: TCS unblocks Task->>Task: return response.BoolResultThread Model:
Error Handling:
Next Stages: