feat: add MCP node management and firmware update jobs#1465
feat: add MCP node management and firmware update jobs#1465DTTerastar wants to merge 1 commit intomainfrom
Conversation
📝 WalkthroughWalkthroughA new FirmwareUpdateJobService is introduced to manage firmware update operations with URL validation, job tracking, and async OTA execution. The service is registered in the DI container, integrated into FirmwareController and McpResources, and tested with unit tests for URL trust validation. Changes
Sequence Diagram(s)sequenceDiagram
actor Client
participant FirmwareController
participant FirmwareUpdateJobService
participant Node
participant HTTP as HTTP (Firmware Source)
Client->>FirmwareController: Start firmware update
FirmwareController->>FirmwareUpdateJobService: Start(nodeId, url)
rect rgba(100, 150, 200, 0.5)
FirmwareUpdateJobService->>FirmwareUpdateJobService: Validate URL (trusted host?)
FirmwareUpdateJobService->>FirmwareUpdateJobService: Check no active update for node
end
FirmwareUpdateJobService->>FirmwareUpdateJobService: Create FirmwareUpdateJob (Queued)
FirmwareUpdateJobService-->>FirmwareController: Job created
FirmwareController-->>Client: JobId
par Async Execution
FirmwareUpdateJobService->>FirmwareUpdateJobService: UpdateJob status to Running
FirmwareUpdateJobService->>HTTP: GetFirmware(url)
HTTP-->>FirmwareUpdateJobService: Binary/ZIP payload
FirmwareUpdateJobService->>FirmwareUpdateJobService: Extract firmware from ZIP if needed
FirmwareUpdateJobService->>Node: Toggle Arduino mode ON
FirmwareUpdateJobService->>Node: OTA update (TCP to node IP)
Node-->>FirmwareUpdateJobService: Update progress/complete
FirmwareUpdateJobService->>FirmwareUpdateJobService: Log progress with timestamp
FirmwareUpdateJobService->>Node: Toggle Arduino mode OFF
rect rgba(150, 200, 150, 0.5)
FirmwareUpdateJobService->>FirmwareUpdateJobService: UpdateJob status (Succeeded/Failed/Cancelled)
FirmwareUpdateJobService->>FirmwareUpdateJobService: Cleanup resources and active mapping
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested labels
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Fix all issues with AI agents
In `@src/Services/FirmwareUpdateJobService.cs`:
- Around line 46-52: The IsTrustedFirmwareUrl method currently does a
case-sensitive prefix check which can incorrectly reject or accept URLs due to
RFC 3986 case-insensitive scheme/host; update the check to perform a
case-insensitive comparison by using url.StartsWith(prefix,
StringComparison.OrdinalIgnoreCase) for each entry in TrustedFirmwarePrefixes
(or normalize the url to a canonical case before checking) so scheme and host
mismatches are handled correctly; ensure this change is applied where
TrustedFirmwarePrefixes.Any(url.StartsWith) is used to locate the logic in
IsTrustedFirmwareUrl.
- Around line 139-145: The job.CompletedAt timestamp is being set twice: once
inside UpdateJob when transitioning to terminal statuses and again
unconditionally in the finally block; remove the redundant assignment in the
finally block (delete the line "job.CompletedAt = DateTime.UtcNow" in the
finally) so UpdateJob remains the single authoritative setter for terminal
states, while keeping the remaining cleanup calls (_tokens.TryRemove(...),
cts?.Dispose(), _activeJobByNode.TryRemove(...)) intact; ensure no other code
expects the finally block to set CompletedAt.
- Around line 196-207: The method currently constructs ZipArchive with default
leaveOpen causing ms1 to be disposed when zipArchive is disposed, which can
return a closed stream for empty zips; change the ZipArchive creation to
preserve ms1 by calling new ZipArchive(ms1, ZipArchiveMode.Read, leaveOpen:
true) (i.e., replace using var zipArchive = new ZipArchive(ms1) with using var
zipArchive = new ZipArchive(ms1, ZipArchiveMode.Read, leaveOpen: true)) so ms1
remains usable when returned; keep existing logic that creates and returns ms2
for the first entry.
- Around line 54-87: The Start method has a TOCTOU race where ContainsKey +
assignment on _activeJobByNode can let two concurrent starts for the same node
succeed; replace that pattern with an atomic TryAdd on _activeJobByNode using
the generated jobId (call _activeJobByNode.TryAdd(nodeId, jobId)) to fail fast
if another job exists, and if subsequent steps (_jobs.TryAdd or _tokens[jobId] =
cts) fail, roll back by removing the entry from _activeJobByNode (and any
partially added entries in _jobs/_tokens) so no leaks remain; update Start to
use TryAdd for the check-and-set and perform cleanup on any intermediate
failure, leaving ExecuteAsync unchanged.
In `@src/Services/McpResources.cs`:
- Around line 162-170: RequestNodeUpdateTool currently forwards a non-null url
directly to _nsd.Update allowing arbitrary URLs; before calling _nsd.Update in
RequestNodeUpdateTool validate the url by calling
_firmwareUpdateJobs.IsTrustedFirmwareUrl(url) (same logic used by
StartFirmwareUpdateTool) and return an error/throw if the url is not trusted,
otherwise proceed to call _nsd.Update(nodeId, url) and return the success JSON;
ensure you reference RequestNodeUpdateTool, _nsd.Update, and
_firmwareUpdateJobs.IsTrustedFirmwareUrl when making the change.
🧹 Nitpick comments (8)
tests/ESPresense.Companion.Tests/Services/FirmwareUpdateJobServiceTests.cs (3)
7-7: Consider adding the[TestFixture]attribute.NUnit 3.x will discover the tests without it, but the attribute is conventional and ensures NUnit analyzers don't flag the class.
+[TestFixture] public class FirmwareUpdateJobServiceTests
9-23:HttpClientis not disposed after each test.The
CreateService()helper creates a newHttpClientper call without disposal. While this won't cause issues in these specific tests (none reach HTTP), it's good hygiene to either share a single instance or dispose viaIDisposable/ setup-teardown.
43-52: Consider expanding test coverage for additionalStartpaths.Currently only the untrusted-URL error path is tested. Key scenarios to cover:
Startwith empty/nullnodeId→"nodeId is required"Startwith empty/nullurl→"url is required"Startwith a valid URL for a node that already has an active job → duplicate prevention- Successful job creation (verifying returned
jobis non-null and fields are set)src/Controllers/FirmwareController.cs (1)
117-152:GetFirmwareis duplicated between the controller andFirmwareUpdateJobService.
FirmwareController.GetFirmware(lines 117–152) andFirmwareUpdateJobService.GetFirmware(lines 177–208 in the service) contain nearly identical download-and-unzip logic. Consider extracting this into a shared helper to avoid divergence over time.src/Services/FirmwareUpdateJobService.cs (2)
85-85: Fire-and-forgetTask.Run— no graceful shutdown support.The spawned task is never tracked, so there's no way to await in-flight firmware updates during application shutdown. If the host shuts down mid-OTA, the node could be left in Arduino/bootloader mode (line 111) without the
finallyblock reliably executing.Consider tracking running tasks (e.g., via a
ConcurrentBag<Task>) and draining them inIHostApplicationLifetime.ApplicationStopping, or implementingIHostedServicefor coordinated shutdown.
211-228: Data classes: consider separating into their own files.
FirmwareUpdateJob,FirmwareUpdateLogEntry, andFirmwareUpdateJobStatusare public types colocated in the service file. For discoverability and consistency with the project's file-per-type convention, consider moving them to aModels/directory.src/Services/McpResources.cs (2)
142-151:SetNodeSettingsToolaccepts arbitrary settings without validation.Any MCP client can overwrite node settings with any values. While this may be acceptable given the trust model, consider whether certain fields (e.g., sensitive network configs) should be restricted or at least logged at warning level for auditability.
133-140: Consider null handling forGetNodeSettingsTool.If
_nsd.Get(nodeId)returnsnullfor an unknown node, the tool returns"null"as JSON. Consider returning a structured error response (consistent with other tools'{ ok: false, error: ... }pattern) for better MCP client ergonomics.
| public bool IsTrustedFirmwareUrl(string url) | ||
| { | ||
| if (string.IsNullOrWhiteSpace(url)) | ||
| return false; | ||
|
|
||
| return TrustedFirmwarePrefixes.Any(url.StartsWith); | ||
| } |
There was a problem hiding this comment.
URL prefix check is case-sensitive — scheme/host bypass possible.
string.StartsWith with current culture is case-sensitive for ASCII. A URL like HTTPS://GitHub.com/ESPresense/... (valid per RFC 3986 for scheme and host) would be rejected. Consider using StringComparison.OrdinalIgnoreCase or normalizing the URL first.
Proposed fix
- return TrustedFirmwarePrefixes.Any(url.StartsWith);
+ return TrustedFirmwarePrefixes.Any(prefix => url.StartsWith(prefix, StringComparison.OrdinalIgnoreCase));🤖 Prompt for AI Agents
In `@src/Services/FirmwareUpdateJobService.cs` around lines 46 - 52, The
IsTrustedFirmwareUrl method currently does a case-sensitive prefix check which
can incorrectly reject or accept URLs due to RFC 3986 case-insensitive
scheme/host; update the check to perform a case-insensitive comparison by using
url.StartsWith(prefix, StringComparison.OrdinalIgnoreCase) for each entry in
TrustedFirmwarePrefixes (or normalize the url to a canonical case before
checking) so scheme and host mismatches are handled correctly; ensure this
change is applied where TrustedFirmwarePrefixes.Any(url.StartsWith) is used to
locate the logic in IsTrustedFirmwareUrl.
| public (FirmwareUpdateJob? Job, string? Error) Start(string nodeId, string url) | ||
| { | ||
| if (string.IsNullOrWhiteSpace(nodeId)) | ||
| return (null, "nodeId is required"); | ||
|
|
||
| if (string.IsNullOrWhiteSpace(url)) | ||
| return (null, "url is required"); | ||
|
|
||
| if (!IsTrustedFirmwareUrl(url)) | ||
| return (null, "Only ESPresense GitHub URLs are allowed"); | ||
|
|
||
| if (_activeJobByNode.ContainsKey(nodeId)) | ||
| return (null, $"A firmware update is already running for node '{nodeId}'"); | ||
|
|
||
| var jobId = Guid.NewGuid().ToString("N"); | ||
| var job = new FirmwareUpdateJob | ||
| { | ||
| JobId = jobId, | ||
| NodeId = nodeId, | ||
| Url = url, | ||
| Status = FirmwareUpdateJobStatus.Queued, | ||
| CreatedAt = DateTime.UtcNow | ||
| }; | ||
|
|
||
| if (!_jobs.TryAdd(jobId, job)) | ||
| return (null, "Unable to create update job"); | ||
|
|
||
| _activeJobByNode[nodeId] = jobId; | ||
| var cts = new CancellationTokenSource(); | ||
| _tokens[jobId] = cts; | ||
|
|
||
| _ = Task.Run(() => ExecuteAsync(job, cts.Token), CancellationToken.None); | ||
| return (job, null); | ||
| } |
There was a problem hiding this comment.
Race condition (TOCTOU): concurrent Start calls for the same node can both pass the guard.
ContainsKey on line 65 and _activeJobByNode[nodeId] = jobId on line 81 are not atomic. Two concurrent calls with the same nodeId can both pass the check and start duplicate jobs.
Use TryAdd for an atomic check-and-set:
🐛 Proposed fix
- if (_activeJobByNode.ContainsKey(nodeId))
- return (null, $"A firmware update is already running for node '{nodeId}'");
-
var jobId = Guid.NewGuid().ToString("N");
var job = new FirmwareUpdateJob
{
@@
if (!_jobs.TryAdd(jobId, job))
return (null, "Unable to create update job");
- _activeJobByNode[nodeId] = jobId;
+ if (!_activeJobByNode.TryAdd(nodeId, jobId))
+ {
+ _jobs.TryRemove(jobId, out _);
+ return (null, $"A firmware update is already running for node '{nodeId}'");
+ }
+
var cts = new CancellationTokenSource();🤖 Prompt for AI Agents
In `@src/Services/FirmwareUpdateJobService.cs` around lines 54 - 87, The Start
method has a TOCTOU race where ContainsKey + assignment on _activeJobByNode can
let two concurrent starts for the same node succeed; replace that pattern with
an atomic TryAdd on _activeJobByNode using the generated jobId (call
_activeJobByNode.TryAdd(nodeId, jobId)) to fail fast if another job exists, and
if subsequent steps (_jobs.TryAdd or _tokens[jobId] = cts) fail, roll back by
removing the entry from _activeJobByNode (and any partially added entries in
_jobs/_tokens) so no leaks remain; update Start to use TryAdd for the
check-and-set and perform cleanup on any intermediate failure, leaving
ExecuteAsync unchanged.
| finally | ||
| { | ||
| job.CompletedAt = DateTime.UtcNow; | ||
| _tokens.TryRemove(job.JobId, out var cts); | ||
| cts?.Dispose(); | ||
| _activeJobByNode.TryRemove(job.NodeId, out _); | ||
| } |
There was a problem hiding this comment.
CompletedAt is written twice for terminal states.
UpdateJob (line 156) sets CompletedAt for terminal statuses, and then the finally block (line 141) unconditionally overwrites it. Remove one of the two assignments to avoid the subtle inconsistency.
Proposed fix — remove the redundant assignment in `finally`
finally
{
- job.CompletedAt = DateTime.UtcNow;
_tokens.TryRemove(job.JobId, out var cts);
cts?.Dispose();
_activeJobByNode.TryRemove(job.NodeId, out _);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| finally | |
| { | |
| job.CompletedAt = DateTime.UtcNow; | |
| _tokens.TryRemove(job.JobId, out var cts); | |
| cts?.Dispose(); | |
| _activeJobByNode.TryRemove(job.NodeId, out _); | |
| } | |
| finally | |
| { | |
| _tokens.TryRemove(job.JobId, out var cts); | |
| cts?.Dispose(); | |
| _activeJobByNode.TryRemove(job.NodeId, out _); | |
| } |
🤖 Prompt for AI Agents
In `@src/Services/FirmwareUpdateJobService.cs` around lines 139 - 145, The
job.CompletedAt timestamp is being set twice: once inside UpdateJob when
transitioning to terminal statuses and again unconditionally in the finally
block; remove the redundant assignment in the finally block (delete the line
"job.CompletedAt = DateTime.UtcNow" in the finally) so UpdateJob remains the
single authoritative setter for terminal states, while keeping the remaining
cleanup calls (_tokens.TryRemove(...), cts?.Dispose(),
_activeJobByNode.TryRemove(...)) intact; ensure no other code expects the
finally block to set CompletedAt.
| using var zipArchive = new ZipArchive(ms1); | ||
| foreach (var entry in zipArchive.Entries) | ||
| { | ||
| var ms2 = new MemoryStream(); | ||
| await using var entryStream = entry.Open(); | ||
| await entryStream.CopyToAsync(ms2, ct); | ||
| ms2.Position = 0; | ||
| return ms2; | ||
| } | ||
|
|
||
| ms1.Position = 0; | ||
| return ms1; |
There was a problem hiding this comment.
Empty-zip edge case: caller receives a disposed MemoryStream.
If the downloaded content is detected as a zip but contains zero entries, zipArchive disposes ms1 (since leaveOpen defaults to false) when exiting method scope, and the caller receives a closed stream. This is an unlikely scenario but will cause ObjectDisposedException at read time.
Pass leaveOpen: true to preserve ms1:
Proposed fix
- using var zipArchive = new ZipArchive(ms1);
+ using var zipArchive = new ZipArchive(ms1, ZipArchiveMode.Read, leaveOpen: true);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| using var zipArchive = new ZipArchive(ms1); | |
| foreach (var entry in zipArchive.Entries) | |
| { | |
| var ms2 = new MemoryStream(); | |
| await using var entryStream = entry.Open(); | |
| await entryStream.CopyToAsync(ms2, ct); | |
| ms2.Position = 0; | |
| return ms2; | |
| } | |
| ms1.Position = 0; | |
| return ms1; | |
| using var zipArchive = new ZipArchive(ms1, ZipArchiveMode.Read, leaveOpen: true); | |
| foreach (var entry in zipArchive.Entries) | |
| { | |
| var ms2 = new MemoryStream(); | |
| await using var entryStream = entry.Open(); | |
| await entryStream.CopyToAsync(ms2, ct); | |
| ms2.Position = 0; | |
| return ms2; | |
| } | |
| ms1.Position = 0; | |
| return ms1; |
🤖 Prompt for AI Agents
In `@src/Services/FirmwareUpdateJobService.cs` around lines 196 - 207, The method
currently constructs ZipArchive with default leaveOpen causing ms1 to be
disposed when zipArchive is disposed, which can return a closed stream for empty
zips; change the ZipArchive creation to preserve ms1 by calling new
ZipArchive(ms1, ZipArchiveMode.Read, leaveOpen: true) (i.e., replace using var
zipArchive = new ZipArchive(ms1) with using var zipArchive = new ZipArchive(ms1,
ZipArchiveMode.Read, leaveOpen: true)) so ms1 remains usable when returned; keep
existing logic that creates and returns ms2 for the first entry.
| [McpServerTool(Name = "request_node_update")] | ||
| [Description("Request node self-update with optional firmware URL")] | ||
| public async Task<string> RequestNodeUpdateTool( | ||
| [Description("Node identifier")] string nodeId, | ||
| [Description("Optional firmware URL")] string? url = null) | ||
| { | ||
| await _nsd.Update(nodeId, url); | ||
| return JsonSerializer.Serialize(new { ok = true, nodeId, url }, _jsonOptions); | ||
| } |
There was a problem hiding this comment.
Missing URL trust validation in RequestNodeUpdateTool.
When a non-null url is provided, it is passed directly to _nsd.Update(nodeId, url) without calling _firmwareUpdateJobs.IsTrustedFirmwareUrl(url). This bypasses the URL allowlist enforced everywhere else (controller, StartFirmwareUpdateTool), allowing an MCP client to instruct a node to self-update from an arbitrary URL.
🛡️ Proposed fix
public async Task<string> RequestNodeUpdateTool(
[Description("Node identifier")] string nodeId,
[Description("Optional firmware URL")] string? url = null)
{
+ if (url != null && !_firmwareUpdateJobs.IsTrustedFirmwareUrl(url))
+ return JsonSerializer.Serialize(new { ok = false, error = "Only ESPresense GitHub URLs are allowed" }, _jsonOptions);
+
await _nsd.Update(nodeId, url);
return JsonSerializer.Serialize(new { ok = true, nodeId, url }, _jsonOptions);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| [McpServerTool(Name = "request_node_update")] | |
| [Description("Request node self-update with optional firmware URL")] | |
| public async Task<string> RequestNodeUpdateTool( | |
| [Description("Node identifier")] string nodeId, | |
| [Description("Optional firmware URL")] string? url = null) | |
| { | |
| await _nsd.Update(nodeId, url); | |
| return JsonSerializer.Serialize(new { ok = true, nodeId, url }, _jsonOptions); | |
| } | |
| [McpServerTool(Name = "request_node_update")] | |
| [Description("Request node self-update with optional firmware URL")] | |
| public async Task<string> RequestNodeUpdateTool( | |
| [Description("Node identifier")] string nodeId, | |
| [Description("Optional firmware URL")] string? url = null) | |
| { | |
| if (url != null && !_firmwareUpdateJobs.IsTrustedFirmwareUrl(url)) | |
| return JsonSerializer.Serialize(new { ok = false, error = "Only ESPresense GitHub URLs are allowed" }, _jsonOptions); | |
| await _nsd.Update(nodeId, url); | |
| return JsonSerializer.Serialize(new { ok = true, nodeId, url }, _jsonOptions); | |
| } |
🤖 Prompt for AI Agents
In `@src/Services/McpResources.cs` around lines 162 - 170, RequestNodeUpdateTool
currently forwards a non-null url directly to _nsd.Update allowing arbitrary
URLs; before calling _nsd.Update in RequestNodeUpdateTool validate the url by
calling _firmwareUpdateJobs.IsTrustedFirmwareUrl(url) (same logic used by
StartFirmwareUpdateTool) and return an error/throw if the url is not trusted,
otherwise proceed to call _nsd.Update(nodeId, url) and return the success JSON;
ensure you reference RequestNodeUpdateTool, _nsd.Update, and
_firmwareUpdateJobs.IsTrustedFirmwareUrl when making the change.
Summary by CodeRabbit
Release Notes
New Features