Skip to content

feat: instrument telemetry for dev command#1223

Open
Hweinstock wants to merge 2 commits into
aws:mainfrom
Hweinstock:feat/dev-telemetry
Open

feat: instrument telemetry for dev command#1223
Hweinstock wants to merge 2 commits into
aws:mainfrom
Hweinstock:feat/dev-telemetry

Conversation

@Hweinstock
Copy link
Copy Markdown
Contributor

@Hweinstock Hweinstock commented May 13, 2026

Description

Add telemetry instrumentation to the dev command for all execution paths (invoke, exec, server modes).

Schema changes:

  • Expanded Action enum: 'exec'
  • Added UiMode enum: 'browser' | 'terminal'
  • Added 'agui' to Protocol enum
  • Added ui_mode field to DevAttrs

Attributes emitted: action, ui_mode, has_stream, protocol, invoke_count

Note: because the dev server runs indefinitely, we emit telemetry eagerly before the server starts. The alternative is to refactor how runWebUI works to return a result that allows us to determine if the error was a cancellation (ctrl + c) which would count as a success, or a crash, which would count as a failure. However, this is a large refactor and is therefore left out of scope.

The effect is that dev success metrics with browser correspond to "was the user able to launch the browser" rather than, "did the browser launch successfully".

Related Issue

Closes #

Documentation PR

N/A

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Other (please describe):

Testing

How have you tested the change?

  • I ran npm run test:unit and npm run test:integ
  • I ran npm run typecheck
  • I ran npm run lint
  • If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots

Additional testing:

  • E2E verified with tarball: telemetry JSONL correctly emitted for both success and failure paths
  • Verified user-facing error messages preserved ("Dev server not running on port X")

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@github-actions github-actions Bot added the size/m PR size: M label May 13, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label May 13, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Package Tarball

aws-agentcore-0.13.1.tgz

How to install

npm install https://github.com/aws/agentcore-cli/releases/download/pr-1223-tarball/aws-agentcore-0.13.1.tgz

@Hweinstock Hweinstock force-pushed the feat/dev-telemetry branch from 94547e7 to fa268ad Compare May 13, 2026 15:23
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label May 13, 2026
Copy link
Copy Markdown

@agentcore-cli-automation agentcore-cli-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Telemetry instrumentation looks good overall, but a few concerns about the SIGINT/exit handling for the long-running server modes that I think need to be resolved before merging. The invoke and exec paths look clean.

Comment thread src/cli/commands/dev/command.tsx Outdated
Comment thread src/cli/commands/dev/command.tsx Outdated
Comment thread src/cli/commands/dev/command.tsx
@Hweinstock Hweinstock force-pushed the feat/dev-telemetry branch from fa268ad to b9c1ec1 Compare May 13, 2026 15:50
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
@Hweinstock Hweinstock force-pushed the feat/dev-telemetry branch from b9c1ec1 to a3eadee Compare May 13, 2026 16:26
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
@Hweinstock Hweinstock force-pushed the feat/dev-telemetry branch from a3eadee to 160dbbc Compare May 13, 2026 16:30
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
@Hweinstock Hweinstock force-pushed the feat/dev-telemetry branch from 160dbbc to 5c1043c Compare May 13, 2026 16:31
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
@Hweinstock Hweinstock closed this May 13, 2026
@Hweinstock Hweinstock reopened this May 13, 2026
@github-actions github-actions Bot added size/m PR size: M agentcore-harness-reviewing AgentCore Harness review in progress and removed size/m PR size: M labels May 13, 2026
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label May 13, 2026
Copy link
Copy Markdown

@agentcore-cli-automation agentcore-cli-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All four serious issues raised in earlier review comments appear to be addressed in the latest commit (53c5d65):

  1. --logs SIGINT cleanup raceresolve() is no longer called from the SIGINT handler at lines 407–411; the promise now only resolves via onExit, so SIGTERM (with the 2s SIGKILL fallback in DevServer.kill) has time to drive the child to exit before process.exit(0) runs.
  2. Browser-mode SIGINT race — the duplicate process.once('SIGINT', ...) { resolve() } was removed (lines 472–493). The code now just awaits runBrowserMode and includes an inline comment explicitly calling out the limitation that normal shutdown telemetry requires a follow-up refactor of runWebUI. Reasonable interim state.
  3. TUI process.exit(0) from onBack — explicit collector?.stop(); process.exit(0) was added after the wrapper at lines 463–464, giving the --no-browser path a deterministic exit point.
  4. --logs failure path silently exits — replaced with if (!devResult.success) throw devResult.error; process.exit(0); at lines 416–417, matching the pattern used by the other paths and surfacing the error via the outer catch.

The integ test exercises both success and failure telemetry through a real audit dir (no mocking of telemetry internals), which is good. Schema additions (exec action, UiMode, agui protocol, ui_mode field) and unit tests look correct.

LGTM.

@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label May 13, 2026
@Hweinstock Hweinstock marked this pull request as ready for review May 13, 2026 17:08
@Hweinstock Hweinstock requested a review from a team May 13, 2026 17:08
@Hweinstock Hweinstock marked this pull request as draft May 13, 2026 17:08
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
@Hweinstock Hweinstock force-pushed the feat/dev-telemetry branch from 95de36f to a74d5b1 Compare May 13, 2026 17:25
@github-actions github-actions Bot removed the size/m PR size: M label May 13, 2026
@github-actions github-actions Bot added the size/m PR size: M label May 13, 2026
@Hweinstock Hweinstock force-pushed the feat/dev-telemetry branch from a74d5b1 to 56f64f6 Compare May 13, 2026 17:29
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
@Hweinstock Hweinstock force-pushed the feat/dev-telemetry branch from 56f64f6 to 3905aee Compare May 13, 2026 17:35
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
…metry eagerly

Error classification:
- ConnectionError for connection-refused (new class in lib/errors/types.ts)
- ValidationError for invalid user input (missing --tool, bad JSON, unknown command)
- ResourceNotFoundError for missing container runtime

Browser mode telemetry:
- Emit telemetry eagerly via TelemetryClientAccessor before the blocking
  runBrowserMode call (which never returns). Do not copy this pattern —
  prefer withCommandRunTelemetry for commands that return.
@Hweinstock Hweinstock force-pushed the feat/dev-telemetry branch from 3905aee to 88b2a30 Compare May 13, 2026 17:41
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 13, 2026
@Hweinstock Hweinstock deployed to e2e-testing May 13, 2026 17:42 — with GitHub Actions Active
@Hweinstock Hweinstock marked this pull request as ready for review May 13, 2026 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/m PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants