Skip to content

docs: add telemetry instrumentation guide#1197

Merged
Hweinstock merged 1 commit into
aws:mainfrom
Hweinstock:docs/telemetry-readme
May 11, 2026
Merged

docs: add telemetry instrumentation guide#1197
Hweinstock merged 1 commit into
aws:mainfrom
Hweinstock:docs/telemetry-readme

Conversation

@Hweinstock
Copy link
Copy Markdown
Contributor

@Hweinstock Hweinstock commented May 11, 2026

Description

Add telemetry instrumentation documentation to help contributors understand how to add metrics to new features and ensure reviewers check for telemetry coverage.

  • src/cli/telemetry/README.md — guide for adding new metrics to features
  • AGENTS.md — note that new features must include telemetry instrumentation
  • .github/harness/prompts/review.md — reviewer checklist item for telemetry

Related Issue

Closes #

Documentation PR

N/A — this PR is itself a documentation change.

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Other (please describe):

Testing

How have you tested the change?

  • I ran npm run test:unit and npm run test:integ
  • I ran npm run typecheck
  • I ran npm run lint
  • If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.

@github-actions github-actions Bot added the size/m PR size: M label May 11, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label May 11, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Package Tarball

aws-agentcore-0.13.1.tgz

How to install

npm install https://github.com/aws/agentcore-cli/releases/download/pr-1197-tarball/aws-agentcore-0.13.1.tgz

Copy link
Copy Markdown

@agentcore-cli-automation agentcore-cli-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up — having a central telemetry guide is great. One blocker though: the docs reference a helper withAddTelemetry (and a corresponding file tui/hooks/useCreateCredential.ts) that don't exist anywhere in the repo. Following the guide as written will result in an import error.

Options to resolve:

  1. Replace every withAddTelemetry reference with withCommandRunTelemetry, which is the helper actually used by TUI add-flows today (e.g. src/cli/tui/screens/identity/useCreateIdentity.ts, src/cli/tui/hooks/useCreateEvaluator.ts, useCreateMcp.ts, useCreateMemory.ts, etc.), and update the file path in the add.credential concrete example to src/cli/tui/screens/identity/useCreateIdentity.ts.
  2. Actually add a withAddTelemetry helper to src/cli/telemetry/cli-command-run.ts and migrate existing call sites, if the intent is that add-flows should have a distinct helper.

Option 1 is the smaller change and matches the current codebase. See inline comments for the specific spots.

Comment thread src/cli/telemetry/README.md Outdated
Comment thread src/cli/telemetry/README.md Outdated
Comment thread src/cli/telemetry/README.md Outdated
@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label May 11, 2026
@Hweinstock Hweinstock force-pushed the docs/telemetry-readme branch from 39920b0 to 58e966a Compare May 11, 2026 17:26
@github-actions github-actions Bot added size/s PR size: S and removed size/m PR size: M labels May 11, 2026
@Hweinstock Hweinstock force-pushed the docs/telemetry-readme branch from 58e966a to 4ef958c Compare May 11, 2026 17:31
@github-actions github-actions Bot added size/m PR size: M and removed size/s PR size: S labels May 11, 2026
@aws aws deleted a comment from agentcore-cli-automation May 11, 2026
@Hweinstock Hweinstock closed this May 11, 2026
@Hweinstock Hweinstock reopened this May 11, 2026
@github-actions github-actions Bot added size/m PR size: M agentcore-harness-reviewing AgentCore Harness review in progress and removed size/m PR size: M labels May 11, 2026
Copy link
Copy Markdown

@agentcore-cli-automation agentcore-cli-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified the README against the current codebase:

  • withCommandRunTelemetry and runCliCommand signatures in src/cli/telemetry/cli-command-run.ts match what's documented.
  • safeSchema, standardize, resilientParse, NoAttrs, and COMMAND_SCHEMAS all exist in src/cli/telemetry/schemas/ as described.
  • The remove.gateway example matches actual usage in src/cli/primitives/GatewayPrimitive.ts:265.
  • The earlier withAddTelemetry / useCreateCredential.ts issues flagged in the previous review have been addressed — those references no longer appear in the PR.

The AGENTS.md and review.md additions correctly point at the new guide. LGTM.

@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label May 11, 2026
Comment thread src/cli/telemetry/README.md
@Hweinstock Hweinstock marked this pull request as ready for review May 11, 2026 17:43
@Hweinstock Hweinstock requested a review from a team May 11, 2026 17:43
@Hweinstock Hweinstock merged commit 340878c into aws:main May 11, 2026
45 of 47 checks passed
notgitika added a commit that referenced this pull request May 13, 2026
Cherry-picks non-breaking changes from main:
- docs: add telemetry instrumentation guide (#1197)
- chore: replace PAT tokens with GitHub App token (#1198)
- fix: add batch eval, recommendation, CloudWatch Logs permissions (#1113)
- feat(evaluator): add kmsKeyArn support (#994)
- chore: replace github.token with GitHub App token (#1210)
- fix: sync-preview workflow rewrite (#1078)

Skipped (requires dedicated effort due to Result type refactor):
- refactor: unify result types with Result<T, E> (#1125)
- feat: instrument telemetry for create/deploy (#1202, #1206)
- feat: record command attrs on failure (#1204)
notgitika added a commit that referenced this pull request May 13, 2026
Cherry-picks non-breaking changes from main:
- docs: add telemetry instrumentation guide (#1197)
- chore: replace PAT tokens with GitHub App token (#1198)
- fix: add batch eval, recommendation, CloudWatch Logs permissions (#1113)
- feat(evaluator): add kmsKeyArn support (#994)
- chore: replace github.token with GitHub App token (#1210)
- fix: sync-preview workflow rewrite (#1078)

Skipped (requires dedicated effort due to Result type refactor):
- refactor: unify result types with Result<T, E> (#1125)
- feat: instrument telemetry for create/deploy (#1202, #1206)
- feat: record command attrs on failure (#1204)
notgitika added a commit that referenced this pull request May 13, 2026
Cherry-picks non-breaking changes from main:
- docs: add telemetry instrumentation guide (#1197)
- chore: replace PAT tokens with GitHub App token (#1198)
- fix: add batch eval, recommendation, CloudWatch Logs permissions (#1113)
- feat(evaluator): add kmsKeyArn support (#994)
- chore: replace github.token with GitHub App token (#1210)
- fix: sync-preview workflow rewrite (#1078)

Skipped (requires dedicated effort due to Result type refactor):
- refactor: unify result types with Result<T, E> (#1125)
- feat: instrument telemetry for create/deploy (#1202, #1206)
- feat: record command attrs on failure (#1204)
tejaskash pushed a commit that referenced this pull request May 13, 2026
* docs: add telemetry instrumentation guide (#1197)

* chore: replace PAT tokens with GitHub App token (#1198)

Replace secrets.PAT_TOKEN and secrets.AUTOMATION_ACCOUNT_PAT_TOKEN with
short-lived tokens generated by the agentcore-devx-automation GitHub App
(ID: 3637953) via actions/create-github-app-token@v1.

This improves security by using ephemeral tokens scoped to the
installation rather than long-lived personal access tokens.

Requires adding repo variable APP_ID=3637953 and repo secret
APP_PRIVATE_KEY with the app's RSA private key.

* fix: add batch eval, recommendation, and CloudWatch Logs write permissions to docs (#1113)

* feat: instrument telemetry for create command (CLI + TUI) (#1202)

Add telemetry recording to the create command in both CLI and TUI paths:
- CLI: wrap handleCreateCLI with runCliCommand to emit CreateAttrs on success/failure
- TUI: wrap useCreateFlow's run() with withCommandRunTelemetry
- Add telemetry assertions to existing integration tests (frameworks + edge cases)

* chore: replace all github.token/GITHUB_TOKEN with GitHub App token

Replaces all occurrences of \${{ github.token }} and \${{ secrets.GITHUB_TOKEN }}
across .github/workflows/ with a per-job GitHub App token generated via
actions/create-github-app-token@v1 using vars.APP_ID and secrets.APP_PRIVATE_KEY.

* fix: bump versions to resolve security audit failure

* revert: keep pr-title.yml using GITHUB_TOKEN (read-only access sufficient)

* feat(evaluator): add kmsKeyArn support for custom evaluator (#994)

* feat(evaluator): Add kmsKeyArn support for custom evaluator

* fix: sync package-lock.json with package.json

The lock file was out of sync after dependency bumps on main were merged,
causing npm ci to fail in CI.

* fix: revert unrelated dep bumps and fix formatting

Reverts @opentelemetry/exporter-metrics-otlp-http ^0.217.0 back to
^0.214.0 and secretlint ^13.0.0 back to ^12.2.0 — these were
accidentally included in the feature commit from unmerged dependabot PRs
and introduce high-severity protobufjs vulnerabilities.

Restores fast-xml-parser and @aws-sdk/xml-builder overrides that were
also inadvertently removed.

Fixes Prettier formatting on agentcore-project.ts import lines.

* fix: sync package-lock.json with updated dependencies

---------

Co-authored-by: notgitika <gitijh@gmail.com>

* refactor: unify result types with discriminated Result<T, E> union (#1125)

* refactor: unify result types with discriminated Result<T, E> union

Introduce a shared Result<T, E> type (inspired by Rust's Result) that
replaces ad-hoc { success: boolean; error?: string } patterns across
the codebase.

Key changes:
- Add src/lib/types.ts with Result<T, E> discriminated union type
- Add toError() helper in src/cli/errors.ts for catch blocks
- Migrate all command, operation, and primitive result types to Result<T>
- Error field is now Error (not string) on the failure branch
- Data fields only exist on the success branch (proper narrowing)
- Update all consumers to narrow before accessing branch-specific fields
- Update test assertions to match new Error objects and add narrowing

* docs: update AGENTS.md and telemetry README to reflect Result<T, E> type

* feat: record command attrs on telemetry failure via fallbackAttrs (#1204)

* feat: record command attrs on telemetry failure via fallbackAttrs

Add optional fallbackAttrs parameter to client.withCommandRun so
command-specific attributes are recorded even when the callback throws.

- client.ts: accept fallbackAttrs, use on failure instead of {}
- client.ts: run resilientParse on all non-empty attrs (not just success)
- cli-command-run.ts: withCommandRunTelemetry passes attrs as fallbackAttrs
- cli-command-run.ts: runCliCommand accepts optional knownAttrs param
- command.tsx: extract knownAttrs upfront, pass to runCliCommand
- client.test.ts: add unit tests for fallbackAttrs behavior
- create-edge-cases.test.ts: assert attrs present on failure entry

* chore: rebase onto mainline

* fix: sync-preview pushes directly on clean merge, PRs only on conflict (#1078)

* fix: sync-preview workflow restores version instead of ignoring files

Instead of keeping preview's entire package.json/package-lock.json
(which discards new deps, scripts, etc. from main), accept main's
content and surgically restore only the version field to preview's
value after merge.

* fix: push directly to preview on clean merge via GitHub App bypass

Use agentcore-devx-automation app token to bypass branch protection
and push directly when the merge is clean (or only version conflicts).
Only creates a PR when there are real conflicts in other files.

* chore: use app-slug instead of app-id for token generation

* fix: address review feedback on sync-preview workflow

- Pass PREVIEW_VERSION via env var instead of string interpolation in
  node -e scripts (safer against special chars)
- Make git add of package-lock.json conditional on file existence to
  match the earlier -f guard
- Replace loose title search for dedup with headRefName prefix filter
  to avoid false positives from unrelated PRs
- Clarify why package.json/package-lock.json are special-cased (preview
  carries a different version string that needs preserving)

* fix: restore preview-owned files after sync merge

Adds a step to restore schemas/agentcore.schema.v1.json and CHANGELOG.md
to preview's versions after merging main. These files are auto-generated
during preview releases — schema-check CI rejects direct modifications
to schemas/, and CHANGELOG.md tracks preview releases separately.

* fix: use app-id instead of app-slug for GitHub App token

Aligns with the pattern in PR #1210 and ci-failure-issue.yml.

* feat: instrument telemetry for deploy command (CLI + TUI) (#1206)

* chore: sync safe commits from main into preview

Cherry-picks non-breaking changes from main:
- docs: add telemetry instrumentation guide (#1197)
- chore: replace PAT tokens with GitHub App token (#1198)
- fix: add batch eval, recommendation, CloudWatch Logs permissions (#1113)
- feat(evaluator): add kmsKeyArn support (#994)
- chore: replace github.token with GitHub App token (#1210)
- fix: sync-preview workflow rewrite (#1078)

Skipped (requires dedicated effort due to Result type refactor):
- refactor: unify result types with Result<T, E> (#1125)
- feat: instrument telemetry for create/deploy (#1202, #1206)
- feat: record command attrs on failure (#1204)

* chore: merge main into preview with Result refactor

Brings in all remaining main commits including:
- refactor: unify result types with Result<T, E> (#1125)
- feat: instrument telemetry for deploy command (#1206)
- feat: record command attrs on failure (#1204)
- feat: instrument telemetry for create command (#1202)

Adapts preview's deploy flow to use OperationResult with string errors
for compatibility with the telemetry layer.

---------

Co-authored-by: Hweinstock <42325418+Hweinstock@users.noreply.github.com>
Co-authored-by: Aidan Daly <99039782+aidandaly24@users.noreply.github.com>
Co-authored-by: Gitika <53349492+notgitika@users.noreply.github.com>
Co-authored-by: Aidan Daly <aidandal@amazon.com>
Co-authored-by: Harrison Weinstock <hkobew@amazon.com>
Co-authored-by: aws-aditya21 <saivenki@amazon.com>
Co-authored-by: notgitika <gitijh@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/m PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants