Skip to content

feat: add automated tool-usage collection via Yarn plugin and agent hooks (Phase 2)#28815

Merged
NicolasMassart merged 28 commits into
mainfrom
MCWP-513-po-c-phase-2-automated-collection
Apr 21, 2026
Merged

feat: add automated tool-usage collection via Yarn plugin and agent hooks (Phase 2)#28815
NicolasMassart merged 28 commits into
mainfrom
MCWP-513-po-c-phase-2-automated-collection

Conversation

@NicolasMassart
Copy link
Copy Markdown
Contributor

@NicolasMassart NicolasMassart commented Apr 14, 2026

Description

Phase 2 of the automated tool-usage collection infrastructure (MCWP-513), building on the core DB write infrastructure from Phase 1.

What changed:

  • Yarn Berry plugin (plugin-usage-tracking.cjs): automatically records start/end/interrupted events for every yarn <script> execution via the wrapScriptExecution hook — no changes to package.json script entries required. Fires-and-forgets via a detached tsx subprocess so it never blocks the terminal. Exit code 129 (SIGHUP/Ctrl+C) is recorded as interrupted with success=NULL to distinguish abandoned sessions from genuine failures.
  • Claude Code PreToolUse hook (.claude/skills/pr-changelog/SKILL.md): frontmatter hook fires automatically on the first tool call after the skill loads — zero tokens, agent-invisible.
  • Cursor beforeReadFile hook (.cursor/hooks.json + cursor-hook-skill-tracking.ts): fires automatically when Cursor reads any .agents/skills/*/SKILL.md — zero tokens, agent-invisible. Uses execFileSync with an explicit argument array to avoid shell injection. Guards file_path type before extracting the skill name.
  • CI guard + developer opt-out (TOOL_USAGE_COLLECTION_OPT_IN=false): all three collection paths (Yarn plugin, Claude hook, Cursor hook) skip silently when CI is set or when TOOL_USAGE_COLLECTION_OPT_IN=false is exported in the developer's shell profile. The Yarn plugin returns an empty factory; the shell commands use [ -z "$CI" ] && [ "$TOOL_USAGE_COLLECTION_OPT_IN" != "false" ] && guards. The Cursor hook uses || echo '{"permission":"allow"}' as a fallback so Cursor always receives an explicit JSON permission response, even when the guards short-circuit.
  • Developer documentation (scripts/tooling/README.md, AGENTS.md, docs/readme/development-process.md): documents all three collection paths, skip conditions, and the architecture with a Mermaid diagram; clarifies that data is recorded locally only, never sent outside.

Changelog

CHANGELOG entry: null

Related issues

Fixes MCWP-513

Manual testing steps

Prerequisites: branch checked out, yarn install done, sqlite3 available (macOS: pre-installed; Linux: sudo apt install sqlite3 or brew install sqlite3), Claude Code installed and configured with this repo as the working directory (required for Path 2 only — https://docs.anthropic.com/en/docs/claude-code)

Feature: automated tool-usage collection (Phase 2)

  Background:
    Given the branch is checked out and dependencies are installed with `yarn install`
    And sqlite3 is available in the terminal
      # macOS: pre-installed. Linux: sudo apt install sqlite3 or brew install sqlite3
    And Claude Code is installed and configured with this repo as the working directory
      # required for Path 2 only — https://docs.anthropic.com/en/docs/claude-code
    And the Cursor `beforeReadFile` hook is active
      # required for Path 3 only — hooks.json is committed; Cursor loads it automatically for trusted workspaces

  # ── Path 1: Yarn Berry plugin ──────────────────────────────────────────────

  Scenario: Yarn plugin records start and end events for a successful script run
    Given no recent events exist in ~/.tool-usage-collection/events.db for tool_name "yarn:setup:expo"
      # check: sqlite3 ~/.tool-usage-collection/events.db "SELECT * FROM events WHERE tool_name='yarn:setup:expo' ORDER BY created_at DESC LIMIT 3;"
      # clean: sqlite3 ~/.tool-usage-collection/events.db "DELETE FROM events WHERE tool_name='yarn:setup:expo';"
    When user runs `yarn setup:expo`
    Then 2 new rows appear in the events table with tool_name = "yarn:setup:expo" and tool_type = "yarn_script"
      # verify: sqlite3 ~/.tool-usage-collection/events.db "SELECT event_type, tool_type, success, duration_ms FROM events WHERE tool_name='yarn:setup:expo' ORDER BY created_at DESC LIMIT 2;"
      # expected output:
      #   end|yarn_script|1|<ms>
      #   start|yarn_script||
    And one row has event_type = "start"
    And one row has event_type = "end" with success = 1 and duration_ms > 0

  Scenario: Yarn plugin records an interrupted event (not end) when a script is cancelled with Ctrl+C
    Given no recent events exist for tool_name "yarn:test:unit"
      # check: sqlite3 ~/.tool-usage-collection/events.db "SELECT * FROM events WHERE tool_name='yarn:test:unit' ORDER BY created_at DESC LIMIT 3;"
      # clean: sqlite3 ~/.tool-usage-collection/events.db "DELETE FROM events WHERE tool_name='yarn:test:unit';"
    When user runs `yarn test:unit` and interrupts it with Ctrl+C before completion
      # Note: Yarn terminates the child via SIGHUP (exit code 129) on Ctrl+C — the plugin
      # detects this and writes event_type='interrupted' instead of 'end', with success=NULL
    Then 2 rows exist for tool_name = "yarn:test:unit"
      # verify: sqlite3 ~/.tool-usage-collection/events.db "SELECT event_type, success, duration_ms FROM events WHERE tool_name='yarn:test:unit' ORDER BY created_at DESC LIMIT 2;"
      # expected output (duration_ms varies, success is always empty for interrupted):
      #   interrupted||<ms>
      #   start||
    And one row has event_type = "start" and one row has event_type = "interrupted"
    And the "interrupted" row has success = NULL and duration_ms > 0

  Scenario: Yarn plugin skips collection when CI env var is set
    Given CI=true is set in the environment
    When user runs any `yarn` script
    Then no new rows are written to ~/.tool-usage-collection/events.db
      # The plugin returns an empty factory immediately — no hooks registered, no subprocess spawned

  Scenario: Yarn plugin skips collection when opted out
    Given TOOL_USAGE_COLLECTION_OPT_IN=false is exported in the shell
    When user runs any `yarn` script
    Then no new rows are written to ~/.tool-usage-collection/events.db

  # ── Path 2: Claude Code skill (PreToolUse hook) ──────────────────────────

  Scenario: Claude Code PreToolUse hook fires automatically when the pr-changelog skill is invoked
    Given no recent events exist for tool_name "skill:pr-changelog" and agent_vendor "claude"
      # check: sqlite3 ~/.tool-usage-collection/events.db "SELECT * FROM events WHERE tool_name='skill:pr-changelog' AND agent_vendor='claude' ORDER BY created_at DESC LIMIT 3;"
      # clean: sqlite3 ~/.tool-usage-collection/events.db "DELETE FROM events WHERE tool_name='skill:pr-changelog' AND agent_vendor='claude';"
    When user starts a Claude Code session in this repo and asks "write a changelog entry for this branch"
      # The PreToolUse hook in .claude/skills/pr-changelog/SKILL.md fires automatically
      # on the first tool call after the skill is loaded (once: true) — the agent does not need to call anything explicitly
    Then 1 new row exists with tool_name = "skill:pr-changelog", tool_type = "skill", event_type = "start", agent_vendor = "claude"
      # verify: sqlite3 ~/.tool-usage-collection/events.db "SELECT tool_name, tool_type, event_type, agent_vendor FROM events WHERE tool_name='skill:pr-changelog' AND agent_vendor='claude' ORDER BY created_at DESC LIMIT 1;"
      # expected output:
      #   skill:pr-changelog|skill|start|claude

  # ── Path 3: Cursor beforeReadFile hook ────────────────────────────────────

  Scenario: Cursor beforeReadFile hook records a start event when a skill file is read
    Given no recent events exist for tool_name "skill:pr-changelog" and agent_vendor "cursor"
      # check: sqlite3 ~/.tool-usage-collection/events.db "SELECT * FROM events WHERE tool_name='skill:pr-changelog' AND agent_vendor='cursor' ORDER BY created_at DESC LIMIT 3;"
      # clean: sqlite3 ~/.tool-usage-collection/events.db "DELETE FROM events WHERE tool_name='skill:pr-changelog' AND agent_vendor='cursor';"
    When user opens a Cursor chat in this repo and asks "write a changelog entry for this branch"
      # Cursor reads .agents/skills/pr-changelog/SKILL.md — the beforeReadFile hook fires automatically
      # cursor-hook-skill-tracking.ts extracts "pr-changelog" from the path and calls tool-usage-collection.ts
      # The agent does not need to call anything explicitly — zero tokens
    Then 1 new row exists with tool_name = "skill:pr-changelog", tool_type = "skill", event_type = "start", agent_vendor = "cursor"
      # verify: sqlite3 ~/.tool-usage-collection/events.db "SELECT tool_name, tool_type, event_type, agent_vendor FROM events WHERE tool_name='skill:pr-changelog' AND agent_vendor='cursor' ORDER BY created_at DESC LIMIT 1;"
      # expected output:
      #   skill:pr-changelog|skill|start|cursor

Screenshots/Recordings

Before

N/A

After

N/A

Pre-merge author checklist

Pre-merge reviewer checklist

  • I've manually tested the PR (e.g. pull and build branch, run the app, test code being changed).
  • I confirm that this PR addresses all acceptance criteria described in the ticket it closes and includes the necessary testing evidence such as recordings and or screenshots.

Note

Medium Risk
Touches developer build tooling by hooking into every yarn script execution and spawning detached subprocesses; while guarded for CI/opt-out, misconfigurations could impact local dev workflows.

Overview
Adds automated local developer tool-usage collection across Yarn scripts and AI agent skills, writing start/end/interrupted events to ~/.tool-usage-collection/events.db with opt-out via TOOL_USAGE_COLLECTION_OPT_IN=false and automatic CI disable.

Introduces a Yarn Berry plugin that wraps yarn <script> runs and fire-and-forgets a detached tsx subprocess for tracking, plus new Cursor and Claude hooks to record skill usage; updates the SQLite schema/CLI/tests to support the new interrupted event type and expands documentation for setup/inspection.

Reviewed by Cursor Bugbot for commit 0877b05. Bugbot is set up for automated code reviews on this repo. Configure here.

NicolasMassart and others added 10 commits April 13, 2026 11:55
This commit introduces a new SQLite database setup for tracking tool usage events. It includes the following changes:
- Added `db.ts` for database initialization and schema creation.
- Implemented `trackEvent` function in `events.ts` to log events with relevant metadata.
- Created test files for both database operations and event tracking to ensure functionality and reliability.
- Added a CLI script (`tool-usage-collection.ts`) for collecting tool usage data via command-line arguments.

Additionally, updated `package.json` and `yarn.lock` to include necessary dependencies for SQLite and TypeScript support.

CHANGELOG entry: Added SQLite database handling for tool usage tracking.
…rgument parsing in tool-usage-collection.ts to enforce value requirements for flags.
- Introduced a new skill for creating and removing git worktrees, including detailed usage instructions.
- Added a plugin for tracking script execution in Yarn, logging events to a local SQLite database.
- Updated package.json to include the new @modelcontextprotocol/sdk dependency.
- Enhanced event tracking functionality in the tooling scripts, ensuring accurate logging of tool usage events.
- Created configuration files for the new skill and tracking server, improving project tooling capabilities.
@NicolasMassart NicolasMassart self-assigned this Apr 14, 2026
@github-actions
Copy link
Copy Markdown
Contributor

CLA Signature Action: All authors have signed the CLA. You may need to manually re-run the blocking PR check if it doesn't pass in a few minutes.

@metamaskbot metamaskbot added the team-mobile-platform Mobile Platform team label Apr 14, 2026
@socket-security

This comment was marked as outdated.

…lag values and improve process.exit handling. Added tests for --help and missing flag values.
@NicolasMassart

This comment was marked as outdated.

@NicolasMassart NicolasMassart added skip-e2e skip E2E test jobs area-devex Issues and PRs focused on developer experience labels Apr 14, 2026
NicolasMassart and others added 5 commits April 14, 2026 17:13
…CWP-513-po-c-phase-2-automated-collection

# Conflicts:
#	scripts/tooling/events.ts
- Added a system to automatically record usage of Yarn scripts, Claude Code skills, and Cursor skills to a local SQLite database.
- Introduced new hooks in Yarn and Claude Code for tracking events, including a `beforeReadFile` hook in Cursor.
- Updated the database schema to include an 'interrupted' event type for better tracking of script execution.
- Created a new README for tooling usage collection, detailing the architecture and usage instructions.
- Enhanced tests to validate the new event type and ensure proper functionality of the tracking system.
@github-actions github-actions Bot added size-L and removed size-M labels Apr 15, 2026
@NicolasMassart NicolasMassart changed the title feat: add Yarn plugin and MCP server for automated tool-usage collection (Phase 2) feat: add automated tool-usage collection via Yarn plugin, MCP server, and agent hooks (Phase 2) Apr 15, 2026
Base automatically changed from MCWP-512-po-c-phase-1-core-write-infrastructure to main April 15, 2026 15:41
# Conflicts:
#	scripts/tooling/db.test.ts
#	scripts/tooling/db.ts
#	scripts/tooling/events.test.ts
#	scripts/tooling/events.ts
#	scripts/tooling/tool-usage-collection.test.ts
#	scripts/tooling/tool-usage-collection.ts
#	yarn.lock
@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.20%. Comparing base (15470f4) to head (9e1c2c0).
⚠️ Report is 53 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #28815      +/-   ##
==========================================
- Coverage   82.20%   82.20%   -0.01%     
==========================================
  Files        5017     5023       +6     
  Lines      131659   131903     +244     
  Branches    29381    29445      +64     
==========================================
+ Hits       108228   108428     +200     
- Misses      16083    16110      +27     
- Partials     7348     7365      +17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…tests

This update simplifies the cursor hook logic by removing the `mustSkip` function, which was previously used to determine if tracking should be skipped based on environment variables. The handling of CI and opt-out conditions is now managed directly in the shell command within `.cursor/hooks.json`. Additionally, tests related to these conditions have been removed to reflect the updated logic, ensuring a cleaner and more efficient implementation.
@github-actions github-actions Bot added risk-low Low testing needed · Low bug introduction risk and removed risk-low Low testing needed · Low bug introduction risk labels Apr 17, 2026
Cal-L
Cal-L previously approved these changes Apr 21, 2026
Copy link
Copy Markdown
Contributor

@Cal-L Cal-L left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@NicolasMassart NicolasMassart added this pull request to the merge queue Apr 21, 2026
@github-project-automation github-project-automation Bot moved this from Needs dev review to Review finalised - Ready to be merged in PR review queue Apr 21, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to a conflict with the base branch Apr 21, 2026
@NicolasMassart
Copy link
Copy Markdown
Contributor Author

@copilot resolve the merge conflicts in this pull request

Co-authored-by: NicolasMassart <4677568+NicolasMassart@users.noreply.github.com>
auto-merge was automatically disabled April 21, 2026 11:27

Head branch was pushed to by a user without write access

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 21, 2026

@copilot resolve the merge conflicts in this pull request

Resolved by merging main into this branch and fixing the conflict in scripts/tooling/README.md. Commit: f86ebe3.

Remove yarn command from event inspection instructions.
@github-actions github-actions Bot added risk-low Low testing needed · Low bug introduction risk and removed risk-low Low testing needed · Low bug introduction risk labels Apr 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

🔍 Smart E2E Test Selection

  • Selected E2E tags: None (no tests recommended)
  • Selected Performance tags: None (no tests recommended)
  • Risk Level: low
  • AI Confidence: 98%
click to see 🤖 AI reasoning details

E2E Test Selection:
All 16 changed files are developer tooling and documentation changes with zero impact on the MetaMask Mobile app:

  1. scripts/tooling/*.ts - New TypeScript scripts for tracking AI agent tool/skill usage (Cursor hooks, event tracking, DB utilities, CLI tool). These are developer experience tools that run locally and have no connection to the app code.
  2. scripts/tooling/*.test.ts - Unit tests for the above tooling scripts.
  3. .cursor/hooks.json - Cursor IDE hook configuration for the skill tracking feature.
  4. .yarn/plugins/plugin-usage-tracking.cjs - Yarn plugin for usage tracking.
  5. .yarnrc.yml - Yarn configuration update.
  6. AGENTS.md - Documentation for AI coding agents.
  7. docs/readme/development-process.md - Development process documentation.
  8. scripts/tooling/README.md - README for the tooling scripts.
  9. .claude/skills/pr-changelog/SKILL.md - Claude skill documentation.
  10. yarn.lock - Dependency lock file update (for the new yarn plugin).

None of these files touch app source code, E2E test infrastructure, CI/CD workflows, controllers, Engine, navigation, UI components, or any wallet functionality. No E2E tests are needed.

Performance Test Selection:
No performance-relevant changes. All changes are developer tooling scripts, documentation, and configuration files that have no impact on app rendering, data loading, state management, or any user-facing functionality.

View GitHub Actions results

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@Cal-L Cal-L left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

@NicolasMassart NicolasMassart added this pull request to the merge queue Apr 21, 2026
Merged via the queue into main with commit c47ffad Apr 21, 2026
65 checks passed
@NicolasMassart NicolasMassart deleted the MCWP-513-po-c-phase-2-automated-collection branch April 21, 2026 15:51
@github-project-automation github-project-automation Bot moved this from Review finalised - Ready to be merged to Merged, Closed or Archived in PR review queue Apr 21, 2026
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 21, 2026
@metamaskbotv2 metamaskbotv2 Bot added the release-7.75.0 Issue or pull request that will be included in release 7.75.0 label Apr 21, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

area-devex Issues and PRs focused on developer experience release-7.75.0 Issue or pull request that will be included in release 7.75.0 risk-low Low testing needed · Low bug introduction risk size-L skip-e2e skip E2E test jobs team-mobile-platform Mobile Platform team

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants