Skip to content

V2.0.0 release#6530

Open
odilitime wants to merge 245 commits intodevelopfrom
v2.0.0
Open

V2.0.0 release#6530
odilitime wants to merge 245 commits intodevelopfrom
v2.0.0

Conversation

@odilitime
Copy link
Collaborator

@odilitime odilitime commented Feb 27, 2026

Make sure develop has the best v2.0.0


Note

High Risk
This PR substantially rewires CI, multi-language testing, and release/publish/deploy pipelines (NPM/PyPI/crates.io/TEE), so misconfiguration could break releases or unintentionally expose/omit artifacts and secrets.

Overview
Reworks CI and automation for the v2.0.0 branch. The main ci.yaml now builds first and runs expanded test coverage across TypeScript core (test:core), TypeScript interop, Python (core + interop), and Rust, plus adds a prompt secret scan and increases timeouts.

Replaces and adds multiple GitHub workflows. Removes older dedicated workflows (CLI tests, Cypress client tests, core-package tests, plugin-sql tests, Tauri CI/release, README translation, news updater) and introduces new ones including docs-ci.yml (Claude-driven docs link/quality fixes), multi-lang-tests.yaml (Rust/Python/plugin-sql/WASM/interop matrix), supply-chain.yaml (SBOM + vulnerability scan), and new release workflows for Python (release-python.yaml), Rust (release-rust.yaml), and ComputerUse crates.

Updates release and deployment behavior. release.yaml is extended to build Rust/WASM artifacts and to temporarily replace/restore workspace:* dependency references during publish; tee-build-deploy.yml adds concurrency controls and tightens env/secrets handling.

Repository hygiene/tooling cleanup. Adds .biomeignore, expands .gitignore for Rust/Python/generated/test artifacts, removes Cursor submodule/rules and various editor/ignore files, tweaks issue templates and dependency-bot configs, and standardizes several workflow configs (checkout versions, quoting, CodeQL triggers).

Written by Cursor Bugbot for commit 0d6f8b2. This will update automatically on new commits. Configure here.

lalalune and others added 30 commits January 8, 2026 23:05
Adds a set of source-oriented docs covering core concepts, architecture, plugins, interop, deployment, and API reference.
Expanded CORE_CONCEPTS.md with explanations of the core concepts of worlds, rooms, and entities, including typical usage patterns and example references.
Added a section explaining the difference between ElizaOS's internal room concept and the platform-specific channel metadata, including details on room IDs, channel types, and channel IDs.
Removes references to a non-existent interop schema file and corrects malformed provider result bullets.
hanzlamateen and others added 19 commits February 5, 2026 21:51
- Removed unnecessary comments and improved logging for the inference server availability.
- Updated the training logic to clarify the use of the private _train_model API, with a note to monitor for a public alternative.
- Added '.art' to examples/art/.gitignore to ensure art files are ignored in the project.
Bridge the TypeScript milaidy agent with all Python benchmark runners
via an HTTP benchmark server, enabling head-to-head comparison between
the purpose-built Python ElizaOS agent and the milaidy TS agent.

TypeScript side:
- benchmark/plugin.ts: MILAIDY_BENCHMARK provider + BENCHMARK_ACTION
  action with multi-benchmark messageHandlerTemplate
- benchmark/server.ts: HTTP server wrapping full milaidy runtime
  with /health, /message, /reset endpoints on port 3939

Python side (benchmarks/milaidy-adapter/):
- MilaidyClient: HTTP client for the benchmark server
- MilaidyServerManager: subprocess lifecycle management
- MilaidyAgentHarness: AgentBench adapter
- MilaidyTauAgent: Tau-bench adapter
- MilaidyMind2WebAgent: Mind2Web adapter
- make_milaidy_llm_query: Context-bench adapter

Wired into all benchmark CLIs:
- AgentBench: --milaidy flag
- Tau-bench: --model-provider milaidy
- Mind2Web: --provider milaidy
- Context-bench: --provider milaidy
- Registry: agent="milaidy" in extra config

Co-authored-by: Cursor <cursoragent@cursor.com>
Add plugins, packages, examples, benchmarks, eliza-cloud-v2,
airboardgame, and supporting config files. Excludes vendor/example
repositories (dnd5e, rlm, rlm-minimal, the-org, rs-sdk,
openclaw-classic) via .gitignore.

Also updates .gitignore with:
- Vendor directory exclusions
- Database file patterns (*.db, *.sqlite, etc.)
- Cache directory exclusions
- Removes stale benchmark result files

Co-authored-by: Cursor <cursoragent@cursor.com>
Add plugin-rolodex, plugin-trust, plugin-cron, plugin-moltbook,
plugin-commands, plugin-edge-tts, plugin-gmail-watch, plugin-auto-trader
Python module, and more. Add dnd-vtt example, benchmark updates, and
various plugin/package improvements. Remove airboardgame directory.

Updated .gitignore to also exclude examples/runescape2004 (rs-sdk
content clone) and plugins/plugin-social-alpha (vendor code), nested
skills cache directories, and untrack catalog.json cache file.

Co-authored-by: Cursor <cursoragent@cursor.com>
- Add Python plugin-auto-trader actions and strategies
- Add Python plugin-blooio types and constants
- Add plugin-elizacloud Python tests
- Add plugin-plugin-manager lifecycle tests
- Add EntityResolutionService to plugin-rolodex
- Update plugin-github, plugin-knowledge, plugin-twitch, plugin-farcaster tests
- Remove stale skills cache file
- Update .gitignore

Co-authored-by: Cursor <cursoragent@cursor.com>
- Add remaining auto-trader Python actions (analyze_performance, configure_strategy, execute_live_trade, get_market_analysis)
- Add blooio Python plugin init, actions, and providers
- Update farcaster and rolodex service tests

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Add auto-trader Python tests (actions, provider, service)
- Add blooio Python action tests
- Add elizacloud Rust cloud_api and services
- Add prose Python test scaffolding
- Add rolodex utility tests
- Update farcaster TypeScript providers
- Update knowledge Rust plugin and providers

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Add elizacloud Rust actions and backup/bridge services
- Add prose Python service tests
- Add tee Python provider tests
- Update farcaster, knowledge, rolodex tests
- Update milaidy e2e test and Python runtime

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
- Fix action handler parameter access: convert proto ActionParameters
  to plain dicts before calling handlers (fixes `get` AttributeError)
- Fix schema_def -> schema compatibility in bootstrap/basic_capabilities
  action parameter formatting (handles both Pydantic and proto objects)
- Fix uninitialized `valid` variable in process_actions when action
  has no parameters
- Add safe getattr() for `examples` field on ActionParameter proto
- Add traceback to message service error logging for debugging

Co-authored-by: Cursor <cursoragent@cursor.com>
…namic2

Co-authored-by: Cursor <cursoragent@cursor.com>

# Conflicts:
#	bun.lock
#	docs/API_REFERENCE.md
#	docs/ARCHITECTURE.md
#	docs/CORE_CONCEPTS.md
#	docs/DEPLOYMENT_GUIDE.md
#	docs/INTEROP_GUIDE.md
#	docs/PLUGIN_DEVELOPMENT.md
#	examples/_plugin/typescript/biome.json
#	examples/_plugin/typescript/src/__tests__/cypress/support/commands.ts
#	examples/_plugin/typescript/src/__tests__/cypress/support/component.ts
#	examples/browser-extension/safari/Chat with Webpage/Chat with Webpage.xcodeproj/project.pbxproj
#	examples/browser-extension/safari/Chat with Webpage/Shared (App)/ViewController.swift
#	examples/browser-extension/safari/Chat with Webpage/Shared (Extension)/SafariWebExtensionHandler.swift
#	examples/browser-extension/safari/Chat with Webpage/iOS (App)/AppDelegate.swift
#	examples/browser-extension/safari/Chat with Webpage/iOS (App)/SceneDelegate.swift
#	examples/browser-extension/safari/Chat with Webpage/macOS (App)/AppDelegate.swift
#	examples/code/package.json
#	examples/code/src/App.tsx
#	examples/code/src/components/ChatPane.tsx
#	examples/code/src/components/TaskPane.tsx
#	examples/code/src/ink.d.ts
#	examples/code/src/lib/sub-agents/sweagent-sub-agent.ts
#	examples/polyagent/.env.example
#	examples/polyagent/apps/web/next.config.ts
#	examples/polyagent/apps/web/package.json
#	examples/polyagent/apps/web/src/app/agents/[agentId]/public/page.tsx
#	examples/polyagent/apps/web/src/app/agents/page.tsx
#	examples/polyagent/apps/web/src/app/api/agents/[agentId]/autonomy/route.ts
#	examples/polyagent/apps/web/src/app/api/agents/[agentId]/fund/route.ts
#	examples/polyagent/apps/web/src/app/api/agents/[agentId]/positions/route.ts
#	examples/polyagent/apps/web/src/app/api/agents/discover/route.ts
#	examples/polyagent/apps/web/src/app/api/agents/public/[agentId]/route.ts
#	examples/polyagent/apps/web/src/app/api/agents/public/route.ts
#	examples/polyagent/apps/web/src/app/api/cron/agent-tick/route.ts
#	examples/polyagent/apps/web/src/app/api/dashboard/route.ts
#	examples/polyagent/apps/web/src/app/api/notifications/route.ts
#	examples/polyagent/apps/web/src/app/api/users/delete-account/route.ts
#	examples/polyagent/apps/web/src/app/api/users/signup/route.ts
#	examples/polyagent/apps/web/src/app/page.tsx
#	examples/polyagent/apps/web/src/components/auth/UserMenu.tsx
#	examples/polyagent/apps/web/src/components/shared/BottomNav.tsx
#	examples/polyagent/apps/web/src/components/shared/Sidebar.tsx
#	examples/polyagent/apps/web/tsconfig.json
#	examples/polyagent/docker-compose.yml
#	examples/polyagent/package.json
#	examples/polyagent/packages/agents/package.json
#	examples/polyagent/packages/agents/src/identity/AgentIdentityService.ts
#	examples/polyagent/packages/agents/src/identity/AgentWalletService.ts
#	examples/polyagent/packages/agents/src/plugins/plugin-trajectory-logger/src/export.ts
#	examples/polyagent/packages/agents/src/runtime/AgentRuntimeManager.ts
#	examples/polyagent/packages/agents/src/services/AgentPnLService.ts
#	examples/polyagent/packages/agents/src/services/AgentService.ts
#	examples/polyagent/packages/agents/src/services/agent-registry.service.ts
#	examples/polyagent/packages/agents/src/types/index.ts
#	examples/polyagent/packages/agents/tsconfig.json
#	examples/polyagent/packages/api/package.json
#	examples/polyagent/packages/api/src/cache/cached-database-service.ts
#	examples/polyagent/packages/api/src/redis/client.ts
#	examples/polyagent/packages/api/src/services/points-service.ts
#	examples/polyagent/packages/api/src/storage/s3-client.ts
#	examples/polyagent/packages/api/src/users/user-lookup.ts
#	examples/polyagent/packages/api/tsconfig.json
#	examples/polyagent/packages/db/drizzle.config.cjs
#	examples/polyagent/packages/db/drizzle.config.ts
#	examples/polyagent/packages/db/package.json
#	examples/polyagent/packages/db/src/client.ts
#	examples/polyagent/packages/db/src/database-service.ts
#	examples/polyagent/packages/db/src/index.ts
#	examples/polyagent/packages/db/src/model-types.ts
#	examples/polyagent/packages/shared/src/types/monitoring.ts
#	examples/polyagent/packages/shared/src/validation/schemas/feedback.ts
#	examples/polyagent/scripts/pre-dev/pre-dev-local.ts
#	examples/polyagent/tsconfig.json
#	examples/polymarket/typescript/tui.tsx
#	examples/vrm/src/App.tsx
#	packages/elizaos/examples-manifest.json
#	packages/prompts/specs/actions/plugins.generated.json
#	packages/python/elizaos/runtime.py
#	packages/python/elizaos/services/message_service.py
#	packages/python/elizaos/types/model.py
#	packages/rust/Cargo.lock
#	packages/typescript/src/autonomy/service.ts
#	packages/typescript/src/bootstrap/autonomy/service.ts
#	packages/typescript/src/generated/action-docs.ts
#	plugins/plugin-agent-orchestrator/typescript/biome.json
#	plugins/plugin-auto-trader/typescript/src/strategies/__tests__/LLMStrategy.test.ts
#	plugins/plugin-browser/typescript/package.json
#	plugins/plugin-computeruse/typescript/package.json
#	plugins/plugin-copilot-proxy/typescript/tsconfig.build.json
#	plugins/plugin-form/package.json
#	plugins/plugin-form/typescript/biome.json
#	plugins/plugin-form/typescript/build.ts
#	plugins/plugin-form/typescript/package.json
#	plugins/plugin-form/typescript/src/index.ts
#	plugins/plugin-form/typescript/src/service.ts
#	plugins/plugin-form/typescript/src/types.ts
#	plugins/plugin-form/typescript/src/validation.ts
#	plugins/plugin-form/typescript/tsconfig.build.json
#	plugins/plugin-form/typescript/tsconfig.json
#	plugins/plugin-minecraft/typescript/package.json
#	plugins/plugin-n8n/typescript/package.json
#	plugins/plugin-polymarket/typescript/biome.json
#	plugins/plugin-solana/typescript/package.json
#	plugins/plugin-todo/typescript/package.json
#	plugins/plugin-twilio/tsup.config.ts
#	plugins/plugin-twilio/typescript/src/__tests__/actions/makeCall.test.ts
#	plugins/plugin-twilio/typescript/src/__tests__/actions/sendSms.test.ts
#	plugins/plugin-twilio/typescript/src/__tests__/index.test.ts
#	plugins/plugin-twilio/typescript/src/__tests__/providers/callState.test.ts
#	plugins/plugin-twilio/typescript/src/__tests__/providers/conversationHistory.test.ts
#	plugins/plugin-twilio/typescript/src/index.ts
#	plugins/plugin-twilio/typescript/src/tests.ts
#	plugins/plugin-whatsapp/package.json
#	plugins/plugin-whatsapp/typescript/src/accounts.ts
#	plugins/plugin-whatsapp/typescript/src/actions/index.ts
#	plugins/plugin-whatsapp/typescript/src/actions/sendMessage.test.ts
#	plugins/plugin-whatsapp/typescript/src/actions/sendMessage.ts
#	plugins/plugin-whatsapp/typescript/src/actions/sendReaction.ts
#	plugins/plugin-whatsapp/typescript/src/config.ts
#	plugins/plugin-whatsapp/typescript/src/normalize.ts
#	plugins/plugin-whatsapp/typescript/src/service.ts
#	tsconfig.json
chore(examples-art): v2 update dependencies, training pipeline, and tests for 2048 game
* fix: resolve build errors and dependency issues for clean installs

- Add missing deps to @elizaos/core (drizzle-orm, markdown-it, undici, yaml, sharp)
- Fix import paths in autonomy/service.ts and testing/index.ts
- Add node types to plugin-anthropic tsconfig to fix uuid type resolution
- Fix dependency declarations in several plugins (dangling refs, version ranges)
- Upgrade @pixi/react to v8 to match pixi.js v8 in dnd-vtt example

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat: add fix-workspace-deps script for local dev / commit workflow

Manages @elizaos/* dependency references across the monorepo:
  bun run fix-deps          — set workspace:* for local dev
  bun run fix-deps:restore  — restore versioned refs before committing
  bun run fix-deps:check    — CI check for leaked workspace:* refs

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(example-chat): add Ollama and local-ai support with smart provider detection

Explicit env vars (API keys, local URLs) take priority over auto-detected
local servers, so users with cloud keys aren't silently overridden by a
running Ollama instance.

Co-authored-by: Cursor <cursoragent@cursor.com>

* examples: fix in cursor

Iteration 2

prr-fix:prrc_kwdomt5cis6ncqbe
prr-fix:prrc_kwdomt5cis6ncw_i
prr-fix:prrc_kwdomt5cis6ncqbp
prr-fix:prrc_kwdomt5cis6ncqbu
prr-fix:prrc_kwdomt5cis6ncq8w
prr-fix:prrc_kwdomt5cis6ncqbi
prr-fix:prrc_kwdomt5cis6ncw_e
prr-fix:prrc_kwdomt5cis6ncqbl

* examples: fix in cursor

Iteration 3

prr-fix:prrc_kwdomt5cis6ncw_b

* scripts: fix security issue

Iteration 1

prr-fix:prrc_kwdomt5cis6p36mf
prr-fix:prrc_kwdomt5cis6p36mm
prr-fix:prrc_kwdomt5cis6p4fro
prr-fix:prrc_kwdomt5cis6p4fsz

* scripts: fix mode scope

Iteration 1

prr-fix:prrc_kwdomt5cis6ncqbl
prr-fix:prrc_kwdomt5cis6ncqbe
prr-fix:prrc_kwdomt5cis6ncw_i

* examples: update chat

Iteration 3

prr-fix:prrc_kwdomt5cis6ncqbi

* examples: fix in cursor

Iteration 5

prr-fix:prrc_kwdomt5cis6ncqbp
prr-fix:prrc_kwdomt5cis6ncw_e

* packages: update service

Iteration 8

prr-fix:prrc_kwdomt5cis6ncqbr

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

* package.json: IMPORTANT: Use `bun` instead of `node` for script execution

Iteration 1

prr-fix:prrc_kwdomt5cis6p4or_

* examples: ### Review comment accidentally committed in JSON file

Iteration 1

prr-fix:prrc_kwdomt5cis6p4ydk

* examples: ### Missing plugin-local-ai dependency in chat package.json

Iteration 1

prr-fix:prrc_kwdomt5cis6p4ydm

* packages: ### Removed `@bufbuild/protobuf` dependency still imported in source

Iteration 1

prr-fix:prrc_kwdomt5cis6p4ydp

* packages: ### AI review tool comments accidentally committed to source

Iteration 1

prr-fix:prrc_kwdomt5cis6p4yds

* scripts: fix mode scope

Iteration 2

prr-fix:prrc_kwdomt5cis6ncqbe
prr-fix:prrc_kwdomt5cis6ncw_i

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

* packages: update service

Iteration 1

prr-fix:ic-3956059219-1

* examples: CRITICAL: Invalid JSON syntax

Iteration 1

prr-fix:prrc_kwdomt5cis6p4ugh

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Copilot AI review requested due to automatic review settings February 27, 2026 02:47
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 27, 2026

Important

Review skipped

Too many files!

This PR contains 300 files, which is 150 over the limit of 150.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 4a343d3b-ad39-48ea-bd08-f41cbf9b624d

📥 Commits

Reviewing files that changed from the base of the PR and between 91dceb1 and 0d6f8b2.

📒 Files selected for processing (300)
  • .biomeignore
  • .cursorignore
  • .cursorrules
  • .dockerignore
  • .github/ISSUE_TEMPLATE/bug_report.md
  • .github/ISSUE_TEMPLATE/feature_request.md
  • .github/dependabot.yml
  • .github/renovate-preset.json
  • .github/workflows/README.md
  • .github/workflows/ci.yaml
  • .github/workflows/claude-code-review.yml
  • .github/workflows/claude-security-review.yml
  • .github/workflows/claude.yml
  • .github/workflows/cli-tests.yml
  • .github/workflows/client-cypress-tests.yml
  • .github/workflows/codeql.yml
  • .github/workflows/core-package-tests.yaml
  • .github/workflows/docs-ci.yml
  • .github/workflows/generate-readme-translations.yml
  • .github/workflows/image.yaml
  • .github/workflows/jsdoc-automation.yml
  • .github/workflows/multi-lang-tests.yaml
  • .github/workflows/plugin-sql-tests.yaml
  • .github/workflows/pr.yaml
  • .github/workflows/release-computeruse-crates.yaml
  • .github/workflows/release-python.yaml
  • .github/workflows/release-rust.yaml
  • .github/workflows/release.yaml
  • .github/workflows/supply-chain.yaml
  • .github/workflows/tauri-ci.yml
  • .github/workflows/tauri-release.yml
  • .github/workflows/tee-build-deploy.yml
  • .github/workflows/update-news.yml
  • .github/workflows/weekly-maintenance.yml
  • .gitignore
  • .gitmodules
  • .husky/pre-commit
  • .npmrc
  • .prettierignore
  • .prr/lessons.md
  • .vscode/launch.json
  • .vscode/settings.json
  • AGENTS.md
  • BENCHMARK_RESULTS.md
  • CHANGELOG.md
  • CLAUDE.md
  • DATABASE_API_CHANGELOG.md
  • DATABASE_API_README.md
  • DOCUMENTATION_COMPLETE.md
  • Dockerfile
  • README.md
  • benchmark_results/bfcl/bfcl_best_results.json
  • benchmarks/__init__.py
  • benchmarks/agentbench/README.md
  • benchmarks/agentbench/RESEARCH.md
  • benchmarks/agentbench/elizaos_agentbench/__init__.py
  • benchmarks/agentbench/elizaos_agentbench/adapters/__init__.py
  • benchmarks/agentbench/elizaos_agentbench/adapters/base.py
  • benchmarks/agentbench/elizaos_agentbench/adapters/db_adapter.py
  • benchmarks/agentbench/elizaos_agentbench/adapters/kg_adapter.py
  • benchmarks/agentbench/elizaos_agentbench/adapters/lateral_thinking_adapter.py
  • benchmarks/agentbench/elizaos_agentbench/adapters/os_adapter.py
  • benchmarks/agentbench/elizaos_agentbench/adapters/webshop_adapter.py
  • benchmarks/agentbench/elizaos_agentbench/benchmark_actions.py
  • benchmarks/agentbench/elizaos_agentbench/cli.py
  • benchmarks/agentbench/elizaos_agentbench/eliza_harness.py
  • benchmarks/agentbench/elizaos_agentbench/mock_runtime.py
  • benchmarks/agentbench/elizaos_agentbench/runner.py
  • benchmarks/agentbench/elizaos_agentbench/tests/__init__.py
  • benchmarks/agentbench/elizaos_agentbench/tests/test_adapters.py
  • benchmarks/agentbench/elizaos_agentbench/tests/test_runner.py
  • benchmarks/agentbench/elizaos_agentbench/tests/test_smart_mock_runtime.py
  • benchmarks/agentbench/elizaos_agentbench/tests/test_types.py
  • benchmarks/agentbench/elizaos_agentbench/trajectory_integration.py
  • benchmarks/agentbench/elizaos_agentbench/types.py
  • benchmarks/agentbench/pyproject.toml
  • benchmarks/agentbench/run_benchmark.py
  • benchmarks/agentbench/smart_smoke/agentbench-detailed.json
  • benchmarks/agentbench/smart_smoke/agentbench-report.md
  • benchmarks/agentbench/smart_smoke/agentbench-results.json
  • benchmarks/bench_cli_types.py
  • benchmarks/bfcl/__init__.py
  • benchmarks/bfcl/__main__.py
  • benchmarks/bfcl/agent.py
  • benchmarks/bfcl/dataset.py
  • benchmarks/bfcl/evaluators/__init__.py
  • benchmarks/bfcl/evaluators/ast_evaluator.py
  • benchmarks/bfcl/evaluators/exec_evaluator.py
  • benchmarks/bfcl/evaluators/relevance_evaluator.py
  • benchmarks/bfcl/metrics.py
  • benchmarks/bfcl/models.py
  • benchmarks/bfcl/parser.py
  • benchmarks/bfcl/plugin.py
  • benchmarks/bfcl/reporting.py
  • benchmarks/bfcl/runner.py
  • benchmarks/bfcl/scripts/run_benchmark.py
  • benchmarks/bfcl/scripts/test_integration.py
  • benchmarks/bfcl/tests/__init__.py
  • benchmarks/bfcl/tests/conftest.py
  • benchmarks/bfcl/tests/test_evaluators.py
  • benchmarks/bfcl/tests/test_parser.py
  • benchmarks/bfcl/tests/test_runner.py
  • benchmarks/bfcl/types.py
  • benchmarks/context-bench/README.md
  • benchmarks/context-bench/RESEARCH.md
  • benchmarks/context-bench/elizaos_context_bench/__init__.py
  • benchmarks/context-bench/elizaos_context_bench/eliza_plugin.py
  • benchmarks/context-bench/elizaos_context_bench/embeddings.py
  • benchmarks/context-bench/elizaos_context_bench/evaluators/__init__.py
  • benchmarks/context-bench/elizaos_context_bench/evaluators/position.py
  • benchmarks/context-bench/elizaos_context_bench/evaluators/retrieval.py
  • benchmarks/context-bench/elizaos_context_bench/generator.py
  • benchmarks/context-bench/elizaos_context_bench/providers/__init__.py
  • benchmarks/context-bench/elizaos_context_bench/providers/context.py
  • benchmarks/context-bench/elizaos_context_bench/reporting.py
  • benchmarks/context-bench/elizaos_context_bench/runner.py
  • benchmarks/context-bench/elizaos_context_bench/suites/__init__.py
  • benchmarks/context-bench/elizaos_context_bench/suites/multihop.py
  • benchmarks/context-bench/elizaos_context_bench/suites/niah.py
  • benchmarks/context-bench/elizaos_context_bench/trajectory_integration.py
  • benchmarks/context-bench/elizaos_context_bench/types.py
  • benchmarks/context-bench/pyproject.toml
  • benchmarks/context-bench/run_benchmark.py
  • benchmarks/context-bench/tests/__init__.py
  • benchmarks/context-bench/tests/conftest.py
  • benchmarks/context-bench/tests/test_evaluators.py
  • benchmarks/context-bench/tests/test_generator.py
  • benchmarks/context-bench/tests/test_reporting.py
  • benchmarks/context-bench/tests/test_runner.py
  • benchmarks/context-bench/tests/test_types.py
  • benchmarks/context-bench/tests/test_validation.py
  • benchmarks/gaia/README.md
  • benchmarks/gaia/benchmark_results/gaia/detailed_results_20260112_000542.json
  • benchmarks/gaia/benchmark_results/gaia/sample/MODEL_COMPARISON.md
  • benchmarks/gaia/benchmark_results/gaia/sample/anthropic_claude-3-5-haiku-20241022/BENCHMARK_RESULTS.md
  • benchmarks/gaia/benchmark_results/gaia/sample/anthropic_claude-3-5-haiku-20241022/BENCHMARK_RESULTS_20260112_003715.md
  • benchmarks/gaia/benchmark_results/gaia/sample/anthropic_claude-3-5-haiku-20241022/BENCHMARK_RESULTS_20260112_004346.md
  • benchmarks/gaia/benchmark_results/gaia/sample/anthropic_claude-3-5-haiku-20241022/gaia-detailed-results_20260112_003715.jsonl
  • benchmarks/gaia/benchmark_results/gaia/sample/anthropic_claude-3-5-haiku-20241022/gaia-detailed-results_20260112_004346.jsonl
  • benchmarks/gaia/benchmark_results/gaia/sample/anthropic_claude-3-5-haiku-20241022/gaia-results-latest.json
  • benchmarks/gaia/benchmark_results/gaia/sample/anthropic_claude-3-5-haiku-20241022/gaia-results_20260112_003715.json
  • benchmarks/gaia/benchmark_results/gaia/sample/anthropic_claude-3-5-haiku-20241022/gaia-results_20260112_004346.json
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/BENCHMARK_RESULTS.md
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/BENCHMARK_RESULTS_20260112_003437.md
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/BENCHMARK_RESULTS_20260112_003511.md
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/BENCHMARK_RESULTS_20260112_004311.md
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/BENCHMARK_RESULTS_20260112_004516.md
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/BENCHMARK_RESULTS_20260112_112951.md
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-detailed-results_20260112_003437.jsonl
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-detailed-results_20260112_003511.jsonl
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-detailed-results_20260112_004311.jsonl
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-detailed-results_20260112_004516.jsonl
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-detailed-results_20260112_112951.jsonl
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-results-latest.json
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-results_20260112_003437.json
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-results_20260112_003511.json
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-results_20260112_004311.json
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-results_20260112_004516.json
  • benchmarks/gaia/benchmark_results/gaia/sample/groq_llama-3.1-8b-instant/gaia-results_20260112_112951.json
  • benchmarks/gaia/benchmark_results/gaia/sample/model_comparison.json
  • benchmarks/gaia/benchmark_results/gaia/sample/openai_gpt-4o-mini/BENCHMARK_RESULTS.md
  • benchmarks/gaia/benchmark_results/gaia/sample/openai_gpt-4o-mini/BENCHMARK_RESULTS_20260112_003540.md
  • benchmarks/gaia/benchmark_results/gaia/sample/openai_gpt-4o-mini/gaia-detailed-results_20260112_003540.jsonl
  • benchmarks/gaia/benchmark_results/gaia/sample/openai_gpt-4o-mini/gaia-results-latest.json
  • benchmarks/gaia/benchmark_results/gaia/sample/openai_gpt-4o-mini/gaia-results_20260112_003540.json
  • benchmarks/gaia/benchmark_results/gaia/simulation_results.json
  • benchmarks/gaia/elizaos_gaia/__init__.py
  • benchmarks/gaia/elizaos_gaia/__main__.py
  • benchmarks/gaia/elizaos_gaia/agent.py
  • benchmarks/gaia/elizaos_gaia/cli.py
  • benchmarks/gaia/elizaos_gaia/dataset.py
  • benchmarks/gaia/elizaos_gaia/evaluator.py
  • benchmarks/gaia/elizaos_gaia/inmemory_adapter.py
  • benchmarks/gaia/elizaos_gaia/metrics.py
  • benchmarks/gaia/elizaos_gaia/plugin.py
  • benchmarks/gaia/elizaos_gaia/providers.py
  • benchmarks/gaia/elizaos_gaia/runner.py
  • benchmarks/gaia/elizaos_gaia/tools/__init__.py
  • benchmarks/gaia/elizaos_gaia/tools/calculator.py
  • benchmarks/gaia/elizaos_gaia/tools/code_executor.py
  • benchmarks/gaia/elizaos_gaia/tools/file_processor.py
  • benchmarks/gaia/elizaos_gaia/tools/web_browser.py
  • benchmarks/gaia/elizaos_gaia/tools/web_search.py
  • benchmarks/gaia/elizaos_gaia/types.py
  • benchmarks/gaia/pyproject.toml
  • benchmarks/gaia/refs/main
  • benchmarks/gaia/tests/__init__.py
  • benchmarks/gaia/tests/test_calculator.py
  • benchmarks/gaia/tests/test_evaluator.py
  • benchmarks/gaia/tests/test_integration.py
  • benchmarks/gaia/tests/test_metrics.py
  • benchmarks/gaia/tests/test_types.py
  • benchmarks/gaia/tests/test_validation.py
  • benchmarks/milaidy-adapter/milaidy_adapter/__init__.py
  • benchmarks/milaidy-adapter/milaidy_adapter/agentbench.py
  • benchmarks/milaidy-adapter/milaidy_adapter/client.py
  • benchmarks/milaidy-adapter/milaidy_adapter/context_bench.py
  • benchmarks/milaidy-adapter/milaidy_adapter/mind2web.py
  • benchmarks/milaidy-adapter/milaidy_adapter/server_manager.py
  • benchmarks/milaidy-adapter/milaidy_adapter/tau_bench.py
  • benchmarks/milaidy-adapter/pyproject.toml
  • benchmarks/mind2web/README.md
  • benchmarks/mind2web/__init__.py
  • benchmarks/mind2web/__main__.py
  • benchmarks/mind2web/cli.py
  • benchmarks/mind2web/dataset.py
  • benchmarks/mind2web/eliza_agent.py
  • benchmarks/mind2web/evaluator.py
  • benchmarks/mind2web/pyproject.toml
  • benchmarks/mind2web/runner.py
  • benchmarks/mind2web/tests/__init__.py
  • benchmarks/mind2web/tests/test_integration.py
  • benchmarks/mind2web/types.py
  • benchmarks/mint/__init__.py
  • benchmarks/mint/agent.py
  • benchmarks/mint/dataset.py
  • benchmarks/mint/evaluator.py
  • benchmarks/mint/executor.py
  • benchmarks/mint/feedback.py
  • benchmarks/mint/metrics.py
  • benchmarks/mint/reporting.py
  • benchmarks/mint/run_benchmark.py
  • benchmarks/mint/runner.py
  • benchmarks/mint/tests/__init__.py
  • benchmarks/mint/tests/conftest.py
  • benchmarks/mint/tests/test_dataset.py
  • benchmarks/mint/tests/test_evaluator.py
  • benchmarks/mint/tests/test_executor.py
  • benchmarks/mint/tests/test_runner.py
  • benchmarks/mint/tests/test_types.py
  • benchmarks/mint/tests/test_validation.py
  • benchmarks/mint/trajectory_logger.py
  • benchmarks/mint/types.py
  • benchmarks/realm/__init__.py
  • benchmarks/realm/__main__.py
  • benchmarks/realm/adapters.py
  • benchmarks/realm/agent.py
  • benchmarks/realm/cli.py
  • benchmarks/realm/dataset.py
  • benchmarks/realm/evaluator.py
  • benchmarks/realm/plugin/__init__.py
  • benchmarks/realm/plugin/actions.py
  • benchmarks/realm/plugin/providers.py
  • benchmarks/realm/runner.py
  • benchmarks/realm/tests/__init__.py
  • benchmarks/realm/tests/test_canonical_agent.py
  • benchmarks/realm/tests/test_dataset_selection.py
  • benchmarks/realm/tests/test_env_loader.py
  • benchmarks/realm/tests/test_plan_parsing.py
  • benchmarks/realm/tests/test_runner_report_validation.py
  • benchmarks/realm/tests/test_smoke.py
  • benchmarks/realm/types.py
  • benchmarks/registry.py
  • benchmarks/rlm-bench/README.md
  • benchmarks/rlm-bench/elizaos_rlm_bench/__init__.py
  • benchmarks/rlm-bench/elizaos_rlm_bench/evaluator.py
  • benchmarks/rlm-bench/elizaos_rlm_bench/generator.py
  • benchmarks/rlm-bench/elizaos_rlm_bench/reporting.py
  • benchmarks/rlm-bench/elizaos_rlm_bench/runner.py
  • benchmarks/rlm-bench/elizaos_rlm_bench/tests/__init__.py
  • benchmarks/rlm-bench/elizaos_rlm_bench/tests/test_benchmark.py
  • benchmarks/rlm-bench/elizaos_rlm_bench/types.py
  • benchmarks/rlm-bench/pyproject.toml
  • benchmarks/rlm-bench/run_benchmark.py
  • benchmarks/swe_bench/RESEARCH.md
  • benchmarks/swe_bench/__init__.py
  • benchmarks/swe_bench/__main__.py
  • benchmarks/swe_bench/agent.py
  • benchmarks/swe_bench/benchmark_results/.gitkeep
  • benchmarks/swe_bench/benchmark_results/swe-bench-lite-baseline.json
  • benchmarks/swe_bench/benchmark_results/swe-bench-lite-baseline.md
  • benchmarks/swe_bench/character.py
  • benchmarks/swe_bench/cli.py
  • benchmarks/swe_bench/dataset.py
  • benchmarks/swe_bench/evaluator.py
  • benchmarks/swe_bench/plugin.py
  • benchmarks/swe_bench/providers.py
  • benchmarks/swe_bench/pyproject.toml
  • benchmarks/swe_bench/repo_manager.py
  • benchmarks/swe_bench/runner.py
  • benchmarks/swe_bench/tests/__init__.py
  • benchmarks/swe_bench/tests/test_agent.py
  • benchmarks/swe_bench/tests/test_character.py
  • benchmarks/swe_bench/tests/test_dataset.py
  • benchmarks/swe_bench/tests/test_evaluator.py
  • benchmarks/swe_bench/tests/test_providers.py
  • benchmarks/swe_bench/tests/test_repo_manager.py
  • benchmarks/swe_bench/tests/test_types.py
  • benchmarks/swe_bench/tools.py
  • benchmarks/swe_bench/trajectory_service.py
  • benchmarks/swe_bench/types.py
  • benchmarks/tau-bench/README.md
  • benchmarks/tau-bench/elizaos_tau_bench/__init__.py
  • benchmarks/tau-bench/elizaos_tau_bench/__main__.py
  • benchmarks/tau-bench/elizaos_tau_bench/agent.py
  • benchmarks/tau-bench/elizaos_tau_bench/cli.py
  • benchmarks/tau-bench/elizaos_tau_bench/constants.py
  • benchmarks/tau-bench/elizaos_tau_bench/dataset.py
  • benchmarks/tau-bench/elizaos_tau_bench/eliza_agent.py
  • benchmarks/tau-bench/elizaos_tau_bench/environments/__init__.py

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch v2.0.0

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link

greptile-apps bot commented Feb 27, 2026

Too many files changed for review. (3000 files found, 100 file limit)

echo "changes=true" >> $GITHUB_OUTPUT
else
echo "changes=false" >> $GITHUB_OUTPUT
fi
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs fixes never reach pull request

High Severity

create-pr checks for local changes after a fresh actions/checkout, but previous jobs only git commit inside their own runners and never push or pass artifacts. That leaves no changes in the final job, so docs-ci silently skips creating the documentation fix PR.

Additional Locations (2)

Fix in Cursor Fix in Web

- name: Run TypeScript interop tests
working-directory: packages/interop/typescript
run: |
npx vitest run || true
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interop test failures are silently ignored

Medium Severity

Both interop test commands end with || true, so failures in vitest or Python pytest do not fail the interop-tests job. The workflow can report success while cross-language interop is broken.

Additional Locations (1)

Fix in Cursor Fix in Web

cargo build --target wasm32-unknown-unknown --release --features wasm || {
echo "⚠️ WASM build with wasm feature failed, trying without..."
cargo build --target wasm32-unknown-unknown --release || true
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WASM build failures do not fail CI

Medium Severity

The wasm-build step falls back to a second cargo build, but that command is also followed by || true. If both builds fail, the step still succeeds, so CI can pass without any valid WASM artifact.

Fix in Cursor Fix in Web

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Introduces new benchmarking capabilities and updates CI/release automation as part of the v2.0.0 push.

Changes:

  • Added BFCL benchmark modules (plugin factory, evaluators, metrics, CLI entrypoint).
  • Added AgentBench Python package (CLI, adapters, deterministic mock runtime) with tests and sample outputs.
  • Refactored GitHub Actions workflows (multi-language testing, releases, supply-chain, docs CI) and removed several legacy repo files.

Reviewed changes

Copilot reviewed 87 out of 13610 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
benchmarks/bfcl/plugin.py Adds ElizaOS plugin factory/helpers for BFCL function definitions
benchmarks/bfcl/metrics.py Adds BFCL metrics calculator with scoring, latency stats, and error analysis
benchmarks/bfcl/evaluators/*.py Adds AST/execution/relevance evaluators for BFCL
benchmarks/bfcl/__main__.py Adds BFCL CLI runner and model listing/info commands
benchmarks/bfcl/__init__.py Exposes BFCL public API surface
benchmarks/bench_cli_types.py Adds shared JSON/benchmark CLI typing helpers
benchmarks/agentbench/** Adds AgentBench package, CLI, tests, docs, and sample artifacts
.github/workflows/* Updates/introduces workflows for CI, releases, docs, supply-chain scanning
Dockerfile (deleted) Removes root Docker image build definition (impacts image workflows)
.github/dependabot.yml Adds Dependabot config (currently invalid as written)
README.md Updates documentation/commands and branding text
.npmrc Changes npm registry auth configuration
Comments suppressed due to low confidence (1)

.github/workflows/image.yaml:79

  • This workflow builds from the repo root context and will default to ./Dockerfile, but the PR deletes the root Dockerfile. As-is, docker/build-push-action will fail at runtime. Either restore a root Dockerfile, or update the workflow to point at the new Dockerfile path via file: (and adjust triggers/paths accordingly).
      - name: Build and push Docker image
        id: push
        uses: docker/build-push-action@v6
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +227 to +271
for result in results:
if result.ast_match:
continue

# Categorize the error
if result.error:
if "timeout" in result.error.lower():
error_counts["timeout"] += 1
elif "type" in result.error.lower():
error_counts["type_error"] += 1
else:
error_counts["other"] += 1
continue

details = result.details
if not details:
error_counts["other"] += 1
continue

mismatch_reason = details.get("mismatch_reason", "")
if mismatch_reason == "count_mismatch":
pred_count = int(details.get("predicted_count", 0) or 0)
exp_count = int(details.get("expected_count", 0) or 0)
if pred_count < exp_count:
error_counts["missing_call"] += 1
else:
error_counts["extra_call"] += 1
else:
mismatches = details.get("mismatches", [])
if isinstance(mismatches, list) and mismatches:
for mismatch in mismatches:
if "name" in str(mismatch).lower():
error_counts["name_mismatch"] += 1
break
elif "arg" in str(mismatch).lower():
error_counts["argument_mismatch"] += 1
break
else:
error_counts["other"] += 1

if not result.relevance_correct:
error_counts["relevance_error"] += 1

if not result.exec_success and result.ast_match:
error_counts["execution_error"] += 1
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The early if result.ast_match: continue makes the later if not result.exec_success and result.ast_match: branch unreachable, so "execution_error" never increments. Consider handling execution failures for AST-matching cases before the continue (or removing that early continue and instead branching explicitly for AST-match vs AST-mismatch cases).

Copilot uses AI. Check for mistakes.
Comment on lines +90 to +93
if self.require_explicit_decline and response_text:
return self._has_decline_indicator(response_text)

# No calls made = correct for irrelevant case
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When require_explicit_decline=True but response_text is missing/empty, this currently returns True (i.e., “correct”) even though the evaluator is configured to require an explicit decline. Consider returning False when require_explicit_decline is enabled and response_text is not provided, or enforce response_text as required under that mode.

Suggested change
if self.require_explicit_decline and response_text:
return self._has_decline_indicator(response_text)
# No calls made = correct for irrelevant case
if self.require_explicit_decline:
# When explicit decline is required but no response text is provided,
# treat this as incorrect because we cannot verify the decline.
if not response_text:
return False
return self._has_decline_indicator(response_text)
# No calls made = correct for irrelevant case when explicit decline is not required

Copilot uses AI. Check for mistakes.
Comment on lines +94 to +102
type_mapping = {
"string": "string",
"str": "string",
"integer": "number",
"int": "number",
"number": "number",
"float": "number",
"boolean": "boolean",
"bool": "boolean",
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In JSON Schema, "integer" is distinct from "number". Mapping BFCL integer/int to "number" can incorrectly allow non-integers and may change downstream validation behavior. Map "integer"/"int" to "integer" instead.

Copilot uses AI. Check for mistakes.
Comment on lines +272 to +279
# Calculator mock
async def calculate(expression: str) -> dict[str, object]:
# Simple eval for basic math (in production, use a safe parser)
try:
result = eval(expression, {"__builtins__": {}}) # noqa: S307
return {"expression": expression, "result": result}
except Exception:
return {"expression": expression, "error": "Invalid expression"}
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eval() is still exploitable even with {"__builtins__": {}} (Python object graph tricks can escape). Since this code can be reached by untrusted benchmark inputs, replace eval with a safe expression evaluator (e.g., ast.parse + a restricted node whitelist + operator dispatch, or a dedicated safe-math library) to avoid code execution risk.

Copilot uses AI. Check for mistakes.
updates:
- package-ecosystem: '' # See documentation for possible values
directory: '/' # Location of package manifests
- package-ecosystem: "" # See documentation for possible values
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

package-ecosystem cannot be empty; Dependabot will reject this config. Set it to a valid ecosystem (npm, pip, github-actions, etc.) and add additional entries as needed for the monorepo (e.g., separate updates for /, /packages/python, /benchmarks/agentbench, and .github/workflows).

Suggested change
- package-ecosystem: "" # See documentation for possible values
- package-ecosystem: "github-actions" # See documentation for possible values

Copilot uses AI. Check for mistakes.
from benchmarks.bfcl.metrics import MetricsCalculator
from benchmarks.bfcl.reporting import BFCLReporter, print_results

__version__ = "1.0.0"
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BFCL package version is hardcoded to 1.0.0 while this PR is labeled “V2.0.0”. If BFCL is intended to ship as part of the v2.0.0 release, consider aligning this version string (or deriving it from a single authoritative version source) to avoid confusing downstream consumers.

Suggested change
__version__ = "1.0.0"
__version__ = "2.0.0"

Copilot uses AI. Check for mistakes.
@odilitime odilitime changed the title V2.0.0 V2.0.0 release Feb 27, 2026
@odilitime odilitime added the 2.x V3 label Feb 27, 2026
* fix: resolve build errors and dependency issues for clean installs

- Add missing deps to @elizaos/core (drizzle-orm, markdown-it, undici, yaml, sharp)
- Fix import paths in autonomy/service.ts and testing/index.ts
- Add node types to plugin-anthropic tsconfig to fix uuid type resolution
- Fix dependency declarations in several plugins (dangling refs, version ranges)
- Upgrade @pixi/react to v8 to match pixi.js v8 in dnd-vtt example

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat: add fix-workspace-deps script for local dev / commit workflow

Manages @elizaos/* dependency references across the monorepo:
  bun run fix-deps          — set workspace:* for local dev
  bun run fix-deps:restore  — restore versioned refs before committing
  bun run fix-deps:check    — CI check for leaked workspace:* refs

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(example-chat): add Ollama and local-ai support with smart provider detection

Explicit env vars (API keys, local URLs) take priority over auto-detected
local servers, so users with cloud keys aren't silently overridden by a
running Ollama instance.

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(plugin-sql): raw SQL expression indexes, array constraints, MySQL JSON translation

- Schema builders: support complex SQL expression indexes (e.g. ((metadata->>'type')))
  and multi-column expression indexes; use sql.raw() for PG and dialect translation for MySQL
- buildTable constraints callback now returns array (Drizzle non-deprecated API)
- DialectAdapter: add optional translateExpression(expr) for PG→MySQL JSON operator translation
- MySQL adapter: translate ->'', ->, ? to JSON_UNQUOTE(JSON_EXTRACT), JSON_EXTRACT, JSON_CONTAINS_PATH
- Apply translation to expression indexes and check constraints when building MySQL tables
- Fix DialectAdapter.createTable type to (table) => any[] in core and plugin-sql types
- Use Object.values in constraint loops to avoid unused key variables

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(db): align Rust host with TS DB API, migrate plugins to runtime memory API

Rust host:
- Add get_memories_by_ids (batch get) with default impl on DatabaseAdapter/UnifiedDatabaseAdapter
- Add UpdateMemoryItem as partial update (id + optional content, metadata, etc.); update_memories takes &[UpdateMemoryItem]
- Add create_memory_with_unique; create_memory delegates with unique=None
- Integration test MockDatabaseAdapter: implement update_memories(UpdateMemoryItem), add CreateMemoryItem import

Plugins (TS/Python):
- plugin-memory, plugin-planning: migrate from getMemoryManager/get_memory_manager to runtime createMemory/getMemories/updateMemory/deleteMemory (create_memory, get_memories, update_memory, delete_memory in Python)

Tests/fixes:
- message_service: fix test_should_respond_ignore_short_circuits by setting VALIDATION_LEVEL=trusted so mock IGNORE response is accepted
- platform.rs: fix define_platform_trait doctests (import macro, use trait Name [] { } syntax)

Misc: tsconfig/package.json/build updates across examples and plugins.
Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(plugin-sql): RLS entity context for transaction, queryEntities, upsertComponents, patchComponent, upsertMemories

- Add optional entityContext/options.entityContext to the five methods in core
  (IDatabaseAdapter, DatabaseAdapter, InMemoryAdapter) and runtime pass-through.
- Plugin-sql base: when entityContext set, run inside withEntityContext so
  Postgres RLS applies; shared createProxyWithDb for both transaction branches.
- Other adapters (LocalDB, InMemoryDB, MySQL base): accept and ignore entityContext.
- MySQL base queryEntities: destructure entityContext out before calling store.
- Unit tests (pg/pglite adapter, entity-context paths); integration test for
  adapter.queryEntities with entityContext under RLS.
- README section 'Entity context and RLS' with WHYs; CHANGELOG; code comments (WHY).

Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(plugin-sql): align RLS isolation with v1 patterns and remove legacy migration

Replace application_name-based server context with parameterized set_config('app.server_id', ...)
for SQL injection safety. Rename withEntityContext to withIsolationContext across all adapters.
Remove legacy v1.6.4→v1.6.5 migration script (migrateToEntityRLS) since v2.0.0 starts with
a fresh snake_case schema.

* fix(plugin-sql): fix failing TypeScript tests and adapter/store behavior

- Align test imports to use ./tables (no schema module)
- Update entity/crud tests for batch adapter API (createEntities returns UUID[])
- Add singular helpers in base adapter (createAgent, getAgent, createWorld, etc.)
- Fix relationship store: tags column as text[] (ARRAY[]::text[]), surface update errors
- Fix memory store: preserve content/metadata on partial updates
- Cache/cascade tests: ensure agent exists, relax cascade assertions when no FKs
- base-comprehensive: expect UUID from createComponent, filter memories by id
- runtime-migrator: allow fkCount >= 0 when FKs not created
- agent.test: use toEqual for array bio, conditional cascade verification
- rls-entity.test: fix describe callback closing (}, not });) for parse

Co-authored-by: Cursor <cursoragent@cursor.com>

* fix(core): fix failing tests and adapter/e2e/plugin-browser

Test adapters (bootstrap + main test-utils):
- Add agents map, upsertAgents, createRoomParticipants; createEntities return UUID[]
- getAgentsByIds/createAgents/upsertAgents use agents map; cleanupTestRuntime accepts undefined

Bootstrap/runtime:
- actions.test: use updateParticipantUserState (not setParticipantUserState)
- runtime.test: expect adapter.initialize (not init), mock initialize sets adapterReady

Secrets/policy/onboarding:
- Telegram token tests: use 35-char token after colon in secrets-validation and onboarding-cli
- tool-policy: empty allow list => allow all (expect true)
- onboarding-state: getUnconfiguredRequired use value == null for null/undefined

ensure-agent-exists/agent-uuid:
- Add upsertAgentsMock; tests assert on upsertAgents; createAgents/createEntities return UUID[]
- getAgentsByIds mock twice for create flow; add Entity import

real-runtime mock:
- Add agents map, upsertAgents, createEntities return UUID[]; add upsertRooms for ensureRoomExists

ApprovalService:
- stop() resolves pending approvals with cancelled: true before clearing

autonomy.test: add getRoomsForParticipants to runtime mock

e2e: vitest.config.ts exclude e2e/** so Playwright specs not run by vitest

plugin-browser-imports: skip plugins without index.browser.ts; add plugin-elizacloud to allowlist
Co-authored-by: Cursor <cursoragent@cursor.com>

* benchmarks: fix issues in benchmarks (ast_evaluator.py)

Iteration 1

prr-fix:prrc_kwdomt5cis6nz4u5

* benchmarks: fix by restructuring (metrics.py)

Iteration 1

prr-fix:prrc_kwdomt5cis6nz4vz
prr-fix:prrc_kwdomt5cis6nz4vz

* benchmarks: add or fix tests (exec_evaluator.py)

Iteration 1

prr-fix:prrc_kwdomt5cis6nz4v_

* benchmarks: improve the path (README.md)

Iteration 1

prr-fix:prrc_kwdomt5cis6nz4wi

* examples: Potential issue | 🟡 Minor (pyproject.toml)

Iteration 1

prr-fix:prrc_kwdomt5cis6n0bo6

* benchmarks: ❌ CRITICAL: Benchmark Bug (UNFIXED × 3 reviews)

Iteration 1

prr-fix:ic-3943501401-1

* misc: ⚠️ IMPORTANT: Documentation Return Type Inconsistencies

Iteration 1

prr-fix:ic-3943501401-2

* plugins: 💡 SUGGESTIONS: Type Safety (base.ts)

Iteration 1

prr-fix:ic-3943501401-3

* misc: Internal developer paths leaked into committed documentation

Iteration 2

prr-fix:prrc_kwdomt5cis6oahj4

* benchmarks: Class renamed to lowercase breaking Python naming conventions

Iteration 2

prr-fix:prrc_kwdomt5cis6pq1bq

* examples: Potential issue | 🟡 Minor (capacitor.config.ts)

Iteration 3

prr-fix:prrc_kwdomt5cis6n0bo0

* examples: Potential issue | 🟡 Minor (sub-agents.smoke.test.ts)

Iteration 3

prr-fix:prrc_kwdomt5cis6n0bo8

* misc: fix issues in DATABASE_API_README (DATABASE_API_README.md)

Iteration 3

prr-fix:prrc_kwdomt5cis6rawu4
prr-fix:prrc_kwdomt5cis6rawvt
prr-fix:ic-3988466184-2

* benchmarks: ❌ CRITICAL: Benchmark Bug (UNFIXED × 4 reviews)

Iteration 7

prr-fix:ic-3988466184-1

* misc: add or fix tests (rls-entity.test.ts)

Iteration 8

prr-fix:ic-3988466184-4
prr-fix:ic-3988480870-2

* plugin-sql: 💡 Type Safety: Multiple `as any` casts (~30 instances)

Iteration 11

prr-fix:ic-3988466184-3

* misc: add type safety (BENCHMARK_RESULTS.md)

Iteration 12

prr-fix:ic-3912797702-0

* packages: IMPORTANT: Interface/Documentation mismatch (database.ts)

Iteration 15

prr-fix:prrc_kwdomt5cis6ra9_p

* misc: ⚠️ IMPORTANT: Interface/Documentation Mismatch

Iteration 15

prr-fix:ic-3988520491-2

* misc: Stale documentation claims completeness with non-existent files

Iteration 16

prr-fix:prrc_kwdomt5cis6ra_s9

* misc: ⚠️ IMPORTANT: Interface/Documentation Mismatch

Iteration 17

prr-fix:ic-3988502505-2

* misc: ✅ What's Good (rls-entity.test.ts)

Iteration 17

prr-fix:ic-3988569610-3

* packages: Interface inconsistency (database.ts)

Iteration 18

prr-fix:prrc_kwdomt5cis6rbi2e

* misc: improve this table (DATABASE_API_README.md)

Iteration 21

prr-fix:prrc_kwdomt5cis6rbi3o
prr-fix:prrc_kwdomt5cis6rbi73
prr-fix:prrc_kwdomt5cis6rbi8y

* benchmarks: consolidate duplicate logic

Changes:
- reporting.py: ### ❌ CRITICAL: Benchmark Bug (UNFIXED × 3 reviews) `benchmarks/bfcl/reportin...
- reporting.py: ### ❌ CRITICAL: Benchmark Bug (UNFIXED × 4 reviews) `benchmarks/bfcl/reportin...
- reporting.py: ### ❌ CRITICAL: Benchmark Bug (UNFIXED × 5 reviews) `benchmarks/bfcl/reportin...
- reporting.py: CRITICAL: enumerate loop variable bug (flagged 5+ times in prior reviews) `ra...

* benchmarks: fix issues in benchmarks (reporting.py)

Iteration 1

prr-fix:prrc_kwdomt5cis6rbnhg
prr-fix:ic-3988598541-1

* plugins: 💡 Type Safety: 66 `as any` casts (participant.store.ts)

Iteration 2

prr-fix:ic-3988598541-2

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

* misc: fix in cursor (DOCUMENTATION_COMPLETE.md)

Changes:
- DOCUMENTATION_COMPLETE.md: ### Internal developer paths leaked into committed documentation Low Severity...
- DOCUMENTATION_COMPLETE.md: ### Stale documentation claims completeness with non-existent files Low Sever...

* benchmarks: fix issues in benchmarks (reporting.py)

Iteration 2

prr-fix:prrc_kwdomt5cis6rbnhg
prr-fix:prrc_kwdomt5cis6rt3ca
prr-fix:ic-3994084967-1

* misc: refactor for clarity (DATABASE_API_README.md)

Iteration 2

prr-fix:prrc_kwdomt5cis6rawvt
prr-fix:ic-3994084967-0

* benchmarks: add or fix tests

Changes:
- metrics.py: `executionerror` can never be incremented because the loop `continue`s whenev...
- metrics.py: `executionerror` can never be incremented because the loop `continue`s whenev...
- metrics.py: The percentile indexing is not computed consistently with common quantile def...
- exec_evaluator.py: Using `eval()` is still unsafe even with `builtins` removed; crafted inputs c...

* benchmarks: Module-level `safe_eval` function is unused dead code

Iteration 1

prr-fix:prrc_kwdomt5cis6rurmp

* misc: refactor for clarity (DOCUMENTATION_COMPLETE.md)

Iteration 2

prr-fix:ic-3994231842-0

* benchmarks: Redundant module-level imports duplicated inside function

Iteration 1

prr-fix:prrc_kwdomt5cis6ruyki

* misc: improve code quality

* benchmarks: Leaderboard fallback rank ignores elizaOS insertion offset

Iteration 1

prr-fix:prrc_kwdomt5cis6rv5f6

* misc: add tests for component.test (component.test.ts)

Iteration 13

prr-fix:ic-3995122811-0

* benchmarks: CRITICAL: enumerate loop variable reassignment bug

Iteration 18

prr-fix:prrc_kwdomt5cis6ra5rw

* benchmarks: Duplicate elizaOS row in leaderboard when score is lowest

Iteration 2

prr-fix:prrc_kwdomt5cis6rwxka

* benchmarks: ⚠️ Minor: Duplicate Fallback Code (reporting.py)

Iteration 3

prr-fix:ic-3995138905-1

* misc: add tests for component.test (component.test.ts)

Iteration 8

prr-fix:ic-3995138905-0

* misc: consolidate duplicate logic (reporting.py)

Iteration 8

prr-fix:ic-3995138905-3

* benchmarks: Reporting module replaced with invalid placeholder text

Iteration 13

prr-fix:prrc_kwdomt5cis6rwzyd

* stores: 💡 Minor: Type Safety (plugin.store.ts)

Iteration 13

prr-fix:ic-3995138905-2

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

* benchmarks: CRITICAL: API Breaking Change (reporting.py)

Iteration 1

prr-fix:prrc_kwdomt5cis6rw3dp

* misc: add tests for component.test (component.test.ts)

Iteration 14

prr-fix:ic-3995841307-0

* misc: refactor for clarity (reporting.py)

Iteration 14

prr-fix:ic-3995841307-1
prr-fix:ic-3995841307-3

* misc: ❌ CRITICAL: `reporting.py` API Breaking Changes (reporting.py)

Iteration 15

prr-fix:ic-3995165126-1

* benchmarks: consolidate duplicate logic

Changes:
- reporting.py: CRITICAL: enumerate loop variable reassignment bug `rank += 1` has no effect ...
- reporting.py: CRITICAL: enumerate loop variable bug (flagged 5+ times in prior reviews) `ra...
- reporting.py: ### Leaderboard rank increment has no effect across iterations Medium Severit...
- reporting.py: CRITICAL: Bug still present `rank += 1` has no effect because `enumerate()` r...
- reporting.py: CRITICAL: Bug still present (was reverted in c262242) `rank += 1` has no eff...
- reporting.py: CRITICAL: Bug still present after 6+ review cycles `rank += 1` has no effect ...
- reporting.py: CRITICAL: Missing rank increment After appending the baseline entry, you need...
- reporting.py: ### Leaderboard fallback rank ignores elizaOS insertion offset Low Severity T...
- reporting.py: ### Duplicate elizaOS row in leaderboard when score is lowest Medium Severity...
- reporting.py: ### Reporting module replaced with invalid placeholder text High Severity The...
- reporting.py: CRITICAL: API Breaking Change `BFCLReporter.init` now takes no arguments, but...
- reporting.py: ### Rewritten `BFCLReporter` API breaks all existing callers High Severity Th...
- plugin.store.ts: ### 💡 Minor: Type Safety 8 `as any` casts remain in production stores (accep...
- reporting.py: CRITICAL: Type mismatch + rank off-by-one 1. Type mismatch: Callers pass `BFC...
- plugin.store.ts: ### 💡 Type Safety: 8 `as any` casts remain (non-blocking) ``` stores/plugin....

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

* benchmarks: CRITICAL: Missing import + wrong attribute access

Iteration 1

prr-fix:prrc_kwdomt5cis6rzn_-

* benchmarks: consolidate duplicate logic (reporting.py)

Iteration 2

prr-fix:prrc_kwdomt5cis6ra5rw
prr-fix:prrc_kwdomt5cis6rw3sq
prr-fix:prrc_kwdomt5cis6r0hnd

* misc: add tests for DOCUMENTATION_COMPLETE (DOCUMENTATION_COMPLETE.md)

Iteration 2

prr-fix:ic-3996124848-5

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

* benchmarks: CRITICAL: Async/argument mismatch with callers

Iteration 1

prr-fix:prrc_kwdomt5cis6r0rat

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

* benchmarks: fix in cursor (reporting.py)

Iteration 1

prr-fix:prrc_kwdomt5cis6rw3sq
prr-fix:prrc_kwdomt5cis6r0r8m
prr-fix:prrc_kwdomt5cis6saz-c

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

* stores: 💡 Minor: Type Safety (plugin.store.ts)

Iteration 1

prr-fix:ic-3995122811-1

* benchmarks: fix in cursor (reporting.py)

Iteration 1

prr-fix:prrc_kwdomt5cis6sbaxd
prr-fix:prrc_kwdomt5cis6sbaxf

* plugins: 💡 SUGGESTIONS: Type Safety (base.ts)

Iteration 1

prr-fix:ic-3943405190-3

* benchmarks: Reporter rewrite loses leaderboard ranking and reporting features

Iteration 3

prr-fix:prrc_kwdomt5cis6sbaxi

* plugins: add type safety (base.ts)

Iteration 3

prr-fix:ic-3924496309-3

* benchmarks: fix imports

Iteration 3

prr-fix:prrc_kwdomt5cis6rw3sq
prr-fix:prrc_kwdomt5cis6rzn_-
prr-fix:prrc_kwdomt5cis6r0gwr
prr-fix:prrc_kwdomt5cis6sbg98
prr-fix:ic-4008414039-1
prr-fix:ic-4008414039-2

* misc: add tests for component.test (component.test.ts)

Iteration 6

prr-fix:ic-4008414039-4

* benchmarks: consolidate duplicate logic

Changes:
- DOCUMENTATION_COMPLETE.md: ### Internal developer paths leaked into committed documentation Low Severity...
- reporting.py: CRITICAL: enumerate loop variable reassignment bug `rank += 1` has no effect ...
- reporting.py: CRITICAL: enumerate loop variable bug (flagged 5+ times in prior reviews) `ra...
- database.ts: IMPORTANT: Interface/Documentation mismatch These methods return `Promise`, b...
- reporting.py: ### Leaderboard rank increment has no effect across iterations Medium Severit...
- DOCUMENTATION_COMPLETE.md: ### Stale documentation claims completeness with non-existent files Low Sever...
- reporting.py: CRITICAL: Bug still present `rank += 1` has no effect because `enumerate()` r...
- database.ts: Interface inconsistency This returns `Promise` but the design principle at `D...
- reporting.py: CRITICAL: Bug still present (was reverted in c262242) `rank += 1` has no eff...
- reporting.py: CRITICAL: Bug still present after 6+ review cycles `rank += 1` has no effect ...
- reporting.py: CRITICAL: Missing rank increment After appending the baseline entry, you need...
- reporting.py: ### Leaderboard fallback rank ignores elizaOS insertion offset Low Severity T...
- reporting.py: ### Duplicate elizaOS row in leaderboard when score is lowest Medium Severity...
- reporting.py: ### Reporting module replaced with invalid placeholder text High Severity The...
- reporting.py: ### Rewritten `reporting.py` breaks all callers with incompatible API High Se...
- reporting.py: ### Leaderboard rank numbering starts at zero instead of one Low Severity The...
- reporting.py: CRITICAL: API Breaking Change `BFCLReporter.init` now takes no arguments, but...
- reporting.py: CRITICAL: Type mismatch + rank off-by-one 1. Type mismatch: Callers pass `BFC...
- reporting.py: ### Rewritten `BFCLReporter` API breaks all existing callers High Severity Th...
- reporting.py: CRITICAL: Type mismatch + rank off-by-one 1. Type mismatch: Callers pass `BFC...
- reporting.py: CRITICAL: Missing import + wrong attribute access 1. `BFCLBenchmarkResults` i...
- reporting.py: ### `printresults` accesses nonexistent `.metrics` on `BFCLResult` High Sever...
- reporting.py: ### `BFCLReporter` rewrite removes all reporting functionality used by caller...
- reporting.py: CRITICAL: Wrong attribute access `result` is a `BFCLResult` which does NOT ha...
- reporting.py: ### Reporter `generatereport` breaks callers expecting async with args High S...
- reporting.py: ### Hardcoded rank=1 ignores actual leaderboard position Medium Severity The ...
- reporting.py: CRITICAL: Async/argument mismatch with callers `runner.py:146` and `:485` cal...
- reporting.py: ### Reporter API mismatch causes runner crash on report generation High Sever...
- reporting.py: ### Reporting rewrite loses all file output and console summary Medium Severi...
- reporting.py: ### `printresults` calls async method without await or argument High Severity...
- reporting.py: ### Rewritten reporter drops all file output functionality Medium Severity Th...
- reporting.py: CRITICAL: Missing await `generatereport()` is async (line 21) but is called h...
- reporting.py: ### `asyncio.run()` crashes inside already-running event loop High Severity `...
- reporting.py: ### `BFCLConfig` dataclass lacks dict `.get()` method High Severity `BFCLRepo...
- reporting.py: ### Reporter rewrite loses leaderboard ranking and reporting features Medium ...
- reporting.py: ### ❌ CRITICAL: Benchmark Bug (UNFIXED × 3 reviews) `benchmarks/bfcl/reportin...
- reporting.py: ### ❌ CRITICAL: Benchmark Bug (UNFIXED × 4 reviews) `benchmarks/bfcl/reportin...
- reporting.py: ### ❌ CRITICAL: Benchmark Bug (UNFIXED × 5 reviews) `benchmarks/bfcl/reportin...
- reporting.py: ### ❌ CRITICAL: Benchmark Bug (Unfixed) `benchmarks/bfcl/reporting.py:394` — ...
- reporting.py: ### ❌ CRITICAL: Benchmark Bug Still Present `benchmarks/bfcl/reporting.py:397...
- DOCUMENTATION_COMPLETE.md: ### Summary This is a well-designed database refactor. Most previously flagge...
- DOCUMENTATION_COMPLETE.md: ### Summary This is a well-designed database refactor. Most previously flagge...
- reporting.py: ### ⚠️ Minor: Duplicate Fallback Code `benchmarks/bfcl/reporting.py:406-421` ...
- DOCUMENTATION_COMPLETE.md: ### ✅ What's Good - patchComponent tests now cover all 4 operations (set, pus...
- reporting.py: ### ❌ CRITICAL: `printresults` calls async function without await `benchmarks...
- reporting.py: ### `printresults` accesses non-existent attributes on `BFCLResult` High Seve...
- reporting.py: ### Report output directory never created before writing files Medium Severit...
- reporting.py: ### ❌ CRITICAL: `printresults` accesses non-existent attributes `benchmarks/b...
- reporting.py: ### ❌ CRITICAL: `asyncio.run()` called from async context `benchmarks/bfcl/re...

* misc: consolidate duplicate logic

Changes:
- reporting.py: ### Verdict Approve with minor changes. The duplicate fallback code in report...
- reporting.py: ### ❌ CRITICAL: `reporting.py` API Breaking Changes The `benchmarks/bfcl/repo...
- reporting.py: ### Verdict Request changes. Fix the `reporting.py` API breaking changes befo...
- reporting.py: ### ❌ CRITICAL: `reporting.py` API Breaking Change `benchmarks/bfcl/reporting...
- reporting.py: ### Verdict Approve with minor changes. The core database refactor is solid. ...
- reporting.py: ### ❌ CRITICAL: `reporting.py` Missing Import + Wrong Attributes `benchmarks/...
- reporting.py: ### Verdict Request changes. Fix the `reporting.py` import and attribute acce...
- reporting.py: ### ❌ CRITICAL: `reporting.py` API Breaking Changes
- reporting.py: 1. Wrong attribute access in `printresults` (`reporting.py:44-47`): ```python...
- reporting.py: ### Verdict Approve with minor changes. The core database refactor is solid. ...
- reporting.py: ### ❌ CRITICAL: `reporting.py` API Breaking Change `runner.py:146` and `:485`...
- reporting.py: ### Verdict Approve with minor changes. Fix the `reporting.py` async/argument...
- reporting.py: ### ❌ CRITICAL: `reporting.py` Async/Argument Mismatch `runner.py:146` and `:...
- reporting.py: ### Verdict Approve with minor changes. Fix the `reporting.py` async/argument...

* benchmarks: fix issues in benchmarks (ast_evaluator.py)

Changes:
- ast_evaluator.py: `isinstance` does not accept PEP-604 unions (`int | float`) as the second arg...

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

* benchmarks: fix in cursor

Iteration 1

prr-fix:prrc_kwdomt5cis6rw3sq
prr-fix:prrc_kwdomt5cis6sbaxi
prr-fix:prrc_kwdomt5cis6sbpck
prr-fix:prrc_kwdomt5cis6sbpcm
prr-fix:prrc_kwdomt5cis6sbtaj

* docs: add review dismissal comments

Explains reasoning for dismissed issues inline in code

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: standujar <s.andujar@proton.me>
@odilitime
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@@ -1 +1 @@
node-linker=hoisted No newline at end of file
//registry.npmjs.org/:_authToken=${NPM_TOKEN} No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Committed .npmrc auth token breaks contributor installs

High Severity

The .npmrc file now only contains //registry.npmjs.org/:_authToken=${NPM_TOKEN} and is committed to the repository (not in .gitignore). The previously existing node-linker=hoisted line was removed. This causes two problems: (1) any contributor without NPM_TOKEN set will have npm send an empty/undefined auth token on every registry request, which can cause 401 Unauthorized errors during install; (2) committing registry auth configuration to version control is a security anti-pattern that can lead to accidental token exposure. This config belongs in CI environment setup or user-level ~/.npmrc, not in the project root.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants