feat: implement container pooling and fix transport error handling#551
feat: implement container pooling and fix transport error handling#551
Conversation
- Fix YAML linting issues in GitHub workflow files - Remove trailing spaces in ci-ts.yaml - Add missing newlines at end of YAML files - Fix typo: settingsNavnavLeftButton → settingsNavLeftButton - Organize imports in containerPool.ts (external packages first) - Remove unused model variable in cleanup() method - Apply automated code formatting improvements
- Fix missing parentheses in tests/minapp/index.ts toString() call - Add error handling to gRPC exchange promise chain - Remove async modifier from globalsetup.ts (no await needed) - Add race condition protection for pool initialization - Document DEV_CERT_PRIVATE_KEY as development-only test key
- Fix trailing spaces in .github/dependabot.yml - Add missing await to async operations in tests/pullImageKillOld.ts - Wrap async calls in IIFE to prevent CI hanging on cleanup This should resolve the hanging CI build and address the remaining CodeRabbit linting issues.
- Add error tracking in transport layer via proxy interceptor - Create errors.ts with APDU status codes and error handling utilities - Propagate critical transport errors in waitUntilScreenIs, waitUntilScreenIsNot, and waitForText - Update getEvents to check for critical transport errors before making requests - Add comprehensive tests to verify fast-fail behavior Previously, when errors like 0x6984 (INVALID_DATA) or 0x6E00 (CLA_NOT_SUPPORTED) occurred, Zemu would continue waiting for timeouts instead of failing immediately. Now these errors are detected and cause immediate failure with clear error messages. Fixes issue where transport errors were absorbed causing unnecessary timeout delays.
The test was expecting 0x6984 (INVALID_DATA) but invalid CLA (0xFF) actually triggers 0x6E00 (CLA_NOT_SUPPORTED) which is the correct behavior.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
- Remove unused imports in test files - Fix import organization in Zemu.ts - Fix formatting (newlines at end of files) - Apply Biome linter rules All tests passing after linter fixes.
There was a problem hiding this comment.
@jleni maybe this file is not needed anymore? depends on the failure being copied to the original file or not
There was a problem hiding this comment.
what about this one? Seems to be repeated too
| ;(async () => { | ||
| await Zemu.checkAndPullImage() | ||
| await Zemu.stopAllEmuContainers() | ||
| })() |
There was a problem hiding this comment.
What if we add then/catch here, and log if errors?
) * feat: modernize Zemu codebase to Zondax standards (#547) * modernize * refactor: migrate to biome and standardize code style across project * chore: update biome lint command and add fix script alias * style: standardize quotes to single quotes and fix formatting * refactor: replace forEach loops with for...of and remove unnecessary async keywords * refactor: replace forEach loops with for...of and remove unnecessary async declarations * ci: update workflow commands and disable coverage in ci-ts pipeline * temporarily disable docker * Modernize codebase (#548) * activate testing in ci * ci: remove frozen lockfile in CI workflow * feat: implement container pooling and fix transport error handling (#551) * feat: implement container pooling system for device emulation * fix: address CodeRabbit review comments - Fix YAML linting issues in GitHub workflow files - Remove trailing spaces in ci-ts.yaml - Add missing newlines at end of YAML files - Fix typo: settingsNavnavLeftButton → settingsNavLeftButton - Organize imports in containerPool.ts (external packages first) - Remove unused model variable in cleanup() method - Apply automated code formatting improvements * fix: address additional CodeRabbit review comments - Fix missing parentheses in tests/minapp/index.ts toString() call - Add error handling to gRPC exchange promise chain - Remove async modifier from globalsetup.ts (no await needed) - Add race condition protection for pool initialization - Document DEV_CERT_PRIVATE_KEY as development-only test key * fix: resolve remaining CI issues and CodeRabbit comments - Fix trailing spaces in .github/dependabot.yml - Add missing await to async operations in tests/pullImageKillOld.ts - Wrap async calls in IIFE to prevent CI hanging on cleanup This should resolve the hanging CI build and address the remaining CodeRabbit linting issues. * fix: handle transport errors to fail fast instead of timing out - Add error tracking in transport layer via proxy interceptor - Create errors.ts with APDU status codes and error handling utilities - Propagate critical transport errors in waitUntilScreenIs, waitUntilScreenIsNot, and waitForText - Update getEvents to check for critical transport errors before making requests - Add comprehensive tests to verify fast-fail behavior Previously, when errors like 0x6984 (INVALID_DATA) or 0x6E00 (CLA_NOT_SUPPORTED) occurred, Zemu would continue waiting for timeouts instead of failing immediately. Now these errors are detected and cause immediate failure with clear error messages. Fixes issue where transport errors were absorbed causing unnecessary timeout delays. * fix: update error handling test to expect correct error code The test was expecting 0x6984 (INVALID_DATA) but invalid CLA (0xFF) actually triggers 0x6E00 (CLA_NOT_SUPPORTED) which is the correct behavior. * fix: resolve linter issues - Remove unused imports in test files - Fix import organization in Zemu.ts - Fix formatting (newlines at end of files) - Apply Biome linter rules All tests passing after linter fixes. * feat: increase default container pool size to 2 for all devices Changed default pool configuration from 1 to 2 containers per device type (except nanos which was already 2). This provides better parallelism for test suites that run multiple tests concurrently. * fix: add explicit permissions to GitHub workflows Address security warnings by adding explicit GITHUB_TOKEN permissions: - ci-ts.yaml: contents: read (minimal permissions for CI) - publish.yml: contents: read, packages: write (for npm publishing) This follows GitHub security best practices to limit token permissions. * fix: address CodeRabbit review comments - Add missing newline at EOF in .github/workflows/publish.yml - Replace console.log with proper error handling in buttons.ts - Improve error handling in Zemu.ts close method with try-catch-finally - Add setScrambleKey support to transport proxy for comprehensive error tracking - Remove debugging console.log statements from tests - Remove skipped test that was only for exploration - Replace console.warn with process.stderr.write for cleaner logging * test: align metadata and comments to CLA_NOT_SUPPORTED (0x6E00) - Update test title to reference 0x6E00 (CLA_NOT_SUPPORTED) instead of 0x6984 - Update comment before transport.send to reference CLA_NOT_SUPPORTED - Update comment on thrown error to reference CLA_NOT_SUPPORTED This aligns test documentation with the actual error code (0x6E00) that the test assertion validates, removing confusion about which error is being tested. * fix: correct property name from containerPooling to disablePool in tests - Update tests/error-handling-simple.test.ts to use disablePool: true - Update tests/error-handling.test.ts to use disablePool: true - Update tests/error-handling-fixed.test.ts to use disablePool: true The property name now matches the IStartOptions interface definition which defines disablePool?: boolean, not containerPooling. * fix: resolve ButtonKind errors for non-touch devices - Create dummyButton constant to avoid calling getTouchElement for nanos/nanox/nanosp - Add device type checks before calling getTouchElement in Zemu.ts navigation - Fix TouchNavigation usage to check isTouchDevice before instantiation - Non-touch devices now use ActionKind.RightClick with dummy button for navigation - Remove hardcoded getTouchElement call in scheduleToNavElement function This resolves 'Unsupported ButtonKind: nanos, 1' and 'Unsupported ButtonKind: nanos, 2' errors when using ButtonKind.QuitAppButton and ButtonKind.SwipeContinueButton with Nano devices. * feat: implement automatic container cleanup to prevent port conflicts - Add cleanupStaleContainers() method to ContainerPool that runs before initialization - Use Docker API to find and remove stale zemu containers automatically - Enhanced global test setup with comprehensive process exit handlers - Added handlers for SIGINT, SIGTERM, beforeExit, uncaughtException, and unhandledRejection - Prevents port allocation conflicts by cleaning up leftover containers from previous runs - No more manual container cleanup required - the library manages this automatically This resolves port conflicts like 'Bind for 0.0.0.0:15000 failed: port is already allocated' that occurred when containers weren't properly cleaned up from interrupted test runs. * fix: add explicit support for prerelease tags while maintaining security - Keep secure numeric-only patterns (rejecting unsafe CodeRabbit suggestion v*.*.*) - Add explicit patterns for common prerelease formats: - v1.2.3-rc1, v1.2.3-rc2 (release candidates) - v1.2.3-beta.1, v1.2.3-beta.2 (beta releases) - v1.2.3-alpha.1, v1.2.3-alpha.2 (alpha releases) - Maintain backwards compatibility with existing patterns - General v[0-9]+.[0-9]+.[0-9]+-* pattern still covers other formats This ensures -rc1 support while preventing dangerous patterns like v*.*.* that could match malicious strings like vMALICIOUS.CODE.INJECTION-hack. * refactor: simplify tag patterns with clear comments - Use concise patterns with descriptive comments - Main releases: v1, v1.2, v1.2.3 - Prereleases: v1.2.3-rc1, v1.2.3-beta.1, etc. - Remove redundant explicit patterns - general v[0-9]+.[0-9]+.[0-9]+-* covers all prerelease formats - Maintain security by requiring numeric versions only * fix: improve error handling and code formatting in container pool initialization * adjust github token permission settings * feat: add comprehensive APDU error codes to errors module - Add BUSY (0x9001) for device busy state - Add EXECUTION_ERROR (0x6400) for execution failures - Add EMPTY_BUFFER (0x6982) for empty buffer conditions - Add OUTPUT_BUFFER_TOO_SMALL (0x6983) for buffer size issues - Add TX_NOT_INITIALIZED (0x6987) for uninitialized transactions - Add BAD_KEY_HANDLE (0x6A80) for invalid key handles - Add SIGN_VERIFY_ERROR (0x6F01) for signature verification failures - Update critical transport errors list with new error codes - Add human-readable messages for all new error codes * refactor: consolidate duplicate dummyButton definitions into single export - Export dummyButton from buttons.ts module - Remove duplicate definitions from Zemu.ts and actions.ts - Update imports to use the shared dummyButton export - Reduces code duplication and improves maintainability * fix: update GitHub workflow permissions for npm publishing - Change permissions from packages:write to id-token:write - Required for npm provenance and OIDC authentication - Addresses security recommendation from PR review * docs: update README with latest features and improvements - Update CI badge to point to new workflow (ci-ts.yaml) - Add support for Stax and Flex devices in overview - Update features list with container pooling and TypeScript support - Add Vitest to supported test frameworks - Include installation instructions with pnpm support - Add quick start code example - Highlight enhanced error handling capabilities
Summary
This PR implements container pooling for Zemu to improve performance and fixes transport error handling to fail fast instead of timing out.
Key Features
Container Pooling (feat: Add container pooling for 60-80% faster test execution #549)
ZEMU_DISABLE_POOL=trueTransport Error Handling
Changes
Container Pooling Implementation
src/containerPool.ts: New singleton class managing pooled containersTransport Error Handling
src/errors.ts: Define APDU status codes and error utilitiessrc/Zemu.ts:waitUntilScreenIs(),waitUntilScreenIsNot(), andwaitForText()to check for critical errorsgetEvents()to propagate transport errors instead of silently returning empty arrayConfiguration
ZEMU_POOL_NANOS: Number of Nano S containers (default: 2)ZEMU_POOL_NANOX: Number of Nano X containers (default: 1)ZEMU_POOL_NANOSP: Number of Nano S+ containers (default: 1)ZEMU_POOL_STAX: Number of Stax containers (default: 1)ZEMU_POOL_FLEX: Number of Flex containers (default: 1)ZEMU_DISABLE_POOL: Disable pooling entirely (default: false)Tests
Performance Impact
Migration Guide
No breaking changes - the pooling feature is enabled by default but can be disabled:
Testing
All tests pass including new test suites:
tests/error-handling.test.ts: Validates fast-fail error handlingtests/error-handling-fixed.test.ts: Comprehensive error propagation teststests/basic.*.test.ts: All existing tests continue to workRelated Issues
Checklist