Skip to content

ci: stabilize integration tests with isolation, serial execution, chaos ordering, and comprehensive command coverage#110

Draft
Copilot wants to merge 5 commits intomainfrom
copilot/stabilize-ci-test-suites
Draft

ci: stabilize integration tests with isolation, serial execution, chaos ordering, and comprehensive command coverage#110
Copilot wants to merge 5 commits intomainfrom
copilot/stabilize-ci-test-suites

Conversation

Copy link
Contributor

Copilot AI commented Feb 27, 2026

  • Investigate flaky CI and identify root causes
  • Fix plugin-lifecycle.test.ts isolation
  • Serialize integration tests + chaos runner + CI changes
  • Merge main
  • Unit tests: tests/unit/list-table-formatting.test.ts
  • Integration tests: tests/integration/list-commands.test.ts (comprehensive CLI flag coverage)
  • Fix CI failures: wrong JSON field names (Key/'Process ID'/State not processDefinitionKey/etc.) in list-commands.test.ts before() and search assertions
  • Fix process.exit mock-with-throw in completion.test.tsspawnSync with exit-code assertion
  • Fix process.exit mock-with-throw in profile-switching.test.tsspawnSync with exit-code assertion
Original prompt

Start implementation

The user has attached the following file paths as relevant context:

  • .github/copilot-instructions.md
[Chronological Review: The conversation began with the user requesting an investigation into flaky GitHub Actions and the need to stabilize CI by analyzing past runs and ensuring test suites are isolated. The user then specified not to invest in fixing the `release.yaml` file as it is being handled in parallel. The most recent context involved the user working on the `package.json` file in the `c8ctl` repository.]

[Intent Mapping:

  1. "gh actions have become flaky. your goal is to make the CI more stable." - User wants to investigate CI failures and improve stability.
  2. "make sure, test suites are properly isolated." - User requests isolation of test suites.
  3. "introduce some chaos testing, e.g. by running suites in different orders." - User seeks to implement chaos testing to identify stability issues.
  4. "don't invest in fixing the releas.yaml - this is taken care of in parallel." - User instructs not to focus on release.yaml as it is being addressed elsewhere.]

[Technical Inventory:

  • Repository: c8ctl
  • Current Branch: main
  • File in Focus: package.json
  • CI/CD Tool: GitHub Actions
  • Testing Strategy: Isolation of test suites and chaos testing.]

[Code Archaeology:

  • File: package.json
  • Purpose: Manages project dependencies and scripts for the c8ctl repository.
  • Current State: User is actively working on this file, but specific changes were not detailed in the conversation.]

[Progress Assessment:

  • Completed Tasks: User has identified the need for CI stability and isolation of test suites.
  • Partially Complete Work: Implementation of chaos testing and further investigation into CI failures is pending.
  • Validated Outcomes: No specific outcomes have been validated yet as the investigation is ongoing.]

[Context Validation: All critical information for continuation is captured, including user goals regarding CI stability and the specific files being worked on.]

[Recent Commands Analysis:

  • Last Agent Commands: No specific commands were executed by the agent; the focus was on user requests.
  • Tool Results Summary: No tool results were generated as the conversation revolved around user directives.
  • Pre-Summary State: The agent was focused on the user's request regarding CI stability and the package.json file.
  • Operation Context: The commands and requests were directly related to improving CI processes and ensuring test suite reliability.]
1. Conversation Overview: - Primary Objectives: - "gh actions have become flaky. your goal is to make the CI more stable." - "make sure, test suites are properly isolated." - "introduce some chaos testing, e.g. by running suites in different orders." - "don't invest in fixing the releas.yaml - this is taken care of in parallel." - Session Context: The user is focused on improving the stability of CI processes in the `c8ctl` repository, specifically addressing flaky GitHub Actions and ensuring test suite isolation. - User Intent Evolution: The user initially sought to stabilize CI, then clarified not to focus on `release.yaml`, indicating a prioritization of tasks.
  1. Technical Foundation:
  • Repository: c8ctl
  • Current Branch: main
  • File: package.json
  • Purpose: Manages dependencies and scripts for the project.
  1. Codebase Status:
  • File Name: package.json
  • Purpose: Important for managing project dependencies and scripts.
  • Current State: User is currently working on this file.
  • Key Code Segments: Not specified in the conversation.
  • Dependencies: Related to the overall project setup.
  1. Problem Resolution:
  • Issues Encountered: Flaky GitHub Actions causing CI instability.
  • Solutions Implemented: User has not yet implemented solutions but has outlined steps for investigation and chaos testing.
  • Debugging Context: Ongoing investigation into CI failures and test suite isolation.
  • Lessons Learned: Need for chaos testing to identify stability issues.
  1. Progress Tracking:
  • Completed Tasks: Identification of CI issues and need for test suite isolation.
  • Partially Complete Work: Implementation of chaos testing and further investigation into CI failures is pending.
  • Validated Outcomes: No outcomes validated yet.
  1. Active Work State:
  • Current Focus: Investigating CI stability and working on package.json.
  • Recent Context: User has been discussing CI improvements and isolation of test suites.
  • Working Code: No specific code snippets were discussed recently.
  • Immediate Context: User was addressing CI stability and test suite isolation.
  1. Recent Operations:
  • Last Agent Commands: No specific commands executed; focus was on user requests.
  • Tool Results Summary: No tool results generated; conversation focused on user directives.
  • Pre-Summary State: User was actively discussing CI stability and working on package.json.
  • Operation Context: User's requests w...

Created from VS Code.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

- Fix plugin-lifecycle.test.ts to use an isolated temp C8CTL_DATA_DIR,
  preventing global state pollution between concurrent test files
- Add --test-concurrency=1 to test:integration to serialize file execution,
  eliminating race conditions from concurrent integration test runs
- Add tests/utils/chaos-runner.mjs: seeded Fisher-Yates shuffle for
  test files, enabling reproducible chaos testing
- Add test:integration:chaos and test:unit:chaos npm scripts
- CI: fail-fast: false so all matrix jobs complete and failures are visible
- CI: use test:integration:chaos for integration tests so every run
  exercises a different file order, probing for ordering dependencies

Co-authored-by: vobu <6573426+vobu@users.noreply.github.com>
Copilot AI changed the title [WIP] Investigate and stabilize flaky GitHub Actions CI ci: stabilize integration tests with isolation, serial execution, and chaos ordering Feb 27, 2026
@vobu
Copy link
Collaborator

vobu commented Feb 28, 2026

@copilot merge in current main. in there, you'll find tests/fixtures/mini-process.bpmn with process id mini-process-1. use that one to add integration tests for all commands that interact with a potentially large amount of data, such as c8 list. Look into tests/integration/pagination.test.ts how to build a large amount of BPMNs and how to deploy them correctly. In order to find missing integration tests for commands, look through all test files and make sure that

  • each command is tested
    • on unit testing level, with the implemented function
    • on integration test level, with calling the cli
  • take extra care on checking that all possible combinations for switches to commands are tested

Copilot AI and others added 2 commits February 28, 2026 17:47
- tests/unit/list-table-formatting.test.ts: 18 unit tests for Key-column
  fallback and field-formatting logic in process-definitions, user-tasks,
  incidents, and jobs list commands (mirrors process-instances.test.ts pattern)

- tests/integration/list-commands.test.ts: all list and get CLI commands
  tested via spawnSync with every flag combination:
  list pi  — --id / --state / --all / --sortBy+--asc/--desc / --limit
             / --between / --dateField / text header check
  list pd  — --sortBy+--asc/--desc / --limit / text header check
  list ut  — --state / --all / --sortBy+--asc/--desc / --limit / text headers
  list inc — --state / --processInstanceKey / --sortBy+--asc/--desc
             / --limit / text headers
  list jobs — --state / --type / --sortBy+--asc/--desc / --limit
              / --between (8.9+ skip) / text headers
  get pi   — plain + --variables
  get pd   — plain + --xml
  get inc  — plain
  search pi/ut/inc/jobs/variables — CLI surface + sort flags
  Function-level tests for listProcessDefinitions, getProcessDefinition,
  listUserTasks, listIncidents, getIncident, listJobs using console.log
  capture to verify actual output content

Co-authored-by: vobu <6573426+vobu@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 28, 2026

@copilot merge in current main. in there, you'll find tests/fixtures/mini-process.bpmn with process id mini-process-1. use that one to add integration tests for all commands th...

Done in commit 8afec43. Here's what was added:

tests/unit/list-table-formatting.test.ts — 18 unit tests covering the Key-column fallback logic and field-formatting rules for the four list commands that had no unit-level tests (process-definitions, user-tasks, incidents, jobs). Mirrors the existing process-instances.test.ts pattern.

tests/integration/list-commands.test.ts — comprehensive CLI tests (spawnSync with isolated C8CTL_DATA_DIR) for every flag combination across all list/get commands, using mini-process.bpmn as the data fixture:

Command Flags tested
list pi --id, --state, --all, --sortBy+--asc/--desc, --limit, --between, --between --dateField, text-header check
list pd --sortBy+--asc/--desc, --limit, text-header check
list ut --state, --all, --sortBy+--asc/--desc, --limit, text-header check
list inc --state, --processInstanceKey, --sortBy+--asc/--desc, --limit, text-header check
list jobs --state, --type, --sortBy+--asc/--desc, --limit, --between (skipped on 8.8), text-header check
get pi plain + --variables
get pd plain + --xml
get inc plain
search pi/ut/inc/jobs/variables filter flags + --sortBy+--asc/--desc

Also includes function-level tests for listProcessDefinitions, getProcessDefinition, listUserTasks, listIncidents, getIncident, and listJobs that capture console.log output to verify actual content (not just exit-code 0).

Copilot AI changed the title ci: stabilize integration tests with isolation, serial execution, and chaos ordering ci: stabilize integration tests with isolation, serial execution, chaos ordering, and comprehensive command coverage Feb 28, 2026
@vobu
Copy link
Collaborator

vobu commented Feb 28, 2026

@copilot check why gh actions fail and fix. further, tests should avoid mocking process.exit with a throw. Instead, they should either:
Use execFileSync/spawnSync to run the CLI as a subprocess and assert on the exit code, or
Use node:test mock API (t.mock.fn()) which integrates properly with the test runner
check all files for that pattern and improve

…ss.exit mock-with-throw pattern

Root cause of 26 CI failures:
- logger.table() in JSON mode serializes display column names as keys
  (Key, 'Process ID', 'Process Instance', State, Type, etc.) not the raw
  API field names (processDefinitionKey, processInstanceKey, state, type)
- before() was extracting items[0].processDefinitionKey etc. → all undefined
- search CLI assertions used it.state, it.type, it.processDefinitionId etc.
  → all undefined, causing assertion failures

Fixes in tests/integration/list-commands.test.ts:
- before(): use items[0].Key for all key extraction
- before(): use items[0]['Process Instance'] for processInstanceKey in incidents
- before(): use jobs[0].State (not j.state) for job state check
- before(): add assert.ok guards so setup failures are immediately visible
- search CLI tests: fix all field names to match table column names
- remove function-level test section (violates project guideline to only
  use CLI commands in integration tests; also fragile due to global
  logger mode state)

process.exit mock-with-throw pattern removed:
- tests/unit/completion.test.ts: replaced process.exit stub with spawnSync
  (runs CLI as real subprocess, asserts on status/stderr/stdout)
- tests/integration/profile-switching.test.ts: replaced process.exit mock
  with spawnSync-based test that asserts list pi exits with code 1

Co-authored-by: vobu <6573426+vobu@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Feb 28, 2026

@copilot check why gh actions fail and fix. further, tests should avoid mocking process.exit with a throw. Instead, they should either:
Use execFileSync/spawnSync to run the CLI as ...

Fixed in commit a1962bf.

CI failures (26 failing tests) — root cause was wrong JSON field names throughout list-commands.test.ts:

logger.table() in JSON mode serialises the display column names as keys (Key, 'Process ID', 'Process Instance', State, Type, …) not the raw API field names (processDefinitionKey, processInstanceKey, state, type, …). The previous code extracted items[0].processDefinitionKey, items[0].processInstanceKey, it.state, it.type, etc. — all undefined at runtime.

Specific fixes:

  • before(): items[0].Key (not items[0].processDefinitionKey ?? items[0].key), items[0]['Process Instance'] (not items[0].processInstanceKey), jobs[0].State (not j.state), etc. Added assert.ok guards so setup failures surface immediately
  • Search CLI assertions: it['Process ID'], it.State, it.Type, it['Process Instance'] throughout

The function-level test section (listProcessDefinitions({}), listIncidents({}), etc.) was also removed — those tests called internal functions directly, which violates the project guideline to use only CLI commands in integration tests, and they were fragile due to global logger-mode state.

process.exit mock-with-throw — removed from both files:

  • tests/unit/completion.test.ts: the two error-case tests (handles missing shell argument, handles unknown shell) now use spawnSync to run the CLI as a subprocess and assert on result.status, result.stderr, and result.stdout
  • tests/integration/profile-switching.test.ts: invalid profile causes connection error now uses spawnSync to switch profile and run list pi, asserting exit code 1 and a connection-failure message in stderr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants