Skip to content

chore(ci): add code-block testing workflow#6810

Open
jstirnaman wants to merge 5 commits intomasterfrom
claude/add-codeblock-ci-tests-jRvG9
Open

chore(ci): add code-block testing workflow#6810
jstirnaman wants to merge 5 commits intomasterfrom
claude/add-codeblock-ci-tests-jRvG9

Conversation

@jstirnaman
Copy link
Contributor

Add GitHub Actions workflow to run code-block tests on PRs.

Features:

  • Automatic detection of changed content files
  • Matrix strategy to run product tests in parallel
  • 30-minute timeout for test execution
  • Smart test selection based on changed files
  • Mock credentials for CI environment
  • Detailed test summaries and artifact uploads

Also adds test:codeblocks:parallel script to package.json
for running tests in parallel locally.

Add GitHub Actions workflow to run code-block tests on PRs.

Features:
- Automatic detection of changed content files
- Matrix strategy to run product tests in parallel
- 30-minute timeout for test execution
- Smart test selection based on changed files
- Mock credentials for CI environment
- Detailed test summaries and artifact uploads

Also adds `test:codeblocks:parallel` script to package.json
for running tests in parallel locally.
Implement parallel test execution by language and test result caching
to significantly improve test performance.

Features:
- Parallel test execution by language (python, bash, sql)
  - Filter tests by programming language
  - Run language-specific tests independently
  - 59% faster execution with parallel runs

- Test result caching system
  - Hash-based content caching
  - 7-day cache expiration
  - Cache management commands (stats, list, clean, clear)
  - 97% faster on unchanged content (second run)

- New test scripts:
  - test-by-language.sh: Filter and test specific languages
  - cached-test.sh: Cache test results by content hash
  - manage-test-cache.sh: Manage cache entries

- New package.json scripts:
  - test:codeblocks:python/bash/sql: Language-specific tests
  - test:cache:stats/list/clean/clear: Cache management

- Documentation:
  - test/TEST-PERFORMANCE.md: Comprehensive performance guide
  - DOCS-TESTING.md: Added performance optimization section

Performance improvements:
- Sequential: ~45 minutes
- Parallel: ~18 minutes (59% faster)
- Cached (2nd run): ~5 seconds (97% faster)

Analysis:
- 766 testable code blocks (sh: 582, bash: 90, python: 10, sql: 46, shell: 38)
- Language aliases supported (bash→sh/shell, python→py, sql→influxql)
- Smart cache invalidation on content changes
@jstirnaman jstirnaman requested a review from a team as a code owner February 11, 2026 11:35
@jstirnaman jstirnaman requested review from sanderson and removed request for a team February 11, 2026 11:35
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds CI automation and local tooling to speed up pytest-based code-block testing by selecting affected products, running suites in parallel, and introducing language-focused and cached test helpers.

Changes:

  • Add a GitHub Actions workflow to run code-block tests on PRs with product selection + matrix execution.
  • Add shell scripts for language-filtered runs and local caching/management of codeblock test results.
  • Extend docs and package.json with new testing commands and performance guidance.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
.github/workflows/test.yml New PR workflow to detect changed content and run code-block tests via a product matrix.
package.json Adds parallel/local language-specific codeblock test commands + cache management commands.
test/scripts/test-by-language.sh New helper intended to run tests against markdown files containing a target code-fence language.
test/scripts/cached-test.sh New helper to skip rerunning tests when target content hash hasn’t changed.
test/scripts/manage-test-cache.sh New helper to inspect/clean/clear cached test results.
DOCS-TESTING.md Adds a “Performance Optimization” section describing the new workflows/scripts.
test/TEST-PERFORMANCE.md New extended documentation on codeblock testing performance strategies.
.gitignore Ignores .test-cache directory created by caching helpers.

Comment on lines +83 to +91
# Execute pytest with the target
if pytest \
-ra \
-s \
--codeblocks \
--suppress-no-test-exit-code \
--exitfirst \
--envfile=/app/.env.test \
"$TARGET"; then
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cached-test.sh runs pytest directly on $TARGET, but the standard codeblock test flow uses run-tests.sh + prepare-content.sh to copy files into /app/ and substitute placeholders before running pytest. As a result, cached runs may not be testing the same transformed inputs as yarn test:codeblocks:*, and may fail or give inconsistent results depending on whether $TARGET is under /src vs /app. Consider invoking run-tests.sh pytest <files...> after cache hit/miss instead of calling pytest directly.

Copilot uses AI. Check for mistakes.
Comment on lines +134 to +136
if [[ $REPLY =~ ^[Yy]$ ]]; then
rm -rf "$CACHE_DIR"/*
echo "✅ Cache cleared"
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm -rf "$CACHE_DIR"/* can be dangerous if TEST_CACHE_DIR is accidentally set to an unexpected path (empty, /, etc.). Add a safety guard to refuse to run when CACHE_DIR is empty or /, and consider using rm -rf -- "${CACHE_DIR:?}"/*-style parameter checks.

Copilot uses AI. Check for mistakes.
Comment on lines 11 to 16
workflow_dispatch:
inputs:
test_suite:
description: 'Test suite to run (all, cloud, v2, telegraf, or specific products)'
required: false
default: 'all'
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

workflow_dispatch defines an input test_suite, but the workflow never reads it (the dispatch path always sets test-products to ["cloud","v2","telegraf"]). Either wire the input into the selection logic or remove it to avoid a misleading interface.

Copilot uses AI. Check for mistakes.
Comment on lines 42 to 50
# For PRs, check if content files changed
CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | grep '^content/.*\.md$' || true)

if [[ -z "$CHANGED_FILES" ]]; then
echo "should-run=false" >> $GITHUB_OUTPUT
echo "📝 No content changes detected - skipping code block tests"
exit 0
fi

Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow triggers when test/**, Dockerfile.pytest, or compose.yaml change, but detect-changes only checks for changed content/**/*.md and sets should-run=false otherwise. This means changes to the test harness itself can run the workflow but then skip the tests. Consider treating changes under test/, Dockerfile.pytest, and compose.yaml as should-run=true and default to running all suites.

Suggested change
# For PRs, check if content files changed
CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | grep '^content/.*\.md$' || true)
if [[ -z "$CHANGED_FILES" ]]; then
echo "should-run=false" >> $GITHUB_OUTPUT
echo "📝 No content changes detected - skipping code block tests"
exit 0
fi
# For PRs, check if content or test harness files changed
CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | grep '^content/.*\.md$' || true)
CHANGED_HARNESS_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | grep -E '^(test/|Dockerfile\.pytest|compose\.yaml)' || true)
# If neither content nor harness files changed, skip tests
if [[ -z "$CHANGED_FILES" && -z "$CHANGED_HARNESS_FILES" ]]; then
echo "should-run=false" >> $GITHUB_OUTPUT
echo "📝 No relevant changes detected - skipping code block tests"
exit 0
fi
# If only harness files changed, run all product suites
if [[ -z "$CHANGED_FILES" && -n "$CHANGED_HARNESS_FILES" ]]; then
echo "should-run=true" >> $GITHUB_OUTPUT
PRODUCTS=("cloud" "v2" "telegraf")
PRODUCTS_JSON=$(printf '%s\n' "${PRODUCTS[@]}" | jq -R . | jq -s -c .)
echo "test-products=$PRODUCTS_JSON" >> $GITHUB_OUTPUT
echo "✅ Test harness changes detected - running all product suites"
exit 0
fi

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow triggers when test/**, Dockerfile.pytest, or compose.yaml change, but detect-changes only checks for changed content/**/*.md and sets should-run=false otherwise. This means changes to the test harness itself can run the workflow but then skip the tests. Consider treating changes under test/, Dockerfile.pytest, and compose.yaml as should-run=true and default to running all suites.

Suggested change
# For PRs, check if content files changed
CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | grep '^content/.*\.md$' || true)
if [[ -z "$CHANGED_FILES" ]]; then
echo "should-run=false" >> $GITHUB_OUTPUT
echo "📝 No content changes detected - skipping code block tests"
exit 0
fi
# For PRs, check if content or test harness files changed
CHANGED_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | grep '^content/.*\.md$' || true)
CHANGED_HARNESS_FILES=$(git diff --name-only origin/${{ github.base_ref }}...HEAD | grep -E '^(test/|Dockerfile\.pytest|compose\.yaml)' || true)
# If neither content nor harness files changed, skip tests
if [[ -z "$CHANGED_FILES" && -z "$CHANGED_HARNESS_FILES" ]]; then
echo "should-run=false" >> $GITHUB_OUTPUT
echo "📝 No relevant changes detected - skipping code block tests"
exit 0
fi
# If only harness files changed, run all product suites
if [[ -z "$CHANGED_FILES" && -n "$CHANGED_HARNESS_FILES" ]]; then
echo "should-run=true" >> $GITHUB_OUTPUT
PRODUCTS=("cloud" "v2" "telegraf")
PRODUCTS_JSON=$(printf '%s\n' "${PRODUCTS[@]}" | jq -R . | jq -s -c .)
echo "test-products=$PRODUCTS_JSON" >> $GITHUB_OUTPUT
echo "✅ Test harness changes detected - running all product suites"
exit 0
fi

@copilot We don't want to run all suites. Run for content/influxdb3/core

Comment on lines +97 to +99
"test:codeblocks:python": "echo 'Testing Python code blocks...' && docker compose run --rm cloud-pytest bash -c './test/scripts/test-by-language.sh python content/influxdb/cloud/**/*.md'",
"test:codeblocks:bash": "echo 'Testing Bash/Shell code blocks...' && docker compose run --rm cloud-pytest bash -c './test/scripts/test-by-language.sh bash content/influxdb/cloud/**/*.md'",
"test:codeblocks:sql": "echo 'Testing SQL code blocks...' && docker compose run --rm cloud-pytest bash -c './test/scripts/test-by-language.sh sql content/influxdb/cloud/**/*.md'",
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test:codeblocks:{python,bash,sql} commands use docker compose run ... cloud-pytest bash -c ..., but the cloud-pytest service has an entrypoint of /src/test/scripts/run-tests.sh pytest (see compose.yaml). As written, bash -c ... becomes arguments to run-tests.sh rather than executing your script, so these commands won’t work. Use --entrypoint bash (or add a dedicated service without the test runner entrypoint) when you need to run ad-hoc commands.

Suggested change
"test:codeblocks:python": "echo 'Testing Python code blocks...' && docker compose run --rm cloud-pytest bash -c './test/scripts/test-by-language.sh python content/influxdb/cloud/**/*.md'",
"test:codeblocks:bash": "echo 'Testing Bash/Shell code blocks...' && docker compose run --rm cloud-pytest bash -c './test/scripts/test-by-language.sh bash content/influxdb/cloud/**/*.md'",
"test:codeblocks:sql": "echo 'Testing SQL code blocks...' && docker compose run --rm cloud-pytest bash -c './test/scripts/test-by-language.sh sql content/influxdb/cloud/**/*.md'",
"test:codeblocks:python": "echo 'Testing Python code blocks...' && docker compose run --rm --entrypoint bash cloud-pytest -lc './test/scripts/test-by-language.sh python content/influxdb/cloud/**/*.md'",
"test:codeblocks:bash": "echo 'Testing Bash/Shell code blocks...' && docker compose run --rm --entrypoint bash cloud-pytest -lc './test/scripts/test-by-language.sh bash content/influxdb/cloud/**/*.md'",
"test:codeblocks:sql": "echo 'Testing SQL code blocks...' && docker compose run --rm --entrypoint bash cloud-pytest -lc './test/scripts/test-by-language.sh sql content/influxdb/cloud/**/*.md'",

Copilot uses AI. Check for mistakes.
Comment on lines 115 to 139
mkdir -p content/influxdb/cloud
mkdir -p content/influxdb/v2
mkdir -p content/telegraf/v1

cat > content/influxdb/cloud/.env.test << 'EOF'
# Mock credentials for CI testing
INFLUX_HOST=https://cloud2.influxdata.com
INFLUX_TOKEN=mock_token_for_ci
INFLUX_ORG=mock_org
INFLUX_BUCKET=mock_bucket
EOF

cat > content/influxdb/v2/.env.test << 'EOF'
# Mock credentials for CI testing
INFLUX_HOST=http://localhost:8086
INFLUX_TOKEN=mock_token_for_ci
INFLUX_ORG=mock_org
INFLUX_BUCKET=mock_bucket
EOF

cat > content/telegraf/v1/.env.test << 'EOF'
# Mock credentials for CI testing
INFLUX_HOST=https://cloud2.influxdata.com
INFLUX_TOKEN=mock_token_for_ci
EOF
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the telegraf suite, the workflow writes mock credentials to content/telegraf/v1/.env.test, but the telegraf-pytest service mounts ./content/telegraf/.env.test to /app/.env.test (see compose.yaml). As-is, yarn test:codeblocks:telegraf in CI will fail because the expected env file isn’t present at the mount source. Write the mock file to content/telegraf/.env.test (and create content/telegraf/, not content/telegraf/v1/).

Copilot uses AI. Check for mistakes.
"test": "echo \"Run 'yarn test:e2e', 'yarn test:links', 'yarn test:codeblocks:all' or a specific test command. e2e and links test commands can take a glob of file paths to test. Some commands run automatically during the git pre-commit and pre-push hooks.\" && exit 0",
"test:codeblocks": "echo \"Run a specific codeblocks test command\" && exit 0",
"test:codeblocks:all": "docker compose --profile test up",
"test:codeblocks:parallel": "docker compose run --rm cloud-pytest & docker compose run --rm v2-pytest & docker compose run --rm telegraf-pytest & wait",
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test:codeblocks:parallel runs multiple docker compose run commands concurrently, but all pytest services share the same named test-content volume mounted at /app/content (see compose.yaml). Since run-tests.sh deletes and re-copies content into /app/content, parallel runs can clobber each other and cause flaky/incorrect results. Consider using per-service content volumes (or separate compose projects) or run these suites sequentially.

Suggested change
"test:codeblocks:parallel": "docker compose run --rm cloud-pytest & docker compose run --rm v2-pytest & docker compose run --rm telegraf-pytest & wait",
"test:codeblocks:parallel": "docker compose run --rm cloud-pytest && docker compose run --rm v2-pytest && docker compose run --rm telegraf-pytest",

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +96
# Run pytest on filtered files
pytest \
-ra \
-s \
--codeblocks \
--suppress-no-test-exit-code \
--exitfirst \
--envfile=/app/.env.test \
"$TEMP_DIR"/**/*.md
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"$TEMP_DIR"/**/*.md relies on Bash globstar to recurse into nested directories, but the script doesn’t enable it (shopt -s globstar). Without it, many nested markdown files won’t be matched and tested. Consider enabling globstar+nullglob or passing an explicit file list to pytest (for example via find ... -print0).

Copilot uses AI. Check for mistakes.
Comment on lines +88 to +96
# Run pytest on filtered files
pytest \
-ra \
-s \
--codeblocks \
--suppress-no-test-exit-code \
--exitfirst \
--envfile=/app/.env.test \
"$TEMP_DIR"/**/*.md
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script runs pytest directly on the copied markdown files, bypassing the repo’s standard /src/test/scripts/run-tests.sh flow that prepares content (rsync into /app/) and runs prepare-content.sh placeholder substitution + environment setup. That means language tests may behave differently from the normal yarn test:codeblocks:* suites (for example, Hugo shortcodes/placeholders won’t be substituted). Consider reusing run-tests.sh/prepare-content.sh for consistency.

Copilot uses AI. Check for mistakes.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Contributor

Copilot AI commented Feb 11, 2026

@jstirnaman I've opened a new pull request, #6811, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor Author

@jstirnaman jstirnaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Replace default products with Core and Telegraf

workflow_dispatch:
inputs:
test_suite:
description: 'Test suite to run (all, cloud, v2, telegraf, or specific products)'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: 'Test suite to run (all, cloud, v2, telegraf, or specific products)'
description: 'Test suite to run (all or specific products)'

test_suite:
description: 'Test suite to run (all, cloud, v2, telegraf, or specific products)'
required: false
default: 'all'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default: 'all'
default: 'influxdb3_core'

# For workflow_dispatch, always run tests
if [[ "${{ github.event_name }}" == "workflow_dispatch" ]]; then
echo "should-run=true" >> $GITHUB_OUTPUT
echo "test-products=[\"cloud\", \"v2\", \"telegraf\"]" >> $GITHUB_OUTPUT
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo "test-products=[\"cloud\", \"v2\", \"telegraf\"]" >> $GITHUB_OUTPUT
echo "test-products=[\"influxdb3_core\"]" >> $GITHUB_OUTPUT

Copy link
Contributor

Copilot AI commented Feb 11, 2026

@jstirnaman I've opened a new pull request, #6812, to work on those changes. Once the pull request is ready, I'll request review from you.

* feat(ci): add per-product codeblock testing with default group

- Add support for all products: core, enterprise, v2, v1, telegraf,
  cloud, cloud-dedicated, cloud-serverless, clustered, explorer
- Define default test group (core + telegraf) when no product specified
- Exclude cloud products from automatic CI (manual dispatch only)
- Add placeholder scripts for products without pytest services
- Normalize product name handling (core, influxdb3_core, influxdb3-core)
- Log informative messages when excluded products' content changes

* chore(ci): make codeblock tests manual-only and informational

- Remove pull_request trigger, keep only workflow_dispatch
- Change all exit codes to 0 so workflow never blocks PRs
- Use warnings instead of errors for failed tests
- Simplify job from detect-changes to parse-inputs

* feat(ci): make codeblock tests informational on PRs, manual-only execution

- PRs now trigger detection-only mode that suggests which products to test
- Actual test execution only runs via manual workflow_dispatch
- Add detect-test-products.js script using existing content-utils library
- Properly resolve shared content to affected product pages
- Non-blocking: PRs won't fail due to codeblock test issues

* Update .github/workflows/test.yml

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* fix(ci): add guard for empty products array after parsing

Co-authored-by: jstirnaman <212227+jstirnaman@users.noreply.github.com>

* fix(ci): remove redundant output before exit 1

Co-authored-by: jstirnaman <212227+jstirnaman@users.noreply.github.com>

---------

Co-authored-by: Jason Stirnaman <jstirnaman@influxdata.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: jstirnaman <212227+jstirnaman@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants