Skip to content

Refactor GitHub Actions workflows for build, documentation, and linting#2260

Open
Odyno wants to merge 1 commit into1technophile:developmentfrom
Odyno:feature/workflow-reorganizzation
Open

Refactor GitHub Actions workflows for build, documentation, and linting#2260
Odyno wants to merge 1 commit into1technophile:developmentfrom
Odyno:feature/workflow-reorganizzation

Conversation

@Odyno
Copy link
Contributor

@Odyno Odyno commented Dec 27, 2025

Description:

Based on the discussion on https://community.openmqttgateway.com/t/holiday-questions-d/4345, this PR refactors the CI/CD infrastructure to improve maintainability, testability, and local development experience while maintaining complete
feature parity with the existing pipeline.

Problem Statement

The current CI/CD setup creates friction for developers:

  • GitHub Actions workflows contain duplicated build/test logic
  • CI scripts cannot be reliably executed locally
  • Environment configurations scattered across multiple files
  • Documentation generation tightly coupled to GitHub Actions

Solution

Platform-agnostic CI/CD system with two complementary layers:

1. GitHub Actions Workflows (Reorganized)

  • Reusable workflow tasks: task-build.yml, task-docs.yml, task-lint.yml
  • Centralized environment metadata: environments.json
  • Updated main workflows: build.yml, build_and_docs_to_dev.yml, release.yml
  • Eliminates ~500 lines of duplicated YAML

2. Local-First Bash CI Scripts

  • Single dispatcher: ci.sh - Main entry point for all CI operations
  • Modular composable scripts:
    • ci_build.sh - Firmware compilation pipeline
    • ci_site.sh - Documentation generation with version management
    • ci_qa.sh - Code linting and formatting
    • ci_set_version.sh - Automated version tagging
    • ci_prepare_artifacts.sh - Artifact packaging
  • Centralized configuration: ci_00_config.sh (DRY principle)
  • Identical behavior: Same script works locally and in GitHub Actions

3. Enhanced Documentation Development Experience

The site generation is now fully self-contained and locally developa​ble using only Node.js and VuePress:

  • Pure Node.js toolchain
  • Runs site locally, preview in https and live manifest creation.
  • Some pages refactored with:
    • Landing page (docs/participate/README.md) with cross-linked navigation
    • Improved contribution workflows (development.md, quick_start.md)
    • Community engagement guides (community.md, support.md, adding-protocols.md)
    • Built-in examples and troubleshooting
  • Local version management - Development manifest generated locally without modifying Git-tracked files

Key Benefits

Aspect Before After
Local Testing ❌ Not possible ✅ Native ./ci.sh
Code Duplication 📈 ~500 lines YAML 📉 Reusable tasks
CI Portability 🔒 GitHub-only 🚀 Platform-agnostic
Debugging 🔍 Push→wait→check logs ✨ Run locally first
Maintenance 😰 Scattered logic 🎯 Single source of truth

Documentation

Complete reference materials included:

Future Work

Security hardening planned for follow-up PR:

  • SBOM (Software Bill of Materials) generation
  • Trivy security scanning
  • Dependency vulnerability reporting

Tested

  • Documentation generation validated
  • Local execution tested on Linux/macOS

Checklist

  • Pull request is done against the latest development branch
  • Only one feature/fix was added per PR and the code change compiles without warnings
  • I accept the DCO

@Odyno Odyno force-pushed the feature/workflow-reorganizzation branch from d34290f to f8779e1 Compare December 27, 2025 15:40
@Odyno Odyno marked this pull request as draft December 29, 2025 18:08
@Odyno Odyno force-pushed the feature/workflow-reorganizzation branch from 1134284 to 27c998b Compare December 29, 2025 19:36
@Odyno Odyno marked this pull request as ready for review December 29, 2025 19:37
@Odyno Odyno marked this pull request as draft December 29, 2025 22:47
"sonoff-basic",
"sonoff-basic-rfr3"
],
"metadata": {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not used by the code I would remove the metadata, they are currently out of sync (the count does not match the qty of environments (83 in metadata 82 in the list)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree, it was one experiment not cleaned.

}

# Function to list available environments
list_environments() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

environment listing should come from environments.json and not the ini I suspect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me create a double option... one from environment.json and the other from real platform.ini... so we can use it in all cases and not only in CI

log_info "Setting PRODUCTION version: $version"

replace_version "$USER_CONFIG" "$version" || return 1
replace_version "$VERSION_JSON" "$version" || return 1
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VERSION_JSON does not look defined somewhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted this file in the end because I realized it was only useful for the website, and it made a small change to the code… a change that cannot be reverted and works only one time. This way of doing things is an anti‑pattern, because it stops local development of this pipeline and makes the code messy. I changed this part so the functionality still works, but without modifying files that are tracked by Git. I tried to work on this and in latest_version.json, latest_version_dev.json, board.md, web-install.md, and config.js. Please check out the proposal

log_info "Setting DEVELOPMENT version: $version"

replace_version "$USER_CONFIG" "$version" || return 1
replace_version "$VERSION_DEV_JSON" "$version" || return 1
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VERSION_DEV_JSON does not look defined somewhere

# Used by: All build scripts for centralized configuration

# Python Configuration
PYTHON_VERSION="3.13"
Copy link
Owner

@1technophile 1technophile Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need all those variables ? If they are not used I would suggest to remove them.
They can mislead the developer in thinking that we have a centralized place for them but it is not the case

@@ -4,13 +4,9 @@ on: [push, pull_request]

jobs:
Copy link
Owner

@1technophile 1technophile Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we centralize this in task lint and remove this file ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! I left it just because it was present

@Odyno
Copy link
Contributor Author

Odyno commented Jan 3, 2026

I'm very happy that you reviewed this MR!

Yes, in those 3 days I have done a small review of all the code of this MR (for this reason it is in draft) because I also tested the site and I discovered some errors.

Really happy to include the comment on the next commit we're I would like to provide a more bit steps to test the site locally ( do not worry always feature parity + add the ability to run all in localhost, also the generation of manifest of local develop version ...)

Hope you have another time to follow the update! Time to test

@Odyno Odyno force-pushed the feature/workflow-reorganizzation branch 2 times, most recently from 86af2df to aaee4e2 Compare January 11, 2026 23:55
@Odyno Odyno requested a review from 1technophile January 12, 2026 00:17
@Odyno Odyno marked this pull request as ready for review January 12, 2026 00:17
@Odyno
Copy link
Contributor Author

Odyno commented Jan 12, 2026

@1technophile (and @h2zero ) I have completed the full review and included all the comments. I hope you appreciate this work. While I understand it's not directly related to the core project, I believe it will help the community to try and test these features, which could be a significant boost for the open-source aspects of the project.

This PR is ready from my side. I have thoroughly tested every part of it, with the exception of the GitHub-specific repository options, which are beyond the scope of this implementation.

@Odyno Odyno force-pushed the feature/workflow-reorganizzation branch 2 times, most recently from 821086f to 36c5f0a Compare January 12, 2026 00:50
@1technophile
Copy link
Owner

@claude can you review this PR

@claude
Copy link

claude bot commented Jan 12, 2026

Claude encountered an error —— View job

Command failed: git fetch origin --depth=20 feature/workflow-reorganizzation

I'll analyze this and get back to you.

@1technophile
Copy link
Owner

PR #2260 Review: CI/CD Infrastructure Refactoring

Overview

Author: Odyno
Changes: 6,051 additions, 1,106 deletions across 53 files
Scope: Complete CI/CD infrastructure refactoring with bash scripts, modular workflows, and documentation overhaul


Strengths

1. Architecture Design

  • Excellent modularization with reusable workflow tasks (task-build.yml, task-docs.yml, task-lint.yml)
  • DRY principle applied - eliminates ~500 lines of duplicated YAML
  • Clear separation of concerns between main workflows and task workflows
  • Comprehensive documentation in .github/workflows/README.md with mermaid diagrams

2. Local Development Experience

  • Major improvement: CI scripts can now run locally via ./ci.sh
  • Unified interface for all CI operations (build, docs, lint, QA)
  • Platform-agnostic design works on Linux/macOS
  • Addresses a real pain point mentioned in the community forum

3. Documentation Quality

  • Exceptional documentation with 652-line workflow README
  • New contributor guide (docs/participate/quick_start.md) - 840 lines
  • CI scripts reference (scripts/CI_SCRIPTS.md) - 574 lines
  • Clear examples, troubleshooting, and architecture diagrams

4. Code Organization

  • Centralized configuration (ci_00_config.sh) reduces duplication
  • Modular script structure makes each component testable
  • environments.json provides single source of truth for build matrix

⚠️ Concerns & Risks

1. Migration from Python to JavaScript (High Risk)

DELETED: scripts/common_wu.py (114 lines)
DELETED: scripts/gen_wu.py (147 lines)
DELETED: scripts/generate_board_docs.py (88 lines)

ADDED: docsgen/common_wu.js (132 lines)
ADDED: docsgen/gen_wu.js (324 lines)
ADDED: docsgen/generate_board_docs.js (225 lines)

Concerns:

  • Complete rewrite with no direct migration testing mentioned
  • Different runtime behavior between Python and Node.js
  • No side-by-side comparison to verify functional equivalence
  • Potential for subtle bugs in manifest generation logic

Recommendation: Need test cases comparing old vs new output for identical inputs

2. Bash Script Complexity (Medium Risk)

The new bash scripts are extensive:

  • ci_build.sh: 336 lines
  • ci_build_firmware.sh: 289 lines
  • ci_prepare_artifacts.sh: 285 lines
  • ci_qa.sh: 409 lines
  • ci_site.sh: 362 lines

Concerns:

  • Bash is notoriously difficult to maintain at scale
  • Error handling in bash is tricky
  • Cross-platform compatibility (macOS/Linux differences)
  • No automated tests for the bash scripts themselves

Recommendations:

  • Add shellcheck linting to CI
  • Consider unit tests for critical functions
  • Document required bash version and dependencies

3. Testing Coverage (High Priority)

Missing:

  • No automated tests for the new bash CI scripts
  • No integration tests for workflow changes
  • No rollback plan if issues arise in production
  • "Tested locally on Linux/macOS" is insufficient for such extensive changes

What's needed:

  • Test matrix for different scenarios
  • Verification that all 83 environments still build correctly
  • Documentation generation comparison (old vs new)
  • Rollback procedure documentation

4. Security Considerations (Medium Risk)

# From ci.sh and other scripts:
- Direct file operations without input validation
- Path construction using user inputs
- No mention of secret handling in new scripts

Concerns:

  • CI scripts handle sensitive operations (file creation, artifact preparation)
  • No security review mentioned in PR description
  • SBOM and Trivy scanning deferred to "future work"

Recommendation: Security audit before merge, especially for artifact preparation scripts

5. Code Style Issues (Low Priority)

Looking at the diff preview:

# In main/*.cpp files - formatting changes mixed with infrastructure:
main/User_config.h: 1 addition, 1 deletion
main/blufi.cpp: 4 additions, 4 deletions
main/gatewayBT.cpp: 2 additions, 2 deletions

Concern: Unrelated formatting changes in firmware files should be in separate PR


🔍 Specific File Concerns

.github/workflows/build.yml

- Changed from inline build logic to workflow_call
- Reduced from 120 lines to 47 lines ✅

Good: Cleaner, more maintainable
Risk: Ensure all original build options preserved

scripts/ci_site.sh (362 lines)

This script handles documentation generation - critical path for releases.

Questions:

  • Does it handle version management correctly for both dev and release?
  • What happens if manifest generation fails mid-process?
  • Are temporary files cleaned up on error?

docsgen/*.js (New files)

Complete rewrite of documentation generation:

  • gen_wu.js: 324 lines of manifest generation logic
  • generate_board_docs.js: 225 lines

Critical: This needs extensive testing to ensure:

  • Correct manifest.json generation
  • Proper version tracking
  • No breaking changes to existing deployment process

📋 Recommendations Before Merge

High Priority ✋

  1. Add automated tests for at least the critical bash functions
  2. Run full CI/CD pipeline on a test branch with both old and new systems
  3. Compare outputs of old vs new documentation generation
  4. Document rollback procedure in case issues arise post-merge
  5. Separate firmware formatting changes into different PR

Medium Priority ⚡

  1. Add shellcheck to lint bash scripts
  2. Security review of artifact preparation scripts
  3. Version pin for new Node.js dependencies
  4. Test cross-platform compatibility more thoroughly
  5. Add error handling tests for edge cases

Low Priority 📝

  1. Consider adding bash script unit tests (using bats or similar)
  2. Document minimum bash version requirements
  3. Create migration guide for contributors familiar with old system

💬 Questions for Author

  1. Testing Strategy: How were the Python→JavaScript changes verified for equivalence?
  2. Rollback Plan: If this causes issues in production, what's the rollback strategy?
  3. Breaking Changes: Are there any behavior changes users should know about?
  4. Performance: How do the new bash scripts compare to old Python scripts in execution time?
  5. Dependencies: What are the complete Node.js dependency requirements now?

🎯 Overall Assessment

Architecture: 9/10 - Excellent design, well thought out
Documentation: 10/10 - Outstanding, comprehensive
Testing: 3/10 - Insufficient for changes of this magnitude
Risk Management: 4/10 - No rollback plan, limited validation mentioned
Code Quality: 7/10 - Good structure, but bash at scale is concerning

Recommendation: ⚠️ REQUEST CHANGES

This PR represents excellent architectural work with outstanding documentation, but the lack of comprehensive testing and migration validation makes it too risky to merge as-is for such a critical system component.

Suggested Path Forward:

  1. Add at least basic automated tests
  2. Run side-by-side comparison of old vs new documentation generation
  3. Test full release pipeline on test branch
  4. Document rollback procedure
  5. Then merge with confidence

🤖 Review generated with Claude Code

@Odyno
Copy link
Contributor Author

Odyno commented Jan 12, 2026

That's true, shellcheck and nodjs check will be part of next PR #2263
As for the js and sh tests: it's a bit strange to request a test for something that manages tests (the chicken and egg paradox)... and in any case, all the test cases/structures already implemented on #2252 relating to platform.io can at most be extended to the shell as well.

Copy link
Collaborator

@h2zero h2zero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this, just a couple remarks.

2. Make your changes and follow the naming rules:
* New gateway: `ZgatewayXXX` where `XXX` is the protocol name.
* New sensor: `ZsensorYYY` where `YYY` is the sensor type.
* New actuator: `ZactuatorZZZ` where `ZZZ` is the actuator type.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we're here lets remove the "Z" prefix from these as that no longer is used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but at the moment it is used in site generation, in board page generation for retrieving information about the modules present in the environment.

@@ -1,4 +1,5 @@

# Replaces BLE library with custom version during PlatformIO build
# Used by: Currently unused (utility script for BLE library replacement)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can delete this now as it is quite out of date.

@h2zero
Copy link
Collaborator

h2zero commented Jan 22, 2026

Details

Tested this today and encountered this error:

Generate Web Upload in dev mode
Error setting up environment: Missing firmware artifacts in generated/artifacts/firmware_build. Run "ci.sh build ..." first to populate this folder.
[ERROR] Documentation build failed

After running:

./scripts/ci.sh site --mode dev --url-prefix /dev/ --version edge --preview

It looks like the folder location of the firmware bins is incorrect as they are actually found in ./generated/site/firmware_build

Please disregard, I missed the ./scripts/ci_prepare_artifacts.sh step after building.

@Odyno
Copy link
Contributor Author

Odyno commented Jan 22, 2026

Details

Please disregard, I missed the ./scripts/ci_prepare_artifacts.sh step after building.

or use the option --deploy-ready on the phase ci.sh build ....

- Consolidated build logic into reusable workflows (`task-build.yml` and `task-docs.yml`) to reduce duplication across multiple workflows.
- Introduced `environments.json` to centralize the list of PlatformIO build environments, improving maintainability and clarity.
- Updated `build.yml` and `build_and_docs_to_dev.yml` to utilize the new reusable workflows and environment definitions.
- Enhanced `release.yml` to streamline the release process and integrate documentation generation.
- Created reusable linting workflow (`task-lint.yml`) to standardize code formatting checks across the repository.
- Simplified manual documentation workflow by leveraging the new reusable documentation workflow.
- Improved artifact management and retention policies across workflows.
- Updated dependencies and versions in workflows to ensure compatibility and performance.

CI/CD pipeline agnostic of Workflow Engine and integrated on github actions

- Implemented ci.sh for orchestrating the complete build pipeline.
- Created ci_00_config.sh for centralized configuration of build scripts.
- Created ci_build_firmware.sh for building firmware for specified PlatformIO environments.
- Created ci_prepare_artifacts.sh for preparing firmware artifacts for upload or deployment.
- Created ci_set_version.sh for updating version tags in firmware configuration files.
- Created ci_build.sh to orchestrate the complete build pipeline.
- Created ci_qa.sh for code linting and formatting checks using clang-format.
- Created ci_site.sh for building and deploying VuePress documentation with version management.
- Implemented checks for required tools and dependencies in the new scripts.
- Improved internal scripts for better error handling and logging.

UPDATE the web installer manifest generation and update documentation structure
- Enhanced ci_list-env.sh to list environments from a JSON file.
- Replaced  common_wu.py and gen_wu.py scripts with new npm scripts for site generation and previewing on docsgen/gen_wu.js
- Replaced  generate_board_docs.py with docsgen/generated_board_docs.js
- Added new npm scripts for integration of site generation on build phase.
- Created preview_site.js to serve locally generated site over HTTPS with improved error handling.
- Added new CI environments for CI builds in environments.json.
- Deleted lint.yml as part of workflow cleanup.
- Enhanced task-build.yml to include linting as a job and added support for specifying PlatformIO version.
- Improved task-docs.yml to handle versioning more effectively and added clean option.

Enhance documentation
- ADD CLEAR Mark of development version of site
- Updated README.md to include detailed workflow dependencies and relationships using mermaid diagrams.
- Improved development.md with a quick checklist for contributors and clarified the code style guide.
- Enhanced quick_start.md with tips for contributors and streamlined the workflow explanation.

LINT FIX
- Refined User_config.h for better formatting consistency.
- Adjusted blufi.cpp and gatewayBT.cpp for improved code readability and consistency in formatting.
- Updated gatewaySERIAL.cpp and mqttDiscovery.cpp to enhance logging error messages.
- Improved sensorDS1820.cpp for better logging of device information.
@Odyno Odyno force-pushed the feature/workflow-reorganizzation branch from 36c5f0a to 3ffbc94 Compare January 27, 2026 20:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants