Skip to content

chore: Automate data model generation from upcoming CycloneDX 2.0 machine-readable specification #1417

@jkowalleck

Description

@jkowalleck

Description

The current data models in this library are primarily handwritten and manually maintained. While functional, this approach introduces significant overhead in terms of development effort, code review, and long-term maintenance.

With the introduction of CycloneDX 2.0, the specification is evolving toward a modularized, machine-readable schema architecture. This creates an opportunity to fundamentally shift how data models are produced in this project: from manual implementation to schema-driven code generation.

This issue proposes aligning the library with that direction by adopting automated model generation based on the CycloneDX 2.0 modular JSON schemas.

Reference (work in progress):


Problem

  • Data models are manually implemented and maintained
  • High maintenance and review overhead
  • Repetitive and error-prone contributor work
  • Slower adoption of specification updates
  • Risk of inconsistencies with the official specification

Proposal

Introduce a schema-driven model generation pipeline based on the CycloneDX 2.0 modular JSON schemas.

Key aspects:

  • Use the official modular schema definitions as the single source of truth
  • Automatically generate JavaScript/TypeScript data models from these schemas
  • Support the modular structure (i.e., multiple interdependent schema files)
  • Integrate generation into the build and/or release workflow
  • Ensure generated models remain aligned with upstream specification changes

This would replace most handwritten models with deterministically generated code, reducing manual effort and improving consistency.

Existing proof-of-concept implementations have already demonstrated feasibility. These should be revisited and consolidated into a production-ready approach.

Pipeline:

CycloneDX JSON Schema
        ↓
Preprocessing (if needed)
        ↓
Code Generation (datamodel-code-generator)
        ↓
Post-processing (formatting, adjustments)
        ↓
Generated Python Models

Possible Tools / Libraries

The following tools may be evaluated for compatibility with modular JSON Schema processing and TypeScript generation:

  • quicktype (Apache 2.0)
    Generates TypeScript models from JSON Schema and other inputs

  • json-schema-to-typescript (MIT)
    Strong support for JSON Schema → TS interfaces (including unions and enums)

  • typebox (MIT)
    Schema-first approach with runtime validation and static typing

  • ajv + tooling (MIT)
    Useful for schema validation and potential type generation workflows

  • Custom generator layer (optional)
    May be required to properly resolve CycloneDX modular schema composition and cross-references


Expected Benefits

  • Significant reduction in manual maintenance effort
  • Guaranteed alignment with the official CycloneDX specification
  • Faster and safer adoption of future spec versions
  • Improved consistency and correctness of data models
  • Better contributor experience (less repetitive work)

Considerations / Open Questions

  • What is the exact structure and format of the CycloneDX 2.0 machine-readable specification (JSON Schema variants, bundling, refs)?
    • JSON Schema it is
  • How should modular schemas be resolved and composed during generation?
  • Should generated code be committed to the repository or produced at build time?
    • decision: generated before build time, and commited to the repo
  • How should custom extensions or library-specific logic be layered on top of generated models?
  • What strategy ensures backward compatibility with CycloneDX 1.x?
    • easy path: breaking change, and only support CDX 2.0 from then on

Additional Context

This effort aligns closely with the goals of CycloneDX 2.0 to provide a modular, tooling-friendly specification. Leveraging this architecture for automated model generation will significantly improve the long-term sustainability and scalability of this library.


Note: This issue serves as a meta-ticket to coordinate related subtasks, experiments, and implementation steps for introducing schema-driven model generation.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions