Skip to content

meta-introspector/split-decls-rs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

split-decls-rs: A Rust Overlay System for Declarative Package Patching

This project, centered around the split-decls-rs tool, is designed to create a "Rust Overlay" system analogous to Nix flake overlays or Debian package sets. The goal is to establish a centralized mechanism for maintaining and applying patches to external Rust modules—such as the rustc source—without directly altering the original upstream codebases. By leveraging build.rs as an orchestrator and procedural macros for AST transformation, the system enables declarative, reproducible modifications to a package set.

Core Vision: Rust Packages via Overlays

The system functions as a package maintenance layer where third-party Rust modules are ingested and transformed into a modular structure. This approach aims to solve common problems associated with modifying external dependencies, offering significant benefits:

  • Centralized Patch Management: All modifications are defined in a global configuration, allowing for consistent updates across different versions of upstream dependencies.
  • Decoupled Modifications: Original source files remain pristine; patches are applied dynamically during the build process, facilitating easier dependency upgrades. This avoids manual, mass editing of large volumes of code that require custom fixes or transformations.
  • Granular Code Units: By "splitting" declarations from a single lib.rs into individual files within a src/decls/ directory, the tool makes external code addressable for precise, semantic patching.
  • Maintainability: Easier to manage and update custom patches, as they are decoupled from the original source.
  • Reusability: Patches and transformations can be designed as modular macros, reusable across different projects or versions.
  • Traceability: Clear visibility into how and why external code is being modified.
  • Reduced Boilerplate: Automates complex code transformations that would otherwise require significant manual effort.

To visualize this, imagine the original Rust module is a pre-built Lego set. This tool disassembles that set into individual bricks (declarations), swaps out specific bricks with your custom-designed versions (patches) based on a master blueprint (the TOML config), and then assembles it into a new, improved model every time you run a build.

Functional Architecture

The split-decls-rs tool transforms a target crate's structure to allow for automated intervention during compilation.

1. Package Ingestion and Orchestration

The system can manage local crates or external repositories. The handle_git_overlay function allows the tool to clone or update specific versions of external source code (e.g., the rustLang/rust repository) and check out specific references for patching. The build.rs script plays a pivotal role, executed prior to the main compilation, acting as the primary orchestrator for:

  • Versioned Source Ingestion: Acquiring and preparing different versions of target Rust code for analysis.
  • Code Reflection: Initiating the process of lifting the target codebase into a high-level, computable meta-model.
  • Generating Metadata: Producing necessary metadata files (e.g., semantic hashes, call graphs) that guide the patching process.

2. The Transformation Workflow

For every package in the "overlay" set, split-decls-rs performs a series of structural transformations:

  • Backup: It renames the original src/lib.rs to src/oldlib.rs and the original build.rs to src/oldbuild.rs. If these files don't exist, empty placeholders are created. This preserves the original content for later use.
  • Library Injection: It generates a new, minimal src/lib.rs that acts as a gateway, re-exporting a generated decls module and prelude macros. This new lib.rs primarily re-exports prelude macros from introspector_decl2_macros and declares/exports a decls module (pub mod decls; pub use decls::*;).
  • Build-Time Synthesis: It generates a specialized build.rs for the target crate based on a template (src/buildrscontent.rst) or modular generator logic. This generated build.rs includes logic to read the content of src/oldlib.rs, process these declarations, and generate individual declaration files (.rs files) within the target crate's src/decls/ directory.

3. Patch Application via build.rs

The generated build.rs script becomes the active agent for the "package's" build process. It performs the following at compile-time:

  • AST Parsing: It reads the src/oldlib.rs content and parses it into an Abstract Syntax Tree using the syn crate.
  • Semantic Patching: It identifies patches relevant to the crate from the configuration and applies them to the AST. This includes name-based replacement of items (functions, structs, etc.) or the addition of entirely new items.
  • Modularization: It splits the patched AST into individual declaration files, injecting a custom prelude and common use statements into each. This ensures that each original declaration becomes a separate, addressable unit, making it amenable to granular reflection, analysis, and targeted patching by the broader "Rust Overlay" system.

The generated build.rs performs these detailed steps:

  1. Dependency Tracking: Sets cargo:rerun-if-changed directives for build.rs itself, oldlib.rs, and oldbuild.rs to ensure the build script is re-executed if any of these files change.
  2. Read and Parse oldlib.rs: Reads the content of the src/oldlib.rs (the backed-up original lib.rs) and parses it into a syn::File (Abstract Syntax Tree).
  3. Collect use Statements: A custom UseStatementCollector visits the parsed AST to gather all top-level use statements. These are then aggregated and injected into each generated declaration file as common_uses.
  4. Extract and Split Declarations: Iterates through each top-level syn::Item in oldlib.rs.
    • For supported declaration types (functions, structs, enums, consts, statics, traits, impls, types, unions), it extracts the TokenStream of the declaration.
    • It skips top-level use statements (which are handled by the collector), macro invocations, and mod declarations.
    • For each extracted declaration, it creates a new Rust file within the target crate's src/decls/ directory (e.g., src/decls/{crate_name}_decls_{decl_name}.rs).
    • The content of each new declaration file includes:
      • The common_uses collected earlier.
      • A prelude!{} macro placeholder.
      • A #[decl_{module_name_ident}] attribute placeholder.
      • The TokenStream of the extracted declaration itself.
  5. Generate decl_module! Invocation: After processing all declarations, it collects the module names of all generated declaration files and uses them to construct a introspector_decl2_macros::decl_module! invocation. This invocation is written to src/decls/_decl_module_invocation.rs, which is then included by the new src/lib.rs to re-export all processed declarations.

Key Enhancements & New Features

The split-decls-rs tool has undergone several key enhancements to improve its functionality and address dependency management at the workspace level:

  • Workspace Defaults for [workspace.package]: The tool now automatically populates missing fields in the root Cargo.toml's [workspace.package] section with a predefined set of default values (e.g., edition, version, authors, license, description, repository, homepage, keywords, rust-version, include, publish). This ensures consistency and reduces boilerplate for workspace configurations.
  • Explicit [workspace.lints] Handling: The tool ensures that a [workspace.lints] table is always present in the root Cargo.toml. If it's missing, an empty table is automatically inserted.
  • Automated [patch.crates-io] Generation: For every dependency defined in [workspace.dependencies] that specifies a path to a local submodule, the tool now automatically generates a corresponding [patch.crates-io.<crate-name>] entry in the root Cargo.toml. This mechanism simplifies the process of integrating patched versions of upstream crates, ensuring that all usages within the workspace correctly resolve to the local submodule.
  • Robust build.rs Dependency Management: The tool now intelligently configures the [build-dependencies] for generated build.rs scripts in target crates. This includes:
    • Ensuring syn is configured with the "full" and "visit" features.
    • Adding serde with the "derive" feature and toml as build dependencies. This resolves common build failures related to missing features or dependencies in the generated build scripts, making the overlay system more reliable.
  • Verbose Output Flag (--verbose): A new --verbose command-line flag has been introduced. When enabled, this flag triggers detailed debug output, providing deeper insights into the tool's internal processing and generated configurations.

Exclusions

For each detected crate (excluding itself and certain blacklisted directories like submodules/rust/compiler or submodules/rust-analyzer), the transformation workflow is applied.

Configuration and Package Definition

The behavior of the overlay is controlled via split-decls-rs.toml, which serves as the manifest for your "package set". This TOML file defines how patches are applied and custom code is injected:

  • patches: A map where you define specific .rs patch files (like my_test_patch.rs) to be applied to specific crates. These files contain the replacement or additive code.
  • string_replacements: Allows for raw text modifications to source code before it is parsed into an AST. This is useful for fixing syntax that might otherwise break the syn parser or for simple, non-semantic text substitutions.
  • custom_prelude_overlay: Defines code to be injected as a header into every generated declaration file across the package set. This ensures consistent use statements or common helper macros are available in all split declaration units.

Future Plan: Per-Package Patch Overrides

To provide even greater flexibility and control, a future enhancement will introduce the ability to define patch overrides on a per-package basis. This will allow for fine-grained customization of how individual overlaid crates are configured and built, deviating from global defaults where necessary. The plan involves:

  1. Defining a Patch Override File Format: Introduce a structured format (e.g., TOML) for package_name.patch.toml files, which will reside alongside the target crate's Cargo.toml. This file will specify overrides for various build and dependency settings.
  2. Extending Configuration Structures: Enhance SplitDeclsConfig or introduce new companion structures to parse and represent these per-package overrides. This will include fields for custom [build-dependencies] (with features), specific [patch.crates-io] entries for the package, and potentially other Cargo.toml sections.
  3. Integrating Override Loading in process_crate: Modify the process_crate function to detect and load these per-package override files. The loaded configurations will then be merged with or take precedence over the global split-decls-rs.toml settings for that specific package.
  4. Adapting generate_new_cargotoml: Update generate_new_cargotoml to dynamically apply these per-package overrides when generating the target crate's Cargo.toml. This will allow for highly customized dependency resolution and feature enablement on a case-by-case basis.

This enhancement will empower users to tailor the overlay system's behavior precisely to the unique requirements of each external Rust module.

Conceptual Usage Example

A developer might interact with this system within a proc_macro context as follows:

use patch_build_rs_macros::{nix_rust_src, pure_reflect, semantic_diff};

// 1. Ingest stable and broken versions
let v_stable = pure_reflect!(nix_rust_src!("1.82"));
let v_broken = pure_reflect!(nix_rust_src!("1.83-broken-branch"));

// 2. Calculate semantic diff vector
let diff_vector = semantic_diff!(v_stable, v_broken);

// 3. Generate a semantic change report
let report = llm! {
    prompt: "Analyze this semantic diff vector and explain why the broken branch deviates from stable intent.",
    data: diff_vector
};

About

my opinions on code structure, defined as build.rs and macros

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published