This project, centered around the split-decls-rs tool, is designed to create a "Rust Overlay" system analogous to Nix flake overlays or Debian package sets. The goal is to establish a centralized mechanism for maintaining and applying patches to external Rust modules—such as the rustc source—without directly altering the original upstream codebases. By leveraging build.rs as an orchestrator and procedural macros for AST transformation, the system enables declarative, reproducible modifications to a package set.
The system functions as a package maintenance layer where third-party Rust modules are ingested and transformed into a modular structure. This approach aims to solve common problems associated with modifying external dependencies, offering significant benefits:
- Centralized Patch Management: All modifications are defined in a global configuration, allowing for consistent updates across different versions of upstream dependencies.
- Decoupled Modifications: Original source files remain pristine; patches are applied dynamically during the build process, facilitating easier dependency upgrades. This avoids manual, mass editing of large volumes of code that require custom fixes or transformations.
- Granular Code Units: By "splitting" declarations from a single
lib.rsinto individual files within asrc/decls/directory, the tool makes external code addressable for precise, semantic patching. - Maintainability: Easier to manage and update custom patches, as they are decoupled from the original source.
- Reusability: Patches and transformations can be designed as modular macros, reusable across different projects or versions.
- Traceability: Clear visibility into how and why external code is being modified.
- Reduced Boilerplate: Automates complex code transformations that would otherwise require significant manual effort.
To visualize this, imagine the original Rust module is a pre-built Lego set. This tool disassembles that set into individual bricks (declarations), swaps out specific bricks with your custom-designed versions (patches) based on a master blueprint (the TOML config), and then assembles it into a new, improved model every time you run a build.
The split-decls-rs tool transforms a target crate's structure to allow for automated intervention during compilation.
The system can manage local crates or external repositories. The handle_git_overlay function allows the tool to clone or update specific versions of external source code (e.g., the rustLang/rust repository) and check out specific references for patching. The build.rs script plays a pivotal role, executed prior to the main compilation, acting as the primary orchestrator for:
- Versioned Source Ingestion: Acquiring and preparing different versions of target Rust code for analysis.
- Code Reflection: Initiating the process of lifting the target codebase into a high-level, computable meta-model.
- Generating Metadata: Producing necessary metadata files (e.g., semantic hashes, call graphs) that guide the patching process.
For every package in the "overlay" set, split-decls-rs performs a series of structural transformations:
- Backup: It renames the original
src/lib.rstosrc/oldlib.rsand the originalbuild.rstosrc/oldbuild.rs. If these files don't exist, empty placeholders are created. This preserves the original content for later use. - Library Injection: It generates a new, minimal
src/lib.rsthat acts as a gateway, re-exporting a generateddeclsmodule and prelude macros. This newlib.rsprimarily re-exports prelude macros fromintrospector_decl2_macrosand declares/exports adeclsmodule (pub mod decls; pub use decls::*;). - Build-Time Synthesis: It generates a specialized
build.rsfor the target crate based on a template (src/buildrscontent.rst) or modular generator logic. This generatedbuild.rsincludes logic to read the content ofsrc/oldlib.rs, process these declarations, and generate individual declaration files (.rsfiles) within the target crate'ssrc/decls/directory.
The generated build.rs script becomes the active agent for the "package's" build process. It performs the following at compile-time:
- AST Parsing: It reads the
src/oldlib.rscontent and parses it into an Abstract Syntax Tree using thesyncrate. - Semantic Patching: It identifies patches relevant to the crate from the configuration and applies them to the AST. This includes name-based replacement of items (functions, structs, etc.) or the addition of entirely new items.
- Modularization: It splits the patched AST into individual declaration files, injecting a custom prelude and common
usestatements into each. This ensures that each original declaration becomes a separate, addressable unit, making it amenable to granular reflection, analysis, and targeted patching by the broader "Rust Overlay" system.
The generated build.rs performs these detailed steps:
- Dependency Tracking: Sets
cargo:rerun-if-changeddirectives forbuild.rsitself,oldlib.rs, andoldbuild.rsto ensure the build script is re-executed if any of these files change. - Read and Parse
oldlib.rs: Reads the content of thesrc/oldlib.rs(the backed-up originallib.rs) and parses it into asyn::File(Abstract Syntax Tree). - Collect
useStatements: A customUseStatementCollectorvisits the parsed AST to gather all top-levelusestatements. These are then aggregated and injected into each generated declaration file ascommon_uses. - Extract and Split Declarations: Iterates through each top-level
syn::Iteminoldlib.rs.- For supported declaration types (functions, structs, enums, consts, statics, traits, impls, types, unions), it extracts the
TokenStreamof the declaration. - It skips top-level
usestatements (which are handled by the collector),macroinvocations, andmoddeclarations. - For each extracted declaration, it creates a new Rust file within the target crate's
src/decls/directory (e.g.,src/decls/{crate_name}_decls_{decl_name}.rs). - The content of each new declaration file includes:
- The
common_usescollected earlier. - A
prelude!{}macro placeholder. - A
#[decl_{module_name_ident}]attribute placeholder. - The
TokenStreamof the extracted declaration itself.
- The
- For supported declaration types (functions, structs, enums, consts, statics, traits, impls, types, unions), it extracts the
- Generate
decl_module!Invocation: After processing all declarations, it collects the module names of all generated declaration files and uses them to construct aintrospector_decl2_macros::decl_module!invocation. This invocation is written tosrc/decls/_decl_module_invocation.rs, which is then included by the newsrc/lib.rsto re-export all processed declarations.
The split-decls-rs tool has undergone several key enhancements to improve its functionality and address dependency management at the workspace level:
- Workspace Defaults for
[workspace.package]: The tool now automatically populates missing fields in the rootCargo.toml's[workspace.package]section with a predefined set of default values (e.g.,edition,version,authors,license,description,repository,homepage,keywords,rust-version,include,publish). This ensures consistency and reduces boilerplate for workspace configurations. - Explicit
[workspace.lints]Handling: The tool ensures that a[workspace.lints]table is always present in the rootCargo.toml. If it's missing, an empty table is automatically inserted. - Automated
[patch.crates-io]Generation: For every dependency defined in[workspace.dependencies]that specifies apathto a local submodule, the tool now automatically generates a corresponding[patch.crates-io.<crate-name>]entry in the rootCargo.toml. This mechanism simplifies the process of integrating patched versions of upstream crates, ensuring that all usages within the workspace correctly resolve to the local submodule. - Robust
build.rsDependency Management: The tool now intelligently configures the[build-dependencies]for generatedbuild.rsscripts in target crates. This includes:- Ensuring
synis configured with the"full"and"visit"features. - Adding
serdewith the"derive"feature andtomlas build dependencies. This resolves common build failures related to missing features or dependencies in the generated build scripts, making the overlay system more reliable.
- Ensuring
- Verbose Output Flag (
--verbose): A new--verbosecommand-line flag has been introduced. When enabled, this flag triggers detailed debug output, providing deeper insights into the tool's internal processing and generated configurations.
For each detected crate (excluding itself and certain blacklisted directories like submodules/rust/compiler or submodules/rust-analyzer), the transformation workflow is applied.
The behavior of the overlay is controlled via split-decls-rs.toml, which serves as the manifest for your "package set". This TOML file defines how patches are applied and custom code is injected:
patches: A map where you define specific.rspatch files (likemy_test_patch.rs) to be applied to specific crates. These files contain the replacement or additive code.string_replacements: Allows for raw text modifications to source code before it is parsed into an AST. This is useful for fixing syntax that might otherwise break thesynparser or for simple, non-semantic text substitutions.custom_prelude_overlay: Defines code to be injected as a header into every generated declaration file across the package set. This ensures consistentusestatements or common helper macros are available in all split declaration units.
To provide even greater flexibility and control, a future enhancement will introduce the ability to define patch overrides on a per-package basis. This will allow for fine-grained customization of how individual overlaid crates are configured and built, deviating from global defaults where necessary. The plan involves:
- Defining a Patch Override File Format: Introduce a structured format (e.g., TOML) for
package_name.patch.tomlfiles, which will reside alongside the target crate'sCargo.toml. This file will specify overrides for various build and dependency settings. - Extending Configuration Structures: Enhance
SplitDeclsConfigor introduce new companion structures to parse and represent these per-package overrides. This will include fields for custom[build-dependencies](with features), specific[patch.crates-io]entries for the package, and potentially otherCargo.tomlsections. - Integrating Override Loading in
process_crate: Modify theprocess_cratefunction to detect and load these per-package override files. The loaded configurations will then be merged with or take precedence over the globalsplit-decls-rs.tomlsettings for that specific package. - Adapting
generate_new_cargotoml: Updategenerate_new_cargotomlto dynamically apply these per-package overrides when generating the target crate'sCargo.toml. This will allow for highly customized dependency resolution and feature enablement on a case-by-case basis.
This enhancement will empower users to tailor the overlay system's behavior precisely to the unique requirements of each external Rust module.
A developer might interact with this system within a proc_macro context as follows:
use patch_build_rs_macros::{nix_rust_src, pure_reflect, semantic_diff};
// 1. Ingest stable and broken versions
let v_stable = pure_reflect!(nix_rust_src!("1.82"));
let v_broken = pure_reflect!(nix_rust_src!("1.83-broken-branch"));
// 2. Calculate semantic diff vector
let diff_vector = semantic_diff!(v_stable, v_broken);
// 3. Generate a semantic change report
let report = llm! {
prompt: "Analyze this semantic diff vector and explain why the broken branch deviates from stable intent.",
data: diff_vector
};