Augment build json messages for app-bundle post-processing

Hi

We are working on tools to produce final application artifacts from Rust applications, allowing distribution of Rust applications through AppStores and other application hubs. Following, a list of target formats we work with, to give a rough understanding of the scope:

- Producing `Foobar.app` application bundles for macOS, code-signing with suitable Apple provisioning information and certificates. On top, creating `Foobar.pkg`, `Foobar.dmg`, or `Foobar.zip` product/image/archive builds for distribution. (+analogous approach for iOS)
- Producing `Foobar.msix` application bundles for Windows, signed and augmented with suitable metadata ready to distribute on the Windows Store.
- Producing `Foobar.apk` Android packages ready for execution on Android devices. On top, creating `Foobar.aab` Android bundles for distribution on the Play Store.
- Producing Flatpak repositories ready for distribution on Fedora-style Linux systems.
- Producing `Foobar.snap` for distribution on Ubuntu-style Linux systems.
- ...

## Design

Our strategy is to put Cargo into the center, and ideally make `cargo build` produce the artifacts the user desires. Since Cargo does not support this, yet, we provide our own cargo-subcommand, as is common with such extensions. We want `Cargo.toml` as root level configuration, possibly augmented with `package.metadata` keys to augment whatever cannot be deduced from Cargo metadata. However, we want the process to deduce all information from the Cargo package, and only require metadata configuration if the user desires full control over specific parts of the build. And also, we want all this to work on stable, without reliance on unstable features (yet the user is of course free to use them).

Whether Cargo ever gains capabilities to produce such artifacts is not important, but we still model the process to allow moving required steps/configuration piece by piece from the custom subcommand to Cargo, if possible. (e.g., make Cargo emit universal binaries on macOS, stripping this process from our build).

Application bundles differ greatly across platforms, but they all require some kind of application entry-point(s). Hence, our build process is rooted in a single Cargo package which must provide the entry-point. This package is built and bundled up with all the artifacts produced as part of the build. Unstable Cargo `bindeps` allow full freedom on splitting such builds into multiple executables, shared objects, or other artifacts, if desired.

The custom subcommand currently starts by parsing `cargo metadata` of the target package, building a dependency tree of involved crates, parsing relevant metadata of each involved crate, and thus has a pretty good overview of what will end up in the build. It then invokes `cargo rustc --lib --crate-type [..]` for each target-architecture required for the desired application bundle. It will collect all produced artifacts, possibly post-process matching binaries into fat/universal-binaries, and then package all data into the application bundle. Afterwards, it will produce the desired archive formats from the application bundle for distribution.

## Struggles

The tool as described works fine and can produce suitable artifacts, and the design seems to work out. Yet, there are some things we have to solve via rather ugly heuristics, where we would love for Cargo to provide us more information:

1) **Artifact Types**: `compiler-artifact` messages tell us the files a build-target emitted, but does not tell us what the files represent. We have to judge based on build-type, crate-type, and file-extension whether a given file is of interest (e.g., `*.rmeta` files are not of interest to us, but `*.so` or split-debuginfo certainly is).
   
   This is particularly annoying for executables, since they often lack file-extensions and can even override the output filename via configuration (which is not reported in the metadata). We use the `executable: ...` key in the compiler-artifact message to improve the heuristics.

   Ideally, we would be able to tell Cargo which artifact-types we can make use of, and it would only report those, ideally even telling us the type of each artifact.

   Also note that bigger applications will often split into many shared-objects and executables, for reasons like faster ELF/PE load times, delayed distribution, or licensing. And it is very convenient to simply pick up all artifacts Cargo reports, rather than requiring users to configure builds for each part, possibly duplicating it for each supported bundle-type, making this process very fragile.

2) **Entrypoint**: The main entry-point of applications can be either in shared-objects, loaded by the system with some known symbol(s), or a standard executable with a known start address. It depends on how the system was designed. Yet, we want to support both such styles from a single code-base. The common approach is to provide a `src/lib.rs` plus `src/bin/runner.rs`, and the build-system builds the target suitable for the selected system.
   
   Unfortunately, `src/bin/runner.rs` cannot be guarded based on platform, so it has to be implemented for each system, even if it makes no sense on the system, thus leading to unnecessary stubs. Furthermore, it can be ambiguous which binary provides the entry-point, if the package has multiple ones. Lastly, it feels unnecessarily complex to have to pick a different build-target, if there is no technical requirement for it, and entry-point definitions end up in different files, depending on the target.
   
   While it is certainly great to allow users to create a different build-target for each platform, we did not see it as a suitable default. Hence, we instead always build `src/lib.rs` and expect it to provide the entry-points for all targets. We use `--crate-type {bin,cdylib}` to ensure the correct artifact is produced. This is especially nice if the application employs cross-platform frameworks, since you can now simply use their macros to generate the entry-points in `lib.rs` and be done. No need to create stub `src/bin/...`.
   
   Unfortunately, such executables are not reported as `executable: ...` in compiler-artifact messages, making it awkward to find. Furthermore, `cargo run` will not run this, even if no other binary build-target exists, nor does it support a `--lib` flag (and `cargo run` is just very convenient during development, even without bundling the application).

## Proposals

1) If Cargo would report `filetypes: []` alongside `filenames: []` in its `compiler-artifact` message, we would have a much easier time figuring out which files we are interested in, and it would certainly make false positives much less likely. Whether this would use short keys like `dll`, `so`, `exe`, `debuginfo`, `rmeta`, ..., or whether it reports mime-types, we do not mind.

2) If libraries built as `--crate-type bin` would be reported via the `executable: ...` key, we would be happy with the entry-point design. I implemented this as #13605. We would also love for `cargo run` to run executables built from libraries via one of the suggested approaches.

We maintain a longer list of things we would to see improved in Cargo and Rust, but those can all be dealt with now. The listed issues are the ones requiring rather unsatisfying workarounds.

Thanks
David

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Augment build json messages for app-bundle post-processing #13612

Design

Struggles

Proposals

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Augment build json messages for app-bundle post-processing #13612

Description

Design

Struggles

Proposals

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions