|
| 1 | +The `agda2lambox` backend is a work in progress. Currently it already shows |
| 2 | +promising results, but should only be considered a proof of concept until |
| 3 | +further work is put into making it easier to use. |
| 4 | + |
| 5 | +This goal of this file is to document the project, in the hopes of making it |
| 6 | +easier for other people to pick it up and finish the work. It describes the goal |
| 7 | +of the backend, and details the implementation strategy. |
| 8 | +Some notes discuss yet unimplemented features, along with some suggestions on |
| 9 | +how to add those. |
| 10 | + |
| 11 | +## The `agda2lambox` project |
| 12 | + |
| 13 | +### Coq/Rocq extraction |
| 14 | + |
| 15 | +The Rocq proof assistant is well known for its extraction mechanism, which can |
| 16 | +turn any formalized development into a self-standing OCaml program. In the past |
| 17 | +10 years or so, several projects have raised the bar and formally verified part |
| 18 | +of the pipeline. |
| 19 | + |
| 20 | +In particular: |
| 21 | + |
| 22 | +- As part of the [MetaCoq] project, |
| 23 | + |
| 24 | + - The pipeline erasure transformation from PCUIC (Rocq's kernel theory) to λ□ |
| 25 | + has been mechanized. |
| 26 | + See: [Certified Erasure for Coq, in Coq](https://inria.hal.science/hal-04077552). |
| 27 | + |
| 28 | + - The translation from λ□ to OCaml (Malfunction, really) |
| 29 | + has been mechanized. |
| 30 | + See: [Verified Extraction from Coq to OCaml](https://dl.acm.org/doi/10.1145/3656379). |
| 31 | + |
| 32 | +- As part of the [CertiCoq] project, |
| 33 | + |
| 34 | + - A full extraction pipeline from Gallina (Rocq's surface language) to Clight (a subset |
| 35 | + of C) has been mechanized and verified. |
| 36 | + |
| 37 | + - CertiCoq has recently been extended with a verified pipeline to [WASM]. |
| 38 | + See: [certicoq-wasm](https://github.com/womeier/certicoqwasm). |
| 39 | + |
| 40 | +- As part of the [ConCert] project, |
| 41 | + |
| 42 | + - MetaCoq was extended with additional transformations. |
| 43 | + - A variation of λ□, annotated with type information, has been introduced. |
| 44 | + - There is support for even more targets using this intermediate language, |
| 45 | + such as Rust and Elm, even though correctness of the generated code is to |
| 46 | + our knowledge not verified. |
| 47 | + |
| 48 | +Common to all those contributions is the intermediate language λ□ (Lambda Box), |
| 49 | +first introduced in Pierre Letouzey's PhD thesis on the (at the time new) Rocq |
| 50 | +extraction mechanism. λ□ is now defined in Rocq inside the MetaCoq project, and |
| 51 | +both the CertiCoq and ConCert projects rely on MetaCoq to use λ□ as an input |
| 52 | +language. |
| 53 | + |
| 54 | +### The plan |
| 55 | + |
| 56 | +The plan, therefore, is to develop a new backend for Agda, that targets λ□ (and |
| 57 | +its type-annotated variant). The hope is to be able to latch onto the rest of |
| 58 | +the verified pipeline from the Rocq ecosystem, and down the line be able to |
| 59 | +compile Agda code to new targets such as WASM, Rust, Elm and even machine code. |
| 60 | + |
| 61 | +λ□ is on paper a very suitable target coming from Agda. |
| 62 | +It's also much easier to maintain this one backend than to implement one for |
| 63 | +each of those desired target languages. |
| 64 | +Although there is no plan to verify the backend, we directy benefit from the |
| 65 | +fact that the rest of the pipeline is (at least partially) verified. |
| 66 | + |
| 67 | +[MetaCoq]: https://metacoq.github.io/ |
| 68 | +[CertiCoq]: https://certicoq.org/ |
| 69 | +[WASM]: https://webassembly.org/ |
| 70 | + |
| 71 | +## The `agda2lambox` backend |
| 72 | + |
| 73 | +Now that the general context is out of the way, let's discuss the current |
| 74 | +implementation. |
| 75 | + |
| 76 | +### λ□ syntax |
| 77 | + |
| 78 | +In folder `LambdaBox`, we define the syntax of λ□. |
| 79 | +It's identical to the formalization that can be found in the MetaCoq project, |
| 80 | +so that it's trivial for the backend to generate Rocq files or forward λ□ programs |
| 81 | +to the rest of the pipeline. |
| 82 | + |
| 83 | +Of note: Since the typed variant of λ□ was added after the fact by the ConCert |
| 84 | +project, it introduces a completely separate notion of global environment in the |
| 85 | +MetaCoq formalization. The *only* significant difference with the global |
| 86 | +environement for λ□ is the inclusion of type annotations. |
| 87 | + |
| 88 | +For the backend, in order to avoid unnecessary duplication, we have a single |
| 89 | +notion of global environment, that *may* contain type information. |
| 90 | +It is defined in `LambdaBox.Env`. |
| 91 | +We do a tiny bit of type-trickery to ensure that type information is always |
| 92 | +there when we are targeting typed λ□, and never there when we target untyped λ□. |
| 93 | +So far this has proven quite useful to ensure by construction that we never do |
| 94 | +more work than necessary. |
| 95 | + |
| 96 | +### Backend pipeline |
| 97 | + |
| 98 | +Now let's dive into how the backend runs. |
| 99 | +Agda provides multiple hooks for backends to trigger and do some work on |
| 100 | +different modules. |
| 101 | + |
| 102 | +In our case, it's quite simple: `agda2lambox` kicks in on the *main* module the |
| 103 | +backend is running on. |
| 104 | + |
| 105 | +- We retrieve all definitions in this main module. *All* of those are compiled |
| 106 | + and will be available in the generated λ□ environment. Unless, of course, they |
| 107 | + are logical and erased during compilation. |
| 108 | + |
| 109 | +- When compiling these definitions, they may themselves require definitions |
| 110 | + coming from other modules. The backend transitively compiles all the required |
| 111 | + definitions (and only those), across modules. |
| 112 | + |
| 113 | + This is what the compilation monad `CompileM` (defined in |
| 114 | + `Agda2Lambox.Compile.Monad`) is for. It maintains a stack of definitions to |
| 115 | + compile, and a way to require new definitions to be compiled (`requireDef`). |
| 116 | + When we compile a function body, for example, we can then call `requireDef` on |
| 117 | + any name that occurs in the body, making sure that those definitions will in |
| 118 | + turn be compiled. |
| 119 | + |
| 120 | +- Once we have compiled all definitions and their transitive dependencies, |
| 121 | + we *topologically* sort them so that they are properly ordered in the |
| 122 | + generated environment. |
| 123 | + The compilation loop and topological sort routine is implemented in |
| 124 | + `Agda2Lambox.Compile.Monad.compileLoop`. |
| 125 | + |
| 126 | +- Once we have the environment, we convert it to either Rocq or an S-expression |
| 127 | + for later consumption by the rest of the pipeline. |
0 commit comments