Skip to content

Commit 2db699d

Browse files
committed
start documenting
1 parent f24c1cb commit 2db699d

File tree

1 file changed

+127
-0
lines changed

1 file changed

+127
-0
lines changed

HACKING.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
The `agda2lambox` backend is a work in progress. Currently it already shows
2+
promising results, but should only be considered a proof of concept until
3+
further work is put into making it easier to use.
4+
5+
This goal of this file is to document the project, in the hopes of making it
6+
easier for other people to pick it up and finish the work. It describes the goal
7+
of the backend, and details the implementation strategy.
8+
Some notes discuss yet unimplemented features, along with some suggestions on
9+
how to add those.
10+
11+
## The `agda2lambox` project
12+
13+
### Coq/Rocq extraction
14+
15+
The Rocq proof assistant is well known for its extraction mechanism, which can
16+
turn any formalized development into a self-standing OCaml program. In the past
17+
10 years or so, several projects have raised the bar and formally verified part
18+
of the pipeline.
19+
20+
In particular:
21+
22+
- As part of the [MetaCoq] project,
23+
24+
- The pipeline erasure transformation from PCUIC (Rocq's kernel theory) to λ□
25+
has been mechanized.
26+
See: [Certified Erasure for Coq, in Coq](https://inria.hal.science/hal-04077552).
27+
28+
- The translation from λ□ to OCaml (Malfunction, really)
29+
has been mechanized.
30+
See: [Verified Extraction from Coq to OCaml](https://dl.acm.org/doi/10.1145/3656379).
31+
32+
- As part of the [CertiCoq] project,
33+
34+
- A full extraction pipeline from Gallina (Rocq's surface language) to Clight (a subset
35+
of C) has been mechanized and verified.
36+
37+
- CertiCoq has recently been extended with a verified pipeline to [WASM].
38+
See: [certicoq-wasm](https://github.com/womeier/certicoqwasm).
39+
40+
- As part of the [ConCert] project,
41+
42+
- MetaCoq was extended with additional transformations.
43+
- A variation of λ□, annotated with type information, has been introduced.
44+
- There is support for even more targets using this intermediate language,
45+
such as Rust and Elm, even though correctness of the generated code is to
46+
our knowledge not verified.
47+
48+
Common to all those contributions is the intermediate language λ□ (Lambda Box),
49+
first introduced in Pierre Letouzey's PhD thesis on the (at the time new) Rocq
50+
extraction mechanism. λ□ is now defined in Rocq inside the MetaCoq project, and
51+
both the CertiCoq and ConCert projects rely on MetaCoq to use λ□ as an input
52+
language.
53+
54+
### The plan
55+
56+
The plan, therefore, is to develop a new backend for Agda, that targets λ□ (and
57+
its type-annotated variant). The hope is to be able to latch onto the rest of
58+
the verified pipeline from the Rocq ecosystem, and down the line be able to
59+
compile Agda code to new targets such as WASM, Rust, Elm and even machine code.
60+
61+
λ□ is on paper a very suitable target coming from Agda.
62+
It's also much easier to maintain this one backend than to implement one for
63+
each of those desired target languages.
64+
Although there is no plan to verify the backend, we directy benefit from the
65+
fact that the rest of the pipeline is (at least partially) verified.
66+
67+
[MetaCoq]: https://metacoq.github.io/
68+
[CertiCoq]: https://certicoq.org/
69+
[WASM]: https://webassembly.org/
70+
71+
## The `agda2lambox` backend
72+
73+
Now that the general context is out of the way, let's discuss the current
74+
implementation.
75+
76+
### λ□ syntax
77+
78+
In folder `LambdaBox`, we define the syntax of λ□.
79+
It's identical to the formalization that can be found in the MetaCoq project,
80+
so that it's trivial for the backend to generate Rocq files or forward λ□ programs
81+
to the rest of the pipeline.
82+
83+
Of note: Since the typed variant of λ□ was added after the fact by the ConCert
84+
project, it introduces a completely separate notion of global environment in the
85+
MetaCoq formalization. The *only* significant difference with the global
86+
environement for λ□ is the inclusion of type annotations.
87+
88+
For the backend, in order to avoid unnecessary duplication, we have a single
89+
notion of global environment, that *may* contain type information.
90+
It is defined in `LambdaBox.Env`.
91+
We do a tiny bit of type-trickery to ensure that type information is always
92+
there when we are targeting typed λ□, and never there when we target untyped λ□.
93+
So far this has proven quite useful to ensure by construction that we never do
94+
more work than necessary.
95+
96+
### Backend pipeline
97+
98+
Now let's dive into how the backend runs.
99+
Agda provides multiple hooks for backends to trigger and do some work on
100+
different modules.
101+
102+
In our case, it's quite simple: `agda2lambox` kicks in on the *main* module the
103+
backend is running on.
104+
105+
- We retrieve all definitions in this main module. *All* of those are compiled
106+
and will be available in the generated λ□ environment. Unless, of course, they
107+
are logical and erased during compilation.
108+
109+
- When compiling these definitions, they may themselves require definitions
110+
coming from other modules. The backend transitively compiles all the required
111+
definitions (and only those), across modules.
112+
113+
This is what the compilation monad `CompileM` (defined in
114+
`Agda2Lambox.Compile.Monad`) is for. It maintains a stack of definitions to
115+
compile, and a way to require new definitions to be compiled (`requireDef`).
116+
When we compile a function body, for example, we can then call `requireDef` on
117+
any name that occurs in the body, making sure that those definitions will in
118+
turn be compiled.
119+
120+
- Once we have compiled all definitions and their transitive dependencies,
121+
we *topologically* sort them so that they are properly ordered in the
122+
generated environment.
123+
The compilation loop and topological sort routine is implemented in
124+
`Agda2Lambox.Compile.Monad.compileLoop`.
125+
126+
- Once we have the environment, we convert it to either Rocq or an S-expression
127+
for later consumption by the rest of the pipeline.

0 commit comments

Comments
 (0)