A question that was raised recently was "how would one run source code directly from a Task". For example, how would you run a Python script?
We know from other systems like Nix that this is actually a nontrivial case to make reproducible. It depends on the chip architecture, OS, available system libraries, and so on.
You can specify all of these by hash ahead of time. A generic task type "run this Python" could fall into one of the following strategies:
- If you don't care about reproducibility, set that expectation as a very loose/dangerous/attested effect
- Fully specification the expectations of the environment (x86 Linux, with these versions down to the hash of these libraries installed)
- Run inside of an established Docker container (by CID) that has the relevant environment set up
- Feed the source to a Wasm Python interpreter
- Feed the source to a Wasm compiler
- Compile the Python to Wasm and execute it directly
2, 3, and 4 are actually really interesting, because you don't even need a new kind of Task. The source can "just" be an argument, and either use the output to further steps (4) or get the result back directly.
4 additionally gets you new Wasm that can be cached for future invocation, which is really helpful for (automatically) publishing reproducible packages to registries — which is a common workflow for e.g. a GitHub Actions.
3 and 4 and kind of microkernel-y, because you could later swap in a different (faster, bugfixed, etc) interpreter/compiler.
CC @simonwo
A question that was raised recently was "how would one run source code directly from a Task". For example, how would you run a Python script?
We know from other systems like Nix that this is actually a nontrivial case to make reproducible. It depends on the chip architecture, OS, available system libraries, and so on.
You can specify all of these by hash ahead of time. A generic task type "run this Python" could fall into one of the following strategies:
2, 3, and 4 are actually really interesting, because you don't even need a new kind of Task. The source can "just" be an argument, and either use the output to further steps (4) or get the result back directly.
4 additionally gets you new Wasm that can be cached for future invocation, which is really helpful for (automatically) publishing reproducible packages to registries — which is a common workflow for e.g. a GitHub Actions.
3 and 4 and kind of microkernel-y, because you could later swap in a different (faster, bugfixed, etc) interpreter/compiler.
CC @simonwo