zj2zmips

NOTICE: 🛌 This project is taking a nap until August 2025.

NOTICE: This project is in motion. Since AWS has popularized working backwards, I no longer am ashamed of writing full-fledged READMEs and alike before even doing any real implementation. If I did it the other way around, it's very likely my codebase gets messy and also that I prioritize the wrong stuff for getting it into a stable and productive state quickly. Hence, what you are reading is an abstract of what is going to happen in the future. This is a really exciting project for me, maybe this can be infectious to someone else.

This program is an implementation of a ZeroJava source code to zMIPS assembly compiler and is based of the TrustworthyComputing ZeroJava-compiler reference implementation. zMIPS assembly serves as input for the TrustworthyComputing Zilch zero-knowledge proof framework.

Why another zMIPS compiler?

First off, it's not really a compiler, it's (in my opinion) actually a transpiler, because zMIPS assembly is not executable and only an abstraction of byte-code. Compiler is a pretty big word and even though a big challenge for me, getting from an abstract syntax tree (AST) to optimized zMIPS assembly is not really what I think of as compilation, when I look at true bytecode compilers.

I found the dependency tree and build environment of the ZeroJava-compiler reference implementation to be rather fragile and wasn't even able to build it with JDK21. Also, I'm not a fan of Java programs - not because of the language, but because of the ecosystem, which is highly opinionated and puts policies over mechanisms. Basically I feel forced to work a certain way, even though that is in no way necessary to get some functional output. In addition, I consider ZeroJava-compiler to be a reference implementation, so I'm not expecting it to be production-grade.

The ZeroJava-compiler is considered the frontend of the Zilch framework. My long-term goal is to also rewrite the backend and port its CUDA kernels to OpenCL. That way the framework would be uniform and easily portable onto different platforms, especially legacy hardware. I bought a 10 year old Tesla K80 GPU recently at a very low price and was astounded by how powerful it is, even for today's standard. That cost to performance ratio inspired me to focus on legacy support. Moving GPGPU away from Nvidia's proprietary CUDA SDK avoids vendor lock-in and extends the framework's compatibility onto a larger set of GPU hardware.

As a PoC, my stretch goal is to port the entire Zilch framework to the ESP32 platform. It's going to be (really) slow, probably not even possible (due to memory constraints when emulating the zMIPS architecture), but figuring this out is going to be fun anyway. I enjoy working with constrained execution environments.

By tackling these hardware constraints, I aim to broaden my understanding of cryptographic design principles. This is a personal exploration of cryptographic applications and the type of workflow it requires to provide source code to someone that knows the ins-and-outs and can audit it from an academic standpoint. I have some concepts of applications for decentralized-autonomous-organizations (DAO) floating around in my head, but I want to familiarize myself with the fundamentals first, so that I'm not dependent on hearsay.

Considerations

I considered noting the ZeroJava grammar as EBNF and use flex and bison as parser generators, but decided on implementing an idiomatic recursive-descent parser from scratch instead, since integrating flex and bison into Rust and maintaining it seemed rather tedious. Also, it's always a good exercise to write a parser from scratch.

I am currently taking a closer look at the initial release of GCC (0.9) and some C99 compiler implementations to get an alternate perspective on how a simple, yet robust compiler design might be achieved, in comparison to what's considered modern nowadays.

Roadmap

This is the rough road-map. I will implement most by hand, as this is a very nice exercise for me and also I see potential in greatly improving on the reference implementation to allow for further collaboration on zero-knowledge proof implementations.

setup an end-to-end build environment with continuous integration (CI)
implement ZeroJava tokenizer
implement ZeroJava abstract syntax tree (AST) parser
implement zMIPS intermediate representation (IR) generator
implement zMIPS assembly code generator
implement zMIPS IR optimizer
implement lightweight comparative fuzzing and benchmarking framework

Starting with CI first should make it easier to understand the workflow I am aiming for without having to resort to describing everything in detail right away. CI forces me to define a clear interface for automation, which coincidentally might help others in getting a mind map of my workflow. Also, it's nice not having to worry about distribution later on...

Files and directories

This is a short description of all the files and directories in use by this project.

File name	Description
.gitignore	Git-specific globbing patterns for excluding files and directories from being treated by Git.
.vimrc	Configuration of VIM text editor. I'll be working within different environments and am not always going to configure them from scratch. VIM luckily allows for defining configurations on a per-directory basis, that's why I'm shipping the configuration with this repository.
ARCHITECTURE.md	Detailed description of design choices and implementation details
bitbucket-pipelines.yaml	Bitbucket Pipelines CI service specification. Defines a sequence of shell commands to be executed when commits are pushed to the remote Git server.
Cargo.toml	Rust project specification (for `cargo` build driver)
Cargo.lock	Lockfile of `cargo` build driver (autogenerated)
configure	Generated GNU Autoconf autoconfiguration script to initialize the build environment. This is intended for use by CI services, so that I can be flexible when choosing/switching build environments (basically a Linux container image) and get some verbosity for quickly debugging misconfigurations.
configure.ac	Source file for GNU Autoconf for generating a (POSIX) shell autoconfiguration script to initialize the build environment. Every time the file changes, `autoconf` must be executed, in order to regenerate the `configure` file.
CHANELOG.md	Logs of changes (additions, fixes, removals) that happen in this project. It follows the Keep-a-changelog convention.
CONTRIBUTING.md	Describes the development workflow, as well as maintenance tasks and general conventions to follow when contributing to this project, which will only be me though (for the time being).
docs/	Convolute of assets for documentation purposes.
examples/	Examples of actually applying the project to use-cases.
LICENSE	License applicable to this project.
Makefile	GNU Make specification, which serves as an interface to the `cargo` build driver (and other things) for CI services. It can also aid in getting an overview of common build/maintenance tasks.
README.md	Project introduction and overview (what you are currently viewing).
scripts/test.sh	Wrapper for `cargo test`, to make testing more conventional by integrating testing, as well as reporting.
scripts/todo.sh	(somehow) quantifies (known) technical debt. This is a guard rail, should I get too excited with implementing and forget about resolving issues first.
src/bin/	Source code for auxiliary binaries. This is to provide a CLI interface to the program.
src/lib.rs	Code entrypoint of this project. The project is intended as a library, so that it does not require a CLI when using it on headless platforms. All public modules of the library are defined in this file.
tests/	Directory containing unit and integration tests. I'm breaking with Rust convention a little, since I want a stronger separation of code and test code, than is intended by the Rust team. Therefore I'm wrapping `cargo test` to make it behave the way I want it to, which may result in some confusing message output. I will adapt my approach over time.

Acknowledgments

Kudos go to the creators of the Zilch framework. I'm just a UNIX sysadmin, intrigued by the research and who's not a fan of Oracle.

D. Mouris and N. G. Tsoutsos, "Zilch: A Framework for Deploying Transparent Zero-Knowledge Proofs,"
in IEEE Transactions on Information Forensics and Security (TIFS), 2021, DOI: 10.1109/TIFS.2021.3074869

I've stumbled upon this research when looking at benchmarks for zero-knowledge proof algorithms and wondering why the Zilch framework was (considerably) faster than other implementations. I got an answer, but honestly no longer care about the answer. I'm just excited to do something with the Zilch framework, on my own terms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

zj2zmips

Why another zMIPS compiler?

Considerations

Roadmap

Files and directories

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
.vimrc		.vimrc
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md

ByteB4rb1e/zj2zmips

Folders and files

Latest commit

History

Repository files navigation

zj2zmips

Why another zMIPS compiler?

Considerations

Roadmap

Files and directories

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages