Skip to content

HallofFamer/CLox

Repository files navigation

CLox

Another implementation of ByteCode Interpreter for Lox in C, with tons of new features.

Introduction

CLox is an implementation of the programming language Lox in C. Currently it uses naive single-pass compiler, and only runtime optimization is performed. In future it is planned to have a multi-pass compiler with AST and compiler optimization. The initial version of CLox has only features already present in the original Lox reference implementation, but subsequent versions will continue to add new features to make it a powerful language. This is an experiment on the design and implementation of language features, before I start to implement the backend of my own programming language Mysidia. Stay tuned.

The original version of Lox programming language can be found at Bob Nystrom's repository: https://github.com/munificent/craftinginterpreters

Features

Original Features

  • Stacked based bytecode VM with the basic OPCode support.
  • On-demand Scanner, Pratt Parser and Single-Pass Compiler.
  • Uniform runtime representation for Lox Value.
  • Basic Unary and Binary Expression/Operators.
  • Support for global and local variables.
  • If..Else and Switch condition Statement.
  • While Loop, Break and Continue Statement.
  • Functions and Closures with automatic upvalue capture.
  • Mark and sweep Garbage Collector.
  • Classes, Objects, Methods and this keyword.
  • Single Inheritance and super keyword.
  • Performance improvement with Faster hash table probing and Nan-Boxing.

New Features

  • Framework for creating native functions, methods and classes.
  • Array/Dictionary Literals and square bracket notation for array/dictionary access.
  • New Operators: Modulo(%), Range(..) and Nil Handling(?., ??, ?:).
  • Operator overloading to allow operators to be treated as method calls, thus can be used by user defined classes.
  • Variadic Functions, Anonymous Functions(local return) and Lambda Expressions(nonlocal return).
  • Object root class for every class in Lox, everything is an object, and every object has a class.
  • Object ID and generic object map which enable inheritance for special build-in classes such as String and Array.
  • Class methods in class declaration using class keyword, and trait declaration using trait keyword.
  • Anonymous classes/traits similar to anonymous functions/lambda.
  • Namespace as CLox's module system, allowing importing namespace and aliasing of imported classes, traits and functions.
  • CLox Standard Library for packages lang, util, collection, io and net in bullt-in namespace clox.std.
  • Raise exception with throw keyword, and exception handling with try/catch/finally statement.
  • Interceptor methods which are invoked automatically when getting/setting properties, invoking methods or throwing exceptions.
  • Generator functions/methods which can yield control back to the caller and resume at a later point of execution.
  • Promise API with event loop provided by libuv library for non-blocking IO operations.
  • Introduction of async and await keywords, which allows C#/JS style of concurrency.
  • Optional static typing support for function/method parameters and return values, types only exist at compile time and are erased at runtime.
  • Semicolon inference as well as basic type inference for immutable local/global variables.
  • Customized Runtime configuration for CLox using clox.ini.
  • Allow loading lox source files in lox script and another lox source file with require keyword.
  • Cross-platform build with Cmake and package manager with vcpkg.

Enhanced or Removed Features

  • VM is no longer a global variable, allowing CLox VM to be embedded in other host applications.
  • Multi-pass compiler with abstract syntax tree, semantic analyzer(resolver), symbol table, type checker, and generation of bytecode by walking AST.
  • Parser is extended with look-ahead capability, with field next storing the next token.
  • Print statement removed, use native function print and println instead.
  • Initializer method is renamed from init to __init__ as an interceptor method.
  • Separated integer values(Int) and floating point(Float) values from Number.
  • Improved string concatenation, addition of string interpolation and UTF-8 strings.
  • C style For Loop replaced with Python/Kotlin style for-in Loop.
  • Introduction of keyword extends instead of using < operator for class inheritance.
  • Global variables are scoped within the file it is declared, effectively becoming module variable.
  • Function/Method parameters become immutable by default, but may be mutable with var keyword.
  • Built-in and user defined classes/functions become immutable, and cannot be accidentally overwritten.
  • Fix reentrancy problem with CLox, calling Lox closures in C API becomes possible.
  • Most runtime errors in VM interpreter loop replaced with Exceptions that can be caught at runtime.
  • Inline caching for VM optimization, as well as implementation of Shape(Hidden Class) for better instance variable representation.
  • Upgraded the mark and sweep GC with a generational GC which has multiple regions for objects of various generations, which leads to GC running faster when marking/freeing objects.

Roadmap

Below are the features planned for future versions of Lox2, the list is subject to change but it gives a basic idea of the future directions of this project.

For a list of implemented features, please see the change logs in /notes directory.

Lox 2.1.0(next version)

  • Extend parser with infinite lookahead, allowing qualified names to be used for function/method signature, class/trait declaration and catch statement.
  • Dedicated syntax for declaring function types and metaclass types, enabling anonymous functions/lambda to be typed.
  • Allow declaration of object properties outside of class initializer, which also enables optional static typing.
  • Redesign of Iterator/Enumerator API for ease of use and implementation of iterable types.

Lox 2.2.0

  • Enhanced type system with basic support for generics/parametric polymorphism.
  • type keyword used as declaration of type alias, useful for complex generic types.
  • Capability of saving bytecode into disk as .loxo file, which can be loaded later for faster compilation.
  • Refactor classes in the existing standard library to use generics(in clox.std parent package), such as Array<T> and Promise<T>.

Lox 2.3.0

  • Additional type system enhancement for union types, with | operator on types such as String | Number.
  • Support for structural pattern matching using match keyword, remove switch statement as it has been superceded.
  • Improved type system with non-nullable by default for type declaration, as well as variance for method parameter/return types.
  • Trailing closure similar to Kotlin and Swift which allows last lambda argument to be placed outside of parenthesis.

Lox 2.4.0

  • Refine if and match as expressions, with the value produced being the last expression/statement of the expression body.
  • Object literal syntax similar to Javascript which can be good for configuration objects.
  • Add new package clox.std.text which handles text processing for MIME types such as json and xml.
  • Foreign function interface(FFI) as a way to write CLox libraries in C and load in lox script.

Lox 2.5.0

  • C# style property accessor syntax, also inline simple getter/setter calls.
  • First class continuation with keyword context, enabling manipulation of call stack in userland.
  • Add CLox CLI to run Lox scripts easily from command line, backed by libuv.
  • Implement a profiler which can identify the "Hotspots" of the program and how long they execute, prerequiste for future JIT.

Build and Run Clox

Windows(with git, cmake and vcpkg, need to replace [$VCPKG_PATH] with installation path of vcpkg)

git clone -b master https://github.com/HallofFamer/CLox.git
cd CLox
cmake -DCMAKE_TOOLCHAIN_FILE:STRING="[$VCPKG_PATH]/scripts/buildsystems/vcpkg.cmake" -S . -B ./build
cmake --build ./build --config Release
./x64/Release/CLox

Linux(with git, cmake, curl and libuv, need to install one of the libcurl4-dev and libuv1.dev packages)

git clone -b master https://github.com/HallofFamer/CLox.git
cd CLox
mkdir build
cmake -S . -B ./build
cmake --build ./build --config Release
./build/CLox

Docker(linux, need to replace [$LinuxDockerfile] by actual docker file name, ie. UbuntuDockerfile)

git clone -b master https://github.com/HallofFamer/CLox.git
cd CLox
docker build -t clox:linux -f Docker/[$LinuxDockerfile] .
docker run -w /CLox-1.9.0/CLox -i -t clox:linux

Credits & Special Thanks

Below is the attribution list for my CLox's design and implementation, please contact me if anything else is missing. This does not include 3rd-party libraries whose copyrights are already present in the header files.

  • Robert Nystrom: For the original Lox language specification and source code that this project is based on.
  • Smalltalk-lang: For the inspiration of object and class model.
  • Ruby-lang: For the inspiration of generic ID map/table which allows inheritance and properties on any objects.
  • Javascript-lang: For the inspiration of object shape/hidden-class.
  • Wren-lang: For the inspiration of UTF-8 string implementation as well as dynamic array macros.
  • Michael Malone(cometlang): For the inspiration of exception and try/catch/finally statement.
  • RevengerWizard(teascript): For the inspiration of reentrancy in Lox language.

FAQ

What is the purpose of this project?

CLox is an implementation of Lox language with bytecode VM instead of treewalk interpreter. It is the last experiment on feature implementations, before I will begin writing my own programming language Mysidia in C.

What are the reasons behind the design of CLox?

Please see DESIGN.md document for more information, it provides a detailed explanation on the choice of new features being implemented, as well as insights into the evolution of CLox as a general purpose programming language.

Can I use the code of CLox as base for my own project?

This project is open source and the codebase can be used as base for someone else's project. It has an MIT license, and attribution must be given except for code from the original Lox implementation or third party libraries.

What will happen to KtLox?

Nothing, KtLox development is on hold until I can figure out a way to generate JVM bytecode instead of running the interpreter by walking down the AST. Treewalk interpreters are way too slow to be practical without JIT, and in my honest opinion I do not think the book should've covered treewalk interpreters but I will digress.