Skip to content

Releases: Konrad1991/ast2ast

Major update

13 Jan 09:55

Choose a tag to compare

ast2ast translates R functions into optimized C++ code while preserving R-like semantics,
enabling fast execution, automatic differentiation, and static analysis - without rewriting code by hand.

Example: compute a Jacobian using reverse-mode automatic differentiation from ordinary R code:

f <- function(y, x) {
  y[[1L]] <- x[[1L]] * x[[2L]]
  y[[2L]] <- x[[1L]] + x[[2L]]*x[[2L]]
  jac <- deriv(y, x)
  return(jac)
}
fcpp_reverse <- ast2ast::translate(f, derivative = "reverse")
y <- c(0, 0)
x <- c(2, 3)
fcpp_reverse(y, x)

Highlights

  • Added forward- and reverse-mode automatic differentiation, enabling efficient gradient and Jacobian computation.

  • Significantly improved error messages: errors now show the source line first, followed by a precise diagnostic.

a <- foo()
Invalid function foo
  • Introduced a type system with automatic type inference in R, supporting:
    • Scalar types: logical, integer, double
    • Data structures: scalar, vector, matrix
    • NA, NaN, and Inf semantics now match R’s behavior.

Design improvements

  • New frontend functions can now be added by registering them in a centralized function registry, making extensions much easier.
  • The entire AST is now represented using R6-based node classes, enabling structured transformations, easier analysis, and future optimizations.
  • Experimental support for n-dimensional arrays already exists on the C++ side (R interface coming soon).

Future work

  • Preserved subsetting is currently too expensive and will be optimized.
  • Adding scalar references to the scalar system.
  • Exposing array support in the R interface.

V0.4.0

30 Aug 15:53

Choose a tag to compare

Major Rewrite and Refactoring

CPP Code:

  • Rewritten large parts of the C++ code.
  • Improved memory management with several new classes handling memory on the heap.
  • Introduced support for bool, int, and double data types.
  • Migrated from C++17 to C++20, reducing lines of code.
  • Added traits to differentiate between r and l vectors, optimizing memory assignment.
  • Changed subsetting to use pointers and indices vectors, making it more efficient.
  • Designed the memory management system to allow for easy addition of new memory classes.

R Code:

  • Added the possibility for users to define types with six supported types: Logical, Int, Double, Logical Vector, Int Vector, and Double Vector.
  • Users can declare types when defining variables using var::type.
  • Improved argument passing to the ast2ast function to allow more detailed declarations.

Error Management:

  • Enhanced error management throughout the codebase.

Future Plans:

  • Further improvements to the R code for robustness and potential complete rewrite.
  • Consider changing variable declaration syntax to var %type% "type".
  • Enhancing the subsetting mechanism to make it even more lazy.

Derivative Calculation System:

The functionality of the J function has been replaced with a more flexible system for calculating derivatives. The key components are:

  • set_indep(x): Sets x as the independent variable.
  • get_deriv(var): Retrieves the derivative of var with respect to x.
  • unset_indep(x): Unsets the independent variable.

Example Usage:

set_indep(x)
y = x * x
assign(y, get_deriv(x) * x + x * get_deriv(x))
unset_indep(x)