Skip to content

Implementation sketch of tagged unions (ADTs)

EliasC edited this page May 14, 2015 · 2 revisions

This page sketches an implementation of tagged unions or ADTs, if we wanted to add it to Encore.

The Example

As the running example for this text, I'm going to use a user-supplied myOption data type. The way this type is specified does not matter, just that it has three constructors; Some that takes a value of type int, Another that takes an int and a Foo and None that is a value on its own:

á la Haskell

data myOption = Some int | Another int Foo | None

á la Rust

enum myOption{
    Some(int),
    Another(int, Foo),
    None
}

Implementation

To track which kind of myOption a value is, we define an enum:

enum _enc__myOption_tag{
  _ENC__TAG_myOption_Some,
  _ENC__TAG_myOption_Another,
  _ENC__TAG_myOption_None,
};

The value itself will be represented by a struct containing a tag (of type _enc__myOption_tag) and a union of structs, one for each constructor:

struct _enc__myOption_t{
  enum _enc__myOption_tag tag;
  union{
    // Some
    struct{
      int64_t some_f1;
    };
    // Another
    struct{
      int64_t another_f1;
      _enc__active_Foo_t *another_f2;
    };
    // None
    struct{};
  };
};

The value of tag decides which of the unions is "active" for a particular value. Creating a new value of type myOption is straightforward:

Encore

let x = Another(42, NULL) in ...

C

struct _enc__myOption_t* _tmp_x = pony_alloc(sizeof(struct _enc__myOption_t));
_tmp_x->tag = _ENC__TAG_myOption_Another;
_tmp_x->another_f1 = 42;
_tmp_x->another_f2 = NULL;

Matching (when we have it) is as simple as doing a switch on the tag and binding the fields within:

Encore

match x with
    Some(n) => print("{}\n", n);
    Another(n, o) => print("{} and {}\n", n, o);
    _ => print("No value!\n");

C

switch(_tmp_x->tag)
{
  case _ENC__TAG_myOption_Some:
  {
    int64_t n = _tmp_x->some_f1;
    printf("%d\n", n);
    break;
  }
  case _ENC__TAG_myOption_Another:
  {
    int64_t n = _tmp_x->Another_f1;
    _enc__active_Foo_t *o = _tmp_x->Another_f2;
    printf("%d and %p\n", n, o);
    break;
  }
  default: 
  {
    printf("No value!\n");
  }
}

Multi-level matching just translates into nested switch statements.

Tracing functions would look like a match, and then like tracing the fields of a class.

Extensions

Adding polymorphism should not be harder than how we do polymorphic classes right now. We might want to store a pointer to the runtime type of the parameter of the union (to make tracing simpler). If we wanted to add names to the "fields" (like records in Haskell) this should also not pose any real problems.

Clone this wiki locally