hazelgrove
diff --git a/‎dune-project‎
Lines changed: 1 addition & 0 deletions b/‎dune-project‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎hazel.opam‎
Lines changed: 1 addition & 0 deletions b/‎hazel.opam‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎hazel.opam.locked‎
Lines changed: 1 addition & 0 deletions b/‎hazel.opam.locked‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎src/b2t2/Datasheet.md‎
Lines changed: 197 additions & 0 deletions b/‎src/b2t2/Datasheet.md‎
Lines changed: 197 additions & 0 deletions
diff --git a/‎src/b2t2/Datasheet.re‎
Lines changed: 19 additions & 0 deletions b/‎src/b2t2/Datasheet.re‎
Lines changed: 19 additions & 0 deletions
diff --git a/‎src/b2t2/README.md‎
Lines changed: 41 additions & 0 deletions b/‎src/b2t2/README.md‎
Lines changed: 41 additions & 0 deletions
diff --git a/‎src/b2t2/Slides.re‎
Lines changed: 42 additions & 0 deletions b/‎src/b2t2/Slides.re‎
Lines changed: 42 additions & 0 deletions
diff --git a/‎src/b2t2/dune‎
Lines changed: 28 additions & 0 deletions b/‎src/b2t2/dune‎
Lines changed: 28 additions & 0 deletions
@@ -33,6 +33,7 @@
    (>= 3.12.0))
   ppx_yojson_conv_lib
   ppx_yojson_conv
+  ppx_blob
   incr_dom
   bisect_ppx
   (omd
 
@@ -0,0 +1,197 @@
+## Reference
+
+> Q. Where can we learn about the programming medium covered by this datasheet?
+> (Feel free to link to multiple kinds of artifacts: repositories, papers, videos, etc.
+> Please also include version information where applicable.)
+
+- **Website**: http://hazel.org  
+- **Source Code**: https://github.com/hazelgrove/hazel
+- **App**: https://hazel.org/build/dev/  
+
+> Q. What is the URL of the version of the benchmark being used?
+https://github.com/brownplt/B2T2/blob/fd227efadf532a20aefd25c7a8580978c2d684a2/Datasheet.md  
+
+
+> Q. On what date was this version of the datasheet last updated?
+2025-11-05
+
+> Q. If you are not using the latest benchmark available on that date, please explain why not.
+Yes
+
+## Example Tables
+
+> Q. Do tables express heterogeneous data, or must data be homogenized?
+  Hazel tables are represented as *lists of labeled tuples*.  
+  - Columns may be heterogeneously typed.   
+  - Rows must be homogeneously typed.
+    - The unknown type allows some degree of heterogenous rows.
+
+> Q. Do tables capture missing data and, if so, how? Do missing values affect the output constraints of any operations,
+  for example `groupBy`?
+  - Represented via `Option` types (`Some` / `None`)  
+  - Incomplete programs can use expression holes (holes are not programmatically discernible)  
+  - No special handling in operations — `Option` values are ordinary
+
+> Q. Are mutable tables supported? Are there any limitations?
+Mutable tables are not supported
+
+> You may reference, instead of duplicating, the responses to the above questions in answering those below:
+
+> Q. Which tables are inexpressible? Why?
+
+None — all tables can be expressed using `Option` types for missing data
+
+> Q. Which tables are only partially expressible? Why, and what’s missing?
+
+N/A
+
+> Q. Which tables’ expressibility is unknown? Why?
+
+N/A
+
+> Q. Which tables can be expressed more precisely than in the benchmark? How?
+
+None - hazel represents the tables as precisely as the benchmark. Once again explicit option types make optional 
+columns explicit.
+
+> Q. How direct is the mapping from the tables in the benchmark to representations in your system? How complex 
+is the encoding?
+
+  - Very direct  
+  - Benchmark tables map naturally to Hazel's `List of Labeled Tuples`  
+  - Missing values use `Option`
+  - Nested tables use nested labeled tuples or lists
+
+## TableAPI
+
+> Q. Are there consistent changes made to the way the operations are represented?
+The operations are mostly presented as depicted, but here are a few variations:
+- Some operations utilize explicity polymorphism in Hazel using the `typfun` keyword to require explicit type
+  application as implicit polymorphism has not been added to Hazel as of 2025-07-08
+- Hazel tables are represented using lists of labeled tuples so there is no runtime schema available for operations.
+  For certain operations, such as `leftJoin`, this requires looking at the head element to determine the schema and
+  give some behavior in the event no such element exists.
+- Certain operations have been made to return an optional value rather than an error
+- Hazel does not have first-class labels, and therefore uses strings for columns for some of the operations.
+  If the operation was done inline primitive operators could be used to recover typesafety.
+
+> Q. Which operations are entirely inexpressible? Why?
+All the operations are at least partially expressible.
+
+> Q. Which operations are only partially expressible? Why, and what’s missing?
+- `leftJoin` can only build the resulting columns if both tables have at least one row to determine the schema
+- Various operations only work if there's at least one row to determine the schema
+  - ncols, header
+- `dropna` only works if every column in a table is optional since there's no way to dynamically dispatch based off of
+   column sort.
+
+> Q. Which operations’ expressibility is unknown? Why?
+N/A
+
+> Q. Which operations can be expressed more precisely than in the benchmark? How?
+- Several operations could be expressed in a more typesafe manner if a projection function was passed instead of a
+  column name.
+  - e.g. `selectColumn(table, fun e -> e.name)` as opposed to `selectColumn(table, `name`)`
+
+## Example Programs
+
+> Q. Which examples are inexpressible? Why?
+- sampleRows is inexpressible as Hazel is pure
+
+
+> Q. Which examples’ expressibility is unknown? Why?
+N/A
+
+> Q. Which examples, or aspects thereof, can be expressed especially precisely? How?
+The examples are expressed as precisely as the benchmark
+
+> Q. How direct is the mapping from the pseudocode in the benchmark to representations in your system? How complex is
+  the encoding?
+- The mapping is quite direct as implemented. A less direct mapping could accomplish a more type-safe translation of
+  several of the programs.
+
+## Errors
+
+> There are (at least) two parts to errors: representing the source program that causes the error, and generating output
+> that explains it. The term “error situation” refers to a representation of the cause of the error in the program 
+> source.
+> 
+> For each error situation it may be that the language:
+> 
+> - isn’t expressive enough to capture it
+> - can at least partially express the situation
+> - prevents the program from being constructed
+> 
+> Expressiveness, in turn, can be for multiple artifacts:
+> 
+> - the buggy versions of the programs
+> - the correct variants of the programs
+> - the type system’s representation of the constraints
+> - the type system’s reporting of the violation
+
+> Q. Which error situations are known to be inexpressible? Why?
+Many of the programs require explicit parametric polymorphism and the higher-order function versions of the TableAPI
+operations to get the best feedback. 
+
+* `getOnlyRow` provides no feedback on the error as we do not currently track table size information statically
+
+
+> Q. Which error situations are only partially expressible? Why, and what’s missing?
+* Two versions of `brownJellybeans` are implemented with tradeoffs on expressibility:
+  * The first version takes a string column name and uses our more dynamic operations to select the column. 
+    This provides no feedback on the error but more closely matches the implementation in the benchmark.
+  * The second version takes a function that selects the column and uses our more type-safe operations to select the 
+    column. This correctly localizes the error to the column selection.
+
+> Q. Which error situations’ expressibility is unknown? Why?
+None
+
+> Q. Which error situations can be expressed more precisely than in the benchmark? How?
+None
+
+> Q. Which error situations are prevented from being constructed? How?
+None
+
+> Q. For each error situation that is at least partially expressible, what is the quality of feedback to the programmer?
+* Malformed Tables
+  * For missing schemas, rows, and cells they are represented by syntactic holes in the program. These are easily
+    visible in the editor and can be filled in by the programmer.
+  * For tables where the schema is the incorrect length static errors are added onto each row showing the type
+    inconsistency between the schema type and the row type.
+    * If extraneous columns are present, the error is localized to the column label and an error is placed 
+      * e.g. `favorite color is not part of expected labels: name, age`.
+    * If there is a cell of the wrong type, the error is localized to the cell and an inconsistent type error is placed 
+      * e.g. `String inconsistent with expected type Int for label age`
+
+Note that in the following programs the errors are partially localized based off of the chosen explicit type 
+application. Using different type-hole inference or choices for parametric type application would change the error
+localization and message.
+
+* `midFinal`
+  * Localizes the error to the column selection `mid` in the editor.
+  * Message: `Label mid not found in tuple's labels: name age quiz1 quiz2 midterm quiz3 quiz4 final`
+* `blackAndWhite`
+  * Localizes the error to the column selection `black and white` in the editor.
+  * Message: 
+```Label `black and white` not found in tuple's labels: get_acne red black white green yellow brown orange pink purple```
+* `pieCount`
+  * Localizes the error to the column selection `true` and `get_count` in the editor.`
+  * The error messages are similar to above
+* `brownAndGetAcne`
+  * Localizes the error to the column selection `brown and get acne` in the editor.
+  * The error messages are similar to above
+* `favoriteColor`
+  * Localizes the error to the column selection `favorite color` in the editor.
+  * The error message: `String is inconsistent with expected type Bool`
+* `brownJellybeans`
+  * The first version provides no feedback on the error as it uses the string column name.
+  * The second version localizes the error to the column selection, `color` with an error message similar to above.
+* `employee_to_department`
+  * Localizes an error to the column selection `last_name` in the editor
+  * Localizes another error to the tuple extension saying the resulting row's type is inconsistent since `last_name` is
+    a `Int` but the expected type is `String`
+  * The error message: `Label department not found in tuple's labels: name age department salary`
+
+
+> Q. For each error situation that is prevented from being constructed, what is the quality of feedback to the programmer?
+N/A
@@ -0,0 +1,19 @@
+open Haz3lcore;
+open Language;
+let content = [%blob "Datasheet.md"];
+
+let content: string = content |> Util.StringUtil.escape_linebreaks;
+let string_exp = IdTagged.FreshGrammar.Exp.string(content);
+let segment =
+  ProjectorInit.init(
+    TextArea,
+    Segment.parenthesize(
+      ExpToSegment.exp_to_segment(
+        ~settings=ExpToSegment.Settings.editable(~inline=true),
+        string_exp,
+      ),
+    ),
+    Exp(string_exp),
+  )
+  |> Option.get;
+let slide = ("B2T2 / Datasheet", PersistentSegment.persist([segment]));
@@ -0,0 +1,41 @@
+# B2T2 Implementation in Hazel
+
+This directory contains Hazel's implementation of the B2T2 (Brown Benchmark for Table Types) benchmark.
+
+## What is B2T2?
+
+B2T2 is a language design benchmark for evaluating type systems for table programming. It provides a standardized framework to compare the expressive power and diagnostic quality of different programming languages and systems when handling tabular data operations.
+
+The benchmark was created by researchers at Brown University and is documented in the paper:
+
+**"Types for Tables: A Language Design Benchmark"**  
+Authors: Kuang-Chen Lu, Ben Greenman, Shriram Krishnamurthi  
+Published in: The Art, Science, and Engineering of Programming, 2022
+
+- **Paper**: https://cs.brown.edu/~sk/Publications/Papers/Published/lgk-b2t2/
+- **Repository**: https://github.com/brownplt/B2T2
+
+## What is this Directory?
+
+This directory contains Hazel's implementation and evaluation of the B2T2 This implementation demonstrates how well Hazel's type system handles table programming constructs.
+
+The implementation includes:
+- **Datasheet** (`Datasheet.md`): A comprehensive evaluation of how Hazel addresses each component of the B2T2 benchmark
+- **Implementation** (`Datasheet.re`): Code used to turn the markdown datasheet into a documentation slide in the editor
+- **Documentation Slides** (`slides/`): Interactive examples demonstrating B2T2 concepts in Hazel
+- **Slides Module** (`Slides.re`): Aggregates all B2T2 slides for integration into Hazel's documentation system
+
+## B2T2 Benchmark Components
+
+The B2T2 benchmark consists of several key components that implementations must address:
+
+1. **Table Definition**: Specification of what constitutes a table in the language
+2. **Example Tables**: Various table structures that must be expressible
+3. **Table API**: A standard library of table operations (filtering, joining, grouping, etc.)
+4. **Example Programs**: Real-world programs that manipulate tables
+5. **Error Scenarios**: Common programming errors and how well the type system catches them
+6. **Datasheet**: Structured evaluation of the implementation's capabilities
+
+## Documentation Slides
+
+The slides are organized in `Slides.re` and automatically loaded into Hazel's documentation system via `src/web/init/Init.re`.
@@ -0,0 +1,42 @@
+let all_slides = [
+  Datasheet.slide,
+  B2T2ExampleTables.out,
+  B2T2TableAPIConstructorsemptyTable.out,
+  B2T2TableAPIConstructorsaddRows.out,
+  B2T2TableAPIConstructorsaddColumn.out,
+  B2T2TableAPIConstructorsbuildColumn.out,
+  B2T2TableAPIConstructorsvcat.out,
+  B2T2TableAPIConstructorshcat.out,
+  B2T2TableAPIConstructorsvalues.out,
+  B2T2TableAPIConstructorscrossJoin.out,
+  B2T2TableAPIConstructorsleftJoin.out,
+  B2T2TableAPIProperties.out,
+  B2T2TableAPIAccessSubcomponents.out,
+  B2T2TableAPISubtable.out,
+  B2T2TableAPIOrdering.out,
+  B2T2TableAPIAggregate.out,
+  B2T2TableAPIMissingValues.out,
+  B2T2TableAPIDataCleaning.out,
+  B2T2TableAPIUtilitiesFlatten.out,
+  B2T2TableAPIUtilitiestransformColumn.out,
+  B2T2TableAPIUtilitiesrenameColumns.out,
+  B2T2TableAPIUtilitiesfind.out,
+  B2T2TableAPIUtilitiesgroupByRetentive.out,
+  B2T2TableAPIUtilitiesgroupBySubtractive.out,
+  B2T2TableAPIUtilitiesupdate.out,
+  B2T2TableAPIUtilitiesselect.out,
+  B2T2TableAPIUtilitiesselectMany.out,
+  B2T2TableAPIUtilitiesgroupJoin.out,
+  B2T2TableAPIUtilitiesjoin.out,
+  B2T2ExampleProgramsDotProduct.out,
+  B2T2ExampleProgramspHackingHomogeneous.out,
+  B2T2ExampleProgramspHackingHeterogeneous.out,
+  B2T2ExampleProgramsquizScoreFilter.out,
+  B2T2ExampleProgramsquizScoreSelect.out,
+  B2T2ExampleProgramsgroupByRetentive.out,
+  B2T2ExampleProgramsgroupBySubtractive.out,
+  B2T2ErrorsMalformedTables.out,
+  B2T2ErrorsUsingTablesPart1.out,
+  B2T2ErrorsUsingTablesPart2.out,
+  B2T2ErrorsUsingTablesPart3.out,
+];
@@ -0,0 +1,28 @@
+(include_subdirs unqualified)
+
+(library
+ (name b2t2)
+ (libraries haz3lcore)
+ (js_of_ocaml)
+ (instrumentation
+  (backend bisect_ppx))
+ (preprocess
+  (pps
+   ppx_yojson_conv
+   js_of_ocaml-ppx
+   ppx_let
+   ppx_blob
+   ppx_sexp_conv
+   ppx_enumerate
+   ppx_deriving.show
+   ppx_deriving.eq))
+ (preprocessor_deps
+  (file Datasheet.md)))
+
+(env
+ (dev
+  (js_of_ocaml
+   (flags :standard --debuginfo --noinline --dynlink --linkall --sourcemap)))
+ (release
+  (js_of_ocaml
+   (flags :standard))))