diff --git a/README.md b/README.md index fae6cd2c..3bffbdfd 100644 --- a/README.md +++ b/README.md @@ -23,13 +23,16 @@ +[![GitHub tag (latest SemVer)](https://img.shields.io/github/v/tag/jeffmay/vapors)](https://img.shields.io/github/v/tag/jeffmay/vapors) +[![codecov](https://codecov.io/gh/jeffmay/vapors/branch/v1/graph/badge.svg?token=M1WYH3T0XA)](https://codecov.io/gh/jeffmay/vapors) + # Vapors The Vapors library provides an embedded-DSL for writing expressions that look similar to the Scala collections library and can be interpreted to produce a function that computes the result given a set of source facts. -These expressions are applicative and can be optimized, evaluated, serialized, and interpreted in various ways -using a tagless final recursive visitor. +These expressions are descriptions of computations that can be optimized, serialized, and interpreted in various ways +using a tagless final recursive visitor (see [`Expr.Visitor`](core-v1/src/main/scala/algebra/Expr.scala)). This library is built on top of Scala's `cats` library for typeclass definitions. See the [Implementation](#implementation) section for more details. @@ -42,13 +45,13 @@ _(The slides are a little bit out of date, but the basic ideas are the same)_ ## Setup -1. **Add it to your `build.sbt`** +1. **Add it to your `build.sbt`** ```sbt - libraryDependencies += "com.rallyhealth" %% "vapors" % "0.16.0" + libraryDependencies += "com.rallyhealth" %% "vapors-v1" % "1.0.0-M2" ``` -2. **Define your fact types.** +2. **Define your fact types.** Every fact type must have a unique name as well as a Scala type. You can define a `FactType` with any Scala type you want, so long as it defines an `Order`. This is so the facts can be pulled from the `FactTable` in a reasonable @@ -59,12 +62,12 @@ _(The slides are a little bit out of date, but the basic ideas are the same)_ val DateOfBirth = FactType[LocalDate]("DateOfBirth") val Role = FactType[Role]("Role") } - + sealed trait Role object Role { case object Admin extends Role case object User extends Role - + implicit val order: Order[Role] = Order.reverse[Role] { Order.by { case Admin => 1 @@ -73,71 +76,94 @@ _(The slides are a little bit out of date, but the basic ideas are the same)_ } } ``` - + _Typically, if you have it, you'll order things by timestamp. Stay tuned for a data type that encapsulates `Timestamped` facts._ - -3. **Craft your expressions.** - You must start with either a `RootExpr` or a logical operator (such as `and` / `or`). You will typically - build a `RootExpr` by filtering to a specific type of fact and computing something from the facts found. +3. **Import the DSL.** + + You get different behavior and add support for different interpreters based on which DSL you use. + + If you just want to compute an expression quickly, with no justification wrapper, then you can use: + ``` + import com.rallyhealth.vapors.v1.dsl.uncached._ + ``` + + If you want your expression to produce a [`Justified`](core-v1/src/main/scala/data/Justified.scala) wrapped value, + then you can use: + ``` + import com.rallyhealth.vapors.v1.dsl.uncached.justified._ + ``` + + Now when you call `.run()` or `.runWith()` on your expression, you can traverse the tree of justification for what + operations produced that value. + +4. **Craft your expressions.** + + Typically, you will start your expression using a value from the `FactTable`. You can use the `valuesOfType` + operation to do so. + Since every `FactType` can have multiple instances defined in the `FactTable`, they must also define an `Order`, so that the facts can come out of the `FactTable` in some natural ordering: - + ```scala - import cats.implicits._ - import com.rallyhealth.vapors.dsl._ - + import com.rallyhealth.vapors.v1.dsl.uncached._ + val isAdmin = { - factsOfType(FactTypes.Role).exists(_.value === Role.Admin) + valuesOfType(FactTypes.Role).exists(_ === Role.Admin) } val isOver18 = { - factsOfType(FactTypes.DateOfBirth).exists { fact => - dateDiff(fact.value, today, const(ChronoUnit.YEARS)) >= 18 + valuesOfType(FactTypes.DateOfBirth).exists { fact => + dateDiff(fact, today, ChronoUnit.YEARS.const) >= 18.const } } ``` - -4. **Feed your facts and expression into the evaluator.** + +5. **Run your expression with a set of facts.** Assuming you have these facts: - + ```scala import FactTypes._ - + val dob = FactTypes.DateOfBirth(LocalDate.of(1980, 1, 1)) val adminRole = FactTypes.Role(Role.Admin) val userRole = FactTypes.Role(Role.User) - + val facts = FactTable( dob, adminRole, userRole ) ``` - - You can then evaluate the expression to get the output and the `Evidence` that was used to prove the result. - + + If you use the `justified` DSL, then you can get the list of facts (i.e. the `Evidence`) used to produce the output. + ```scala - import com.rallyhealth.vapors.dsl._ - - val isAdminResult = eval(facts) { - isAdmin - } - assert(isAdminResult.output.value) - assert(isAdminResult.output.evidence == Evidence(adminRole)) - - val combinedResult = evalWithFacts(facts) { - and(isAdmin, isOver18) - } - assert(combinedResult.output.value) - assert(combinedResult.output.evidence == Evidence(adminRole, dob)) + import com.rallyhealth.vapors.v1.dsl.uncached.justified._ + + val isAdminResult = isAdmin.run(facts) + assert(isAdminResult.value) + assertEquals(isAdminResult.evidence, Evidence(adminRole)) + + val combinedResult = and(isAdmin, isOver18).run(facts) + assert(combinedResult.value) + assertEquals(combinedResult.evidence, Evidence(adminRole, dob)) + ``` + + If your expression requires anything other than `Any`, then you can only call `.runWith()` to provide the required + input of the expected type. + + ```scala + val isGt3 = ident[Int] > 3.const + assert(isGt3.runWith(5)) + compileErrors("isGt3.run()") ``` - + ## Complex FactTypes The type of operators that are available for a type are dependent on the instances of cats typeclasses -that are available in implicit scope. So for example, if you want to be able to use `<`, `>=`, `===`, etc +that are available in implicit scope. So for example, if you want to be able to use `<`, `>=`, `===`, `=!=`, etc then you need to define an `Order` instance for your type. ```scala @@ -151,289 +177,563 @@ object FeetAndInches { # Terminology and Problem Statement -In essence, every expression can be optimized and evaluated to produce a function: +In essence, every expression (`Expr[I, O, OP]`) can be optimized and interpreted to produce a function like: + +```scala +(FactTable, I) => O ``` -type Evaluate[A, P] = Expr[FactTable, A, P] => (FactTable => ExprResult[A, P]) + +NOTE: If the `I` type is `Any`, then the framework will substitute +[`ExprState.Nothing`](core-v1/src/main/scala/data/ExprState.scala#L84) for the input and allow you to treat the +function like: + +```scala +FactTable => O ``` -Note that the type `P` is the captured parameter that can be used when further analyzing -the `ExprResult` and how it came up with the output value and evidence. +The goal of this library is to provide a way to build serializable and introspectable definitions for facts. These +can also be thought of as "rules". Thus making this a rules engine library where the "rules" are built using a +strongly-typed, embedded domain-specific language (eDSL) using the Scala compiler. + +You might wonder: Why not just use an existing rules engine DSL? Or why not just serialize native Scala code? + +1. **Justification** + + The limited DSL allows computing a value alongside the chain of inferences that justifies that value from some set of source facts, config values, and / or constants. + + While many rules engines support serialization and ways of representing this capability, AFAICT none explicitly support it. +2. **Type-Safety** + + By using an embedded DSL, we take advantage of the Scala compiler to catch errors at compile-time and provide an IDE experience. + + While a goal of this project is to allow parsing expressions from files or strings, it should only be done so in a manner that does not violate strongly-typed schemas. In the meantime, we offer capabilities to operate on algebraic data types using `shapeless.HList` operations. + +3. **Customization** + + By using embedding the construct within a fully-featured language, the library user is empowered to interpret and customize the DSL to suit their own needs with the full power of the Scala programming language with an interface that is designed for extension. + +## Facts + +### Data Types + +#### Fact + +Every `Fact` -- whether a _source fact_ or _derived fact_ -- is produced from a `FactType`. + +#### FactType + +A `FactType[T]` is constructed with both a `String` name and a Scala type parameter. The `FactType[T]` then acts like a +function that produces `TypedFact[T]` instances given values of type `T`. + +```scala +val Age = FactType[Int]("age") +val sourceFact = Age(23) ``` -type Analyze[G[_], A, P] = ExprResult[A, P] => G[A] + +#### TypedFact + +All `TypedFact[_]` instances extend `Fact` which has a dependent type member called `Value` which is assigned to the +type parameter. This allows using these `Fact` instances inside invariant collections (like `Set[Fact]`). You could +also conceivably match on the `.factType` of a `Fact` to determine the `Value` type from the original `FactType[T]`. + +#### TypedFactSet + +In the case that you need to handle multiple facts with the same Scala type, but different names (thus different +`FactType[T]` definitions), you can use a `FactTypeSet[T]`. This can be useful if you are able to handle a common +supertype of multiple `FactType[_]` definitions in a common way. All operations that handle a single `FactType[T]` +can also handle a `FactTypeSet[T]`. Just be careful not to combine two different fact types with the same Scala type +that have different units of measurement. For example, `FactType[Double]("weight_kg")` and +`FactType[Double]("weight_lbs")` might have the same Scala type, `Double`, but you should be careful not to perform +calculations that are only valid if you are operating in the metric system. + +#### FactTable + +Every expression can be computed by providing a `FactTable`. This is a map of `FactType`s to instances of those +`TypedFact`s that can be retrieved at any point in the computation with `O(1)` performance. + +### Terminology + +#### Source Facts + +When computing an expression, you will often want to feed the expression some starting set of **source facts**. These +facts are available in all computations using the `valuesOfType` operation, which can retrieve all instances of these +facts from the `FactTable` that was initially provided. + +There is no distinction between **source facts** and **derived facts** beyond the understanding of when they were +defined. If they are defined as part of the process of evaluating an expression, then they are "derived", otherwise, +they must have been provided to the `FactTable` as "source" facts. + +#### Derived Facts + +You can use a `define` operation to create a **definition expression**, which can be provided to a `using` expression +to compute some values from the result of that definition expression having added all the values computed by the +expression into the `FactTable`. + +```scala +import com.rallyhealth.vapors.v1.data.{FactType, FactTable} +import com.rallyhealth.vapors.v1.dsl.uncached._ + +import java.time.LocalDate +import java.time.temporal.ChronoUnit + +val Age = FactType[Int]("age") +val DateOfBirth = FactType[LocalDate]("date_of_birth") +val defineAgeFromDob = define(Age).fromSeq { + valuesOfType(DateOfBirth).map { dob => + dateDiff(today, dob, ChronoUnit.YEARS) + } +} +val isOver18 = using(defineAgeFromDob).thenReturn { + _.forall(_ >= 18.const) +} +// This will be true when run any day after 2007-12-31 +assert(isOver18.run(FactTable(LocalDate.of(1990, 1, 1)))) ``` ## Expression Algebra - + + + - - + + + + - + + + - - + + + + - + + + - - - - - - + + + - - + + + + - - + + + + + + + + + + + - - + + + + + + + + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + + + + + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + + - - + + + +
ExprExpr Signature*Unwrapped DSL Type DSL ExampleUnwrapped Function** Description
ConstOutput[V, R, P]const(1)Const[+O]Any ~:> O1.const_ => 1 Ignores the input and just returns a constant value.
ReturnInput[V, P]Identity[I]I ~:> I - concat(input, input)
- where the input is some collection type that can be concatenated with itself. + ident[Int] + 1
+ an expression that adds 1 to its input.
(_: Int) + 1 - The identity function. It just returns the input to the expressions. This is vital for the inner workings of the - builder-style DSL. In an ideal world these would be optimized out of the resulting expression evaluators. + The identity function. It just returns the input as the output. This has some applications for direct use, but is + more commonly used to support the builder-style DSL. +
+ In an ideal world, these would be optimized out of the resulting expressions.
These are not often used directly.
Embed[V, R, P] - concat(stringToList, embed(const(List(1))))
-
- where stringToList is some Expr[String, List[Int], P] -
AndThen[-II, +IO, -OI, +O]II ~:> OO + ident[Seq[Seq[Int]]] + .andThen(Expr.Flatten()) + + (identity[Seq[Seq[Int]]] _) + .andThen(_.flatten) + - Ignores the input and just passes the FactTable as the result. The only type of expressions that can be - embedded are RootExprs which only depend on the FactTable.
- In order to embed a constant at the same level, you can use the builder's .embedConst() method. -
- NOTE: Typically these are applied automatically by implicit conversion + Chain the output the first expression to the input of a second expression to produce the output of the second + expression. This is the foundation for most of the DSL. Many operations only operate on a certain shape of input + and rely on the `andThen` operation to chain the output of a previous expression into the input.
WithFactsOfType[T, R, P]ValuesOfType[T, +O]Any ~:> Seq[O] - valuesOfType(Age).exists(_ > 21)
+ valuesOfType(Age)

where Age is some FactType[Int]
(_, ft: FactTable) =>
+  ft.getSortedSeq(FactTypeSet.one(Age))
+    .map(_.value)
+
Selects all the facts that have a FactType in the given FactTypeSet. The result of the - expression will be a FoldableExprBuilder that can be filtered, mapped, etc to produce some - typed computation result. + expression will be an Any ~:> Seq[O] that can be filtered, mapped, etc to produce some typed + computation result.
+ NOTE: The only reason for the difference between `T` and `O` is that some DSLs apply a wrapper to every value in + the DSL, so the output type might differ from the `FactType`
Definition[P]N / A - Not an expression node you can construct, but rather a sealed trait that hides the type parameters so that multiple - definitions can be put into a list. The only subclass is the Expr.Define node. + Definition[-I] +

/

+ Define[-I, +C[_] : Foldable, T, OP]
Define[M[_] : Foldable, T, P]define(Age).from(const(30))I ~:> Seq[TypedFact[T]]define(Age).from(21.const)(age: Int, ft: FactTable) => ft.add(Age(age)) - Creates a Definition for a given FactType with a given Expr node. These can - be used in the UsingDefinitions node. + Creates a Definition[I, OP] for a given FactType[T] with a given Expr node. + These can be used in the using expression to guarantee that the facts generated for the given + `FactType` are available for use in the `FactTable` in subsequent expressions within that scope.
UsingDefinitions[V, R, P]usingDefinitions(ageDef) { valuesOfType(Age).exists(_ > 21) }UsingDefinitions[-I, +O]I ~:> Ousing(defineAge).thenReturn(_.exists(_ > 21))
(i: I, ft: FactTable) =>
+  ft.addAll(defineAge(i))
+    .getSortedSeq(Age)
+    .map(_.value)
+    .exists(_ > 21)
+
Adds the given definitions to the FactTable and makes them available to the expression provided in the following block.
And[V, R, P]and(const(true), const(false))Combine[-I, -LI, +LO, -RI, +RO, +O]I ~:> O1.const + 2.constAdd[Int, Int, Int].add(1, 2) + Combine 2 expressions using a pre-defined binary operation. Every operation has a name and a function that combines + the left-side input (LI) with the right-side input (RI) that will be provided the output + of the two input expressions: left-side output (LO) and right-side output (RO). + The result of the binary operation (O) is the final output.
+ Justified[Boolean] defines all operations using Justified.byInference with the operation + name and chains both the left and right side justification. +
And[-I, +B : AsBoolean, W[+_]]I ~:> Booleanand(true.const, false.const)Seq(true, false).forall(identity[Boolean]) Combine expressions that return a value for which there is an Conjunction and combines the results - using the given definition for conjunction. Evidence defines Conjunction by returning - the union of the evidence of all expressions that are falsy if any expressions are falsy, otherwise the union - of all the evidence of all the given expressions that are truthy. + using the given definition for conjunction.
+ Justified[Boolean] defines Conjunction by returning the union of the evidence of all + expressions that are truthy if any expressions are truthy, otherwise the union of all the evidence of all the + given expressions that are falsy.
Or[V, R, P]or(const(true), const(false))Or[-I, +B : AsBoolean, W[+_]]I ~:> Booleanor(true.const, false.const)Seq(true, false).exists(identity[Boolean]) Combine expressions that return a value for which there is an Disjunction and combines the results - using the given definition for disjunction. Evidence defines Disjunction by returning - the union of the evidence of all expressions that are truthy if any expressions are truthy, otherwise the union - of all the evidence of all the given expressions that are falsy. + using the given definition for disjunction.
+ Justified[Boolean] defines Disjunction by returning the union of the evidence of all + expressions that are truthy if any expressions are truthy, otherwise the union of all the evidence of all the + given expressions that are falsy. +
Not[-I, +B : AsBoolean, W[+_]]I ~:> Booleannot(true.const)!true + Convert a falsy expression to a truthy expression (or vice versa) using a provided concept of + Negation
The Evidence of a Justified value is not altered when the + value is negated.
Not[V, R, P]not(const(true))Match[-I, +S, +B : AsBoolean, +O]I ~:> O +
ident[Person].matching(
+  Case[Admin] ==> Role.Admin.const,
+  Case[Employee].when {
+    _.get(_.select(_.role)) >= Role.Editor.const
+  } ==> ident[Employee].get(_.role),
+)
+
(_: Person) match {
+  case _: Admin => Some(Role.Admin)
+  case e: Employee if e.role >= Role.Editor => Some(e.role)
+  case _ => None
+}
+
- Convert a falsy expression to a truthy expression (or vice versa) using a provided concept of Negation - The Evidence never be impacted by this operation. + Constructs an expression that matches on one of a series of branches that first matches on a subclass, then applies + a filter expression, and finally, if the type is correct and the guard expression returns true, then it returns the + output of the expression on the right-hand side of the ==> wrapped a Some. If no branch + matches the input, then the expression returns None. See also + Expr.MatchCase[-I, S, +B, +O].
When[V, R, P]when(const(true)).thenReturn(const(1)).elseReturn(const(0))When[-I, +B : AsBoolean, +O]I ~:> O +
when(true.const)
+  .thenReturn(1.const)
+  .elseReturn(0.const)
+
if (true) 1 else 0 Evaluates the first subexpression of the given sequence of condition branch expressions that evaluates to - true. + true. This is less powerful than .matching(...) in that it does not allow downcasting to + a subclass before applying the condition expression. However, it does not return an Option and can be + simpler and more clear in certain cases (including Case[A].when() conditions).
SelectFromOutput[V, S, R, P]const(Map("example" -> "value")).withOutputValue.get(_.at("example"))Select[-I, A, B, +O]I ~:> Oident[Map[String, String]].get(_.at("example")(_: Map[String, String]).get("example") Select a field from a product type, map, or sequence. Returns either an Option or strict value - depending on the type of Indexed instance is available for the type. + depending on the type of Indexed instance is available for input type and key type combination.
FilterOutput[V, M[_] : Foldable : FunctorFilter, R, P]valuesOfType(Age).filter(_ > 21)Filter[C[_], A, +B : AsBoolean, D[_]]C[A] ~:> D[A]ident[NonEmptySeq[Int]].filter(_ > 1.const)(_: NonEmptySeq[Int]).filter(_ > 1) - Keeps elements of the given Functor that match the given expression and discards the others. + Keeps elements of the given collection for which the filter expression returns true and discards the others.
MapOutput[V, M[_] : Foldable : Functor, U, R, P]const(List(1, 2, 3, 4)).withOutputFoldable.map(_ * 2)Match[I, S, B : AsBoolean, O]I ~:> O
ident[RoleClass].matching(
+  Case[Editor].when(_.get(_.select(_.articles)) > 0.const) ==>
+    ident[Editor].get(_.select(_.permissions))
+)
(_: RoleClass) match {
+  case e: Editor if e.articles > 0 => Some(e.permissions)
+  case _ => None
+}
- For every value in the given Functor, apply the given expression. + Matches the given cases in the order provided. If any branch matches both the expected type and the guard + expression, if any, then the result of the expression on the right-hand side of the ==> will be + returned inside a Some(_). If no Cases match, then the expression returns + None.
GroupOutput[V, M[_] : Foldable, U : Order, K, P]valuesOfType(FactTypes.Prediction).groupBy(_.get(_.select(_.model)))MapEvery[C[_] : Functor, A, B]C[A] ~:> C[B]ident[List[Int]].map(_ * 2.const)(_: List[Int]).map(_ * 2) - Group the values of the Foldable by key located at the given NamedLens and return the - MapView. + For every value in the given Functor + apply the given expression to create a new collection of the same type with the elements produced by the given + expression.
SortOutput[V, M[_], R, P]const(List(2, 4, 3, 1)).withOutputFoldable.sortedFlatten[C[_] : FlatMap, A]C[C[A]] ~:> C[A]ident[List[List[Int]]].flatten(_: List[List[Int]]).flatten - Sort the values using the given ExprSorter -- either a given natural Order[R] of the - return type or the Order of a field selected by a given NamedLens). + Uses the definition of FlatMap[F[_]] to + flatten the outer collection (C[C[A]]) into a new collection of all the elements in order + (C[A]). This can be used to construct a .flatMap() operation.
ConcatOutput[V, M[_] : MonoidK, R, P]concat(const(List(1, 2)), const(List(3, 4)))GetOrElse[I, O]I ~:> Oident[Option[Int]].getOrElse(0.const)(_: Option[Int]).getOrElse(0) - Concatenates the output of the given expressions that return the same MonoidK[M] to produce a single - M[R] with all elements of all the monoids in the given order. + Calls the Option.getOrElse method on the result of an expression that returns an Option[O] + and provides the evaluation of a default value expression to be computed if the original optional result is empty.
FoldOutput[V, M[_] : Foldable, R : Monoid, P]const(List("hello", "world")).withOutputFoldable.foldSorted[C[_], A]C[A] ~:> C[A]ident[List[Int]].sorted(_: List[Int]).sorted - Folds the output of the given expressions into a single value of the same type, given a Monoid[R]. + Sort the values using the given ExprSorter -- either a given natural Order[R] of the + return type or the Order of a field selected by a given NamedLens).
WrapOutputSeq[V, R, P]wrapSeq(const(1), const(2), const(3), const(4))FoldLeft[-I, +C[_] : Foldable, A, O]I ~:> Oident[List[Int]].foldLeft(0.const) { _ + _ }(_: List[Int]).foldLeft(0) { _ + _ } - Wraps the output of the sequence of given expressions into a sequence of the results. + Folds the output of the given expressions into the result of some initial expression.
WrapOutputHList[V, L <: HList, R, P]wrap(const(1), const("two")).zippedToShortest.asTupleSequence[+C[+_] : Applicative : SemigroupK : Traverse, -I, +O]I ~:> C[O]wrapAll(NonEmptySeq.of(1.const, 2.const)])NonEmptySeq.of(_ => 1, _ => 2).map(_(())) - Wraps the output of a heterogeneous list of given expressions into an HList of the return types. + Wraps the output of a sequence of given expressions into an expression that produces the same type of sequence from + the results of every expression given the same input.
ZipOutput[V, M[_] : Align : FunctorFilter, L <: HList, R, P]ToHList[-I, +L <: HList]I ~:> Lwrap(1.const, "two".const).toHList((_ => 1)(()) :: (_ => "two")(()) :: HNil) - Zips a heterogeneous list of expressions into a single HList of the results. + Wraps the output of a heterogeneous ExprHList of given expressions into an HList of the + return types. +
Convert[-I, +O]I ~:> O(1 :: "two" :: HNil).const.as[(Int, String)]Generic[(Int, String)].from(1 :: "two" :: HNil) + Converts an HList to any type that can be isomorphically converted using shapeless.Generic
OutputIsEmpty[V, M[_] : Foldable, R, P]ZipToShortestHList[-I, W[+_], +WL <: HList, +UL <: HList]I ~:> W[UL]wrap(seq(1.const, 2.const), seq("1".const)).zipToShortest(Seq(1, 2) :: Seq("1") :: HNil) - Returns true if the output of the expression is an empty Foldable, otherwise - false. + Zips a heterogeneous list of expressions into a single HList of the results.
TakeFromOutput[V, M[_] : Traverse : TraverseFilter, R, P]Repeat[-I, +O]I ~:> IterableOnce[O]
wrap(
+  Seq(1, 2, 3).const,
+  repeatConstForever(0.const)
+).zipToShortest
+
Seq(1, 2, 3)
+  .zip(IterableOnce.continually(0))
+  .map {
+    case (l, r) } => l :: r :: HNil
+  }
+
- Takes a given number of elements from the front of the traversable result. + Repeats the given expression (either memoizing the result or recomputing it) infinitely (or to some limit) to + produce an expression of IterableOnce[O] that can be zipped with other sequences of known or unknown + length. This is commonly used to thread a single value produced by a prior expression into the input of a + collection operation (like .map(), .forall, etc). This helps to avoid the issue of lacking + closures in the applicative expression language.
ExistsInOutputV, M[_] : Foldable, U, P]valuesOfType(Age).exists(_ > 18)SizeIs[-I, N : AsInt, B : AsBoolean]I ~:> Bident[Seq[Int]].sizeIs > 2.const(_: Seq[Int]).sizeIs > 2 - Returns true if there exists an element in the given Foldable for which the given - predicate expression returns true, otherwise returns false. + Returns true if the output of the expression has a size that meets a given SizeComparison.
AddOutputs[V, R : Addition, P]const(1) + const(1)Slice[C[_] : Traverse, A, D[_]]C[A] ~:> D[A]ident[Seq[Int]].slice(1 <-> -1)(s: Seq[Int]) => s.slice(1, s.size - 1) - Adds the output of the given expressions using the definition of Addition provided for the output type. + Selects a range of values from the starting collection to produce an output collection. The C[_] and + D[_] types are different because if you take a slice from a NonEmptySeq, you will get a + regular Seq that can be empty. There are other collections that can differ when selected. You can even + define your own. Check out the CollectInto type-level function.
SubtractOutputs[V, R : Subtraction, P]const(1) - const(1)Exists[-C[_] : Foldable, A, B : AsBoolean]C[A] ~:> Booleanident[Seq[Int]].exists(_ > 18.const)(_: Seq[Int]).exists(_ > 18) - Subtracts the output of the given expressions using the definition of Subtration provided for the - output type. + Returns true if there exists an element in the input collection for which the given predicate + expression returns true, otherwise returns false (including if the collection is empty).
MultiplyOutputs[V, R : Multiplication, P]const(1) * const(1)ForAll[-C[_] : Foldable, A, B : AsBoolean]C[A] ~:> Booleanident[Seq[Int]].forall(_ > 18.const)(_: Seq[Int]).forall(_ > 18) - Multiplies the output of the given expressions using the definition of Multiplication provided for the - output type. + Returns true if every element in the input collection returns true from the given + predicate expression (or the collection is empty), otherwise returns false.
DivideOutputs[V, R : Division, P]const(1) / const(1)ContainsAny[-I, W[+_] : Extract, C[_] : Foldable, A, +B]I ~:> Boolean1.const in Set(1, 2, 3).constSet(1, 2, 3).contains(1) - Multiplies the output of the given expressions using the definition of Multiplication provided for the - output type. + Returns true if the output of the expression is found in the given Set[A] of values, + otherwise false.
NegativeOutput[V, R : Negative, P]-const(1)WithinWindow[-I, +V, W[+_]]I ~:> Boolean1.const <= 2.const1 <= 2 - Converts the output of the given expression using the definition of Negative to negate the value.
-
- Note: This is different than Negation in that it is an arithmetic operation not a logical operation. + Returns true if the output of the expression is contained by the given + Window[V], otherwise false.
OutputWithinSet[V, R, P]const(1) in Set(1, 2, 3)IsEqual[-I, +V, W[+_]]I ~:> Boolean1.const === 2.const1 == 2 - Returns true if the output of the expression is found in the given Set[R] of values, - otherwise false. + Returns true if the output of the left expression is equal to the output of the right expression + according to the provided definition of + EqualComparable[W, V, OP].
OutputWithinWindow[V, R, P]const(1) <= 2CustomFunction[-I, +O]I ~:> OExpr.CustomFunction("average", (xs: Seq[Int]) => xs.sum / xs.size)(xs: Seq[Int]) => xs.sum / xs.size - Returns true if the output of the expression is contained by the given Window[R], - otherwise false. + Allows you do define a custom expression node that will evaluate the given function. Note that this will bypass any + of the evidence tracking, introspect-ability, and other use cases for using this library in the first place, so you + should only use it for simple functions that have well-defined behavior and name. This is mainly to allow calling + out to library code to do things like decryption, datetime calculations, statistical calculations, etc.
+* Every `Expr` has 3 parameters, `-I` input, `+O` output, and `OP[_]` the output parameter. + Since the `OP` type is repeated throughout every expression and has no real implication on the standard + behavior of the language, I have removed it from the signatures. So, for example, if we have a subclass of + `Expr` named `Op`, the signature would be shown as `Op[-I, +O]`, rather than `Op[-I, +O, OP[_]]`. + +** Every `Expr[I, O, OP[_]]` built by an unwrapped DSL is interpreted as a function `(I, FactTable) => O`, + however to simplify this column, any expression that does not depend on the `FactTable` will only use the + input parameter, `I`. + + These operations have a custom "wrapper" type `W[+_]`. This is not the ideal state for the expression + algebra, as the wrapper type is an artifact of the DSL, and not the underlying logic of the language. + Unfortunately, I was not able to figure out how to solve various issues that arose from using a single + concrete type parameter (rather than a higher-kinded parameter and a concrete inner parameter). + ## Expression Type Aliases @@ -443,95 +743,56 @@ type Analyze[G[_], A, P] = ExprResult[A, P] => G[A] - - - - - - - + + - - + + - + + + + + +
Definition
Expr[V, R, P]The super class of all expression nodes.N / A
RootExpr[R, P] - An expression that can operate on a FactTable alone to produce a value of type `R`. - Typically built using a factsOfType expression builder. - I ~:> O - Expr[FactTable, R, P] + The super class of all expression nodes. You can call .run() on an Any ~:> O to get a + Result[O]. Expr[I, O, OP]
Definition[P] - A sealed trait that contains a FactType and its appropriately typed definition. The only subclass of - this type is the Expr.Define expression node. - I =~:> O - [M[_], T] ==> Expr.Define[M, T, P] => Expr[FactTable, M[T], P] + A function from an expression of I ~:> I to produce an expression I ~:> O. Expr[I, I, OP] => Expr[I, O, OP]
CondExpr[V, P]AndThen[I, M, O] - A conditional expression that returns a Boolean. + An alias for the Expr.AndThen expression node where the intermediate type is fixed. Expr.AndThen[I, M, M, O, OP]
I >=< O - Expr[T, Boolean, P] + An alias for the Expr.WithinWindow expression node where the wrapper type and OP type are fixed. Expr.WithinWindow[I, V, W, OP]
-## FactTypeSets +## Justification -Since `FactType`s have both a Scala type parameter and a `String` name, it is possible to handle multiple facts with -the same Scala type, but different names. Typically, you should only do this if these `FactType`s are all defined -with the same underlying meaning (although, they may differ in souce or quality). In this case, you can filter an -expression to a `FactTypeSet`. +## Custom DSLs / Interpreters -## Expression Builders +Note that the type `OP` is the "output parameter" that can be used when building an interpreter (i.e. `Expr.Visitor`) +to provide type-bounded context for the output of every expression node. It can also be used when defining mathematic +or comparison operations like addition, equality, etc. Lastly, it can be used when interpreting an `ExprResult` +(more details on this later). -In order to make building the `Expr` algebra easier, we use expression builders (`ExprBuilder[V, M[_], U, P]`). An -expression builder carries type information from the output of one expression node into a function that can be used -to build an expression that depends on it as input. These functions are designed to look like standard Scala collections -except that they operate over an "Applicative" data structure, rather than an eagerly evaluated monadic collection. +In the simple `uncached` DSL, the result is computed directly without any wrapper. The interpreter produces a +stream-lined and efficient function because it can ignore any work required to wrap the results. You can define custom +DSLs and interpreters that produce different results, but they will likely be slower. -To get started, you must import the DSL: - -```scala -import com.rallyhealth.vapors.dsl._ -``` - -There are two types of builders: `ValExprBuilder` and `FoldableExprBuilder` (does not always require a `Foldable` -instance, but it is the most common typeclass constraint of the operations). - -You use the `ValExprBuilder` for operations on a single value, like comparison operations, selecting fields, etc. -The also implicitly convert to an `Expr` of the same type when needed. - -```scala -val a: ValExprBuilder[FactTable, Int, Unit] = const(1).withOutputValue -val b: ValExprBuilder[FactTable, Boolean, Unit] = a >= 1 -val c: ValExprBuilder[FactTable, LocalDate, Unit] = const(LocalDate.now()).withOutputValue -val d: ValExprBuilder[FactTable, Int, Unit] = c.get(_.select(_.getYear)) -val e: Expr[FactTable, Int, Unit] = d // implicitly converted -``` - -You use the `FoldableExprBuilder` to perform operations that can produce other foldable structures or fold the values -into a `ValExprBuilder`. These do not always implicitly convert properly to an `Expr`. There is more work to be done -here to make this builder syntax more seemless. - -```scala -val f: FoldableExprBuilder[FactTable, Seq, Int, Unit] = const(Seq(1, 2, 3, 4)).withOutputFoldable -val g: FoldableExprBuilder[FactTable, Seq, Int, Unit] = f.filter(_ < 2).map(_ * 2) -val h: ValExprBuilder[FactTable, Boolean, Unit] = g.exists(_ === 8) -``` - -As you can see, the builders are designed to look like the Scala collections libraries, but are implemented using -typeclass definitions from the `cats-core` library. - -### Example +## Example The following a more complete example of a set of facts and an evaluated query. ```scala import cats.Order -import com.rallyhealth.vapors.dsl._ +import com.rallyhealth.vapors.v1.dsl.uncached.justified._ case class SemVer(major: Int, minor: Int, patch: Int) object SemVer { @@ -558,76 +819,121 @@ object JoeSchmoe { val weight = FactTypes.WeightLbs(260) val dateOfBirth = FactTypes.DateOfBirth(LocalDate.of(1980, 1, 1)) val predictWeightloss = FactTypes.Prediction(PredictionModel("weightloss", SemVer(2, 0, 1), 0.85)) - val facts = Facts(height, weight, dateOfBirth, predictWeightloss) + val facts = FactTable(height, weight, dateOfBirth, predictWeightloss) } object Example { - val query: RootExpr[Boolean, Unit] = { + val query: Any ~:> Boolean = { or( and( - valuesOfType(FactTypes.WeightLbs).exists(_ > 250), - valuesOfType(FactTypes.HeightFt).exists(_ < FeetAndInches(6, 0)) + valuesOfType(FactTypes.WeightLbs).exists(_ > 250.const), + valuesOfType(FactTypes.HeightFt).exists(_ < FeetAndInches(6, 0).const) ), and( - valuesOfType(FactTypes.WeightLbs).exists(_ > 300), - valuesOfType(FactTypes.HeightFt).exists(_ >= FeetAndInches(6, 0)) + valuesOfType(FactTypes.WeightLbs).exists(_ > 300.const), + valuesOfType(FactTypes.HeightFt).exists(_ >= FeetAndInches(6, 0).const) ), valuesOfType(FactTypes.Prediction).exists { prediction => and( - prediction.get(_.select(_.modelName)) === "weightloss", - prediction.get(_.select(_.score)) > 0.8, - prediction.get(_.select(_.modelVersion)) >= SemVer(2, 0, 0) + prediction.get(_.select(_.modelName)) === "weightloss".const, + prediction.get(_.select(_.score)) > 0.8.const, + prediction.get(_.select(_.modelVersion)) >= SemVer(2, 0, 0).const ) } ) } - val rs = eval(JoeSchmoe.facts)(query) - assert(rs.isTrue) - assert(rs.matchingFacts == Facts(JoeSchmoe.height, JoeSchmoe.weight, JoeSchmoe.predictWeightloss)) + val rs = query.run(JoeSchmoe.facts) + assert(rs.value) + assert(rs.evidence == Evidence(JoeSchmoe.height, JoeSchmoe.weight, JoeSchmoe.predictWeightloss)) +} +``` + +## Debugging + +Since you cannot drop a breakpoint at a specific point in the construction of the expression (as you probably want +access to the value when it is running, rather than the mostly visible information available when the expression is +being constructed), we provide a mechanism to attach a debug function. + +For example, you can call `.debug` on any `Expr` to provide a `ExprState[?, O] => Unit` function that will be evaluated +when the expression is run. By attaching this debug function, you also get place where you can drop a breakpoint. + +Note: The input type of the `ExprState` provided to the debug function depends on the `Expr` subclass. This is the +primary reason why DSLs return the specific `Expr` subclass rather than their widened `I ~:> O` type. Likewise, the +specific `Expr` type is unaffected by attaching a debug function. + +```scala +import com.rallyhealth.vapors.v1.dsl.uncached._ + +object Example { + val query: Seq[String] ~:> Boolean = ident[Seq[String]].exists { + _.sizeIs > 3 + }.debug { s => + val in = s.input + val out = s.output + println(s"INPUT: $in") + println(s"OUTPUT: $out") + } } ``` # Implementation +## Core DSLs + +| Import | Status | Interpreter | Function Type | Function Description | +|:------------------------------|:-------|:-----------------|:--------------------------------------------|:-----------------------------------------------------------------------------------| +| `dsl.uncached` | ✅ | `SimpleEngine` | `Expr.State[Any, I] => O` | Produces the unwrapped value directly | +| `dsl.uncached.justified` | ☑️ | `SimpleEngine` | `Expr.State[Any, I] => Justified[O]` | Produces the `Justified` wrapped value directly | +| `dsl.standard` | ◽️ | `StandardEngine` | `Expr.State[Any, I] => ExprResult[I, O, OP]` | Produces the unwrapped value inside of an `ExprResult`, which can be reinterpreted | + +✅ - Fully implemented and tested. Only minor changes to be expected. + +☑️ - Fully implemented but with some inconsistencies. Major changes are plausible. + +◽️ - Partially implemented. Major changes are likely. + ## Interpreters In order to convert from an `Expr` into something useful, we must interpret the expression. -The main interpreter is the `InterpretExprAsResultFn` which converts an `Expr[V, R, P]` into a function -`ExprInput[V, P] => ExprResult[V, R, P]`, which can then be evaluated by providing the appropriate input. The -standard `eval` requires a `RootExpr` which takes a `FactTable` as input and produces some arbitrary output. You -can get both the value +The main interpreter is the engine provided by your imported DSL. If you are using an unwrapped DSL, then this is +probably the `SimpleEngine.Visitor` which converts an `Expr[I, O, OP]` into a function `(I, FactTable) => O`. -We provide a tagless-final encoded `Expr.Visitor` that you can extend to interpret `Expr` expressions built by our -embedded DSL. By extending this interface, you must handle each case of the `Expr` algebra in separate methods. +There is also a `Justified` DSL that uses the same `SimpleEngine`, but it returns a `Justified[O]` instead of just the +`O`. This allows you to follow the chain of justification for any produced output. -For example, we can interpret an `Expr` as `Json` by providing an interpreter that looks like: +Lastly, there is a `StandardEngine` which interprets the `Expr` as a function that produces an `ExprResult`, +which is another recursive data structure that mirrors the expressions used to construct it. This mirror data structure +contains all the information of the original expression combined with the state and the captured output parameters at +the point where it was evaluated. This data structure can be re-interpreted with an `ExprResult.Visitor` to serialize +the evaluation, create some kind of visualization, etc. It is much slower to capture all of this data, so this DSL is +only recommended if you need any of these capabilities. This is a large part of the reason why the name will likely +change from "standard" to something more descrtiptive to its purpose. + + This engine requires redesign. The name and implementation are likely to change fairly dramatically. + +### Custom Interpreters + +You can define your own `Expr.Visitor` to produce something other than a function. For example, we can interpret an +`Expr` as a `Json` object by providing an interpreter that looks like: ```scala import io.circe.Json +import io.circe.syntax._ object VisitExprAsJson { - type G[_] = Json + type G[-_, +_] = Json } -class VisitExprAsJson[V] extends Expr.Visitor[VisitExprAsJson.G, V, Unit] { - override def visitConstOutput[R](expr: ConstOutput[V, R, P]): Json = Json.fromString(expr.value.toString) +class VisitExprAsJson[OP[a] <: HasEncoder[a]] extends Expr.Visitor[VisitExprAsJson.G, OP] { + + implicit def encodeOutput[O](implicit op: OP[O]): Encoder[O] = op.encoder + + override def visitConst[O : OP](expr: Expr.Const[O, OP]): Json = Json.obj( + "$type" -> expr.name.asJson, + "value" -> expr.value.asJson, + ) + // ... implement all other visitX methods to produce Json } ``` - -If you want to handle all expressions similarly and don't need any of the typeclass instances, you can use the -`VisitGenericExprWithProxyFn` and just define the `visitGeneric` method. - -```scala -object VisitExprAsString { - type G[_] = String -} - -class VisitExprAsString[V] extends VisitGenericExprWithProxyFn[V, Unit, VisitExprAsString.G] { - override def visitGeneric[U, R]( - expr: Expr[U, R, P], - input: ExprInput[U], - ): String = s"$input -> $expr" -} -```