Skip to content

Add non-default ability to disable arity checking in H2O Seq[Double] features #215

@ryan-deak-zefr

Description

@ryan-deak-zefr

Currently, H2O features that produces vectors (Seq[Double]) have arity checking. This should become an option that is on by default, but that can be turned off.

See https://github.com/eHarmony/aloha/blob/master/aloha-h2o/src/main/scala/com/eharmony/aloha/models/h2o/json/H2oModelJson.scala#L109

Current Code

case class DoubleSeqH2oSpec(name: String, spec: String, size: Int, defVal: Option[Seq[Double]]) extends H2oSpec {
  type A = Seq[Double]
  def ffConverter[B] = f => DoubleSeqFeatureFunction(f, size)
  def refInfo = RefInfo[Seq[Double]]

  protected def sizeErr: String = s"feature '$name' output size != $size"

  // NOTE: override here and wrap spec in Option to avoid adding implicit Option lift for Seq[Double]
  override def compile[B](semantics: Semantics[B]): Either[Seq[String], FeatureFunction[B]] = {
    val wrappedSpec = s"Option($spec).map{x => require(x.size == $size, " + s""""$sizeErr"); x}"""
    semantics.createFunction[Option[Seq[Double]]](wrappedSpec, Option(defVal))(RefInfo[Option[Seq[Double]]]).right.map(f =>
      ffConverter(f.andThenGenAggFunc(_ orElse defVal)))
  }
}

What to do

  1. Add a arityChecking: Boolean = true parameter to the case class
  2. Update the compile function to synthesize the proper code.
  3. Update the JSON Format to be able to read and write the option
  4. Write one test.

Motivation

When embeddings are produced from the specification file, they are produced after the existence of the model definition. If an embedding of the specified arity can't be produce, the model will err because the data produced is the wrong arty. In this situation, Aloha did the right thing, but we want to allow it to be flexible for real-life practical usage.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions