A lightweight YAML parser for Scala that handles a practical subset of YAML 1.2. Built on FastParse, it cross-compiles for JVM, Scala.js, and Scala Native.
There are plenty of YAML parsers available for Scala already, here's what makes Kaitai YAML distinct:
- Portable across all Scala ecosystem (JVM, Scala.js, Scala Native).
- No type coercion — all scalar values are plain strings. It solves 3/4 of what YAML is routinely being hated for, e.g. infamous Norway problem, treating versions as doubles, etc.
- Source positions — every YAML node carries its position which can be used for precise offset/line/column error reporting.
- Deterministic key order — mappings preserve document order.
- Integration with FastParse — making it possible to embed this parser into other FastParse parsers, or, vice versa, extend YAML parsing with other FastParse parsers.
If you're looking for slightly different feature set, consider using any of these libraries:
- scala-yaml
- SnakeYAML (YAML 1.1) or SnakeYAML-Engine (YAML 1.2) in Java
- or Scala wrappers around it like circe-yaml, MoultingYAML
- or Java wrappers around it like Jackson
- yaml4s
- yamlesque
The library is available for Scala 2.12, 2.13, and 3.x.
// JVM
libraryDependencies += "io.kaitai" %% "kaitai-yaml" % "0.1"
// Scala.js / Scala Native
libraryDependencies += "io.kaitai" %%% "kaitai-yaml" % "0.1"ivy"io.kaitai::kaitai-yaml:0.1"implementation("io.kaitai:kaitai-yaml_2.13:0.1")import fastparse._
import io.kaitai.yaml._
val input = """
name: my-service
version: 1.0
tags: [scala, yaml]
config:
debug: false
ports:
- 8080
- 8443
""".trim
YamlParser.parse(input) match {
case Parsed.Success(root, _) => println(root)
case f: Parsed.Failure => System.err.println(f.trace().longAggregateMsg)
}YamlParser.parse returns fastparse.Parsed[YamlNode] — the standard
FastParse result type:
Parsed.Successcarries the parsed AST and the index where parsing stopped.Parsed.Failureprovides the failure index and a.trace()with detailed diagnostics.
val Parsed.Success(root, _) = YamlParser.parse(input): @unchecked
// YamlMap.apply throws on missing key; .get returns Option
val name = root.asInstanceOf[YamlMap]("name") // YamlScalar("my-service", 0)
val tags = root.asInstanceOf[YamlMap]("tags") // YamlSeq([...], ...)
val debug = root.asInstanceOf[YamlMap]("config")
.asInstanceOf[YamlMap]("debug") // YamlScalar("false", ...)
println(name.asInstanceOf[YamlScalar].value) // "my-service"
println(debug.asInstanceOf[YamlScalar].value) // "false"Pattern matching is the idiomatic way to work with the AST:
val Parsed.Success(root, _) = YamlParser.parse(input): @unchecked
root match {
case YamlMap(fields, _) =>
fields.foreach { case (key, value) =>
value match {
case YamlScalar(v, _) => println(s"${key.value} = $v")
case YamlSeq(items, _) => println(s"${key.value} has ${items.size} items")
case YamlMap(_, _) => println(s"${key.value} is a nested map")
}
}
case _ => ()
}On failure you get a Parsed.Failure with the character index where parsing
stopped and a full trace:
YamlParser.parse("- [unterminated") match {
case Parsed.Success(node, _) => // ...
case f: Parsed.Failure =>
println(s"Error at index ${f.index}")
println(f.trace().longAggregateMsg)
}Every node records where it appeared in the source text as a zero-based character offset:
val Parsed.Success(root, _) = YamlParser.parse("key: value"): @unchecked
val offset = root.pos // 0This is the same offset scheme used by FastParse itself. To convert an
offset to a line:column string, use ParserInput.prettyIndex:
val input = "key: value"
val pi = fastparse.ParserInput.fromString(input)
println(pi.prettyIndex(root.pos)) // "1:1"The YamlParser.document parser is public, so you can embed YAML parsing
inside a larger FastParse grammar:
import fastparse._, NoWhitespace._
def myFormat[A: P]: P[(String, YamlNode)] =
P("---BEGIN---\n" ~ CharsWhile(_ != '\n').! ~ "\n" ~ YamlParser.document)
val input = "---BEGIN---\nheader-value\nkey: value\n"
fastparse.parse(input, myFormat(_)) match {
case Parsed.Success((header, yaml), _) => println(s"$header -> $yaml")
case f: Parsed.Failure => println(f.trace().longAggregateMsg)
}YamlNode (sealed trait)
+-- YamlScalar(value: String, pos: Int)
+-- YamlSeq(items: List[YamlNode], pos: Int)
+-- YamlMap(fields: List[(YamlScalar, YamlNode)], pos: Int)
get(key: String): Option[YamlNode]
apply(key: String): YamlNode // throws NoSuchElementException
pos is a zero-based character offset into the source input.
See SUPPORTED_YAML.md for the full specification of which YAML features are and are not supported.
Requires sbt 1.x and JDK 11+.
# Run tests across all Scala versions (JVM)
sbt "+kaitaiYamlJVM/test"
# Compile for a specific platform
sbt kaitaiYamlJS/compile
sbt kaitaiYamlNative/compile
# Format and lint
sbt scalafmtAll
sbt scalafixAllMIT -- see LICENSE for details.