Skip to content

Hadrian Data Format

Jim Pivarski edited this page Nov 17, 2015 · 10 revisions

PFA by itself does not define a data representation, only a type system (Avro's type system). Hadrian, as a software library rather than an application, does not require data to be serialized in a particular format. Three input formats are defined so far (Avro, JSON, and CSV), but applications using the library are encouraged to use their own input formats: anything that is appropriate for the workflow that Hadrian is to be embedded in.

However, data has to be represented in some form for processing by PFA functions. This is the data format used internally by Hadrian.

Avro type Hadrian's internal format
null null Java Object
boolean java.lang.Boolean
int java.lang.Integer
long java.lang.Long
float java.lang.Float
double java.lang.Double
string Java String
bytes Java array of bytes
array com.opendatagroup.hadrian.data.PFAArray
map com.opendatagroup.hadrian.data.PFAMap
record com.opendatagroup.hadrian.data.PFARecord
fixed com.opendatagroup.hadrian.data.PFAFixed
enum com.opendatagroup.hadrian.data.PFAEnumSymbols
union Java Object

Clone this wiki locally