|
| 1 | +<!DOCTYPE html> |
| 2 | +<html lang="en"> |
| 3 | +<head> |
| 4 | + <meta charset="UTF-8"> |
| 5 | + <meta name="viewport" content="width=device-width, initial-scale=1.0"> |
| 6 | + <title>STEF SDL: Schema Definition Language</title> |
| 7 | + <link href="https://fonts.googleapis.com/css?family=Roboto:400,700&display=swap" rel="stylesheet"> |
| 8 | + <link rel="stylesheet" href="./style.css"> |
| 9 | + <link rel=" stylesheet" href=" https://cdn.jsdelivr.net/npm/[email protected]/themes/prism.css" > |
| 10 | + <link rel="stylesheet" href="./prism-stef.css"> |
| 11 | +</head> |
| 12 | +<body> |
| 13 | + <header> |
| 14 | + <h1>STEF SDL</h1> |
| 15 | + <p>Schema Definition Language<br>Define schemas for STEF serialization</p> |
| 16 | + <nav> |
| 17 | + <a href="./index.html">Home</a> |
| 18 | + <a href="https://github.com/splunk/stef">GitHub</a> |
| 19 | + <a href="https://github.com/splunk/stef/blob/main/stef-spec/specification.md">Specification</a> |
| 20 | + </nav> |
| 21 | + </header> |
| 22 | + <main> |
| 23 | + <h2>Overview</h2> |
| 24 | + <p>The STEF Schema Definition Language (SDL) is used to define schemas for STEF serialization. |
| 25 | + It provides a simple, type-safe way to describe data structures that can be efficiently serialized |
| 26 | + and deserialized using the STEF format.</p> |
| 27 | + |
| 28 | + <h2>Package Declaration</h2> |
| 29 | + <p>Every STEF schema file begins with a package declaration:</p> |
| 30 | + <pre><code class="language-stef">package com.example.myschema</code></pre> |
| 31 | + <p>Package names use dot notation and can have one or more dot-delimited components.</p> |
| 32 | + |
| 33 | + <h3>Language-Specific Package Handling</h3> |
| 34 | + <p>Different target languages handle package names differently when generating code:</p> |
| 35 | + <ul> |
| 36 | + <li><strong>Go:</strong> Uses only the last component of the package name. For example, <code>com.example.myschema</code> becomes package <code>myschema</code> in Go.</li> |
| 37 | + <li><strong>Java:</strong> Uses the full package name hierarchy. For example, <code>com.example.myschema</code> becomes package <code>com.example.myschema</code> in Java.</li> |
| 38 | + </ul> |
| 39 | + |
| 40 | + <h2>Comments</h2> |
| 41 | + <p>STEF SDL supports C-style single-line comments:</p> |
| 42 | + <pre><code class="language-stef">// This is a comment |
| 43 | +package com.example // Comments can appear at end of lines</code></pre> |
| 44 | + |
| 45 | + <h2>Primitive Types</h2> |
| 46 | + <p>STEF SDL supports the following primitive data types:</p> |
| 47 | + <ul> |
| 48 | + <li><code>bool</code> - Boolean values (true/false)</li> |
| 49 | + <li><code>int64</code> - 64-bit signed integer</li> |
| 50 | + <li><code>uint64</code> - 64-bit unsigned integer</li> |
| 51 | + <li><code>float64</code> - 64-bit floating point number</li> |
| 52 | + <li><code>string</code> - UTF-8 encoded string</li> |
| 53 | + <li><code>bytes</code> - Binary data</li> |
| 54 | + </ul> |
| 55 | + |
| 56 | + <h2>Structs</h2> |
| 57 | + <p>Structs define composite data types with named fields:</p> |
| 58 | + <pre><code class="language-stef">struct Person { |
| 59 | + Name string |
| 60 | + Age uint64 |
| 61 | + Email string |
| 62 | +}</code></pre> |
| 63 | + |
| 64 | + <h3>Root Structs</h3> |
| 65 | + <p>The <code>root</code> attribute marks a struct as the top-level record type in a STEF stream:</p> |
| 66 | + <pre><code class="language-stef">struct Record root { |
| 67 | + Timestamp uint64 |
| 68 | + Data Person |
| 69 | +}</code></pre> |
| 70 | + <p>Multiple structs can be marked as <code>root</code> in a single schema, allowing the STEF stream to contain different types of records:</p> |
| 71 | + <pre><code class="language-stef">struct MetricRecord root { |
| 72 | + Timestamp uint64 |
| 73 | + Metric Metric |
| 74 | +} |
| 75 | + |
| 76 | +struct TraceRecord root { |
| 77 | + Timestamp uint64 |
| 78 | + Span Span |
| 79 | +}</code></pre> |
| 80 | + <p>When multiple root structs are defined, each record in the stream will be one of the root types, and the STEF format includes type information to distinguish between them during deserialization.</p> |
| 81 | + |
| 82 | + <h3>Dictionary Compression</h3> |
| 83 | + <p>Fields can use dictionary compression for repeated values using the <code>dict</code> modifier:</p> |
| 84 | + <pre><code class="language-stef">struct Event { |
| 85 | + EventType string dict(EventTypes) |
| 86 | + Message string |
| 87 | +}</code></pre> |
| 88 | + <p>Structs can also have dictionary compression applied:</p> |
| 89 | + <pre><code class="language-stef">struct Resource dict(Resources) { |
| 90 | + Name string |
| 91 | + Version string |
| 92 | +}</code></pre> |
| 93 | + <p>Dictionary names allow the same dictionary to be shared across multiple fields, even in different structs, as long as the fields have the same type:</p> |
| 94 | + <pre><code class="language-stef">struct MetricEvent { |
| 95 | + ServiceName string dict(ServiceNames) |
| 96 | + EventType string dict(EventTypes) |
| 97 | +} |
| 98 | + |
| 99 | +struct TraceEvent { |
| 100 | + ServiceName string dict(ServiceNames) // Same dictionary as above |
| 101 | + SpanName string dict(SpanNames) |
| 102 | +}</code></pre> |
| 103 | + <p>This sharing enables more efficient compression when the same values appear across different record types.</p> |
| 104 | + |
| 105 | + <h3>Optional Fields</h3> |
| 106 | + <p>Fields can be marked as optional, meaning they may not be present in every record:</p> |
| 107 | + <pre><code class="language-stef">struct User { |
| 108 | + Name string |
| 109 | + Email string optional |
| 110 | + Phone string optional |
| 111 | +}</code></pre> |
| 112 | + |
| 113 | + <h2>Arrays</h2> |
| 114 | + <p>Array types are denoted with square brackets and can contain zero or more elements of the specified type:</p> |
| 115 | + <pre><code class="language-stef">struct Container { |
| 116 | + Items []string |
| 117 | + Numbers []int64 |
| 118 | + Objects []Person |
| 119 | +}</code></pre> |
| 120 | + <p>Arrays are variable-length - they can be empty or contain any number of elements.</p> |
| 121 | + |
| 122 | + <h2>Oneofs (Union Types)</h2> |
| 123 | + <p>Oneofs define union types that can hold one of several possible field types:</p> |
| 124 | + <pre><code class="language-stef">oneof JsonValue { |
| 125 | + String string |
| 126 | + Number float64 |
| 127 | + Bool bool |
| 128 | + Array []JsonValue |
| 129 | + Object JsonObject |
| 130 | +}</code></pre> |
| 131 | + <p>An empty oneof represents null/absence of value.</p> |
| 132 | + |
| 133 | + <h2>Multimaps</h2> |
| 134 | + <p>Multimaps define key-value collections:</p> |
| 135 | + <pre><code class="language-stef">multimap Attributes { |
| 136 | + key string |
| 137 | + value AnyValue |
| 138 | +}</code></pre> |
| 139 | + <p>Multimaps can also use dictionary compression:</p> |
| 140 | + <pre><code class="language-stef">multimap Labels { |
| 141 | + key string dict(LabelKeys) |
| 142 | + value string dict(LabelValues) |
| 143 | +}</code></pre> |
| 144 | + |
| 145 | + <h2>Enums</h2> |
| 146 | + <p>Enums define named constant values:</p> |
| 147 | + <pre><code class="language-stef">enum MetricType { |
| 148 | + Gauge = 0 |
| 149 | + Counter = 1 |
| 150 | + Histogram = 2 |
| 151 | + Summary = 3 |
| 152 | +}</code></pre> |
| 153 | + <p>Enum values must be explicitly assigned unsigned integer values. Multiple number formats are supported:</p> |
| 154 | + <ul> |
| 155 | + <li><strong>Decimal:</strong> <code>MetricType = 42</code></li> |
| 156 | + <li><strong>Hexadecimal:</strong> <code>MetricType = 0x2A</code> or <code>MetricType = 0X2A</code></li> |
| 157 | + <li><strong>Octal:</strong> <code>MetricType = 0o52</code> or <code>MetricType = 0O52</code></li> |
| 158 | + <li><strong>Binary:</strong> <code>MetricType = 0b101010</code> or <code>MetricType = 0B101010</code></li> |
| 159 | + </ul> |
| 160 | + <pre><code class="language-stef">enum StatusCode { |
| 161 | + OK = 0 |
| 162 | + NotFound = 0x194 // 404 in hexadecimal |
| 163 | + InternalError = 0o770 // 500 in octal |
| 164 | + Custom = 0b1111101000 // 1000 in binary |
| 165 | +}</code></pre> |
| 166 | + |
| 167 | + <h2>Complete Example</h2> |
| 168 | + <p>Here's a comprehensive example showing various STEF SDL features:</p> |
| 169 | + <pre><code class="language-stef">package com.example.monitoring |
| 170 | + |
| 171 | +// Enum for metric types |
| 172 | +enum MetricType { |
| 173 | + Gauge = 0 |
| 174 | + Counter = 1 |
| 175 | + Histogram = 2 |
| 176 | +} |
| 177 | + |
| 178 | +// Key-value attributes |
| 179 | +multimap Attributes { |
| 180 | + key string dict(AttributeKeys) |
| 181 | + value AttributeValue |
| 182 | +} |
| 183 | + |
| 184 | +// Union type for attribute values |
| 185 | +oneof AttributeValue { |
| 186 | + StringValue string |
| 187 | + IntValue int64 |
| 188 | + FloatValue float64 |
| 189 | + BoolValue bool |
| 190 | +} |
| 191 | + |
| 192 | +// Resource information with dictionary compression |
| 193 | +struct Resource dict(Resources) { |
| 194 | + ServiceName string dict(ServiceNames) |
| 195 | + ServiceVersion string dict(ServiceVersions) |
| 196 | + Attributes Attributes |
| 197 | +} |
| 198 | + |
| 199 | +// Metric data point |
| 200 | +struct DataPoint { |
| 201 | + Timestamp uint64 |
| 202 | + Value float64 |
| 203 | + Attributes Attributes |
| 204 | +} |
| 205 | + |
| 206 | +// Main metric structure |
| 207 | +struct Metric { |
| 208 | + Name string dict(MetricNames) |
| 209 | + Type MetricType |
| 210 | + Unit string dict(Units) |
| 211 | + Description string optional |
| 212 | + DataPoints []DataPoint |
| 213 | +} |
| 214 | + |
| 215 | +// Root record type |
| 216 | +struct MetricRecord root { |
| 217 | + Resource Resource |
| 218 | + Metric Metric |
| 219 | +}</code></pre> |
| 220 | + |
| 221 | + <h2>Type References</h2> |
| 222 | + <p>STEF SDL supports forward references - you can reference types before they are defined in the file. |
| 223 | + The parser resolves all type references after parsing the complete schema.</p> |
| 224 | + |
| 225 | + <h3>Recursive Type Declarations</h3> |
| 226 | + <p>STEF SDL allows recursive type declarations, enabling the definition of tree-like data structures.</p> |
| 227 | + |
| 228 | + <h4>Self-Referential Types</h4> |
| 229 | + <p>A type can reference itself, useful for creating tree structures:</p> |
| 230 | + <pre><code class="language-stef">// Binary tree node |
| 231 | +struct TreeNode { |
| 232 | + Value int64 |
| 233 | + Left TreeNode optional |
| 234 | + Right TreeNode optional |
| 235 | +} |
| 236 | +</code></pre> |
| 237 | + |
| 238 | + <h4>Mutually Referential Types</h4> |
| 239 | + <p>Multiple types can reference each other, creating more complex recursive relationships:</p> |
| 240 | + <pre><code class="language-stef">// Expression tree with operators and operands |
| 241 | +struct Expression { |
| 242 | + Node ExpressionNode |
| 243 | +} |
| 244 | + |
| 245 | +oneof ExpressionNode { |
| 246 | + Literal LiteralValue |
| 247 | + BinaryOp BinaryOperation |
| 248 | + UnaryOp UnaryOperation |
| 249 | +} |
| 250 | + |
| 251 | +struct LiteralValue { |
| 252 | + Value float64 |
| 253 | +} |
| 254 | + |
| 255 | +struct BinaryOperation { |
| 256 | + Operator string |
| 257 | + Left Expression // References back to Expression |
| 258 | + Right Expression // References back to Expression |
| 259 | +} |
| 260 | + |
| 261 | +struct UnaryOperation { |
| 262 | + Operator string |
| 263 | + Operand Expression // References back to Expression |
| 264 | +}</code></pre> |
| 265 | + <p>These recursive patterns are resolved correctly by the STEF parser and enable rich data modeling capabilities.</p> |
| 266 | + |
| 267 | + <h2>Syntax Rules</h2> |
| 268 | + <ul> |
| 269 | + <li>Identifiers must start with a letter and can contain letters, digits, and underscores</li> |
| 270 | + <li>Keywords are case-sensitive</li> |
| 271 | + <li>Struct, oneof, multimap, and enum names must be unique within a schema</li> |
| 272 | + <li>Field names must be unique within their containing struct/oneof/multimap</li> |
| 273 | + <li>Enum values must be unique within their enum</li> |
| 274 | + <li>Whitespace and comments are ignored during parsing</li> |
| 275 | + </ul> |
| 276 | + |
| 277 | + <h2>Generated Code</h2> |
| 278 | + <p>Use the <code>stefgen</code> tool to generate serialization code from your STEF schema:</p> |
| 279 | + <pre><code class="language-bash">stefgen --lang=go myschema.stef</code></pre> |
| 280 | + <p>This generates efficient serializers and deserializers in your target language.</p> |
| 281 | + |
| 282 | + <h2>Learn More</h2> |
| 283 | + <ul> |
| 284 | + <li><a href="./index.html">STEF Overview</a></li> |
| 285 | + <li><a href="https://github.com/splunk/stef/blob/main/stef-spec/specification.md">STEF Specification</a></li> |
| 286 | + <li><a href="https://github.com/splunk/stef">GitHub Repository</a></li> |
| 287 | + </ul> |
| 288 | + </main> |
| 289 | + <script src=" https://cdn.jsdelivr.net/npm/[email protected]/prism.js" ></script> |
| 290 | + <script src=" https://cdn.jsdelivr.net/npm/[email protected]/components/prism-go.min.js" ></script> |
| 291 | + <script src="./prism-stef.js"></script> |
| 292 | +</body> |
| 293 | +</html> |
0 commit comments