Skip to content

Commit 1e8817f

Browse files
Add STEF SDL docs
1 parent f99e126 commit 1e8817f

File tree

2 files changed

+294
-0
lines changed

2 files changed

+294
-0
lines changed

docs/index.html

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,7 @@ <h2>Write STEF Records</h2>
9797
</code></pre>
9898
<h2>Learn More</h2>
9999
<ul>
100+
<li><a href="./sdl.html">STEF Schema Definition Language</a></li>
100101
<li><a href="https://github.com/splunk/stef/blob/main/stef-spec/specification.md">STEF Specification</a> (detailed format and protocol)</li>
101102
<li><a href="./benchmarks.html">Benchmarks</a> (performance results)</li>
102103
<li><a href="https://github.com/splunk/stef">GitHub Repository</a></li>

docs/sdl.html

Lines changed: 293 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,293 @@
1+
<!DOCTYPE html>
2+
<html lang="en">
3+
<head>
4+
<meta charset="UTF-8">
5+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
6+
<title>STEF SDL: Schema Definition Language</title>
7+
<link href="https://fonts.googleapis.com/css?family=Roboto:400,700&display=swap" rel="stylesheet">
8+
<link rel="stylesheet" href="./style.css">
9+
<link rel="stylesheet" href="https://cdn.jsdelivr.net/npm/[email protected]/themes/prism.css">
10+
<link rel="stylesheet" href="./prism-stef.css">
11+
</head>
12+
<body>
13+
<header>
14+
<h1>STEF SDL</h1>
15+
<p>Schema Definition Language<br>Define schemas for STEF serialization</p>
16+
<nav>
17+
<a href="./index.html">Home</a>
18+
<a href="https://github.com/splunk/stef">GitHub</a>
19+
<a href="https://github.com/splunk/stef/blob/main/stef-spec/specification.md">Specification</a>
20+
</nav>
21+
</header>
22+
<main>
23+
<h2>Overview</h2>
24+
<p>The STEF Schema Definition Language (SDL) is used to define schemas for STEF serialization.
25+
It provides a simple, type-safe way to describe data structures that can be efficiently serialized
26+
and deserialized using the STEF format.</p>
27+
28+
<h2>Package Declaration</h2>
29+
<p>Every STEF schema file begins with a package declaration:</p>
30+
<pre><code class="language-stef">package com.example.myschema</code></pre>
31+
<p>Package names use dot notation and can have one or more dot-delimited components.</p>
32+
33+
<h3>Language-Specific Package Handling</h3>
34+
<p>Different target languages handle package names differently when generating code:</p>
35+
<ul>
36+
<li><strong>Go:</strong> Uses only the last component of the package name. For example, <code>com.example.myschema</code> becomes package <code>myschema</code> in Go.</li>
37+
<li><strong>Java:</strong> Uses the full package name hierarchy. For example, <code>com.example.myschema</code> becomes package <code>com.example.myschema</code> in Java.</li>
38+
</ul>
39+
40+
<h2>Comments</h2>
41+
<p>STEF SDL supports C-style single-line comments:</p>
42+
<pre><code class="language-stef">// This is a comment
43+
package com.example // Comments can appear at end of lines</code></pre>
44+
45+
<h2>Primitive Types</h2>
46+
<p>STEF SDL supports the following primitive data types:</p>
47+
<ul>
48+
<li><code>bool</code> - Boolean values (true/false)</li>
49+
<li><code>int64</code> - 64-bit signed integer</li>
50+
<li><code>uint64</code> - 64-bit unsigned integer</li>
51+
<li><code>float64</code> - 64-bit floating point number</li>
52+
<li><code>string</code> - UTF-8 encoded string</li>
53+
<li><code>bytes</code> - Binary data</li>
54+
</ul>
55+
56+
<h2>Structs</h2>
57+
<p>Structs define composite data types with named fields:</p>
58+
<pre><code class="language-stef">struct Person {
59+
Name string
60+
Age uint64
61+
Email string
62+
}</code></pre>
63+
64+
<h3>Root Structs</h3>
65+
<p>The <code>root</code> attribute marks a struct as the top-level record type in a STEF stream:</p>
66+
<pre><code class="language-stef">struct Record root {
67+
Timestamp uint64
68+
Data Person
69+
}</code></pre>
70+
<p>Multiple structs can be marked as <code>root</code> in a single schema, allowing the STEF stream to contain different types of records:</p>
71+
<pre><code class="language-stef">struct MetricRecord root {
72+
Timestamp uint64
73+
Metric Metric
74+
}
75+
76+
struct TraceRecord root {
77+
Timestamp uint64
78+
Span Span
79+
}</code></pre>
80+
<p>When multiple root structs are defined, each record in the stream will be one of the root types, and the STEF format includes type information to distinguish between them during deserialization.</p>
81+
82+
<h3>Dictionary Compression</h3>
83+
<p>Fields can use dictionary compression for repeated values using the <code>dict</code> modifier:</p>
84+
<pre><code class="language-stef">struct Event {
85+
EventType string dict(EventTypes)
86+
Message string
87+
}</code></pre>
88+
<p>Structs can also have dictionary compression applied:</p>
89+
<pre><code class="language-stef">struct Resource dict(Resources) {
90+
Name string
91+
Version string
92+
}</code></pre>
93+
<p>Dictionary names allow the same dictionary to be shared across multiple fields, even in different structs, as long as the fields have the same type:</p>
94+
<pre><code class="language-stef">struct MetricEvent {
95+
ServiceName string dict(ServiceNames)
96+
EventType string dict(EventTypes)
97+
}
98+
99+
struct TraceEvent {
100+
ServiceName string dict(ServiceNames) // Same dictionary as above
101+
SpanName string dict(SpanNames)
102+
}</code></pre>
103+
<p>This sharing enables more efficient compression when the same values appear across different record types.</p>
104+
105+
<h3>Optional Fields</h3>
106+
<p>Fields can be marked as optional, meaning they may not be present in every record:</p>
107+
<pre><code class="language-stef">struct User {
108+
Name string
109+
Email string optional
110+
Phone string optional
111+
}</code></pre>
112+
113+
<h2>Arrays</h2>
114+
<p>Array types are denoted with square brackets and can contain zero or more elements of the specified type:</p>
115+
<pre><code class="language-stef">struct Container {
116+
Items []string
117+
Numbers []int64
118+
Objects []Person
119+
}</code></pre>
120+
<p>Arrays are variable-length - they can be empty or contain any number of elements.</p>
121+
122+
<h2>Oneofs (Union Types)</h2>
123+
<p>Oneofs define union types that can hold one of several possible field types:</p>
124+
<pre><code class="language-stef">oneof JsonValue {
125+
String string
126+
Number float64
127+
Bool bool
128+
Array []JsonValue
129+
Object JsonObject
130+
}</code></pre>
131+
<p>An empty oneof represents null/absence of value.</p>
132+
133+
<h2>Multimaps</h2>
134+
<p>Multimaps define key-value collections:</p>
135+
<pre><code class="language-stef">multimap Attributes {
136+
key string
137+
value AnyValue
138+
}</code></pre>
139+
<p>Multimaps can also use dictionary compression:</p>
140+
<pre><code class="language-stef">multimap Labels {
141+
key string dict(LabelKeys)
142+
value string dict(LabelValues)
143+
}</code></pre>
144+
145+
<h2>Enums</h2>
146+
<p>Enums define named constant values:</p>
147+
<pre><code class="language-stef">enum MetricType {
148+
Gauge = 0
149+
Counter = 1
150+
Histogram = 2
151+
Summary = 3
152+
}</code></pre>
153+
<p>Enum values must be explicitly assigned unsigned integer values. Multiple number formats are supported:</p>
154+
<ul>
155+
<li><strong>Decimal:</strong> <code>MetricType = 42</code></li>
156+
<li><strong>Hexadecimal:</strong> <code>MetricType = 0x2A</code> or <code>MetricType = 0X2A</code></li>
157+
<li><strong>Octal:</strong> <code>MetricType = 0o52</code> or <code>MetricType = 0O52</code></li>
158+
<li><strong>Binary:</strong> <code>MetricType = 0b101010</code> or <code>MetricType = 0B101010</code></li>
159+
</ul>
160+
<pre><code class="language-stef">enum StatusCode {
161+
OK = 0
162+
NotFound = 0x194 // 404 in hexadecimal
163+
InternalError = 0o770 // 500 in octal
164+
Custom = 0b1111101000 // 1000 in binary
165+
}</code></pre>
166+
167+
<h2>Complete Example</h2>
168+
<p>Here's a comprehensive example showing various STEF SDL features:</p>
169+
<pre><code class="language-stef">package com.example.monitoring
170+
171+
// Enum for metric types
172+
enum MetricType {
173+
Gauge = 0
174+
Counter = 1
175+
Histogram = 2
176+
}
177+
178+
// Key-value attributes
179+
multimap Attributes {
180+
key string dict(AttributeKeys)
181+
value AttributeValue
182+
}
183+
184+
// Union type for attribute values
185+
oneof AttributeValue {
186+
StringValue string
187+
IntValue int64
188+
FloatValue float64
189+
BoolValue bool
190+
}
191+
192+
// Resource information with dictionary compression
193+
struct Resource dict(Resources) {
194+
ServiceName string dict(ServiceNames)
195+
ServiceVersion string dict(ServiceVersions)
196+
Attributes Attributes
197+
}
198+
199+
// Metric data point
200+
struct DataPoint {
201+
Timestamp uint64
202+
Value float64
203+
Attributes Attributes
204+
}
205+
206+
// Main metric structure
207+
struct Metric {
208+
Name string dict(MetricNames)
209+
Type MetricType
210+
Unit string dict(Units)
211+
Description string optional
212+
DataPoints []DataPoint
213+
}
214+
215+
// Root record type
216+
struct MetricRecord root {
217+
Resource Resource
218+
Metric Metric
219+
}</code></pre>
220+
221+
<h2>Type References</h2>
222+
<p>STEF SDL supports forward references - you can reference types before they are defined in the file.
223+
The parser resolves all type references after parsing the complete schema.</p>
224+
225+
<h3>Recursive Type Declarations</h3>
226+
<p>STEF SDL allows recursive type declarations, enabling the definition of tree-like data structures.</p>
227+
228+
<h4>Self-Referential Types</h4>
229+
<p>A type can reference itself, useful for creating tree structures:</p>
230+
<pre><code class="language-stef">// Binary tree node
231+
struct TreeNode {
232+
Value int64
233+
Left TreeNode optional
234+
Right TreeNode optional
235+
}
236+
</code></pre>
237+
238+
<h4>Mutually Referential Types</h4>
239+
<p>Multiple types can reference each other, creating more complex recursive relationships:</p>
240+
<pre><code class="language-stef">// Expression tree with operators and operands
241+
struct Expression {
242+
Node ExpressionNode
243+
}
244+
245+
oneof ExpressionNode {
246+
Literal LiteralValue
247+
BinaryOp BinaryOperation
248+
UnaryOp UnaryOperation
249+
}
250+
251+
struct LiteralValue {
252+
Value float64
253+
}
254+
255+
struct BinaryOperation {
256+
Operator string
257+
Left Expression // References back to Expression
258+
Right Expression // References back to Expression
259+
}
260+
261+
struct UnaryOperation {
262+
Operator string
263+
Operand Expression // References back to Expression
264+
}</code></pre>
265+
<p>These recursive patterns are resolved correctly by the STEF parser and enable rich data modeling capabilities.</p>
266+
267+
<h2>Syntax Rules</h2>
268+
<ul>
269+
<li>Identifiers must start with a letter and can contain letters, digits, and underscores</li>
270+
<li>Keywords are case-sensitive</li>
271+
<li>Struct, oneof, multimap, and enum names must be unique within a schema</li>
272+
<li>Field names must be unique within their containing struct/oneof/multimap</li>
273+
<li>Enum values must be unique within their enum</li>
274+
<li>Whitespace and comments are ignored during parsing</li>
275+
</ul>
276+
277+
<h2>Generated Code</h2>
278+
<p>Use the <code>stefgen</code> tool to generate serialization code from your STEF schema:</p>
279+
<pre><code class="language-bash">stefgen --lang=go myschema.stef</code></pre>
280+
<p>This generates efficient serializers and deserializers in your target language.</p>
281+
282+
<h2>Learn More</h2>
283+
<ul>
284+
<li><a href="./index.html">STEF Overview</a></li>
285+
<li><a href="https://github.com/splunk/stef/blob/main/stef-spec/specification.md">STEF Specification</a></li>
286+
<li><a href="https://github.com/splunk/stef">GitHub Repository</a></li>
287+
</ul>
288+
</main>
289+
<script src="https://cdn.jsdelivr.net/npm/[email protected]/prism.js"></script>
290+
<script src="https://cdn.jsdelivr.net/npm/[email protected]/components/prism-go.min.js"></script>
291+
<script src="./prism-stef.js"></script>
292+
</body>
293+
</html>

0 commit comments

Comments
 (0)