This document shows how the yactfr library works internally. It’s intended for contributors and, most importantly, for future me.
I try to use abbreviations and acronyms consistently throughout the code to reduce identifier lengths with very common terms.
Once you’re familiar with the yactfr source code vocabulary below, the code becomes much more easy to read. Most of them are obvious.
An abbreviations or acronym can be made plural by adding an “s” at the end. For example, “dsts” means “data stream types”.
This only applies to internal code. Everything that’s public and documented uses unabbreviated names.
| Abbreviation/acronym | Term |
|---|---|
Abs |
Absolute |
Addr |
Address |
Align |
Alignment |
Attr |
Attribute |
Be |
Big-endian |
Blk |
Block |
Bio |
Bit order |
Bo |
Byte order |
Bool |
Boolean |
Cand |
Candidate |
Cc |
Clock class (CTF 2) |
Clk |
Clock |
Cls |
Class (CTF 2) |
Col |
Column |
Conv |
Conversion |
Ctx |
Context |
Cur |
Current |
Decr |
Decrement |
Def |
Default |
Descr |
Description |
Disc |
Discarded |
Disp |
Display |
Dl |
Dynamic-length |
Ds |
Data stream |
Dsc |
Data stream class (CTF 2) |
Dst |
Data stream type |
Dt |
Data type |
Dup |
Duplicate |
Dyn |
Dynamic |
Elem |
Element |
Enum |
Enumeration (CTF 1) |
Er |
Event record |
Erc |
Event record class (CTF 2) |
Ert |
Event record type |
Exc |
Exception |
Fc |
Field class (CTF 2) |
Fl |
Fixed-length |
Float |
Floating point number |
Freq |
Frequency |
Id |
Identifier (numeric) |
Ident |
Identifier (language) |
Incr |
Increment |
Instr |
Instruction |
Int |
Integer |
It, Iter |
Iterator |
Le |
Little-endian |
Len |
Length |
Lex |
Lexical |
Loc |
Location |
Max |
Maximum |
Min |
Minimum |
Ns |
Namespace |
Nt |
Null-terminated |
Opt |
Option, optional |
Orig |
Origin |
Pkt |
Packet |
Pos |
Position |
Pow |
Power |
Prec |
Precision |
Pref |
Preferred |
Proc |
Procedure |
Rec |
Recursive |
Ref |
Reference |
Rej |
Rejecter |
Rel |
Relative |
Req |
Requirement, required |
Sel |
Selector |
Seq |
Sequence |
SInt |
Signed integer |
Sl |
Static-length |
Snap |
Snapshot |
Sp |
Shared pointer |
Spec |
Specific |
Src |
Source |
Ss |
String stream, string scanner |
SSel |
Signed selector |
Std |
Standard |
Str |
String |
Struct |
Structure |
Tc |
Trace class (CTF 2) |
Tgt |
Target |
Ts |
Timestamp (CTF 2) |
UInt |
Unsigned integer |
Up |
Unique pointer |
USel |
Unsigned selector |
Val |
Value |
Var |
Variant |
Vl |
Variable-length |
Vm |
Virtual machine |
Combination examples:
-
convCtxmeans “conversion context”. -
locItmeans “location iterator”. -
pseudoDtsmeans “pseudo data types”. -
isPseudoVarTypeWithoutSelLocRecmeans “is pseudo variant type without selector location (recursive)”. -
uIntValmeans “unsigned integer value”. -
flSIntmeans “fixed-length signed integer”.
The yactfr library lives in two namespaces:
yactfr-
Public API.
yactfr::internal-
Internal API.
Some internal classes and declarations are exposed to the user in public headers (needed for templating), like
yactfr::internal::tryCloneAttrs(), but those headers are placed in a subdirectory namedinternal, and their content is not publicly documented.This namespace is similar to the
detailnamespace commonly found in the Boost libraries, for example.
The metadata objects exist to create a CTF metadata hierarchy.
This hierarchy, once complete, is as follows:
Trace type
Packet header type (optional)
Clock types (zero or more)
Data stream types (zero or more)
Packet context type (optional)
Event record header type (optional)
Event record common context type (optional)
Event record types (zero or more)
Specific context type (optional)
Payload type (optional)
The concept of a type here is that it represents a set of possible values. For example:
-
A 4-bit fixed-length unsigned integer data type is the set of the fixed-length integer values from 0 to 31.
-
A clock type is a set of possible clocks.
-
A data stream type is a set of possible data streams.
Therefore a trace type is a set of possible CTF traces.
All the metadata objects are composed using unique pointers
(std::unique_ptr), so that all the nodes in this specialized tree are
unique. This is important because it becomes possible to refer to a node
by address since nodes are never reused. So, for example, all the 32-bit
fixed-length unsigned integer types are different objects, even if they
have the same properties. In the future, to optimize memory usage, the
nodes could be unique while their content is shared, keeping the same
API for getters.
All the metadata objects are immutable. Once built, you cannot change
them, and all the accessors are const.
You always build a metadata object by providing everything it needs. On construction, some basic parameters are copied, and some, more heavy, are moved. For example, when you build a data stream type, you move a set of event record types to it. It doesn’t matter that you don’t have this set anymore as the caller because, like any metadata object, event record types are unique anyway, so the data stream type becomes the owner at this point.
There are a few exceptions to immutability to create weak links to parent nodes when you finally build a trace type. This is why, for example, an event record type object has this member:
mutable const DataStreamType *_dst;While you can build a trace type object manually, the most interesting use case is probably to get one out of a standard TSDL document (CTF 1.8) or JSON text sequence document (CTF 2).
The fromMetadataText() functions do exactly that. Those functions
accept either two character pointers (beginning and end) or an
std::string object to synthesize a pair of trace type object and
metadata stream UUID out of the parsed text.
fromMetadataText() requires a textual (non-packetized) version of
the document (for CTF 1.8). If the TSDL content is packetized, you
can use createMetadataStream() to get a metadata stream object which
contains a metadata plain text accessor.
A metadata stream object decodes all the contents on construction and keeps it, so it can get heavy with a heavy metadata stream. That being said, note that a fairly large LTTng kernel trace metadata stream is about 500 Kib: not the end of the world.
The TSDL parser is of the non-predictive recursive descent type. There’s
one method for each construct, and the parser can sometimes backtrack if
it doesn’t reach what it expects. The parser gets its tokens from a
string scanner (internal::StrScanner) which is just a specialized
lexer with a built-in stack to be able to backtrack.
internal::TsdlParser isn’t the fastest parser in the world, but it’s
good enough considering the application: the main work is decoding data
streams when reading a CTF trace, not parsing its metadata stream. The
fact that it’s a recursive descent parser (with helpers like an RAII
lexical scope object) also makes it straightforward to understand,
debug, and modify.
internal::Ctf2JsonSeqParser is the CTF 2 metadata stream parser.
Its validation relies a lot on internal::JsonAnyFragValReq, a JSON
value requirement to validate a single CTF 2 fragment.
A trace type gets translated into a packet procedure once you call its
internal::TraceTypeImpl::packetProc() accessor method. It’s lazily
built because the user could need a trace type without having to read
data streams with it, for example to inspect a metadata stream file. The
trace type implementation itself owns the packet procedure, and the
packet procedure has a weak pointer to its owner.
A packet procedure is a tree of procedures used to decode specific parts of a data packet described by the trace type of the packet procedure.
A procedure is a sequence of instructions, some of which can contain a subprocedure themselves. A yactfr virtual machine (VM) is a packet procedure consumer.
All the possible instructions are found in proc.hpp. They all inherit
internal::Instr.
There are instructions which require the VM to align the current decoding head and then read data in a specific way, for example:
-
internal::ReadFlSIntInstr -
internal::ReadFlFloatInstr -
internal::ReadNtStrInstr
For compound types, a first internal::BeginRead*Instr instruction
indicates to “enter” the compound data. This instruction usually
contains a subprocedure to read its contents. The last element of this
subprocedure is usually an internal::EndReadDataInstr instruction,
which indicates the end of the subprocedure. This avoids a useless
index-size comparison performed before fetching the next instruction in
the VM.
Other instructions are related to the last decoded integer, for example:
-
internal::SetDsIdInstrfollows a “read fixed-length integer” or “read variable-length integer” instruction and indicates to the VM to set the current data stream ID to the last decoded integer value. -
internal::SetExpectedPktTotalLenInstrindicates to the VM to set the expected total length of the current packet to the last decoded integer value. -
internal::UpdateDefClkValInstrindicates to the VM to update the value of a specific data stream default clock with the value (or partial value) of the last decoded integer.
An internal::PktProcBuilder object translates a trace type to a packet
procedure.
An internal::PktProc object contains:
-
A preamble procedure, that is, which procedure to execute initially for any packet of the trace.
-
For each contained data stream type: an
internal::DsPktProcobject.
An internal::DsPktProc object contains:
-
A preamble procedure, that is, which procedure to execute after the preable of the packet procedure for any packet of such a data stream.
-
An event record preamble procedure, that is, which procedure to execute initially for any event record which is part of such a data stream.
-
For each contained event record type: an
internal::ErProcobject.
An internal::ErProc object contains the specific procedure to execute
for any event record of a given type. This procedure is executed after
executing the event record preamble procedure of the data stream packet
procedure.
|
Tip
|
To view a textual representation of a generated packet procedure tree in
a debug build, set the YACTFR_DEBUG_PRINT_PROC environment variable to
1 and create a trace type.
|
There’s a special instruction, internal::SaveValInstr, which requires
the VM to save the value of the last decoded integer to a specific
position (index) within an array of saved values.
This is how the VM knows where to dynamically find the length of a
dynamic-length array/string/BLOB, or the selector of a variant/optional,
as the internal::BeginReadDlArrayInstr,
internal::BeginReadDlStrInstr, internal::BeginReadDlBlobInstr,
internal::BeginReadVarUIntSelInstr,
internal::BeginReadVarSIntSelInstr,
internal::BeginReadOptBoolSelInstr,
internal::BeginReadOptUIntSelInstr, and
internal::BeginReadOptSIntSelInstr instructions contain a numeric
position (index) within this saved value array where to find the length
or selector value.
internal::PktProcBuilder contains the logic to insert
internal::SaveValInstr instructions at specific locations within the
procedures and to assign appropriate positions to link “begin read
dynamic-length array”, “begin read dynamic-length string”, “begin
read dynamic-length BLOB”, “begin read variant”, and “begin read
optional” instructions to their length/selector values.
A data source factory is an object which can build data sources.
The library user can extend the DataSourceFactory class to provide
custom data sources to element sequence iterators.
When you build an element sequence, you need to pass a trace type and a data source factory. Each iterator created by the element sequence creates its own data source, making all iterators independent and usable in different threads without explicit locking.
The MemoryMappedFileViewFactory class ships with the yactfr library.
When you build it, you pass a path to the data stream file to use. While
the factory itself is responsible for opening the path and getting a
file descriptor, each created data source (called memory mapped file
view) has its own memory map on that shared file descriptor. Thanks to
appropriate internal shared pointers, the shared file descriptor is
never closed before all created data sources are destroyed.
The yactfr VM (internal::Vm) is the bridge between a
packet procedure and a data source.
A VM executes the appropriate instructions of a packet procedure, reading its data (when needed) from its own data source.
The VM has an internal position (internal::VmPos). This is the whole
state of the VM, including:
-
Current offsets in the element sequence, current packet, and data source buffer.
-
Current state.
-
Last byte order.
-
Last decoded integer value.
-
Current packet procedure, data stream packet procedure, and event record procedure.
-
Current expected packet total and content lengths.
-
Stack of frames containing the next instruction to execute and the parent procedure.
-
Array of saved values.
-
Current data stream clock value.
-
Concrete element objects to set when executing the VM.
The VM position is a different object because this is what
internal::Vm::savePosition() (called from the public
ElementSequenceIterator::savePosition()) copies to an
ElementSequenceIteratorPosition object.
On construction, the VM initializes an array of instruction handlers.
This is a function table which the VM uses to handle specific
instructions according to their numeric kind. I’m only going to claim
without numbers here that I tried using virtual calls and this approach
is faster. It’s also faster than a big switch statement. I didn’t opt
for computed gotos only because they’re not portable and it would make
an eventual portability effort more complicated.
State handling however is an inline switch statement with about
15 cases. This seems faster than a function table for some reason.
The reason why there are instructions and VM states is that the yactfr
instruction set is not general enough. This would result in many useless
function calls in some situations. For example, the
internal::BeginReadSlArrayInstr requires the VM to start reading a
static-length array. This instruction object contains the length of the
static-length array, that is, the number of items to read next. The
static-length array reading instruction contains a subprocedure which
only contains which instruction to execute to read a single array
element. It doesn’t contain register decrementation, comparison, and
jump instructions like you would find in a typical assembly loop.
Instead, the state of the VM is changed (to VmState::ExecArrayInstr)
so that it knows that it’s currently decoding an array at this stack
level, and the VM position contains the number of remaining elements.
The decrementation, comparison, and stack popping when it’s done are
implicit. This proves faster than executing three instructions for each
array item.
The VM decodes “standard” fixed-length integers, that is, integers
which are at least aligned on a byte and have lengths of 8, 16, 32, or
64, using the inline functions in std-fl-int-reader.hpp. Those use
std::memcpy() with a length known at build time and Boost.Endian
which generate efficient CPU instructions the last time I checked. Those
fixed-length integers are typically the most commonly found in a data
stream.
The VM decodes all the other fixed-length integers with the methods of
fl-int-reader.hpp. This file is generated by
tools/genflintreaderfile.py and contains the exact statements needed
to decode all the possible fixed-length integers up to a length of
64 bits. Function tables are created to select:
-
The length of the integer in bits (1 to 64).
-
The current bit position within the first byte of data (0 to 7).
-
The byte order (little-endian, big-endian).
-
The signedness (unsigned, signed).
The parameters above yield 2048 permutations. However, my assumption is that during the decoding process, only a few of those functions are called, so they should stay in cache. This is another place where computed gotos would probably prove useful.
An element sequence iterator object and its VM are tightly coupled.
The VM knows its owning iterator because, when it executes one or more
instructions and the current element changes, it sets the _curElem
member of the iterator to the address of one of its already allocated
and filled elements (located within the VM position object). It also
sets the offset and mark of the iterator.
The mark of the element sequence iterator is the index of the current
element within its packet. In combination with the current offset (bits
from the beginning of the element sequence), this is enough to compare
two iterators which were created from the same element sequence without
relying on the VM. Therefore the comparison operators of the iterator
are inlined, just as operator*() and operator->().
An “end” (passed-the-end) element sequence iterator has its offset set
to ElementSequenceIterator::_endOffset which is the maximum value for
the offset type, and its mark set to 0. Therefore, any iterator
which isn’t passed the end is less than a passed-the-end iterator.
It’s possible that an “end” iterator has no VM because its constructor
won’t allocate one when it’s directly built as a passed-the-end iterator
by ElementSequence::end(). However, all iterators contain:
-
The trace type of its creating element sequence.
-
The data source factory of its creating element sequence.
Those are enough to create a new VM when necessary, for example when assigning a “non-end” iterator to an “end” iterator:
auto beginIter = mySeq.begin(); // has a VM
auto endIter = mySeq.end(); // has no VM
endIter = beginIter; // creates a copy of the other VMAn element sequence iterator can seek a packet within the data of its
data source known to be located at a specific offset in bytes. When you
call ElementSequenceIterator::seekPacket(), it resets the VM position
of the iterator and the buffers. There can’t be any validation that this
is indeed the beginning of a packet: it’s the library user’s
responsibility.
Some use cases can require that you need a lot of iterators from the same element sequence at the same time, but only one at a time is used. Remember that the VM of an iterator has its own data source, and this means active resources. To avoid having too many active data sources, yactfr offers the element sequence iterator position API. It’s pretty simple to use:
ElementSequenceIteratorPosition pos;
myIter.savePosition(pos);
// ...
myIter.restorePosition(pos);An element sequence iterator position doesn’t contain any data source. It contains:
-
A copy of the VM position when
ElementSequenceIterator::savePosition()was called. -
The offset, mark, and current element of the iterator when
ElementSequenceIterator::savePosition()was called.
This is enough to reset any iterator (created from the same element sequence) to an exact position later.