Callback perf#89
Draft
tmcgilchrist wants to merge 15 commits into
Draft
Conversation
Still merges the users' existing OCAMLRUNPARAM value
Honour users intent if OCAML_RUNTIME_EVENTS_PRESERVE is set and leave the file in place.
Fixes some EINTR edge-cases for system calls (sleepf and waitpid).
Adds a C stub olly_is_process_alive that uses kill(pid, 0) on Unix and OpenProcess + GetExitCodeProcess on Windows.
Changed emit from trace -> Event.t -> unit to trace -> ~ring_id:int -> ~ts:int64 -> ~name:string -> ~kind:Event.kind -> unit, passing fields directly instead of boxing them into a record on every event. This should avoid one Event.t allocation per event. Event.t type is preserved since kind is still needed as an extensible type.
This should avoid allocations in Printf (1-2 allocations), Some(fun oc -> ...) closure + option wrapper for counters, and %t function argument indirection.
Provides a direct path for counter events that takes ~value:int instead of ~kind:(Counter value).
Thread_ref.ref only supports values 1–255 (inline fuchsia trace refs), which broke when we bumped max_doms to 4096. Thread_ref.inline supports arbitrary pid/tid values — the trade-off is 2 extra words per event record in the trace file, which is negligible.
Key design choices: - C stubs (fxt_put_event_header, fxt_put_arg_header_i32/i64) pack multi-field headers into int64 words using C bit manipulation — zero Int64 boxing - String interning (32K table) — repeated phase names like "major", "minor" are emitted once as string records, then referenced by index. Makes traces ~50% smaller and avoids redundant string writes - Thread interning (256 table) — domain thread refs registered once, referenced by index - Single Bytes.t buffer (64KB) flushed to out_channel — no Buf_chain, no locks, no pool (olly is single-threaded) - %caml_bytes_set64u compiler intrinsic for timestamp writes — single instruction, no allocation
With a self-contained Fuchsia implementation there is no reason to keep this package.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
On top of #88
Trying out a rewrite to avoid GC allocations in the callback paths. The issues we've seen around dropped events, unterminated fuchsia (#20) and olly crashes could all be linked back to taking a GC during one of the runtime events callbacks.