@@ -88,29 +88,31 @@ was carried out in single threaded mode. Similar to code loading, this
8888can impose a severe problem for availability that grows with the
8989number of cores.
9090
91- In OTP R16, breakpoints are set in the code without blocking the VM.
91+ Since OTP R16, breakpoints are set in the code without blocking the VM.
9292Erlang processes may continue executing undisturbed in parallel during the
9393entire operation. The same base technique is used as for code loading. A
9494staging area of breakpoints is prepared and then made active with a single
9595atomic operation.
9696
97- ### Redesign of Breakpoint Wheel
97+ ### Trace sessions
9898
99- To make it easier to manage breakpoints without single threaded mode a
100- redesign of the breakpoint mechanism has been made. The old
101- "breakpoint wheel" data structure was a circular double-linked list of
102- breakpoints for each instrumented function. It was invented before the
103- SMP emulator. To support it in the SMP emulator, is was essentially
104- expanded to one breakpoint wheel per scheduler. As more breakpoint
105- types have been added, the implementation have become messy and hard
106- to understand and maintain.
99+ Since OTP 27, dynamic trace session that are isolated from each
100+ other can be created. Trace sessions are represented by instances of the struct
101+ ` ErtsTraceSession ` . The old legacy session (kept for backward compatibility) is
102+ represented by the static instance ` erts_trace_session_0 ` .
107103
108- In the new design the old wheel was dropped and instead replaced by
109- one struct (` GenericBp ` ) to hold the data for all types of breakpoints
110- for each instrumented function. A bit-flag field is used to indicate
111- what different type of break actions that are enabled.
104+ ### Breakpoints
112105
113- ### Same Same but Different
106+ For call tracing, breakpoints are created and inserted in the ingress of each
107+ traced Erlang function. A pointer to the allocated struct ` GenericBp ` is
108+ inserted that holds all the data for all types of breakpoints. A bit-flag field
109+ is used to indicate what different type of break actions that are
110+ enabled. Struct ` GenericBp ` is session specific. If more than one trace session
111+ affects a function, one ` GenericBp ` instance is created for each session. They
112+ are linked together in a singly linked list that is traversed when the
113+ breakpoint is hit.
114+
115+ ### Similar to Code Loading but Different
114116
115117Even though ` trace_pattern ` use the same technique as the non-blocking
116118code loading with replicated generations of data structures and an
@@ -290,6 +292,35 @@ tracing is that we insert the `op_i_generic_breakpoint` instruction
290292(with its pointer at offset -4) in the export entry rather than in the
291293code.
292294
295+ ### call_time and call_memory tracing
296+
297+ For profiling, ` call_time ` and/or ` call_memory ` tracing can be set for a function.
298+ This will measure the time/memory spent by a function. The measured
299+ time/memory is kept in individual counters for every call traced process
300+ calling that function. To ensure scalability, scheduler specific hash tables
301+ (` BpTimemTrace ` ) are used in the breakpoint to map the calling process pid to
302+ its time/memory counters.
303+
304+ Function ` trace:info ` is used to collect stats for ` call_time ` , `call_memory'
305+ or both (` all ` ). It has to aggregate the counters from all those scheduler
306+ specific hash tables to build a list with one tuple with counters for each
307+ pid. This cannot be done safely while the hash tables may be concurrently
308+ updated by traced processes.
309+
310+ Since OTP 29, ` trace:info ` collects ` call_time ` and ` call_memory ` stats without
311+ blocking all schedulers from running. This is done by using the active and
312+ staging halves of the breakpoint. During normal operations both halves of the
313+ breakpoint refer to the same thread specific hash tables. To collect the stats
314+ safely, temporary hash tables are created to be used by traced calls happening
315+ during the call to ` trace:info ` . The temporary hash tables are being made active
316+ while the "real" hash tables are made inactive in the staging half. When the hash
317+ tables are inactive, they can be safely traversed. When done, the real
318+ tables are made active again. A final consolidation step is done to collect any
319+ stats from the temporary tables, delete them and make the two halves of the
320+ breakpoint identical again using the same real hash tables. Scheduling with
321+ thread progress is done (` trace_info_finisher() ` ) between the switching to make
322+ sure the traversed hash tables are not being concurrently updated.
323+
293324### Future work
294325
295326We still go to single threaded mode when new code is loaded for a
0 commit comments