Description
TracePoint
is a mechanism to install hooks and count various GC-related events in a code block. GC.stat
returns the internal statistics of the GC. They are currently not (well) supported when using MMTk. That has caused some test cases to fail.
Failing test cases
Tests related to TracePoint
, mainly TestTracepointObj#test_tracks_objspace_events
and TestTracepointObj#test_tracks_objspace_count
, failed for various reasons.
In Debug mode, the gc_trace_point()
call in obj_free
attempts to call GET_EC()
to find the current execution context of the current mutator thread. However, when using MMTk, obj_free
is executed by GC worker threads which do not have execution contexts. It crashes because of SIGSEGV.
In Release mode, some counts are different from the expected value. Currently, when using MMTk, we do not call gc_event_trace()
during object allocation, so the number of newobj is always observed as 0.
The test case TestTracepointObj#test_tracks_objspace_count
also reads free_count
, gc_start_count
, gc_end_mark_count
and gc_end_sweep_count
. They are not implemented, either. It also reads from GC.stat
, and that doesn't have the required keys when using MMTk, either
Supporting TracePoint
TracePoint
is based on gc_event_hook
. The GC-related code cals gc_event_hook
at various places. Those events are defined in event.h
:
RUBY_INTERNAL_EVENT_NEWOBJ
RUBY_INTERNAL_EVENT_FREEOBJ
(callingobj_free
)RUBY_INTERNAL_EVENT_GC_START
RUBY_INTERNAL_EVENT_GC_END_MARK
RUBY_INTERNAL_EVENT_GC_END_SWEEP
RUBY_INTERNAL_EVENT_GC_ENTER
RUBY_INTERNAL_EVENT_GC_EXIT
It is not hard to add hooks to newobj_of
. Checking gc_event_newobj_hook_needed_p(objspace)
and callling gc_event_hook_prep(objspace, RUBY_INTERNAL_EVENT_NEWOBJ, obj, newobj_zero_slot(obj));
is sufficient to get the newobj_count
number correct.
Other places can be supported similarly. There are things need to be changed.
- Some events are emitted by GC workers. For example,
obj_free
is now executed by GC worker threads which do not have "execution context", and the GC-start, GC-end events are related to GC workers, too. - Some events do not make sense for MMTk. For example,
- When not using MarkSweep,
GC_END_MARK
andGC_END_SWEEP
won't make sense. GC_ENTER
andGC_EXIT
are CRuby's default GC's notion of whether the VM is "in GC". In the current definition, the VM is also considered "in GC" when a mutator is executing finalizers. But in MMTk, the VM is "in GC" since all mutators have stopped, until the GC ends.
- When not using MarkSweep,
It is useful to have something like TracePoint
for debugging. But it should be adapted to MMTk, or other different GCs, too.
Supporting GC.stat
GC.stat
extracts GC-specific statistics. MMTk internally keeps various statistics, too, and the data is used by harness_begin
and harness_end
. To bridge Ruby's GC.stat
with MMTk, we just need to expose the needed API and call into mmtk-core
About testing
Because the statistics collected from TracePoint
and GC.stat
can be GC-specific, test cases should be written in a way generic to all GCs, or written specifically for each GC implementation.
However, The VM may do optimization that changes the number of objects allocated. For example, if the VM detects a function or a code block never changes a String
argument, it may reuse the same String
instance instead of allocating a new instance each time. So the number of object the following code snippet (inspired by TestTracepointObj#test_tracks_objspace_count
) allocates may vary if the JIT compiler or the interpreter optimizes the code.
100.times { "" }
200.times { puts "Hello world!".length }
If the VM reuses the same "Hello world!"
instance, there will be no objects allocated, or only one object (the String "Hello world!"
itself) allocated.
So test cases depending on the implementation details of the GC or potential JIT compiler optimizations may fail mysteriously.