Skip to content

Heap introspection API for debugging #1302

Open
@wks

Description

@wks

In #803, we proposed that we need a heap traversal API for the purpose of debugging GC algorithms. We already introduced the MMTK::enumerate_objects method for VMs to implement user-level traversal APIs. But this API is not enough for debugging GC. GC developers want the capability of introspecting the concrete structure of each policy, such as the chunk/block/line structure of ImmixSpace.

#1300 is one attempt of exposing the block structure of MallocSpace and ImmixSpace. But it is too much for a simple user-facing object-enumerating API, but not rich enough for GC developers.

We can provide an API for this kind of fine-grained introspection. For example,

{ // The VM must stop mutators from modifying the heap before entering this block
    let inspector = mmtk.inspect_heap();

    // Create an inspector object for a space.
    let space: SpaceInspector = inspector.space("immixspace").unwrap();

    // Inspect general information about a space.
    eprintln!("Space occupancy: {}", space.used_pages() as f64 / space.total_pages());

    // Cast to a concrete inspector for more information
    let immix_space: ImmixSpaceInspector = space.downcast::<ImmixSpaceInspector>().unwrap();
    eprintln!("Immix block size: {}", immix_space.block_size());

    // Some users are interested in block-level details.
    for block: ImmixBlockInspector in immix_space.blocks() {
        eprintln!("Block {} - {}", block.start(), block.end());
        for object in block.objects() {
            eprintln!("object: {object}");
        }
    }

    // Some users are interested in lines, too.
    for block: ImmixBlockInspector in immix_space.blocks() {
        for line: ImmixLineInspector in block.lines() {
            for object in line.objects() {
                  eprintln!("Block: {block}, line: {line}, object: {object}");
            }
        }
    }

    // We can introspect mutators, too.
    // Since we stopped the world, they are all stopped, too, allowing us to introspect.
    for mutator: MutatorInspector in inspector.mutators() {
        for bump_pointer in mutator.bump_pointers() {
            eprintln!("Mutator {} is bump-allocating into the region {}-{}", mutator, bump_pointer.cursor, bump_pointer.limit);
        }
    }
} // The VM can resume mutators now.

The key of this API is those opaque introspectors, including ImmixSpaceInspector, ImmixBlockInspector and ImmixLineInspector, which allows the VM binding to introspect the heap structure in controlled ways.

For other spaces, we provide other inspectors, such as MarkSweepSpaceInspector, MarkSweepBlockInspector and MarkSweepCellInspector.

And the invocation of mmtk.inspect_heap(); will stop the world, allowing the VM to do all the introspection when no mutators can mutate the heap. Update: mmtk-core doesn't have to initiate STW. Instead we require that VM bindings can only call this API when no mutators can mutate the heap, which is the same requirement of MMTK::enumerate_objects.

Should this API be public?

We need to call those APIs in the VM binding, so they should be public.

It is debatable whether this API should be always available, or guarded behind a Cargo feature. Most of the features provided by this API should have no performance impact. But we may need additional metadata to make some of the introspection possible at mutator time. If such cases exist, we may add a Cargo feature "advanced_introspection" which enables more introspection methods at the cost of some space/time overhead.

List of API methods

Here is an incomplete list of types and methods we can expose.

  • HeapInspector: The main object for heap inspection / introspection.
    • spaces(): Itereate over spaces.
    • space(name): Get an abstract SpaceInspector for a given space.
    • mutators(): Iterate over mutators.
  • SpaceInspector: The trait for all space inspectors.
    • name(): Get the name
    • ranges(): Go through all contiguous chunk ranges allocated to that space. For Map64, that's just one contiguous range of chunks.
    • objects(): Itereate over all objects
  • ImmixSpaceInspector: For ImmixSpace. Implements SpaceInspector, with Immix-specific methods.
    • blocks(): Iterate over Immix blocks
  • ImmixBlockInspector: A block in ImmixSpace.
    • start(): starting address
    • end(): ending address (exclusive)
    • lines(): Iterate over all lines
    • objects(): Iterate over all objects
  • ImmixLineInspector: A line in ImmixSpace block.
    • start(): starting address
    • end(): ending address (exclusive)
    • objects(): Iterate over all objects
    • is_used(): Return whether the line is in use. (TODO: Is this decidable at mutator time?)
  • MarkSweepSpaceInspector: For the native MarkSweepSpace. Implements SpaceInspector, with MarkSweep-specific methods.
    • blocks(): Iterate over Immix blocks.
    • blocks_in_size_class(size_class_index): Iterate over all blocks of a given size class.
  • MarkSweepBlockInspector: A block in MarkSweepSpace.
    • start(): starting address
    • end(): ending address (exclusive)
    • size_class(): The size class
    • cells(): Iterate over all lines
    • objects(): Iterate over all objects
  • MarkSweepCellInspector: A cell in a MarkSweepSpace block.
    • start(): starting address
    • end(): ending address (exclusive)
    • objects(): Iterate over all objects
    • is_used(): Return whether the cell is in use. (TODO: Is this decidable at mutator time?)
  • MutatorInspector: Inspect a mutator
    • bump_pointers(): Iterate through all BumpPointer this mutator is using.
    • allocators(): Iterate through all allocators
    • allocator(index): Get an allocator inspector
  • AllocatorInspector: The abstract trait for inspecting allocators.
  • BumpAllocatorInspector: For BumpAllocator
    • bump_pointer(): Get a reference to its bump pointer.
  • ImmixAllocatorInspector: For ImmixAllocator
    • bump_pointer(): Get a reference to its bump pointer.
    • large_bump_pointer(): Get a reference to the bump pointer for medium objects.
  • MarkSweepAllocatorInspector: For MarkSweepAllocator
    • blocks(): Iterate over locally cached blocks.
    • blocks_in_size_class(size_class_index): Iterate over locally cached blocks of a given size class

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions