Skip to content

Remove pinning fields in built-in Ruby types. #54

Open
@wks

Description

@wks

If an object calls rb_gc_mark on a field, it pins the child. Such objects are potential pinning parents that must be handled specially. Reducing such objects can reduce the overhead it imposes on copying GC.

Related higher-level issues are:

This issue keeps a list of built-in types that are PPPs, and why they pin their children.

Some object can pin its children

  • T_DATA: Some third-party libraries were written before Ruby introduced moving GC.
  • T_IMEMO
    • imemo_ifunc:
      • gc_mark_maybe(RANY(obj)->as.imemo.ifunc.data) type: VALUE
      • ifunc represents a "block written in C",
        and data is the "extra argument" passed to the block in addition to the yielded data.
        • I guess because the ifunc is written in C,
          the data can be anything (as long as the C func recognizes),
          even though it is supposed to be a VALUE which holds a Ruby value.
          It could be a compromise due to frequent misuse.
    • imemo_memo:
      • gc_mark_maybe(RANY(obj)->as.imemo.memo.u3.value)
      • It looks like a generic "memo" type. The u3 field is an untagged union that can be anything.
    • imemo_iseq (No longer PPP since Make all of the references of iseq movable ruby/ruby#7156):
      • Union aux members
        • rb_gc_mark(iseq->aux.loader.obj)
        • rb_gc_mark(compile_data->catch_table_ary)
        • rb_hook_list_mark(iseq->aux.exec.local_hooks) which calls rb_gc_mark(hook->data) for each hook.
        • rb_iseq_mark_insn_storage(compile_data->insn.storage_head) which calls rb_gc_mark(op)
        • The three fields above are parts of a union (iseq->aux)
          Other union variants do not hold reference at the same offset, so it has to be conservative.
          • It should be possible to test the union tag to know precisely which case it is.
            • Actually rb_iseq_mark is testing the union tags!
      • MJIT:
        • mjit_mark_cc_entries(body)
    • imemo_tmpbuf:
      • fully conservative.
        • Calls rb_gc_mark_locations on all offsets.
        • It is used to implement ALLOCV. I think it has to be PPP because of it conservative nature.
    • imemo_ast:
      • rb_gc_mark(ast->node_buffer->mark_hash)
      • rb_gc_mark(ast->body.compile_option)
      • rb_gc_mark(ast->body.script_lines)
      • rb_ast_update_references only calls update_ast_value on each NODE, but not the three fields above.
    • imemo_parser_strterm:
      • rb_gc_mark(heredoc->lastline)
      • It is part of a union, but rb_strterm_mark already tested the tag.
  • T_HASH: If Hash#compare_by_identity is called, it will pin_key_mark_value.
    • compare_by_identity: Sets self to consider only identity in comparing keys;
      two keys are considered the same only if they are the same object; returns self.
      • Cannot be undone. Good candidate for using remembered set.
    • Can be made non-PPP by introducing address-based hashing.
  • Any object that has gen_ivtab (No longer PPP since ruby@de72448)
    • What's that?
      • gen_ivtab = generic instance variable table
        • useful for adding custom variables to anything other than T_OBJECT
      • gc_mark_children -> (if EXIVAR) rb_mark_generic_ivar -> gen_ivtbl_mark -> rb_gc_mark
      • generic_iv_tbl_: (in variable.c) a global st_table mapping obj to gen_ivtable.
    • Seems unnecessary.
      • I patched the code to let it move, and it seems to work.
      • wks@282148b

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions