Skip to content

Use refcount based gc (shared_ptr) when Boehm GC is disabled#5640

Draft
ChrisDodd wants to merge 16 commits into
p4lang:mainfrom
ChrisDodd:cdodd-disablegc
Draft

Use refcount based gc (shared_ptr) when Boehm GC is disabled#5640
ChrisDodd wants to merge 16 commits into
p4lang:mainfrom
ChrisDodd:cdodd-disablegc

Conversation

@ChrisDodd

@ChrisDodd ChrisDodd commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

This is a huge change that touches a lot of code but doesn't really do anything.

The main change is to replace const IR::NodeType * in 1000s of places with IR::Ptr<NodeType>, and when the compiler is configured the ENABLE_GC=OFF use a reference counted 'smart' pointer for that. In the normal build case, IR::Ptr<T> is just an alias for const T *, so there's no real change.

We use a custom IR::shared_ptr (rather than std::shared_ptr), as we use IR::Node objects allocated on the stack or embedded in other objects (with inline in .def files) in many cases, so we need to detect when objects are allocated with new and when they are not, and only use the refcount to clean up those allocated with new

There are still many places in the compiler that will leak memory (for things that are not IR nodes); those will need to be individually changed to use std::shared_ptr (or equivalent) to avoid leaking.

Note that this PR does not (yet) contain fixes for the tofino backend to make it build/work with ENABLE_GC=OFF

@ChrisDodd ChrisDodd requested review from asl and fruffy June 1, 2026 23:52
@fruffy fruffy requested a review from smolkaj June 1, 2026 23:54
@fruffy fruffy added bmv2 Topics related to BMv2 or v1model core Topics concerning the core segments of the compiler (frontend, midend, parser) compiler-performance Topics on improving the performance of the compiler core. breaking-change This change may break assumptions of compiler back ends. and removed bmv2 Topics related to BMv2 or v1model labels Jun 1, 2026
@fruffy fruffy requested a review from kfcripps June 1, 2026 23:54
@fruffy

fruffy commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

This is great!!

One way to test is to disable GC for CI and check what happens. We can also disable for testgen in general, which has long running tests sensitive to leaking memory. Curious about the performance of this too.

@ChrisDodd ChrisDodd marked this pull request as draft June 1, 2026 23:57
Chris Dodd and others added 16 commits June 3, 2026 05:32
- IR::Ptr that maps for 'const *' or IR::shared_ptr depending on
  HAVE_LIBGC

Signed-off-by: Chris Dodd <cdodd@nvidia.com>
Signed-off-by: Chris Dodd <cdodd@nvidia.com>
- fields held in P4::MethodInstance objects and other subclasses
- return values of typechecking canonicalize and specialize methods
- returned value in bindVariables
- CloneExpressions::clone return value
- FrentEnd::run return value
- typeMap::setType argument, due to strangess with absl maps apparently
  destroying the fowarded argument before copying it.
- check for unreferenced raw pointers in typeMap that would otherwise
  get deleted, due to its internal use of IR::Ptr

Also fix IR ctors to not use delegating ctors, as they evaluate the
delegated args *before* calling the base class ctor, so screws up
IR::shared_ptr stuff figuring out if an object is on the heap or not

- apply_visitor routines need to ensure there's a ref to the node to be
  returned *before* calling visited.reset()

Signed-off-by: Chris Dodd <cdodd@nvidia.com>
- preorder/postorder methods all return raw pointers to allow using
  covariant return typing to return something more specific than an
  IR::Node pointer -- only works for raw pointers, not smart pointers
- guardReturn stashes a copy of an IR::Ptr temporarily in the Transform
  so it won't be freed when returning as a raw pointer if there are no
  other references

Signed-off-by: Chris Dodd <cdodd@nvidia.com>
- ProgramMap peristent handles to P4Program
- temps in hsIndexSimplify
- ReferenceMap internal caches
- Inlining held values
- ConstantTypeSubstitution convert method return values
- TypeConstraint internals/caches

Can't safely return a pointer to an inline field, as the containing
object might be freed; need explicit clone in DoSimplifyControlFlow

Signed-off-by: Chris Dodd <cdodd@nvidia.com>
- constantFolding internal cloning
- DoReplaceTuples
- DoLocalCopyPropagation
- specializeGenericTypes/ReplaceTypeUses
- irutils.cpp
- eliminateTypedefs/DoReplaceTypedef
- defaultArguments/TypeNameSubstitutionVisitor

Signed-off-by: Chris Dodd <cdodd@nvidia.com>
- inlining/GeneralInline temp
- removeParameters/DoRemoveActionParameters temps
- sideEffects/DoSimplifyExpressions created temp vars and additions
- specialization (SpecializeFunctions/TypeSpecialization) added things
- getP4Type() return value cached many places
- typeChecker/TypeInferenceBase::constantFold return value
- more typechecking temp/cached values
- TypeSubstitution bindings
- TypeMap leftValues and constants being inserted into maps
- midend flattening transforms temps
- midend/interperter symbolic values
- parserUnroll/ParserSymbolicInterpreter created temps
- unrollLoops temp body being constructed

Inspector Visitor::Tracker remove node from cache if it is to be
revisted.  Reduces size of the tracer cache for visitDagOnce = false
visitors and avoid dangling temp problem with toP4

Signed-off-by: Chris Dodd <cdodd@nvidia.com>
- move const so we can have both `IR::Ptr` and `IR::MutablePtr`

Signed-off-by: Chris Dodd <cdodd@nvidia.com>
- bmv2/dpdk/ebpf/udbpf changes to build and (mostly?) work
- tofino inital chages to main/frontend
- minor typecheck fixes
- Evaluator fixes using IR::MutablePtr and IR::Ptr

Signed-off-by: Chris Dodd <cdodd@nvidia.com>
Signed-off-by: Chris Dodd <cdodd@nvidia.com>
Signed-off-by: Chris Dodd <cdodd@nvidia.com>
Signed-off-by: Chris Dodd <cdodd@nvidia.com>
Signed-off-by: Chris Dodd <cdodd@nvidia.com>
Signed-off-by: Chris Dodd <cdodd@nvidia.com>
Signed-off-by: Chris Dodd <cdodd@nvidia.com>
- do NOT use IR::Ptr for the return value of getWriteDest in parde.def,
  so that we can have covariant return types.
- stateful alu creation use IR::Ptr throughout
- fromv1.0 stuff use IR::Ptr is most places
- MANY changes 'const auto *' => 'auto' to allow inferring IR::Ptr

Signed-off-by: Chris Dodd <cdodd@nvidia.com>
@fruffy fruffy added the p4tc Topics related to the P4-TC back end. On PRs, also triggers p4tc CI tests to run. label Jun 4, 2026
@fruffy fruffy added run-validation Use this tag to trigger a Validation CI run. run-sanitizer Use this tag to run a Clang+Sanitzers CI run. run-static Use this tag to trigger static build CI run. labels Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

breaking-change This change may break assumptions of compiler back ends. compiler-performance Topics on improving the performance of the compiler core. core Topics concerning the core segments of the compiler (frontend, midend, parser) p4tc Topics related to the P4-TC back end. On PRs, also triggers p4tc CI tests to run. run-sanitizer Use this tag to run a Clang+Sanitzers CI run. run-static Use this tag to trigger static build CI run. run-validation Use this tag to trigger a Validation CI run.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants