Skip to content

CXX allocator concept pass + cpg-analysis/cpg-concepts dependency flip#2774

Open
oxisto wants to merge 6 commits into
mainfrom
oxisto/allocate-concept-size
Open

CXX allocator concept pass + cpg-analysis/cpg-concepts dependency flip#2774
oxisto wants to merge 6 commits into
mainfrom
oxisto/allocate-concept-size

Conversation

@oxisto

@oxisto oxisto commented Jun 1, 2026

Copy link
Copy Markdown
Member

Summary

Replaces the malloc string-match in ArrayValue.getSize with a generic read of an Allocate memory-concept overlay. The actual recognition of allocator calls moves into a new language-specific pass, CXXMemoryAllocationPass, which lives next to the C/C++ frontend.

Along the way:

  • Module layering flipped: cpg-analysis now depends on cpg-concepts instead of the other way round. The only coupling was one file (PolicyQueries.kt), which moved to its semantic home in cpg-analysis. The new chain is cpg-language-cxx → cpg-concepts → cpg-analysis → cpg-core.
  • Existing CXX concept passes relocated: CXXDynamicLoadingPass and CXXEntryPointsPass moved from cpg-concepts into cpg-language-cxx. cpg-concepts now contains only language-agnostic concept definitions.
  • Auto-registration: CXXLanguageFrontend carries @RegisterExtraPass(CXXMemoryAllocationPass::class) so the pass runs automatically with defaultPasses() whenever the C/C++ frontend is loaded — no more manual it.registerPass<>() for the common case.

Commit walkthrough

  1. Flip cpg-analysis ↔ cpg-conceptsPolicyQueries.kt → cpg-analysis; gradle deps swapped.
  2. Allocate.size: Expression? — new field on the existing concept + builder support.
  3. Relocate CXX concept passesCXXDynamicLoadingPass, CXXEntryPointsPass → cpg-language-cxx. ConceptPass.getConceptOrCreate widened from internalprotected.
  4. CXXMemoryAllocationPass — matches malloc/calloc/realloc; for calloc the size is a synthesised count * elemSize BinaryOperator (ValueEvaluator constant-folds when both operands are literals).
  5. Wire ArrayValue.getSize to read Allocate overlays + auto-register the pass. Memory concept attaches to the Component (one heap per program, shared across TUs).
  6. Tests — 6 integration tests covering malloc with a constant, malloc via Assign, calloc, realloc, malloc with a non-constant size, and the negative case (no overlay on unrelated calls).

Why Expression for size?

A bare Int only handles malloc(constant). Keeping the original Expression node means:

  • Constant cases: callers do (size.value.value as Number).toLong().
  • Variable cases (e.g. malloc(n)): callers can run interval analysis, taint, DFG tracing, etc.
  • The synthesised count * elemSize for calloc folds uniformly across literal-times-literal, variable-times-literal, etc.

Not covered (one-line extensions, left for follow-ups)

  • new / new[] — different AST node shape in CDT.
  • aligned_alloc, posix_memalign — size lives in different argument positions.
  • Python / other language allocator passes — same pattern applies; pass goes in the respective language module.

Test plan

  • :cpg-core:test — green
  • :cpg-analysis:test — green
  • :cpg-language-cxx:test (including new CXXMemoryAllocationPassTest + existing ArraySizeEvaluatorTest, TypedefTest, etc.) — green
  • :cpg-concepts:integrationTest (including the relocated DynamicLoadingTest, EntryPointTest, PolicyTest) — green

🤖 Generated with Claude Code

oxisto added 6 commits May 29, 2026 11:32
The previous direction (cpg-concepts depending on cpg-analysis) was driven
entirely by one file — PolicyQueries.kt — which is semantically a
"query that uses concepts", not a concept definition. Moving it to
cpg-analysis (where the cpg.query DSL lives) lets cpg-concepts shed the
cpg-analysis dependency, and lets cpg-analysis take cpg-concepts as a
clean dependency in turn.

Why: the next steps need cpg-analysis to read concept overlays (e.g.
ArrayValue consulting an Allocate concept for the size of a malloc/new),
and want cpg-language-* to register concept passes via `@RegisterExtraPass`
on their frontend. Both of those require concepts to sit *above*
analysis, not the other way round. Flipping now avoids a scaffold
HasAllocatedSize interface in cpg-core.

- cpg-analysis/build.gradle.kts: api(projects.cpgConcepts).
- cpg-concepts/build.gradle.kts: drop implementation(projects.cpgAnalysis);
  add api(projects.cpgCore) since the analysis api dependency previously
  pulled core in transitively. integrationTestImplementation kept so the
  PolicyTest in cpg-concepts still reaches the relocated query.
- PolicyQueries.kt path moves cpg-concepts → cpg-analysis; package name
  unchanged so no import updates needed downstream.
`var size: Expression?` on the Allocate operation captures the expression
that determines how many bytes/elements the allocation reserves. For C
this is the `N` of `malloc(N)`, the `M * N` of `calloc(M, N)`, etc.;
language-specific concept passes are free to populate it from whatever
shape their allocator takes. `null` when the size can't be derived.

The `newAllocate()` builder takes the new field as an optional parameter
so existing call sites keep compiling unchanged.
Concept passes that recognise C/C++ stdlib/runtime functions belong with
the C/C++ frontend, not in the generic cpg-concepts module. Moves
CXXDynamicLoadingPass and CXXEntryPointsPass over and adds the
cpg-concepts dependency on cpg-language-cxx so they can keep importing
the concept definitions.

Why move them at all: this PR is about to add CXXMemoryAllocationPass.
Putting that next to its CXX siblings keeps everything consistent. It
also unlocks `@RegisterExtraPass` on CXXLanguageFrontend so the passes
auto-register with `defaultPasses()` — addressing the long-standing
"users had to remember `registerPass<>()` manually" friction.

- cpg-language-cxx/build.gradle.kts: implementation(projects.cpgConcepts).
- ConceptPass.getConceptOrCreate: internal → protected so subclasses in
  other modules can still use it.
- Package paths preserved; existing integration tests in
  cpg-concepts/integrationTest keep working unchanged because they
  already had cpg-language-cxx on the integration test classpath.
Recognises malloc / calloc / realloc and attaches an Allocate operation
to each call, populated with `what` (the target variable, when obvious
from the surrounding Variable initializer or Assign) and `size` (the
allocation expression). For calloc the size is synthesised as a `*`
BinaryOperator over the count and element-size arguments — ValueEvaluator
constant-folds it when both operands are literals.

Annotated CXXLanguageFrontend with @RegisterExtraPass so the pass
auto-registers whenever a translation runs with `defaultPasses()` and the
CXX frontend is enabled. Callers can still register it manually for
custom configurations.

Not yet covered: `new` / `new[]` (different AST node), aligned_alloc /
posix_memalign (size in different arg positions) — both are one-line
extensions to the when-branch.
ArrayValue.getSize previously string-matched `call.name.localName == "malloc"`.
With CXXMemoryAllocationPass now attaching an Allocate concept to every
recognised allocator call, the size lookup becomes generic:

    call.overlays.filterIsInstance<Allocate>().firstOrNull()?.size

The same code path works for malloc/calloc/realloc and for any future
language pass that populates Allocate.size — no per-allocator special
cases in cpg-analysis.

Also for the pass itself:
- Memory concept attaches to the Component, not the TranslationUnit:
  a C/C++ program has one heap conceptually, and all Allocate ops point
  back to that shared concept.
- @dependsOn(EvaluationOrderGraphPass) + @dependsOn(SymbolResolver) so
  the pass runs after the call name is resolved and the EOG is built —
  newOperation inserts into the EOG path, which requires EOG to exist.
Covers:
- malloc with a constant size → Allocate.size is the literal, .what is
  the bound Variable
- malloc via an Assign rather than an initializer → .what still resolves
  through the LHS Reference
- calloc(M, N) → Allocate.size is the synthesised `*` BinaryOperator
  with both operands intact
- realloc(p, N) → .size is the new-size argument, not the old buffer
- malloc(n) with a non-constant size → .size is the Reference to `n`
  (downstream is free to constant-evaluate, interval-bound, or DFG-trace)
- a calls of other functions get no Allocate overlay attached
Copilot AI review requested due to automatic review settings June 1, 2026 07:11
@codecov

codecov Bot commented Jun 1, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 61.70213% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.07%. Comparing base (05de61f) to head (c71be29).
⚠️ Report is 14 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
...ses/concepts/memory/cxx/CXXMemoryAllocationPass.kt 75.75% 0 Missing and 8 partials ⚠️
...isec/cpg/analysis/abstracteval/value/ArrayValue.kt 20.00% 1 Missing and 3 partials ⚠️
...aunhofer/aisec/cpg/graph/concepts/memory/Memory.kt 33.33% 4 Missing ⚠️
.../cpg/graph/concepts/memory/MemoryConceptBuilder.kt 50.00% 1 Missing ⚠️
...raunhofer/aisec/cpg/passes/concepts/ConceptPass.kt 0.00% 1 Missing ⚠️

❌ Your patch check has failed because the patch coverage (61.70%) is below the target coverage (75.00%). You can increase the patch coverage or adjust the target coverage.

Additional details and impacted files
Files with missing lines Coverage Δ
...aisec/cpg/queries/concepts/policy/PolicyQueries.kt 100.00% <ø> (ø)
...fer/aisec/cpg/frontends/cxx/CXXLanguageFrontend.kt 77.00% <ø> (ø)
...pg/passes/concepts/flows/cxx/CXXEntryPointsPass.kt 89.47% <ø> (ø)
...asses/concepts/memory/cxx/CXXDynamicLoadingPass.kt 64.70% <ø> (ø)
.../cpg/graph/concepts/memory/MemoryConceptBuilder.kt 95.65% <50.00%> (+18.37%) ⬆️
...raunhofer/aisec/cpg/passes/concepts/ConceptPass.kt 80.00% <0.00%> (ø)
...isec/cpg/analysis/abstracteval/value/ArrayValue.kt 73.58% <20.00%> (+0.50%) ⬆️
...aunhofer/aisec/cpg/graph/concepts/memory/Memory.kt 76.19% <33.33%> (+17.36%) ⬆️
...ses/concepts/memory/cxx/CXXMemoryAllocationPass.kt 75.75% <75.75%> (ø)

... and 4 files with indirect coverage changes

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR moves C/C++ allocator recognition into a CXX-specific concept pass and updates array-size evaluation to consume the language-agnostic Allocate concept, while also flipping the module dependency direction so cpg-analysis depends on cpg-concepts (and cpg-concepts depends only on cpg-core).

Changes:

  • Introduce CXXMemoryAllocationPass to attach Allocate(size, what) overlays to malloc/calloc/realloc calls and auto-register it in the CXX frontend.
  • Extend the Allocate concept with an optional size: Expression? and propagate it through the concept builder.
  • Flip/adjust Gradle module dependencies and relocate CXX-specific concept passes into cpg-language-cxx; move PolicyQueries.kt into cpg-analysis.

Reviewed changes

Copilot reviewed 11 out of 14 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
cpg-language-cxx/src/test/resources/allocations/allocations.c Adds C allocation fixtures used by the new pass tests.
cpg-language-cxx/src/test/kotlin/de/fraunhofer/aisec/cpg/passes/concepts/memory/cxx/CXXMemoryAllocationPassTest.kt Integration tests validating Allocate overlays and extracted sizes.
cpg-language-cxx/src/main/kotlin/de/fraunhofer/aisec/cpg/passes/concepts/memory/cxx/CXXMemoryAllocationPass.kt New CXX pass that recognizes allocator calls and attaches Allocate concepts.
cpg-language-cxx/src/main/kotlin/de/fraunhofer/aisec/cpg/passes/concepts/memory/cxx/CXXDynamicLoadingPass.kt Relocated CXX dynamic loading concept pass into the CXX language module.
cpg-language-cxx/src/main/kotlin/de/fraunhofer/aisec/cpg/passes/concepts/flows/cxx/CXXEntryPointsPass.kt Relocated CXX entry point concept pass into the CXX language module.
cpg-language-cxx/src/main/kotlin/de/fraunhofer/aisec/cpg/frontends/cxx/CXXLanguageFrontend.kt Auto-registers the new memory allocation pass via @RegisterExtraPass.
cpg-language-cxx/build.gradle.kts Adds dependency on cpg-concepts needed for concept passes living in cpg-language-cxx.
cpg-concepts/src/main/kotlin/de/fraunhofer/aisec/cpg/passes/concepts/ConceptPass.kt Widens getConceptOrCreate visibility to support subclasses outside the module.
cpg-concepts/src/main/kotlin/de/fraunhofer/aisec/cpg/graph/concepts/memory/MemoryConceptBuilder.kt Adds size support to newAllocate.
cpg-concepts/src/main/kotlin/de/fraunhofer/aisec/cpg/graph/concepts/memory/Memory.kt Adds size: Expression? to Allocate and updates equality/hashCode accordingly.
cpg-concepts/build.gradle.kts Adjusts dependencies to remove cpg-analysis and expose cpg-core.
cpg-analysis/src/main/kotlin/de/fraunhofer/aisec/cpg/queries/concepts/policy/PolicyQueries.kt Moves policy query utilities into cpg-analysis.
cpg-analysis/src/main/kotlin/de/fraunhofer/aisec/cpg/analysis/abstracteval/value/ArrayValue.kt Uses Allocate.size overlay instead of malloc name matching.
cpg-analysis/build.gradle.kts Adds dependency on cpg-concepts required by the updated analysis code.

Comment on lines +116 to +121
if (count != null && elemSize != null) {
call.newBinaryOperator("*").apply {
lhs = count
rhs = elemSize
}
} else null
Comment on lines +21 to +23
void malloc_unknown_size(int n) {
char *p = malloc(n);
}
Comment on lines +132 to +136
tu.calls
.filter { it.name.toString() !in setOf("malloc", "calloc", "realloc") }
.forEach { call ->
assertNull(call.overlays.filterIsInstance<Allocate>().firstOrNull())
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants