Problem
In decompiled output, a single variable like `__dnew` is reused for many unrelated purposes — holding pointer values, size values, and temporaries at different program points:
size_type __dnew;
// ...
_ZN14QStandardPaths16writableLocationENS_16StandardLocationE(&__dnew, 9, ...); // pointer result
// ...
__dnew = (size_type)QVar2.d; // cast from pointer
// ...
_ZN4QDirC1ERK7QString(&__dnew, &local_88); // used as struct address
// ...
__dnew = uVar7; // size value
Root Cause
Ghidra's SSA/variable recovery merges unrelated temporaries into a single `HighVariable` when they share the same storage location (e.g., a stack slot or register that's reused across different live ranges). The serializer emits one `DECLARE_LOCAL` for `__dnew` and all uses reference it.
The JSON shows a single declaration:
DECLARE_LOCAL name=__dnew key=unique:00410209:9:0
But the variable is used in ~15+ different contexts with incompatible types.
Analysis
PcodeSerializer.java's `classifyVariable()` (lines 802-852) categorizes variables but doesn't detect or split merged HighVariables. The `temporaryAddressMap` (line 2502) provides optional SSA-like versioning but only for some temporaries.
Proposed Fix
This is fundamentally a Ghidra decompiler limitation, but can be mitigated:
-
In PcodeSerializer.java: Detect when a single HighVariable has multiple incompatible definitions (different types at different def points). Split into separate JSON variables with unique names.
-
In the C++ AST pipeline: Track reaching definitions per use-site. When a variable name is used with a type incompatible with its declaration, create a new variable with a suffixed name.
-
Simpler workaround: In FunctionBuilder, when creating a DeclRefExpr for a variable whose declared type doesn't match the use-site type, insert a cast instead of silently mistyping.
Files
- `scripts/ghidra/util/PcodeSerializer.java` — variable classification and serialization
- `lib/patchestry/AST/FunctionBuilder.cpp` — variable creation and reference
- `lib/patchestry/AST/OperationBuilder.cpp` — varnode creation
Problem
In decompiled output, a single variable like `__dnew` is reused for many unrelated purposes — holding pointer values, size values, and temporaries at different program points:
Root Cause
Ghidra's SSA/variable recovery merges unrelated temporaries into a single `HighVariable` when they share the same storage location (e.g., a stack slot or register that's reused across different live ranges). The serializer emits one `DECLARE_LOCAL` for `__dnew` and all uses reference it.
The JSON shows a single declaration:
But the variable is used in ~15+ different contexts with incompatible types.
Analysis
PcodeSerializer.java's `classifyVariable()` (lines 802-852) categorizes variables but doesn't detect or split merged HighVariables. The `temporaryAddressMap` (line 2502) provides optional SSA-like versioning but only for some temporaries.
Proposed Fix
This is fundamentally a Ghidra decompiler limitation, but can be mitigated:
In PcodeSerializer.java: Detect when a single HighVariable has multiple incompatible definitions (different types at different def points). Split into separate JSON variables with unique names.
In the C++ AST pipeline: Track reaching definitions per use-site. When a variable name is used with a type incompatible with its declaration, create a new variable with a suffixed name.
Simpler workaround: In FunctionBuilder, when creating a DeclRefExpr for a variable whose declared type doesn't match the use-site type, insert a cast instead of silently mistyping.
Files