Skip to content

[Clang][analyzer] replace Stmt* with ConstCFGElementRef in SymbolConjured #128251

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fangyi-zhou
Copy link

Closes #57270.

This PR changes the Stmt * field in SymbolConjured with CFGBlock::ConstCFGElementRef. The motivation is that, when conjuring a symbol, there might not always be a statement available, causing information to be lost for conjured symbols, whereas the CFGElementRef can always be provided at the callsite.

Following the idea, this PR changes callsites of functions to create conjured symbols, and replaces them with appropriate CFGElementRefs.

Copy link

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

@isuckatcs isuckatcs self-requested a review February 22, 2025 00:22
@isuckatcs
Copy link
Member

The source of the crash you mentioned in the issue is CStringChecker.cpp:1304, where the CallEvent is a nullptr.

return State->invalidateRegions(R, E, C.blockCount(), LCtx,
                                CausesPointerEscape, nullptr, nullptr,
                                &ITraits);

Copy link
Member

@isuckatcs isuckatcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far we have some progress, so keep it up.

Keep in mind that we'll also need to update SValExplainer, but you'll see it once you run the tests and start seeing the warning messages.

@llvmbot
Copy link
Member

llvmbot commented Feb 22, 2025

@llvm/pr-subscribers-clang-analysis
@llvm/pr-subscribers-clang

@llvm/pr-subscribers-clang-static-analyzer-1

Author: Fangyi Zhou (fangyi-zhou)

Changes

Closes #57270.

This PR changes the Stmt * field in SymbolConjured with CFGBlock::ConstCFGElementRef. The motivation is that, when conjuring a symbol, there might not always be a statement available, causing information to be lost for conjured symbols, whereas the CFGElementRef can always be provided at the callsite.

Following the idea, this PR changes callsites of functions to create conjured symbols, and replaces them with appropriate CFGElementRefs.


Patch is 68.81 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/128251.diff

25 Files Affected:

  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h (+5)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h (+52-39)
  • (modified) clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h (+37-20)
  • (modified) clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp (+21-13)
  • (modified) clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp (+17-16)
  • (modified) clang/lib/StaticAnalyzer/Checkers/ErrnoModeling.cpp (+1-1)
  • (modified) clang/lib/StaticAnalyzer/Checkers/ErrnoTesterChecker.cpp (+2-1)
  • (modified) clang/lib/StaticAnalyzer/Checkers/Iterator.cpp (+6-5)
  • (modified) clang/lib/StaticAnalyzer/Checkers/Iterator.h (+5-4)
  • (modified) clang/lib/StaticAnalyzer/Checkers/IteratorModeling.cpp (+17-12)
  • (modified) clang/lib/StaticAnalyzer/Checkers/MallocChecker.cpp (+4-2)
  • (modified) clang/lib/StaticAnalyzer/Checkers/RetainCountChecker/RetainCountChecker.cpp (+2-2)
  • (modified) clang/lib/StaticAnalyzer/Checkers/STLAlgorithmModeling.cpp (+8-5)
  • (modified) clang/lib/StaticAnalyzer/Checkers/SmartPtrModeling.cpp (+4-4)
  • (modified) clang/lib/StaticAnalyzer/Checkers/StdLibraryFunctionsChecker.cpp (+3-2)
  • (modified) clang/lib/StaticAnalyzer/Checkers/StreamChecker.cpp (+2-1)
  • (modified) clang/lib/StaticAnalyzer/Checkers/cert/InvalidPtrChecker.cpp (+1-1)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngine.cpp (+9-9)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngineC.cpp (+30-25)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngineCXX.cpp (+7-6)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngineCallAndReturn.cpp (+4-2)
  • (modified) clang/lib/StaticAnalyzer/Core/ExprEngineObjC.cpp (+9-7)
  • (modified) clang/lib/StaticAnalyzer/Core/RegionStore.cpp (+31-26)
  • (modified) clang/lib/StaticAnalyzer/Core/SValBuilder.cpp (+54-39)
  • (modified) clang/lib/StaticAnalyzer/Core/SymbolManager.cpp (+1-1)
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
index 168983fd5cb68..02bd4a91961a9 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/CheckerContext.h
@@ -151,6 +151,11 @@ class CheckerContext {
     return Pred->getSVal(S);
   }
 
+  /// Get the CFG Element Ref from the ExprEngine
+  CFGBlock::ConstCFGElementRef getCFGElementRef() const {
+    return Eng.getCFGElementRef();
+  }
+
   /// Returns true if the value of \p E is greater than or equal to \p
   /// Val under unsigned comparison
   bool isGreaterOrEqual(const Expr *E, unsigned long long Val);
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
index 54430d426a82a..6fb5f15822585 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SValBuilder.h
@@ -19,6 +19,7 @@
 #include "clang/AST/Expr.h"
 #include "clang/AST/ExprObjC.h"
 #include "clang/AST/Type.h"
+#include "clang/Analysis/CFG.h"
 #include "clang/Basic/LLVM.h"
 #include "clang/Basic/LangOptions.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/BasicValueFactory.h"
@@ -171,20 +172,27 @@ class SValBuilder {
 
   // Forwarding methods to SymbolManager.
 
-  const SymbolConjured* conjureSymbol(const Stmt *stmt,
-                                      const LocationContext *LCtx,
-                                      QualType type,
-                                      unsigned visitCount,
-                                      const void *symbolTag = nullptr) {
-    return SymMgr.conjureSymbol(stmt, LCtx, type, visitCount, symbolTag);
+  const SymbolConjured *
+  conjureSymbol(const CFGBlock::ConstCFGElementRef ElemRef,
+                const LocationContext *LCtx, QualType type, unsigned visitCount,
+                const void *symbolTag = nullptr) {
+    return SymMgr.conjureSymbol(ElemRef, LCtx, type, visitCount, symbolTag);
   }
 
-  const SymbolConjured* conjureSymbol(const Expr *expr,
-                                      const LocationContext *LCtx,
-                                      unsigned visitCount,
-                                      const void *symbolTag = nullptr) {
-    return SymMgr.conjureSymbol(expr, LCtx, visitCount, symbolTag);
-  }
+  // const SymbolConjured* conjureSymbol(const Stmt *stmt,
+  //                                     const LocationContext *LCtx,
+  //                                     QualType type,
+  //                                     unsigned visitCount,
+  //                                     const void *symbolTag = nullptr) {
+  //   return SymMgr.conjureSymbol(stmt, LCtx, type, visitCount, symbolTag);
+  // }
+
+  // const SymbolConjured* conjureSymbol(const Expr *expr,
+  //                                     const LocationContext *LCtx,
+  //                                     unsigned visitCount,
+  //                                     const void *symbolTag = nullptr) {
+  //   return SymMgr.conjureSymbol(expr, LCtx, visitCount, symbolTag);
+  // }
 
   /// Construct an SVal representing '0' for the specified type.
   DefinedOrUnknownSVal makeZeroVal(QualType type);
@@ -198,33 +206,38 @@ class SValBuilder {
   /// The advantage of symbols derived/built from other symbols is that we
   /// preserve the relation between related(or even equivalent) expressions, so
   /// conjured symbols should be used sparingly.
-  DefinedOrUnknownSVal conjureSymbolVal(const void *symbolTag,
-                                        const Expr *expr,
-                                        const LocationContext *LCtx,
-                                        unsigned count);
-  DefinedOrUnknownSVal conjureSymbolVal(const void *symbolTag, const Stmt *S,
-                                        const LocationContext *LCtx,
-                                        QualType type, unsigned count);
-  DefinedOrUnknownSVal conjureSymbolVal(const Stmt *stmt,
-                                        const LocationContext *LCtx,
-                                        QualType type,
-                                        unsigned visitCount);
-
-  /// Conjure a symbol representing heap allocated memory region.
-  ///
-  /// Note, the expression should represent a location.
-  DefinedSVal getConjuredHeapSymbolVal(const Expr *E,
-                                       const LocationContext *LCtx,
-                                       unsigned Count);
-
-  /// Conjure a symbol representing heap allocated memory region.
-  ///
-  /// Note, now, the expression *doesn't* need to represent a location.
-  /// But the type need to!
-  DefinedSVal getConjuredHeapSymbolVal(const Expr *E,
-                                       const LocationContext *LCtx,
-                                       QualType type, unsigned Count);
-
+  // DefinedOrUnknownSVal
+  // conjureSymbolVal(const void *symbolTag,
+  //                  const CFGBlock::ConstCFGElementRef elemRef,
+  //                  const LocationContext *LCtx, unsigned count);
+  DefinedOrUnknownSVal
+  conjureSymbolVal(const void *symbolTag,
+                   const CFGBlock::ConstCFGElementRef elemRef,
+                   const LocationContext *LCtx, QualType type, unsigned count);
+  DefinedOrUnknownSVal
+  conjureSymbolVal(const CFGBlock::ConstCFGElementRef elemRef,
+                   const LocationContext *LCtx, QualType type,
+                   unsigned visitCount);
+
+  // /// Conjure a symbol representing heap allocated memory region.
+  // ///
+  // /// Note, the expression should represent a location.
+  // DefinedSVal getConjuredHeapSymbolVal(const Expr *E,
+  //                                      const LocationContext *LCtx,
+  //                                      unsigned Count);
+
+  // /// Conjure a symbol representing heap allocated memory region.
+  // ///
+  // /// Note, now, the expression *doesn't* need to represent a location.
+  // /// But the type need to!
+  // DefinedSVal getConjuredHeapSymbolVal(const Expr *E,
+  //                                      const LocationContext *LCtx,
+  //                                      QualType type, unsigned Count);
+
+  DefinedSVal
+  getConjuredHeapSymbolVal(const CFGBlock::ConstCFGElementRef elemRef,
+                           const LocationContext *LCtx, QualType type,
+                           unsigned Count);
   /// Create an SVal representing the result of an alloca()-like call, that is,
   /// an AllocaRegion on the stack.
   ///
diff --git a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
index cbbea1b56bb40..4e24c9a81ae1f 100644
--- a/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
+++ b/clang/include/clang/StaticAnalyzer/Core/PathSensitive/SymbolManager.h
@@ -17,6 +17,7 @@
 #include "clang/AST/Expr.h"
 #include "clang/AST/Type.h"
 #include "clang/Analysis/AnalysisDeclContext.h"
+#include "clang/Analysis/CFG.h"
 #include "clang/Basic/LLVM.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/APSIntPtr.h"
 #include "clang/StaticAnalyzer/Core/PathSensitive/MemRegion.h"
@@ -80,17 +81,18 @@ class SymbolRegionValue : public SymbolData {
 /// A symbol representing the result of an expression in the case when we do
 /// not know anything about what the expression is.
 class SymbolConjured : public SymbolData {
-  const Stmt *S;
+  const CFGBlock::ConstCFGElementRef ElemRef;
   QualType T;
   unsigned Count;
   const LocationContext *LCtx;
   const void *SymbolTag;
 
   friend class SymExprAllocator;
-  SymbolConjured(SymbolID sym, const Stmt *s, const LocationContext *lctx,
-                 QualType t, unsigned count, const void *symbolTag)
-      : SymbolData(SymbolConjuredKind, sym), S(s), T(t), Count(count),
-        LCtx(lctx), SymbolTag(symbolTag) {
+  SymbolConjured(SymbolID sym, CFGBlock::ConstCFGElementRef elemRef,
+                 const LocationContext *lctx, QualType t, unsigned count,
+                 const void *symbolTag)
+      : SymbolData(SymbolConjuredKind, sym), ElemRef(elemRef), T(t),
+        Count(count), LCtx(lctx), SymbolTag(symbolTag) {
     // FIXME: 's' might be a nullptr if we're conducting invalidation
     // that was caused by a destructor call on a temporary object,
     // which has no statement associated with it.
@@ -102,7 +104,12 @@ class SymbolConjured : public SymbolData {
 
 public:
   /// It might return null.
-  const Stmt *getStmt() const { return S; }
+  const Stmt *getStmt() const {
+    if (auto Stmt = ElemRef->getAs<CFGStmt>()) {
+      return Stmt->getStmt();
+    }
+    return nullptr;
+  }
   unsigned getCount() const { return Count; }
   /// It might return null.
   const void *getTag() const { return SymbolTag; }
@@ -113,11 +120,13 @@ class SymbolConjured : public SymbolData {
 
   void dumpToStream(raw_ostream &os) const override;
 
-  static void Profile(llvm::FoldingSetNodeID &profile, const Stmt *S,
+  static void Profile(llvm::FoldingSetNodeID &profile,
+                      const CFGBlock::ConstCFGElementRef ElemRef,
                       const LocationContext *LCtx, QualType T, unsigned Count,
                       const void *SymbolTag) {
     profile.AddInteger((unsigned)SymbolConjuredKind);
-    profile.AddPointer(S);
+    // profile.Add(ElemRef);
+    // profile.AddPointer(S);
     profile.AddPointer(LCtx);
     profile.Add(T);
     profile.AddInteger(Count);
@@ -125,7 +134,7 @@ class SymbolConjured : public SymbolData {
   }
 
   void Profile(llvm::FoldingSetNodeID& profile) override {
-    Profile(profile, S, LCtx, T, Count, SymbolTag);
+    Profile(profile, ElemRef, LCtx, T, Count, SymbolTag);
   }
 
   // Implement isa<T> support.
@@ -533,20 +542,28 @@ class SymbolManager {
   template <typename SymExprT, typename... Args>
   const SymExprT *acquire(Args &&...args);
 
-  const SymbolConjured *conjureSymbol(const Stmt *E,
-                                      const LocationContext *LCtx, QualType T,
-                                      unsigned VisitCount,
-                                      const void *SymbolTag = nullptr) {
-    return acquire<SymbolConjured>(E, LCtx, T, VisitCount, SymbolTag);
-  }
+  // const SymbolConjured *conjureSymbol(const Stmt *E,
+  //                                     const LocationContext *LCtx, QualType
+  //                                     T, unsigned VisitCount, const void
+  //                                     *SymbolTag = nullptr) {
+  //   return acquire<SymbolConjured>(E, LCtx, T, VisitCount, SymbolTag);
+  // }
+
+  const SymbolConjured *
+  conjureSymbol(const CFGBlock::ConstCFGElementRef ElemRef,
+                const LocationContext *LCtx, QualType T, unsigned VisitCount,
+                const void *SymbolTag = nullptr) {
 
-  const SymbolConjured* conjureSymbol(const Expr *E,
-                                      const LocationContext *LCtx,
-                                      unsigned VisitCount,
-                                      const void *SymbolTag = nullptr) {
-    return conjureSymbol(E, LCtx, E->getType(), VisitCount, SymbolTag);
+    return acquire<SymbolConjured>(ElemRef, LCtx, T, VisitCount, SymbolTag);
   }
 
+  // const SymbolConjured* conjureSymbol(const Expr *E,
+  //                                     const LocationContext *LCtx,
+  //                                     unsigned VisitCount,
+  //                                     const void *SymbolTag = nullptr) {
+  //   return conjureSymbol(E, LCtx, E->getType(), VisitCount, SymbolTag);
+  // }
+
   QualType getType(const SymExpr *SE) const {
     return SE->getType();
   }
diff --git a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
index 39dcaf02dbe25..ea3b815a95bc1 100644
--- a/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/CStringChecker.cpp
@@ -1515,7 +1515,8 @@ void CStringChecker::evalCopyCommon(CheckerContext &C, const CallEvent &Call,
       // conjure a return value for later.
       if (lastElement.isUnknown())
         lastElement = C.getSValBuilder().conjureSymbolVal(
-            nullptr, Call.getOriginExpr(), LCtx, C.blockCount());
+            nullptr, Call.getCFGElementRef(), LCtx,
+            Call.getOriginExpr()->getType(), C.blockCount());
 
       // The byte after the last byte copied is the return value.
       state = state->BindExpr(Call.getOriginExpr(), LCtx, lastElement);
@@ -1665,8 +1666,9 @@ void CStringChecker::evalMemcmp(CheckerContext &C, const CallEvent &Call,
     State = CheckBufferAccess(C, State, Left, Size, AccessKind::read, CK);
     if (State) {
       // The return value is the comparison result, which we don't know.
-      SVal CmpV = Builder.conjureSymbolVal(nullptr, Call.getOriginExpr(), LCtx,
-                                           C.blockCount());
+      SVal CmpV = Builder.conjureSymbolVal(
+          nullptr, Call.getCFGElementRef(), LCtx,
+          Call.getOriginExpr()->getType(), C.blockCount());
       State = State->BindExpr(Call.getOriginExpr(), LCtx, CmpV);
       C.addTransition(State);
     }
@@ -1770,7 +1772,8 @@ void CStringChecker::evalstrLengthCommon(CheckerContext &C,
       // All we know is the return value is the min of the string length
       // and the limit. This is better than nothing.
       result = C.getSValBuilder().conjureSymbolVal(
-          nullptr, Call.getOriginExpr(), LCtx, C.blockCount());
+          nullptr, Call.getCFGElementRef(), LCtx,
+          Call.getOriginExpr()->getType(), C.blockCount());
       NonLoc resultNL = result.castAs<NonLoc>();
 
       if (strLengthNL) {
@@ -1794,7 +1797,8 @@ void CStringChecker::evalstrLengthCommon(CheckerContext &C,
     // value, so it can be used in constraints, at least.
     if (result.isUnknown()) {
       result = C.getSValBuilder().conjureSymbolVal(
-          nullptr, Call.getOriginExpr(), LCtx, C.blockCount());
+          nullptr, Call.getCFGElementRef(), LCtx,
+          Call.getOriginExpr()->getType(), C.blockCount());
     }
   }
 
@@ -2261,8 +2265,9 @@ void CStringChecker::evalStrcpyCommon(CheckerContext &C, const CallEvent &Call,
     // If this is a stpcpy-style copy, but we were unable to check for a buffer
     // overflow, we still need a result. Conjure a return value.
     if (ReturnEnd && Result.isUnknown()) {
-      Result = svalBuilder.conjureSymbolVal(nullptr, Call.getOriginExpr(), LCtx,
-                                            C.blockCount());
+      Result = svalBuilder.conjureSymbolVal(
+          nullptr, Call.getCFGElementRef(), LCtx,
+          Call.getOriginExpr()->getType(), C.blockCount());
     }
   }
   // Set the return value.
@@ -2361,8 +2366,9 @@ void CStringChecker::evalStrcmpCommon(CheckerContext &C, const CallEvent &Call,
   const StringLiteral *RightStrLiteral =
       getCStringLiteral(C, state, Right.Expression, RightVal);
   bool canComputeResult = false;
-  SVal resultVal = svalBuilder.conjureSymbolVal(nullptr, Call.getOriginExpr(),
-                                                LCtx, C.blockCount());
+  SVal resultVal = svalBuilder.conjureSymbolVal(
+      nullptr, Call.getCFGElementRef(), LCtx, Call.getOriginExpr()->getType(),
+      C.blockCount());
 
   if (LeftStrLiteral && RightStrLiteral) {
     StringRef LeftStrRef = LeftStrLiteral->getString();
@@ -2469,14 +2475,15 @@ void CStringChecker::evalStrsep(CheckerContext &C,
     // further along in the same string, or NULL if there are no more tokens.
     State =
         State->bindLoc(*SearchStrLoc,
-                       SVB.conjureSymbolVal(getTag(), Call.getOriginExpr(),
+                       SVB.conjureSymbolVal(getTag(), Call.getCFGElementRef(),
                                             LCtx, CharPtrTy, C.blockCount()),
                        LCtx);
   } else {
     assert(SearchStrVal.isUnknown());
     // Conjure a symbolic value. It's the best we can do.
-    Result = SVB.conjureSymbolVal(nullptr, Call.getOriginExpr(), LCtx,
-                                  C.blockCount());
+    Result =
+        SVB.conjureSymbolVal(nullptr, Call.getCFGElementRef(), LCtx,
+                             Call.getOriginExpr()->getType(), C.blockCount());
   }
 
   // Set the return value, and finish.
@@ -2520,7 +2527,8 @@ void CStringChecker::evalStdCopyCommon(CheckerContext &C,
   SValBuilder &SVB = C.getSValBuilder();
 
   SVal ResultVal =
-      SVB.conjureSymbolVal(nullptr, Call.getOriginExpr(), LCtx, C.blockCount());
+      SVB.conjureSymbolVal(nullptr, Call.getCFGElementRef(), LCtx,
+                           Call.getOriginExpr()->getType(), C.blockCount());
   State = State->BindExpr(Call.getOriginExpr(), LCtx, ResultVal);
 
   C.addTransition(State);
diff --git a/clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp b/clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
index 55ed809bfed6c..74a7b8e0f54ff 100644
--- a/clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
+++ b/clang/lib/StaticAnalyzer/Checkers/ContainerModeling.cpp
@@ -107,13 +107,13 @@ bool frontModifiable(ProgramStateRef State, const MemRegion *Reg);
 bool backModifiable(ProgramStateRef State, const MemRegion *Reg);
 SymbolRef getContainerBegin(ProgramStateRef State, const MemRegion *Cont);
 SymbolRef getContainerEnd(ProgramStateRef State, const MemRegion *Cont);
-ProgramStateRef createContainerBegin(ProgramStateRef State,
+ProgramStateRef createContainerBegin(CheckerContext &C, ProgramStateRef State,
                                      const MemRegion *Cont, const Expr *E,
                                      QualType T, const LocationContext *LCtx,
                                      unsigned BlockCount);
-ProgramStateRef createContainerEnd(ProgramStateRef State, const MemRegion *Cont,
-                                   const Expr *E, QualType T,
-                                   const LocationContext *LCtx,
+ProgramStateRef createContainerEnd(CheckerContext &C, ProgramStateRef State,
+                                   const MemRegion *Cont, const Expr *E,
+                                   QualType T, const LocationContext *LCtx,
                                    unsigned BlockCount);
 ProgramStateRef setContainerData(ProgramStateRef State, const MemRegion *Cont,
                                  const ContainerData &CData);
@@ -260,8 +260,9 @@ void ContainerModeling::handleBegin(CheckerContext &C, const Expr *CE,
   auto State = C.getState();
   auto BeginSym = getContainerBegin(State, ContReg);
   if (!BeginSym) {
-    State = createContainerBegin(State, ContReg, CE, C.getASTContext().LongTy,
-                                 C.getLocationContext(), C.blockCount());
+    State =
+        createContainerBegin(C, State, ContReg, CE, C.getASTContext().LongTy,
+                             C.getLocationContext(), C.blockCount());
     BeginSym = getContainerBegin(State, ContReg);
   }
   State = setIteratorPosition(State, RetVal,
@@ -282,7 +283,7 @@ void ContainerModeling::handleEnd(CheckerContext &C, const Expr *CE,
   auto State = C.getState();
   auto EndSym = getContainerEnd(State, ContReg);
   if (!EndSym) {
-    State = createContainerEnd(State, ContReg, CE, C.getASTContext().LongTy,
+    State = createContainerEnd(C, State, ContReg, CE, C.getASTContext().LongTy,
                                C.getLocationContext(), C.blockCount());
     EndSym = getContainerEnd(State, ContReg);
   }
@@ -326,7 +327,7 @@ void ContainerModeling::handleAssignment(CheckerContext &C, SVal Cont,
           auto &SVB = C.getSValBuilder();
           // Then generate and assign a new "end" symbol for the new container.
           auto NewEndSym =
-              SymMgr.conjureSymbol(CE, C.getLocationContext(),
+              SymMgr.conjureSymbol(C.getCFGElementRef(), C.getLocationContext(),
                                    C.getASTContext().LongTy, C.blockCount());
           State = assumeNoOverflow(State, NewEndSym, 4);
           if (CData) {
@@ -844,7 +845,7 @@ SymbolRef getContainerEnd(ProgramStateRef State, const MemRegion *Cont) {
   return CDataPtr->getEnd();
 }
 
-ProgramStateRef createContainerBegin(ProgramStateRef State,
+ProgramStateRef createContainerBegin(CheckerContext &C, Pr...
[truncated]

@fangyi-zhou
Copy link
Author

I've made some more progress, the crash goes away, there are still some review comments that I need to address, which I'll try to complete later.

/home/fangyi/playground/bug.cc:21:5: warning: value derived from (symbol of type 'int' conjured at statement '->~S() (Implicit destructor)
') for global variable 'S::a' [debug.ExprInspection]
   21 |     clang_analyzer_explain(S::a);
      |     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 warning generated.

@fangyi-zhou fangyi-zhou changed the title [Clang] [analyzer] replace Stmt* with ConstCFGElementRef in SymbolConjured [Clang][analyzer] replace Stmt* with ConstCFGElementRef in SymbolConjured Feb 22, 2025
@fangyi-zhou fangyi-zhou force-pushed the clang-analyzer-conjured-symbol-use-cfgelement-ref branch 2 times, most recently from d4acfb0 to 7ef2ea5 Compare February 23, 2025 14:37
@fangyi-zhou fangyi-zhou marked this pull request as ready for review February 23, 2025 15:15
Copy link
Member

@isuckatcs isuckatcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't have enough time to check the whole patch, I'll get back to it later.

The general patterns seems to be that when a conjured symbol is created, it's always the reference to the current CFG element that is passed as the source statement instead of some other statement that was passed in the past.

I think every place, where we conjure a symbol should be double checked, so that we make sure we pass the CFG element reference to the correct statement, that actually conjured the value.

@@ -229,7 +230,7 @@ DefinedSVal makeRetVal(CheckerContext &C, const CallExpr *CE) {

const LocationContext *LCtx = C.getLocationContext();
return C.getSValBuilder()
.conjureSymbolVal(nullptr, CE, LCtx, C.blockCount())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CE in unused.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not yet fixed.

@@ -27,7 +27,8 @@ namespace ento {
/// by the loop body in any iteration.
ProgramStateRef getWidenedLoopState(ProgramStateRef PrevState,
const LocationContext *LCtx,
unsigned BlockCount, const Stmt *LoopStmt);
unsigned BlockCount, const Stmt *LoopStmt,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both the Stmt and the ElemRef here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not any more, will remove.

@@ -101,8 +103,10 @@ class SymbolConjured : public SymbolData {
}

public:
/// It might return null.
const Stmt *getStmt() const { return S; }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to keep getStmt? It could still return null but some callers might find it useful.

@@ -1515,7 +1515,8 @@ void CStringChecker::evalCopyCommon(CheckerContext &C, const CallEvent &Call,
// conjure a return value for later.
if (lastElement.isUnknown())
lastElement = C.getSValBuilder().conjureSymbolVal(
nullptr, Call.getOriginExpr(), LCtx, C.blockCount());
nullptr, Call.getOriginExpr(), C.getCFGElementRef(), LCtx,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can dump the CFGElementRef and the expr. We want to make sure they match, so there is no semantic change. That is one way to check if the code is correct.

const void *symbolTag = nullptr) {
return SymMgr.conjureSymbol(expr, LCtx, visitCount, symbolTag);
const SymbolConjured *
conjureSymbol(const CFGBlock::ConstCFGElementRef ElemRef,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we still extract a type from the ElemRef in the implementation? Does the caller have a reasonable type to pass in when ElemRef is not referring to an expression?

Copy link
Member

@isuckatcs isuckatcs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a testcase with the snippetin the issue that used to crash?

ID.AddInteger(Index);
}

int64_t getID() const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, this ID is only used when dumping a conjured symbol. How about removing it completely?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any suggestion for replacement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we want to print the ID in the first place? It probably doesn't hold a value to our users anyway.

Either print a stament if there is any, or a source location, etc. The CFG is essentially the statements in the source file in execution order, so I imagine we always have a statement.

return PrevState->invalidateRegions(Regions, getLoopCondition(LoopStmt),
BlockCount, LCtx, true, nullptr, nullptr,
&ITraits);
return PrevState->invalidateRegions(Regions, ElemRef, BlockCount, LCtx, true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does ElemRef point to the loop condition?

const Expr *Ex,
const LocationContext *LCtx,
unsigned Count) {
/// When using this overload, the \p elemRef provided must be a \p CFGStmt.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is asserted. The comment is not necessary I think.

@fangyi-zhou fangyi-zhou force-pushed the clang-analyzer-conjured-symbol-use-cfgelement-ref branch from a5ec28e to 2716967 Compare March 1, 2025 02:44
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:analysis labels Mar 1, 2025
@fangyi-zhou fangyi-zhou force-pushed the clang-analyzer-conjured-symbol-use-cfgelement-ref branch from 2716967 to 683aee6 Compare March 1, 2025 03:06
@fangyi-zhou fangyi-zhou force-pushed the clang-analyzer-conjured-symbol-use-cfgelement-ref branch from 683aee6 to eeb6f61 Compare March 13, 2025 17:22
@fangyi-zhou fangyi-zhou requested a review from isuckatcs March 13, 2025 17:24
@fangyi-zhou
Copy link
Author

May I get a re-review for the changes please?

Copy link
Contributor

@steakhal steakhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only spotchecked. It looks correct at first glance.
I somewhat dislike that practically every file needs to include CFG.h; I'd advise revisiting this.

Do you intentionally take CFGBlock::ConstCFGElementRef as a const parameter? That doesn't seem to do anything. I'd rather not spell const unless there is a compelling reason to do so.

Then the final question, why is ConstCFGElementRef nested under the CFGBlock? It makes it spell every time, leading to noise. I'd revisit this, and possibly change its implementation to allow less mouthful spelling of this type.

@@ -229,7 +230,7 @@ DefinedSVal makeRetVal(CheckerContext &C, const CallExpr *CE) {

const LocationContext *LCtx = C.getLocationContext();
return C.getSValBuilder()
.conjureSymbolVal(nullptr, CE, LCtx, C.blockCount())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not yet fixed.

@fangyi-zhou fangyi-zhou force-pushed the clang-analyzer-conjured-symbol-use-cfgelement-ref branch from eeb6f61 to b67d90b Compare April 15, 2025 16:42
…ured

Closes llvm#57270.

This PR changes the `Stmt *` field in `SymbolConjured` with
`CFGBlock::ConstCFGElementRef`. The motivation is that, when conjuring a
symbol, there might not always be a statement available, causing
information to be lost for conjured symbols, whereas the CFGElementRef
can always be provided at the callsite.

Following the idea, this PR changes callsites of functions to create
conjured symbols, and replaces them with appropriate `CFGElementRef`s.
@fangyi-zhou fangyi-zhou force-pushed the clang-analyzer-conjured-symbol-use-cfgelement-ref branch from b67d90b to 87b45cc Compare April 15, 2025 16:45
@fangyi-zhou
Copy link
Author

Sorry I've been a bit busy with other things, just had some time to address the review comments. Please let me know if anything else needs changing

@fangyi-zhou fangyi-zhou requested a review from steakhal April 15, 2025 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:static analyzer clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[analyzer] Crash using clang_analyzer_explain() in the debug.ExprInspection checker
5 participants