Skip to content

[clang-tidy][dataflow] Add bugprone-null-check-after-dereference check #84166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 21 commits into
base: main
Choose a base branch
from

Conversation

Discookie
Copy link
Contributor

@Discookie Discookie commented Mar 6, 2024

This check implements a null-pointer analysis using the data-flow framework. In particular, the check finds cases where the pointer's nullability is checked after it has been dereferenced:

int f(int *ptr) {
  *ptr += 20;   // note: one of the locations where the pointer's value cannot be null
  
  if (ptr) {
    *ptr += 42; // warning: pointer value is checked even though it cannot be null at this point
    return *ptr;
  }
  return 0;

It currently recognizes the following operations

  • Dereference and member access through arrow operators
  • Pointer-to-boolean cast
  • Assigning &obj, nullptr, and other values

Notable limitations

The checker only supports C++ due to the limitations of the framework (#65301).

Function calls taking a reference of a pointer are not handled yet.
void external(int *&ref);

The note tag is only displayed at one location, and not all operations are supported or displayed.

More examples and limitations in the checker docs.

Results on open-source projects

Available at: https://codechecker-demo.eastus.cloudapp.azure.com/Default/runs?run=Discookie_reverse-null

The reference-of-ptr limitation leads to a class of false positives across the tested projects (example).

But the checker also found quite a few true positives, on LLVM: [1] [2] [3] and a couple more, listed here.


This checker corresponds to the the CERT rule EXP34-C. Do not dereference null pointers, example #3.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang-tools-extra clangd clang-tidy clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang:analysis labels Mar 6, 2024
@whisperity whisperity removed clang Clang issues not falling into any other category clang-tools-extra clangd clang:analysis labels Mar 6, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 6, 2024

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-clang-tidy
@llvm/pr-subscribers-clang-tools-extra
@llvm/pr-subscribers-clang-analysis

@llvm/pr-subscribers-clangd

Author: Discookie (Discookie)

Changes

This checker implements a null-pointer analysis checker using the data-flow framework. In particular, the checker finds cases where the pointer's nullability is checked after it has been dereferenced:

int f(int *ptr) {
  *ptr += 20; // note: one of the locations where the pointer's value cannot be null
  
  if (ptr) {
    *ptr += 42; // warning: pointer value is checked even though it cannot be null at this point
    return *ptr;
  }
  return 0;

It currently recognizes the following operations:

  • Dereference and arrow operators
  • Pointer-to-boolean cast
  • Assigning &obj, nullptr and other values

Notable limitations:

The checker only supports C++ due to the limitations of the framework (llvm/llvm-project#65301).

Function calls taking a reference of a pointer are not handled yet.
void external(int *&ref);

The note tag is only displayed at one location, and not all operations are supported or displayed.

More examples and limitations in the checker docs.


Results on open-source projects:
https://codechecker-demo.eastus.cloudapp.azure.com/Default/runs?run=Discookie_reverse-null

The reference-of-ptr limitation leads to a class of false positives across the tested projects (example).

But the checker also found quite a few true positives, on LLVM: 1 2 3 and a couple more, listed here.


This checker corresponds to the the CERT rule EXP34-C, Do not dereference null pointers, example 3.


Patch is 63.42 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/84166.diff

12 Files Affected:

  • (modified) clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp (+3)
  • (modified) clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt (+1)
  • (added) clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.cpp (+175)
  • (added) clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.h (+37)
  • (modified) clang-tools-extra/clangd/TidyProvider.cpp (+2-1)
  • (added) clang-tools-extra/docs/clang-tidy/checks/bugprone/null-check-after-dereference.rst (+158)
  • (added) clang-tools-extra/test/clang-tidy/checkers/bugprone/null-check-after-dereference.cpp (+330)
  • (added) clang/include/clang/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.h (+112)
  • (modified) clang/lib/Analysis/FlowSensitive/Models/CMakeLists.txt (+1)
  • (added) clang/lib/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.cpp (+629)
  • (modified) clang/unittests/Analysis/FlowSensitive/CMakeLists.txt (+1)
  • (added) clang/unittests/Analysis/FlowSensitive/NullPointerAnalysisModelTest.cpp (+336)
diff --git a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
index a8a23b045f80bb..ddd708dd513355 100644
--- a/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp
@@ -48,6 +48,7 @@
 #include "NoEscapeCheck.h"
 #include "NonZeroEnumToBoolConversionCheck.h"
 #include "NotNullTerminatedResultCheck.h"
+#include "NullCheckAfterDereferenceCheck.h"
 #include "OptionalValueConversionCheck.h"
 #include "ParentVirtualCallCheck.h"
 #include "PosixReturnCheck.h"
@@ -180,6 +181,8 @@ class BugproneModule : public ClangTidyModule {
     CheckFactories.registerCheck<PosixReturnCheck>("bugprone-posix-return");
     CheckFactories.registerCheck<ReservedIdentifierCheck>(
         "bugprone-reserved-identifier");
+    CheckFactories.registerCheck<NullCheckAfterDereferenceCheck>(
+        "bugprone-null-check-after-dereference");
     CheckFactories.registerCheck<SharedPtrArrayMismatchCheck>(
         "bugprone-shared-ptr-array-mismatch");
     CheckFactories.registerCheck<SignalHandlerCheck>("bugprone-signal-handler");
diff --git a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
index 1cd6fb207d7625..5dbe761cb810e7 100644
--- a/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
+++ b/clang-tools-extra/clang-tidy/bugprone/CMakeLists.txt
@@ -44,6 +44,7 @@ add_clang_library(clangTidyBugproneModule
   NoEscapeCheck.cpp
   NonZeroEnumToBoolConversionCheck.cpp
   NotNullTerminatedResultCheck.cpp
+  NullCheckAfterDereferenceCheck.cpp
   OptionalValueConversionCheck.cpp
   ParentVirtualCallCheck.cpp
   PosixReturnCheck.cpp
diff --git a/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.cpp
new file mode 100644
index 00000000000000..1ba4edc1eaf0e4
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.cpp
@@ -0,0 +1,175 @@
+//===--- NullCheckAfterDereferenceCheck.cpp - clang-tidy-------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "NullCheckAfterDereferenceCheck.h"
+#include "clang/AST/ASTContext.h"
+#include "clang/AST/DeclCXX.h"
+#include "clang/AST/DeclTemplate.h"
+#include "clang/ASTMatchers/ASTMatchFinder.h"
+#include "clang/ASTMatchers/ASTMatchers.h"
+#include "clang/Analysis/CFG.h"
+#include "clang/Analysis/FlowSensitive/ControlFlowContext.h"
+#include "clang/Analysis/FlowSensitive/DataflowAnalysisContext.h"
+#include "clang/Analysis/FlowSensitive/DataflowEnvironment.h"
+#include "clang/Analysis/FlowSensitive/DataflowLattice.h"
+#include "clang/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.h"
+#include "clang/Analysis/FlowSensitive/WatchedLiteralsSolver.h"
+#include "clang/Basic/SourceLocation.h"
+#include "llvm/ADT/Any.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/Support/Error.h"
+#include <memory>
+#include <vector>
+
+namespace clang::tidy::bugprone {
+
+using ast_matchers::MatchFinder;
+using dataflow::NullPointerAnalysisModel;
+using dataflow::NullCheckAfterDereferenceDiagnoser;
+
+static constexpr llvm::StringLiteral FuncID("fun");
+
+struct ExpandedResult {
+  SourceLocation WarningLoc;
+  std::optional<SourceLocation> DerefLoc;
+};
+
+using ExpandedResultType = std::pair<std::vector<ExpandedResult>,
+                                     std::vector<ExpandedResult>>;
+
+static std::optional<ExpandedResultType> analyzeFunction(
+        const FunctionDecl &FuncDecl) {
+  using dataflow::ControlFlowContext;
+  using dataflow::DataflowAnalysisState;
+  using llvm::Expected;
+
+  ASTContext &ASTCtx = FuncDecl.getASTContext();
+
+  if (FuncDecl.getBody() == nullptr) {
+    return std::nullopt;
+  }
+
+  Expected<ControlFlowContext> Context =
+      ControlFlowContext::build(FuncDecl, *FuncDecl.getBody(), ASTCtx);
+  if (!Context)
+    return std::nullopt;
+
+  dataflow::DataflowAnalysisContext AnalysisContext(
+      std::make_unique<dataflow::WatchedLiteralsSolver>());
+  dataflow::Environment Env(AnalysisContext, FuncDecl);
+  NullPointerAnalysisModel Analysis(ASTCtx);
+  NullCheckAfterDereferenceDiagnoser Diagnoser;
+  NullCheckAfterDereferenceDiagnoser::ResultType Diagnostics;
+
+  using LatticeState = DataflowAnalysisState<NullPointerAnalysisModel::Lattice>;
+  using DetailMaybeLatticeStates = std::vector<std::optional<LatticeState>>;
+
+  auto DiagnoserImpl = [&ASTCtx, &Diagnoser, &Diagnostics](
+              const CFGElement &Elt,
+              const LatticeState &S) mutable -> void {
+            auto EltDiagnostics = Diagnoser.diagnose(ASTCtx, &Elt, S.Env);
+            llvm::move(EltDiagnostics.first,
+                       std::back_inserter(Diagnostics.first));
+            llvm::move(EltDiagnostics.second,
+                       std::back_inserter(Diagnostics.second));
+          };
+
+  Expected<DetailMaybeLatticeStates>
+      BlockToOutputState = dataflow::runDataflowAnalysis(
+          *Context, Analysis, Env, DiagnoserImpl);
+
+  if (llvm::Error E = BlockToOutputState.takeError()) {
+    llvm::dbgs() << "Dataflow analysis failed: " << llvm::toString(std::move(E))
+                 << ".\n";
+    return std::nullopt;
+  }
+
+  ExpandedResultType ExpandedDiagnostics;
+
+  llvm::transform(Diagnostics.first,
+                  std::back_inserter(ExpandedDiagnostics.first),
+                  [&](SourceLocation WarningLoc) -> ExpandedResult {
+                    if (auto Val = Diagnoser.WarningLocToVal[WarningLoc];
+                        auto DerefExpr = Diagnoser.ValToDerefLoc[Val]) {
+                      return {WarningLoc, DerefExpr->getBeginLoc()};
+                    }
+
+                    return {WarningLoc, std::nullopt};
+                  });
+
+  llvm::transform(Diagnostics.second,
+                  std::back_inserter(ExpandedDiagnostics.second),
+                  [&](SourceLocation WarningLoc) -> ExpandedResult {
+                    if (auto Val = Diagnoser.WarningLocToVal[WarningLoc];
+                        auto DerefExpr = Diagnoser.ValToDerefLoc[Val]) {
+                      return {WarningLoc, DerefExpr->getBeginLoc()};
+                    }
+
+                    return {WarningLoc, std::nullopt};
+                  });
+
+  return ExpandedDiagnostics;
+}
+
+void NullCheckAfterDereferenceCheck::registerMatchers(MatchFinder *Finder) {
+  using namespace ast_matchers;
+
+  auto hasPointerValue =
+      hasDescendant(NullPointerAnalysisModel::ptrValueMatcher());
+  Finder->addMatcher(
+      decl(anyOf(functionDecl(unless(isExpansionInSystemHeader()),
+                              // FIXME: Remove the filter below when lambdas are
+                              // well supported by the check.
+                              unless(hasDeclContext(cxxRecordDecl(isLambda()))),
+                              hasBody(hasPointerValue)),
+                 cxxConstructorDecl(hasAnyConstructorInitializer(
+                     withInitializer(hasPointerValue)))))
+          .bind(FuncID),
+      this);
+}
+
+void NullCheckAfterDereferenceCheck::check(
+    const MatchFinder::MatchResult &Result) {
+  if (Result.SourceManager->getDiagnostics().hasUncompilableErrorOccurred())
+    return;
+
+  const auto *FuncDecl = Result.Nodes.getNodeAs<FunctionDecl>(FuncID);
+  assert(FuncDecl && "invalid FuncDecl matcher"); 
+  if (FuncDecl->isTemplated())
+    return;
+
+  if (const auto Diagnostics = analyzeFunction(*FuncDecl)) {
+    const auto& [CheckWhenNullLocations, CheckAfterDereferenceLocations] =
+        *Diagnostics;
+
+    for (const auto [WarningLoc, DerefLoc] : CheckAfterDereferenceLocations) {
+      diag(WarningLoc,
+           "pointer value is checked even though "
+           "it cannot be null at this point");
+
+      if (DerefLoc) {
+        diag(*DerefLoc,
+             "one of the locations where the pointer's value cannot be null",
+             DiagnosticIDs::Note);
+      }
+    }
+
+    for (const auto [WarningLoc, DerefLoc] : CheckWhenNullLocations) {
+      diag(WarningLoc,
+           "pointer value is checked but it can only be null at this point");
+      
+      if (DerefLoc) {
+        diag(*DerefLoc,
+             "one of the locations where the pointer's value can only be null",
+             DiagnosticIDs::Note);
+      }
+    }
+  }
+}
+
+} // namespace clang::tidy::bugprone
diff --git a/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.h b/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.h
new file mode 100644
index 00000000000000..e5ac27e79deb11
--- /dev/null
+++ b/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.h
@@ -0,0 +1,37 @@
+//===--- NullCheckAfterDereferenceCheck.h - clang-tidy ----------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE_NULLCHECKAFTERDEREFERENCECHECK_H
+#define LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE_NULLCHECKAFTERDEREFERENCECHECK_H
+
+#include "../ClangTidyCheck.h"
+#include "clang/ASTMatchers/ASTMatchFinder.h"
+
+namespace clang::tidy::bugprone {
+
+/// Finds checks for pointer nullability after a pointer has already been
+/// dereferenced, using the data-flow framework.
+///
+/// For the user-facing documentation see:
+/// http://clang.llvm.org/extra/clang-tidy/checks/bugprone/null-check-after-dereference.html
+class NullCheckAfterDereferenceCheck : public ClangTidyCheck {
+public:
+  NullCheckAfterDereferenceCheck(StringRef Name, ClangTidyContext *Context)
+      : ClangTidyCheck(Name, Context) {}
+
+  // The data-flow framework does not support C because of AST differences.
+  bool isLanguageVersionSupported(const LangOptions &LangOpts) const override {
+    return LangOpts.CPlusPlus;
+  }
+  void registerMatchers(ast_matchers::MatchFinder *Finder) override;
+  void check(const ast_matchers::MatchFinder::MatchResult &Result) override;
+};
+
+} // namespace clang::tidy::bugprone
+
+#endif // LLVM_CLANG_TOOLS_EXTRA_CLANG_TIDY_BUGPRONE_NULLCHECKAFTERDEREFERENCECHECK_H
diff --git a/clang-tools-extra/clangd/TidyProvider.cpp b/clang-tools-extra/clangd/TidyProvider.cpp
index b658a80559937c..cf7b4dff16070b 100644
--- a/clang-tools-extra/clangd/TidyProvider.cpp
+++ b/clang-tools-extra/clangd/TidyProvider.cpp
@@ -219,9 +219,10 @@ TidyProvider disableUnusableChecks(llvm::ArrayRef<std::string> ExtraBadChecks) {
       "-bugprone-use-after-move",
       // Alias for bugprone-use-after-move.
       "-hicpp-invalid-access-moved",
-      // Check uses dataflow analysis, which might hang/crash unexpectedly on
+      // Checks use dataflow analysis, which might hang/crash unexpectedly on
       // incomplete code.
       "-bugprone-unchecked-optional-access",
+      "-bugprone-null-check-after-dereference",
 
       // ----- Performance problems -----
 
diff --git a/clang-tools-extra/docs/clang-tidy/checks/bugprone/null-check-after-dereference.rst b/clang-tools-extra/docs/clang-tidy/checks/bugprone/null-check-after-dereference.rst
new file mode 100644
index 00000000000000..90e8897d7817ed
--- /dev/null
+++ b/clang-tools-extra/docs/clang-tidy/checks/bugprone/null-check-after-dereference.rst
@@ -0,0 +1,158 @@
+.. title:: clang-tidy - bugprone-null-check-after-dereference
+
+bugprone-null-check-after-dereference
+=====================================
+
+.. note::
+
+   This check uses a flow-sensitive static analysis to produce its
+   results. Therefore, it may be more resource intensive (RAM, CPU) than the
+   average clang-tidy check.
+
+This check identifies redundant pointer null-checks, by finding cases where the
+pointer cannot be null at the location of the null-check.
+
+Redundant null-checks can signal faulty assumptions about the current value of
+a pointer at different points in the program. Either the null-check is
+redundant, or there could be a null-pointer dereference earlier in the program.
+
+.. code-block:: c++
+
+   int f(int *ptr) {
+     *ptr = 20; // note: one of the locations where the pointer's value cannot be null
+     // ...
+     if (ptr) { // bugprone: pointer is checked even though it cannot be null at this point
+       return *ptr;
+     }
+     return 0;
+   }
+
+Supported pointer operations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Pointer null-checks
+-------------------
+
+The checker currently supports null-checks on pointers that use
+``operator bool``, such as when being used as the condition
+for an `if` statement.
+
+.. code-block:: c++
+
+   int f(int *ptr) {
+     if (ptr) {
+       if (ptr) { // bugprone: pointer is re-checked after its null-ness is already checked.
+         return *ptr;
+       }
+
+       return ptr ? *ptr : 0; // bugprone: pointer is re-checked after its null-ness is already checked.
+     }
+     return 0;
+   }
+
+Pointer dereferences
+--------------------
+
+Pointer star- and arrow-dereferences are supported.
+
+.. code-block:: c++
+
+   struct S {
+     int val;
+   };
+
+   void f(int *ptr, S *wrapper) {
+     *ptr = 20;
+     wrapper->val = 15;
+   }
+
+Null-pointer and other value assignments
+----------------------------------------
+
+The checker supports assigning various values to pointers, making them *null*
+or *non-null*. The checker also supports passing pointers of a pointer to
+external functions.
+
+.. code-block:: c++
+
+   extern int *external();
+   extern void refresh(int **ptr_ptr);
+   
+   int f() {
+     int *ptr_null = nullptr;
+     if (ptr_null) { // bugprone: pointer is checked where it cannot be non-null.
+       return *ptr_null;
+     }
+
+     int *ptr = external();
+     if (ptr) { // safe: external() could return either nullable or nonnull pointers.
+       return *ptr;
+     }
+
+     int *ptr2 = external();
+     *ptr2 = 20;
+     refresh(&ptr2);
+     if (ptr2) { // safe: pointer could be changed by refresh().
+       return *ptr2;
+     }
+     return 0;
+   }
+
+Limitations
+~~~~~~~~~~~
+
+The check only supports C++ due to limitations in the data-flow framework.
+
+The check currently does not output the locations of the previous
+dereferences, even in simple cases.
+
+The annotations ``_nullable`` and ``_nonnull`` are not supported.
+
+.. code-block:: c++
+
+   extern int *_nonnull external_nonnull();
+
+   int annotations() {
+     int *ptr = external_nonnull();
+
+     return ptr ? *ptr : 0; // false-negative: pointer is known to be non-null.
+   }
+
+Function calls taking a pointer value as a reference or a pointer-to-pointer are not supported.
+
+.. code-block:: c++
+   extern int *external();
+   extern void refresh_ref(int *&ptr);
+   extern void refresh_ptr(int **ptr);
+
+   int extern_ref() {
+     int *ptr = external();
+     *ptr = 20;
+
+     refresh_ref(ptr);
+     refresh_ptr(&ptr);
+
+     return ptr ? *ptr : 0; // false-positive: pointer could be changed by refresh_ref().
+   }
+
+Note tags are currently appended to a single location, even if all paths ensure a pointer is not null.
+
+.. code-block:: c++
+   int branches(int *p, bool b) {
+     if (b) {
+       *p = 42; // true-positive: note-tag appended here
+     } else {
+       *p = 20; // false-positive: note tag not appended here
+     }
+
+     return ptr ? *ptr : 0;
+   }
+
+Declarations and some other operations are not supported by note tags yet. This can sometimes result in erroneous note tags being shown instead of the correct one.
+
+.. code-block:: c++
+   int note_tags() {
+      int *ptr = nullptr; // false-negative: note tag not shown
+
+      return ptr ? *ptr : 0;
+   }
diff --git a/clang-tools-extra/test/clang-tidy/checkers/bugprone/null-check-after-dereference.cpp b/clang-tools-extra/test/clang-tidy/checkers/bugprone/null-check-after-dereference.cpp
new file mode 100644
index 00000000000000..21e9eff4290f7a
--- /dev/null
+++ b/clang-tools-extra/test/clang-tidy/checkers/bugprone/null-check-after-dereference.cpp
@@ -0,0 +1,330 @@
+// RUN: %check_clang_tidy %s bugprone-null-check-after-dereference %t
+
+struct S {
+  int a;
+};
+
+int warning_deref(int *p) {
+  *p = 42;
+
+  if (p) {
+    // CHECK-MESSAGES: :[[@LINE-1]]:7: warning: pointer value is checked even though it cannot be null at this point [bugprone-null-check-after-dereference]
+    // CHECK-MESSAGES: :[[@LINE-4]]:3: note: one of the locations where the pointer's value cannot be null
+  // FIXME: If there's a direct path, make the error message more precise, ie. remove `one of the locations`
+    *p += 20;
+    return *p;
+  } else {
+    return 0;
+  }
+}
+
+int warning_member(S *q) {
+  q->a = 42;
+
+  if (q) {
+    // CHECK-MESSAGES: :[[@LINE-1]]:7: warning: pointer value is checked even though it cannot be null at this point
+    // CHECK-MESSAGES: :[[@LINE-4]]:3: note: one of the locations where the pointer's value cannot be null
+    q->a += 20;
+    return q->a;
+  } else {
+    return 0;
+  }
+}
+
+int negative_warning(int *p) {
+  *p = 42;
+
+  if (!p) {
+    // CHECK-MESSAGES: :[[@LINE-1]]:8: warning: pointer value is checked even though it cannot be null at this point
+    // CHECK-MESSAGES: :[[@LINE-4]]:3: note: one of the locations where the pointer's value cannot be null
+    return 0;
+  } else {
+    *p += 20;
+    return *p;
+  }
+}
+
+int no_warning(int *p, bool b) {
+  if (b) {
+    *p = 42;
+  }
+
+  if (p) {
+    // CHECK-MESSAGES-NOT: :[[@LINE-1]]:7: warning: pointer value is checked even though it cannot be null at this point 
+    *p += 20;
+    return *p;
+  } else {
+    return 0;
+  }
+}
+
+int else_branch_warning(int *p, bool b) {
+  if (b) {
+    *p = 42;
+  } else {
+    *p = 20;
+  }
+
+  if (p) {
+    // CHECK-MESSAGES: :[[@LINE-1]]:7: warning: pointer value is checked even though it cannot be null at this point
+    // CHECK-MESSAGES: :[[@LINE-7]]:5: note: one of the locations where the pointer's value cannot be null
+    return 0;
+  } else {
+    *p += 20;
+    return *p;
+  }
+}
+
+int two_branches_warning(int *p, bool b) {
+  if (b) {
+    *p = 42;
+  }
+  
+  if (!b) {
+    *p = 20;
+  }
+
+  if (p) {
+    // CHECK-MESSAGES: :[[@LINE-1]]:7: warning: pointer value is checked even though it cannot be null at this point
+    // CHECK-MESSAGES: :[[@LINE-9]]:5: note: one of the locations where the pointer's value cannot be null
+    return 0;
+  } else {
+    *p += 20;
+    return *p;
+  }
+}
+
+int two_branches_reversed(int *p, bool b) {
+  if (!b) {
+    *p = 42;
+  }
+  
+  if (b) {
+    *p = 20;
+  }
+
+  if (p) {
+    // CHECK-MESSAGES: :[[@LINE-1]]:7: warning: pointer value is checked even though it cannot be null at this point
+    // CHECK-MESSAGES: :[[@LINE-9]]:5: note: one of the locations where the pointer's value cannot be null
+    return 0;
+  } else {
+    *p += 20;
+    return *p;
+  }
+}
+
+
+int regular_assignment(int *p, int *q) {
+  *p = 42;
+  q = p;
+
+  if (q) {
+    // CHECK-MESSAGES: :[[@LINE-1]]:7: warning: pointer value is checked even though it cannot be null at this point
+    // CHECK-MESSAGES: :[[@LINE-5]]:3: note: one of the locations where the pointer's value cannot be null
+    *p += 20; 
+    return *p;
+  } else {
+    return 0;
+  }
+}
+
+int nullptr_assignment(int *nullptr_param, bool b) {
+  *nullptr_param = 42;
+  int *nullptr_assigned;
+
+  if (b) {
+    nullptr_assigned = nullptr;
+  } else {
+    nullptr_assigned = nullptr_param;
+  }
+
+  if (nullptr_assigned) {
+    // CHECK-MESSAGES-NOT: :[[@LINE-1]]:7: warning: pointer value is checked even though it cannot be null at this point
+    *nullptr_assigned = 20;
+    return *nullptr_assigned;
+  } else {
+    return 0;
+  }
+}
+
+extern int *fncall();
+extern void refresh_ref(int *&ptr);
+extern void refresh_ptr(int **ptr);
+
+int fncall_reassignment(int *fncall_reassigned) {
+  *fncall_reassigned = 42;
+
+  fncall_reassigned = fncall();
+
+  if (fncall_reassigned) {
+    // CHECK-MESSAGES-NOT: :[[@LINE-1]]:7: warning: pointer value is checked even though it cannot be null at...
[truncated]

Copy link

github-actions bot commented Mar 6, 2024

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.cpp clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.h clang-tools-extra/test/clang-tidy/checkers/bugprone/null-check-after-dereference.cpp clang/include/clang/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.h clang/lib/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.cpp clang/unittests/Analysis/FlowSensitive/NullPointerAnalysisModelTest.cpp clang-tools-extra/clang-tidy/bugprone/BugproneTidyModule.cpp clang-tools-extra/clangd/TidyProvider.cpp
View the diff from clang-format here.
diff --git a/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.cpp b/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.cpp
index 6ea8e174b..fb2035d86 100644
--- a/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.cpp
+++ b/clang-tools-extra/clang-tidy/bugprone/NullCheckAfterDereferenceCheck.cpp
@@ -66,9 +66,11 @@ analyzeFunction(const FunctionDecl &FuncDecl) {
   Diagnoser Diagnoser;
 
   Diagnoser::ResultType Diagnostics;
-  
-  if (llvm::Error E = dataflow::diagnoseFunction<NullPointerAnalysisModel, Diagnoser::DiagnosticEntry>(
-    FuncDecl, ASTCtx, Diagnoser).moveInto(Diagnostics)) {
+
+  if (llvm::Error E = dataflow::diagnoseFunction<NullPointerAnalysisModel,
+                                                 Diagnoser::DiagnosticEntry>(
+                          FuncDecl, ASTCtx, Diagnoser)
+                          .moveInto(Diagnostics)) {
     llvm::dbgs() << "Dataflow analysis failed: " << llvm::toString(std::move(E))
                  << ".\n";
     return std::nullopt;
@@ -76,8 +78,7 @@ analyzeFunction(const FunctionDecl &FuncDecl) {
 
   ExpandedResultType ExpandedDiagnostics;
 
-  llvm::transform(Diagnostics,
-                  std::back_inserter(ExpandedDiagnostics),
+  llvm::transform(Diagnostics, std::back_inserter(ExpandedDiagnostics),
                   [&](Diagnoser::DiagnosticEntry Entry) -> ExpandedResult {
                     if (auto Val = Diagnoser.WarningLocToVal[Entry.Location];
                         auto DerefExpr = Diagnoser.ValToDerefLoc[Val]) {
@@ -139,9 +140,10 @@ void NullCheckAfterDereferenceCheck::check(
              "pointer value is checked but it can only be null at this point");
 
         if (DerefLoc) {
-          diag(*DerefLoc,
-               "one of the locations where the pointer's value can only be null",
-               DiagnosticIDs::Note);
+          diag(
+              *DerefLoc,
+              "one of the locations where the pointer's value can only be null",
+              DiagnosticIDs::Note);
         }
         break;
       }
diff --git a/clang/include/clang/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.h b/clang/include/clang/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.h
index af5145aae..d5ce5afde 100644
--- a/clang/include/clang/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.h
+++ b/clang/include/clang/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.h
@@ -41,6 +41,7 @@ class NullPointerAnalysisModel
     : public DataflowAnalysis<NullPointerAnalysisModel, NoopLattice> {
 private:
   CFGMatchSwitch<Environment> TransferMatchSwitch;
+
 public:
   explicit NullPointerAnalysisModel(ASTContext &Context);
 
diff --git a/clang/lib/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.cpp b/clang/lib/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.cpp
index 36907c8fd..e64ed532c 100644
--- a/clang/lib/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.cpp
+++ b/clang/lib/Analysis/FlowSensitive/Models/NullPointerAnalysisModel.cpp
@@ -60,12 +60,7 @@ enum class SatisfiabilityResult {
   Unknown
 };
 
-enum class CompareResult {
-  Same,
-  Different,
-  Top,
-  Unknown
-};
+enum class CompareResult { Same, Different, Top, Unknown };
 
 using SR = SatisfiabilityResult;
 using CR = CompareResult;
@@ -99,9 +94,8 @@ auto addressofMatcher() {
 }
 
 auto functionCallMatcher() {
-  return callExpr(
-      callee(functionDecl(hasAnyParameter(anyOf(hasType(pointerType()),
-                                                hasType(referenceType()))))));
+  return callExpr(callee(functionDecl(hasAnyParameter(
+      anyOf(hasType(pointerType()), hasType(referenceType()))))));
 }
 
 auto assignMatcher() {
@@ -117,9 +111,9 @@ auto nullCheckExprMatcher() {
 // FIXME: When TK_IgnoreUnlessSpelledInSource is removed from ptrWithBinding(),
 // this matcher should be merged with nullCheckExprMatcher().
 auto equalExprMatcher() {
-  return binaryOperator(hasAnyOperatorName("==", "!="),
-                        hasOperands(ptrWithBinding(kVar),
-                                    ptrWithBinding(kValue)));
+  return binaryOperator(
+      hasAnyOperatorName("==", "!="),
+      hasOperands(ptrWithBinding(kVar), ptrWithBinding(kValue)));
 }
 
 auto anyPointerMatcher() { return expr(hasType(isAnyPointer())).bind(kVar); }
@@ -246,8 +240,8 @@ void matchDereferenceExpr(const Stmt *stmt,
 }
 
 void matchNullCheckExpr(const Expr *NullCheck,
-                    const MatchFinder::MatchResult &Result,
-                    Environment &Env) {
+                        const MatchFinder::MatchResult &Result,
+                        Environment &Env) {
   const auto *Var = Result.Nodes.getNodeAs<Expr>(kVar);
   assert(Var != nullptr);
 
@@ -271,17 +265,16 @@ void matchNullCheckExpr(const Expr *NullCheck,
     CondValue = &A.makeAtomValue();
     Env.setValue(*NullCheck, *CondValue);
   }
-  const Formula &CondFormula = IsNonnullOp ? CondValue->formula()
-                                             : A.makeNot(CondValue->formula());
+  const Formula &CondFormula =
+      IsNonnullOp ? CondValue->formula() : A.makeNot(CondValue->formula());
 
   Env.assume(A.makeImplies(CondFormula, A.makeNot(IsNull.formula())));
-  Env.assume(A.makeImplies(A.makeNot(CondFormula),
-                           A.makeNot(IsNonnull.formula())));
+  Env.assume(
+      A.makeImplies(A.makeNot(CondFormula), A.makeNot(IsNonnull.formula())));
 }
 
 void matchEqualExpr(const BinaryOperator *EqualExpr,
-                    const MatchFinder::MatchResult &Result,
-                    Environment &Env) {
+                    const MatchFinder::MatchResult &Result, Environment &Env) {
   bool IsNotEqualsOp = EqualExpr->getOpcode() == BO_NE;
 
   const auto *LHSVar = Result.Nodes.getNodeAs<Expr>(kVar);
@@ -308,17 +301,19 @@ void matchEqualExpr(const BinaryOperator *EqualExpr,
     Env.setValue(*EqualExpr, *CondValue);
   }
 
-  const Formula &CondFormula = IsNotEqualsOp ? A.makeNot(CondValue->formula())
-                                       : CondValue->formula();
+  const Formula &CondFormula =
+      IsNotEqualsOp ? A.makeNot(CondValue->formula()) : CondValue->formula();
 
   // FIXME: Simplify formulas
   // If the pointers are equal, the nullability properties are the same.
-  Env.assume(A.makeImplies(CondFormula, 
+  Env.assume(A.makeImplies(
+      CondFormula,
       A.makeAnd(A.makeEquals(LHSNull.formula(), RHSNull.formula()),
                 A.makeEquals(LHSNonnull.formula(), RHSNonnull.formula()))));
 
   // If the pointers are not equal, at most one of the pointers is null.
-  Env.assume(A.makeImplies(A.makeNot(CondFormula),
+  Env.assume(A.makeImplies(
+      A.makeNot(CondFormula),
       A.makeNot(A.makeAnd(LHSNull.formula(), RHSNull.formula()))));
 }
 
@@ -365,8 +360,7 @@ void matchPtrArgFunctionExpr(const CallExpr *fncall,
   for (const auto *Arg : fncall->arguments()) {
     // FIXME: Add handling for reference types as arguments
     if (Arg->getType()->isPointerType()) {
-      PointerValue *OuterValue = cast_or_null<PointerValue>(
-          Env.getValue(*Arg));
+      PointerValue *OuterValue = cast_or_null<PointerValue>(Env.getValue(*Arg));
 
       if (!OuterValue)
         continue;
@@ -376,27 +370,27 @@ void matchPtrArgFunctionExpr(const CallExpr *fncall,
         continue;
 
       StorageLocation &InnerLoc = OuterValue->getPointeeLoc();
-      
+
       PointerValue *InnerValue =
           cast_or_null<PointerValue>(Env.getValue(InnerLoc));
 
       if (!InnerValue)
         continue;
-      
+
       Value *NewValue = Env.createValue(InnerType);
       assert(NewValue && "Failed to re-initialize a pointer's value");
 
       Env.setValue(InnerLoc, *NewValue);
 
-    // FIXME: Recursively invalidate all member pointers of eg. a struct
-    // Should be part of the framework, most likely.
+      // FIXME: Recursively invalidate all member pointers of eg. a struct
+      // Should be part of the framework, most likely.
     }
   }
 
   if (fncall->getCallReturnType(*Result.Context)->isPointerType() &&
       !Env.getValue(*fncall)) {
-    Value *PtrValue = Env.createValue( 
-        fncall->getCallReturnType(*Result.Context));
+    Value *PtrValue =
+        Env.createValue(fncall->getCallReturnType(*Result.Context));
     if (!PtrValue)
       return;
 
@@ -445,9 +439,10 @@ diagnoseDerefLocation(const Expr *Deref, const MatchFinder::MatchResult &Result,
   return {};
 }
 
-Diagnoser::ResultType diagnoseAssignLocation(const Expr *Assign,
-                                             const MatchFinder::MatchResult &Result,
-                                             Diagnoser::DiagnoseArgs &Data) {
+Diagnoser::ResultType
+diagnoseAssignLocation(const Expr *Assign,
+                       const MatchFinder::MatchResult &Result,
+                       Diagnoser::DiagnoseArgs &Data) {
   auto [ValToDerefLoc, WarningLocToVal, Env] = Data;
 
   const auto *RHSVar = Result.Nodes.getNodeAs<Expr>(kValue);
@@ -467,8 +462,8 @@ Diagnoser::ResultType diagnoseAssignLocation(const Expr *Assign,
 
 NullCheckAfterDereferenceDiagnoser::ResultType
 diagnoseNullCheckExpr(const Expr *NullCheck,
-      const MatchFinder::MatchResult &Result,
-      const Diagnoser::DiagnoseArgs &Data) {
+                      const MatchFinder::MatchResult &Result,
+                      const Diagnoser::DiagnoseArgs &Data) {
   auto [ValToDerefLoc, WarningLocToVal, Env] = Data;
 
   const auto *Var = Result.Nodes.getNodeAs<Expr>(kVar);
@@ -488,7 +483,8 @@ diagnoseNullCheckExpr(const Expr *NullCheck,
         assert(Inserted && "multiple warnings at the same source location");
         (void)Inserted;
 
-        return {{Var->getBeginLoc(), Diagnoser::DiagnosticType::CheckAfterDeref}};
+        return {
+            {Var->getBeginLoc(), Diagnoser::DiagnosticType::CheckAfterDeref}};
       }
 
       if (IsNull && !IsNonnull) {
@@ -520,22 +516,24 @@ diagnoseEqualExpr(const Expr *PtrCheck, const MatchFinder::MatchResult &Result,
   assert(LHSVar != nullptr);
   const auto *RHSVar = Result.Nodes.getNodeAs<Expr>(kValue);
   assert(RHSVar != nullptr);
-  
+
   Arena &A = Env.arena();
   llvm::SmallVector<Diagnoser::DiagnosticEntry> NullVarLocations;
 
   if (Value *LHSValue = Env.getValue(*LHSVar);
-      LHSValue->getProperty(kIsNonnull) && 
+      LHSValue->getProperty(kIsNonnull) &&
       Env.proves(A.makeNot(getVal(kIsNonnull, *LHSValue).formula()))) {
     WarningLocToVal.try_emplace(LHSVar->getBeginLoc(), LHSValue);
-    NullVarLocations.push_back({LHSVar->getBeginLoc(), Diagnoser::DiagnosticType::CheckWhenNull});
+    NullVarLocations.push_back(
+        {LHSVar->getBeginLoc(), Diagnoser::DiagnosticType::CheckWhenNull});
   }
 
   if (Value *RHSValue = Env.getValue(*RHSVar);
-      RHSValue->getProperty(kIsNonnull) && 
+      RHSValue->getProperty(kIsNonnull) &&
       Env.proves(A.makeNot(getVal(kIsNonnull, *RHSValue).formula()))) {
     WarningLocToVal.try_emplace(RHSVar->getBeginLoc(), RHSValue);
-    NullVarLocations.push_back({RHSVar->getBeginLoc(), Diagnoser::DiagnosticType::CheckWhenNull});
+    NullVarLocations.push_back(
+        {RHSVar->getBeginLoc(), Diagnoser::DiagnosticType::CheckWhenNull});
   }
 
   return NullVarLocations;
@@ -652,7 +650,7 @@ void NullPointerAnalysisModel::join(QualType Type, const Value &Val1,
   } else {
     MergedVal.setProperty(kIsNonnull, NonnullValue);
     MergedVal.setProperty(kIsNull, NullValue);
-  
+
     MergedEnv.assume(MergedEnv.makeOr(NonnullValue, NullValue).formula());
   }
 }
@@ -699,12 +697,10 @@ ComparisonResult NullPointerAnalysisModel::compare(QualType Type,
   if (NullComparison == CR::Top || NonnullComparison == CR::Top)
     return ComparisonResult::Same;
 
-  if (NullComparison == CR::Different ||
-      NonnullComparison == CR::Different)
+  if (NullComparison == CR::Different || NonnullComparison == CR::Different)
     return ComparisonResult::Different;
 
-  if (NullComparison == CR::Unknown ||
-      NonnullComparison == CR::Unknown)
+  if (NullComparison == CR::Unknown || NonnullComparison == CR::Unknown)
     return ComparisonResult::Unknown;
 
   return ComparisonResult::Same;
@@ -746,12 +742,10 @@ ComparisonResult compareAndReplace(QualType Type, Value &Val1,
   if (NullComparison == CR::Top || NonnullComparison == CR::Top)
     return ComparisonResult::Same;
 
-  if (NullComparison == CR::Different ||
-      NonnullComparison == CR::Different)
+  if (NullComparison == CR::Different || NonnullComparison == CR::Different)
     return ComparisonResult::Different;
 
-  if (NullComparison == CR::Unknown ||
-      NonnullComparison == CR::Unknown)
+  if (NullComparison == CR::Unknown || NonnullComparison == CR::Unknown)
     return ComparisonResult::Unknown;
 
   return ComparisonResult::Same;
@@ -779,7 +773,7 @@ NullCheckAfterDereferenceDiagnoser::NullCheckAfterDereferenceDiagnoser()
 
 NullCheckAfterDereferenceDiagnoser::ResultType
 NullCheckAfterDereferenceDiagnoser::operator()(
-    const CFGElement &Elt, ASTContext &Ctx, 
+    const CFGElement &Elt, ASTContext &Ctx,
     const TransferStateForDiagnostics<NoopLattice> &State) {
   DiagnoseArgs Args = {ValToDerefLoc, WarningLocToVal, State.Env};
   return DiagnoseMatchSwitch(Elt, Ctx, Args);

Copy link
Member

@whisperity whisperity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(@Discookie and I had an extensive prior review at Discookie#1 in case someone is interested in the details.)

@whisperity whisperity changed the title [clang][dataflow] Add null-check after dereference checker [clang-tidy][dataflow] Add bugprone-null-check-after-dereference check Mar 6, 2024
@llvmbot llvmbot added clang Clang issues not falling into any other category clang-tools-extra clangd clang:analysis labels Mar 6, 2024
@gribozavr
Copy link
Collaborator

gribozavr commented Mar 6, 2024

A complete review will take some time - but I wanted to first thank you for this contribution and say that I'm very excited to see more tools for improving C++ nullability correctness, as well as the dataflow framework being used in-tree!

cc @martinboehme @ymand from our group which is also working on C++ nullability verification and inference. We are planning to contribute our ClangTidy check to Clang upstream in future. At the moment we're still evaluating and fine tuning it, as well as working on inference tooling. So we are not yet ready to write an RFC that pins down the semantics in a way that is justified by our internal deployment experience.

However, if you are interested, I would invite you to take a look at our modeling of nullability that uses two booleans per pointer to arrive at a gradual typing solution. I would like to also draw your attention at our "nullability vector" approach that complements an arbitrary C++ type with nullability information. This "shadow type system" approach allows us to understand nullability annotations not only at the top level but also through C++ templates (for example, we understand that the return value of first() on a std::vector<int * _Nullable> is nullable; unfortunately, Clang's resugaring does not always help in more complex cases).

Copy link
Contributor

@EugeneZelenko EugeneZelenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention new check in Release Notes.


Env.assume(A.makeNot(IsNonnull.formula()));
NewRootValue->setProperty(kIsNonnull, Env.getBoolLiteralValue(false));
}
Copy link
Collaborator

@gribozavr gribozavr Mar 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function could be significantly simplified.

You should check if the Expr argument of the cast has a modeled pointer value, and if it is, then one of the booleans that you're storing as synthetic properties on that pointer value would be the result of the pointer to bool cast.

See https://github.com/google/crubit/blob/085108ea0a3b8da2e400613d7d235274de1f34b9/nullability/pointer_nullability_analysis.cc#L822 for a sample transfer function for a cast.

If not all variables reliably get a modeled pointer value (I'm guessing this is the case because a good chunk of this transfer function is dedicated to creating values), then that problem should be solved uniformly for all AST nodes, not just for casts. PTAL at unpackPointerValue, initPointerNullState and initPointerFromTypeNullability in our code, and how they are used - specifically transferValue_Pointer and that it is called for every pointer-typed expression, not just variables. To reliably initialize synthetic properties (kIsNull etc.) the idea is to call initialization in a central place, so that each individual transfer function does not need to worry about it.

unpackPointerValue also handles "top" and I think your code would benefit from something similar, as it leads to significantly better convergence of loop analysis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, thanks for the in-depth explanation! I'll have to take a better look at how you handle transferValue_Pointer.

In general, the checker would benefit greatly from centralizing common operations, I just wasn't sure what would be a good way to group such operations. Thank you for the pointers on that as well, will need to refactor the code a bit to implement those.

(As an aside, the approach of setting the cast expr's Value directly, instead of making the nullability depend on the cast expr's result is something I didn't think of before.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I am only doing what you said about transferValue_Pointer, and only matching special cases where a pointer is used instead of created. I'm in the process of simplifying all of the logic along the lines of this, though.

Comment on lines +524 to +682
SatisfiabilityResult LHSResult = computeSatisfiability(LHSVar, Env1);
SatisfiabilityResult RHSResult = computeSatisfiability(RHSVar, Env2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a pretty expensive way of evaluating a comparison: You're potentially calling the SAT solver four times (twice for each computeSatisfiability()).

Instead, I'd suggest testing "proves(makeEquals(*LHSVar, *RHSVar))" and "proves(makeNot(makeEquals(*LHSVar, *RHSVar)))" (pseudocode), which gives you at most two SAT solver invocations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with evaluating something similar to LHS == RHS is that in the environments Env1, Env2 and MergedEnv can have different satisfiability results.
I can imagine evaluating something similar to (FlowCondition1 => LHS) == (FlowCondition2 => RHS) against MergedEnv could work, but I'll need to test that first.


// Computes satisfiability by using the flow condition. Slower, but more
// precise.
SatisfiabilityResult computeSatisfiability(BoolValue *Val,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nit: This isn't really computing satisfiability; it's computing whether or not the environment proves that a BoolValue is necessarily true or false.

I think you could rename this tryToProve(), and you could probably do without the SatisfiabilityResult type entirely (which isn't really an accurate name either) and instead return a BoolValue & (since that's what the caller ends up needing anyway). This would be either a literal true or false (if the environment can prove the value true or false) or *Val if the environment can't prove anything about the variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning a BoolValue & sounds useful. I wouldn't be able to differentiate cases such as nullptr vs unknown, but maybe I don't even need to, since I'm gonna initialize the variable later anyways.

Copy link
Member

@PiotrZSL PiotrZSL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like idea behind this check, and I think that there should be version of this check not only for raw pointers but also for optionals, smart pointers and iterators.

My main problem is that dataflow framework is slow and unstable, there are 20 issues open for an bugprone-unchecked-optional-access check that uses this framework and 19 issues for a framework alone. It crashes, it hangs and only cause problems.

Personally I would prefer check like this to be written in simpler way by using same method as bugprone-use-after-move is using. Even if it would find less issues, but at-least wouldn't force half of the projects to disable it due to crashes or long execution time.

Otherwise please consider implementing this check as an clang-analyzer check, so that all incoming issues could be redirected there instead pin to clang-tidy.

@Discookie Discookie removed the request for review from ymand May 13, 2024 08:21
@Discookie
Copy link
Contributor Author

Should be ready for another round of review.

Simplified a lot of the logic, implemented most of the changes requested, added support for p != nullptr, and void f(int **ptr) invalidating the value of *ptr.

I've been experimenting with trying to remove TK_IgnoreUnlessSpelledInSource, without much luck. I need to be able to set a new Value to all of my pointers, and currently the framework doesn't let me set it in a persistent manner.
I'm still yet to refactor handling Top, currently it's used in early returns where possible, but I still need to think of a clear way to make central functions for it.

@Discookie
Copy link
Contributor Author

gentle ping @martinboehme @gribozavr

@martinboehme
Copy link
Contributor

gentle ping @martinboehme @gribozavr

Sorry, I didn't notice your original comment that this is ready for review again.

I've added a few individual comments, but I haven't reviewed the whole PR in depth because I see that there are still quite a number of old comments that haven't been addressed. Please could you address these comments before the next review round?

Also:

I'm still yet to refactor handling Top, currently it's used in early returns where possible, but I still need to think of a clear way to make central functions for it.

Can you address this too before the next review round? It's hard to review a PR that the author themself acknowledges isn't in a final state yet.

If you're not sure what the best way is to do certain things, happy to help with that of course -- but then we should address those specific topics first before continuing to review the whole PR.

llvm::DenseMap<const Value *, const Expr *> ValToDerefLoc;
// Maps Maps a warning's SourceLocation to its relevant Value.
llvm::DenseMap<SourceLocation, const Value *> WarningLocToVal;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks, though, as if these maps are passed to the diagnose-match-switch through a DiagnoseArgs parameter, rather than being accessed directly. So I think these could (and should) still be private?

Copy link
Contributor Author

@Discookie Discookie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still on this checker on and off, and there's been a bit of development since my last comments.
Added handling and early returns for Top values, fixed more crashes, and I'm currently in the process of making the logic simpler, eliminating unneeded solver invocations. Fixed all the review comments, except for the one related to unpackPointerValue, which I still need to figure out a good system for.


Env.assume(A.makeNot(IsNonnull.formula()));
NewRootValue->setProperty(kIsNonnull, Env.getBoolLiteralValue(false));
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I am only doing what you said about transferValue_Pointer, and only matching special cases where a pointer is used instead of created. I'm in the process of simplifying all of the logic along the lines of this, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:analysis clang:dataflow Clang Dataflow Analysis framework - https://clang.llvm.org/docs/DataFlowAnalysisIntro.html clang Clang issues not falling into any other category clang-tidy clang-tools-extra clangd
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants