Skip to content

Conversation

@Jianghibo
Copy link
Contributor

Allows merging the memory profiles with basic sampling profiles or LBR sampling profils.

Using std::map instead of StringMap to keep records in order: For profiles that mix branch and memory information, since llvm-bolt processes branch and memory information sequentially, it is essential to ensure that the data of these two types remains relatively independent.

@llvmbot
Copy link
Member

llvmbot commented Jan 5, 2026

@llvm/pr-subscribers-bolt

Author: Haibo Jiang (Jianghibo)

Changes

Allows merging the memory profiles with basic sampling profiles or LBR sampling profils.

Using std::map instead of StringMap to keep records in order: For profiles that mix branch and memory information, since llvm-bolt processes branch and memory information sequentially, it is essential to ensure that the data of these two types remains relatively independent.


Full diff: https://github.com/llvm/llvm-project/pull/174405.diff

2 Files Affected:

  • (added) bolt/test/merge-fdata-mixed-branch-memory.test (+15)
  • (modified) bolt/tools/merge-fdata/merge-fdata.cpp (+22-4)
diff --git a/bolt/test/merge-fdata-mixed-branch-memory.test b/bolt/test/merge-fdata-mixed-branch-memory.test
new file mode 100644
index 0000000000000..8ad3e3f8ac9dc
--- /dev/null
+++ b/bolt/test/merge-fdata-mixed-branch-memory.test
@@ -0,0 +1,15 @@
+## Check that merge-fdata correctly handles merging two fdata files with no_lbr/lbr sampling and memory sampling.
+
+# REQUIRES: system-linux
+
+# RUN: split-file %s %t
+# RUN: merge-fdata %t/a.fdata %t/b.fdata -o %t/merged.fdata
+# RUN: FileCheck %s --input-file %t/merge.fdata
+#
+# CHECK: 1 main 0 1 main 2 0 1
+# CHECK: 4 main 10 4 othreSym 123 1
+
+#--- a.fdata
+1 main 0 1 main 2 0 1
+#--- b.fdata
+4 main 10 4 otherSym 123 1
diff --git a/bolt/tools/merge-fdata/merge-fdata.cpp b/bolt/tools/merge-fdata/merge-fdata.cpp
index f5b0251ea0331..1b94c67a53fda 100644
--- a/bolt/tools/merge-fdata/merge-fdata.cpp
+++ b/bolt/tools/merge-fdata/merge-fdata.cpp
@@ -263,6 +263,14 @@ bool isYAML(const StringRef Filename) {
   return false;
 }
 
+bool isBasicOrLBRItem(APInt SymbolValue) {
+  return SymbolValue.sge(0) && SymbolValue.sle(2);
+}
+
+bool isMemoryItem(APInt SymbolValue) {
+  return SymbolValue.sge(3) && SymbolValue.sle(5);
+}
+
 void mergeLegacyProfiles(const SmallVectorImpl<std::string> &Filenames) {
   errs() << "Using legacy profile format.\n";
   std::optional<bool> BoltedCollection;
@@ -319,8 +327,12 @@ void mergeLegacyProfiles(const SmallVectorImpl<std::string> &Filenames) {
       auto [Signature, ExecCount] = Line.rsplit(' ');
       if (ExecCount.getAsInteger(10, Count.Exec))
         report_error(Filename, "Malformed / corrupted execution count");
+
+      auto [Symbol, Record] = Signature.split(' ');
+      APInt SymbolValue;
+      Symbol.getAsInteger(10, SymbolValue);
       // Only LBR profile has misprediction field
-      if (!NoLBRCollection.value_or(false)) {
+      if (!NoLBRCollection.value_or(false) && isBasicOrLBRItem(SymbolValue)) {
         auto [SignatureLBR, MispredCount] = Signature.rsplit(' ');
         Signature = SignatureLBR;
         if (MispredCount.getAsInteger(10, Count.Mispred))
@@ -343,10 +355,13 @@ void mergeLegacyProfiles(const SmallVectorImpl<std::string> &Filenames) {
     Pool.async(ParseProfile, std::cref(Filename), std::ref(ParsedProfiles));
   Pool.wait();
 
-  ProfileTy MergedProfile;
+  std::map<StringRef, CounterTy> MergedProfile;
   for (const auto &[Thread, Profile] : ParsedProfiles)
     for (const auto &[Key, Value] : Profile) {
-      CounterTy Count = MergedProfile.lookup(Key) + Value;
+      auto It = MergedProfile.find(Key);
+      CounterTy Count = It != MergedProfile.end() ? It->second + Value
+                                                  : Value;
+
       MergedProfile.insert_or_assign(Key, Count);
     }
 
@@ -356,7 +371,10 @@ void mergeLegacyProfiles(const SmallVectorImpl<std::string> &Filenames) {
     output() << "no_lbr\n";
   for (const auto &[Key, Value] : MergedProfile) {
     output() << Key << " ";
-    if (!NoLBRCollection.value_or(false))
+    auto [Symbol, Record] = key.split(' ');
+    APInt SymbolValue;
+    Symbol.getAsInteger(10, SymbolValue);
+    if (!NoLBRCollection.value_or(false) && isBasicOrLBRItem(SymbolValue))
       output() << Value.Mispred << " ";
     output() << Value.Exec << "\n";
   }

@github-actions
Copy link

github-actions bot commented Jan 5, 2026

✅ With the latest revision this PR passed the C/C++ code formatter.

@github-actions
Copy link

github-actions bot commented Jan 5, 2026

🐧 Linux x64 Test Results

  • 608 tests passed
  • 50 tests skipped

✅ The build succeeded and all tests passed.

@Jianghibo Jianghibo force-pushed the tmp-branch branch 2 times, most recently from ab07ce9 to ed276f5 Compare January 5, 2026 13:50
Allows merging the memory profiles with basic sampling profiles or
LBR sampling profils.

Using std::map instead of StringMap to keep records in order: For
profiles that mix branch and memory information, since llvm-bolt
processes branch and memory information sequentially, it is
essential to ensure that the data of these two types remains
relatively independent.
@aaupov
Copy link
Contributor

aaupov commented Jan 5, 2026

The issue should be addressed by #128108, please take a look

@Jianghibo
Copy link
Contributor Author

The issue should be addressed by #128108, please take a look

Thanks, I will close this PR.

@Jianghibo
Copy link
Contributor Author

The implementation already exists. see #128108

@Jianghibo Jianghibo closed this Jan 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants