-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[Windows] Add support for emitting PGO/LTO magic strings in the Windows PE debug directory #114260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Windows] Add support for emitting PGO/LTO magic strings in the Windows PE debug directory #114260
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
@llvm/pr-subscribers-lld @llvm/pr-subscribers-mc Author: Mikołaj Piróg (mikolaj-pirog) ChangesThis PR adds support for putting magic strings indicating PGO/LTO in the debug directory of Windows PE files. This is to make clang, lld behave as MSVC compiler in this regard. This needs a little background: MSVC compiler puts magic strings ("PGI" for instrumentation, "PGU" for binary built using instrumented data, "LTCG" for LTO builds) in the debug directory of Windows PE files. You can see these strings by using This PR "ports" this behavior to clang and lld, so compiling and linking using lld with The implementation of that is split between lld linker and emission of COFF object files. The linker puts the magic strings in the debug directory; the problem is it needs to know when to do that -- lld isn't aware of instrumentation, or building based on profiling info. It knows only about linking of LTOed binary. Naturally, lld has to somehow know that it needs to put "PGI"/ "PGU" string -- I have solved this by putting a special sections (".pgi", ".pgu") in the COFF object files, when they are compiled with instrumentation, or with data from instrumentation. lld checks for these sections, and based on their presence, puts the magic string in the debug directory of Windows PE file. Full diff: https://github.com/llvm/llvm-project/pull/114260.diff 9 Files Affected:
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index f018130807519d..fcf3dc25d95fc0 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -525,6 +525,9 @@ static bool initTargetOptions(DiagnosticsEngine &Diags,
Options.MCOptions.PPCUseFullRegisterNames =
CodeGenOpts.PPCUseFullRegisterNames;
Options.MisExpect = CodeGenOpts.MisExpect;
+ Options.MCOptions.PgoInstrumentation = CodeGenOpts.getProfileInstr() > 0;
+ Options.MCOptions.PgoUse =
+ CodeGenOpts.getProfileUse() > 0 || !CodeGenOpts.SampleProfileFile.empty();
return true;
}
diff --git a/clang/test/CodeGen/debug-dir-win-pe-ltcg-string.c b/clang/test/CodeGen/debug-dir-win-pe-ltcg-string.c
new file mode 100644
index 00000000000000..a121ab8c9acc45
--- /dev/null
+++ b/clang/test/CodeGen/debug-dir-win-pe-ltcg-string.c
@@ -0,0 +1,13 @@
+// This test checks if Window PE file compiled with -flto option contains a magic
+// string "LTCG" to indicate LTO compilation.
+
+// REQUIRES: system-windows
+
+// RUN: %clang --target=x86_64-pc-windows-msvc -flto -fuse-ld=lld %s -o %t.exe
+// RUN: dumpbin /HEADERS %t.exe | FileCheck %s
+// CHECK: {{.*}}LTCG{{.*}}
+
+int main(void) {
+
+ return 0;
+}
diff --git a/clang/test/CodeGen/debug-dir-win-pe-pgi-string.c b/clang/test/CodeGen/debug-dir-win-pe-pgi-string.c
new file mode 100644
index 00000000000000..7f1e9e35aaf120
--- /dev/null
+++ b/clang/test/CodeGen/debug-dir-win-pe-pgi-string.c
@@ -0,0 +1,14 @@
+// This test checks if Windows PE file compiled with
+// -fprofile-generate has magic string "PGI" to indicate so.
+
+
+// REQUIRES: system-windows
+
+// RUN: %clang --target=x86_64-pc-windows-msvc -fprofile-generate -fuse-ld=lld %s -o %t.exe
+// RUN: dumpbin /HEADERS %t.exe | FileCheck --check-prefix=CHECK2 %s
+// CHECK2: {{.*}}PGI{{.*}}
+
+int main(void) {
+
+ return 0;
+}
diff --git a/clang/test/CodeGen/debug-dir-win-pe-pgu-string.c b/clang/test/CodeGen/debug-dir-win-pe-pgu-string.c
new file mode 100644
index 00000000000000..12c63425aee0f5
--- /dev/null
+++ b/clang/test/CodeGen/debug-dir-win-pe-pgu-string.c
@@ -0,0 +1,18 @@
+// This test checks if Windows PE file contains a "PGU" string to indicate that
+// it was compiled using profiling data.
+
+// REQUIRES: system-windows
+
+// RUN: %clang --target=x86_64-pc-windows-msvc -fprofile-instr-generate="%profdata" -fuse-ld=lld %s -o %t.exe
+// RUN: %t.exe
+// RUN: llvm-profdata merge -output=%code.profdata %profdata
+// RUN: %clang --target=x86_64-pc-windows-msvc -fprofile-use=%code.profdata -fuse-ld=lld %s -o %t.exe
+// RUN: dumpbin /HEADERS %t.exe | FileCheck %s
+
+// CHECK: {{.*}}PGU{{.*}}
+
+int main(void) {
+
+ return 0;
+}
+
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 71ee5ce4685553..0ce62ad21c4634 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -77,6 +77,12 @@ static unsigned char dosProgram[] = {
static_assert(sizeof(dosProgram) % 8 == 0,
"DOSProgram size must be multiple of 8");
+static char ltcg[] = "LTCG";
+static char pgi[] = "PGI";
+static char pgu[] = "PGU";
+static char pgiSectionName[] = ".pgi";
+static char pguSectionName[] = ".pgu";
+
static const int dosStubSize = sizeof(dos_header) + sizeof(dosProgram);
static_assert(dosStubSize % 8 == 0, "DOSStub size must be multiple of 8");
@@ -179,6 +185,23 @@ class ExtendedDllCharacteristicsChunk : public NonSectionChunk {
uint32_t characteristics = 0;
};
+class DebugDirStringChunk : public NonSectionChunk {
+public:
+ DebugDirStringChunk(std::string str) : str(str.begin(), str.end()) {
+ while (this->str.size() % 4 != 0)
+ this->str.push_back(0);
+ }
+ size_t getSize() const override { return str.size(); }
+
+ void writeTo(uint8_t *b) const override {
+ char *p = reinterpret_cast<char *>(b);
+ auto strReverse = str;
+ std::reverse(strReverse.begin(), strReverse.end());
+ memcpy(p, strReverse.data(), strReverse.size());
+ }
+ std::vector<char> str;
+};
+
// PartialSection represents a group of chunks that contribute to an
// OutputSection. Collating a collection of PartialSections of same name and
// characteristics constitutes the OutputSection.
@@ -1165,6 +1188,23 @@ void Writer::createMiscChunks() {
llvm::TimeTraceScope timeScope("Misc chunks");
Configuration *config = &ctx.config;
+ auto searchForPgoMagicSection = [this](char sectionName[]) {
+ for (auto *obj : ctx.objFileInstances) {
+ for (auto &chunk : obj->getChunks()) {
+ if (chunk->kind() == Chunk::SectionKind &&
+ chunk->getSectionName() == sectionName) {
+ return true;
+ }
+ }
+ }
+ return false;
+ };
+
+ bool writePgi = searchForPgoMagicSection(pgiSectionName);
+ bool writePgu = !writePgi && searchForPgoMagicSection(pguSectionName);
+ bool writeLTO = ctx.bitcodeFileInstances.size();
+
+
for (MergeChunk *p : ctx.mergeChunkInstances) {
if (p) {
p->finalizeContents();
@@ -1181,7 +1221,7 @@ void Writer::createMiscChunks() {
// Create Debug Information Chunks
debugInfoSec = config->mingw ? buildidSec : rdataSec;
if (config->buildIDHash != BuildIDHash::None || config->debug ||
- config->repro || config->cetCompat) {
+ config->repro || config->cetCompat || writePgi || writePgu || writeLTO) {
debugDirectory =
make<DebugDirectoryChunk>(ctx, debugRecords, config->repro);
debugDirectory->setAlignment(4);
@@ -1206,6 +1246,20 @@ void Writer::createMiscChunks() {
IMAGE_DLL_CHARACTERISTICS_EX_CET_COMPAT));
}
+
+ if (writeLTO) {
+ debugRecords.emplace_back(COFF::IMAGE_DEBUG_TYPE_POGO,
+ make<DebugDirStringChunk>(ltcg));
+ }
+
+ if (writePgi) {
+ debugRecords.emplace_back(COFF::IMAGE_DEBUG_TYPE_POGO,
+ make<DebugDirStringChunk>(pgi));
+ } else if (writePgu) {
+ debugRecords.emplace_back(COFF::IMAGE_DEBUG_TYPE_POGO,
+ make<DebugDirStringChunk>(pgu));
+ }
+
// Align and add each chunk referenced by the debug data directory.
for (std::pair<COFF::DebugType, Chunk *> r : debugRecords) {
r.second->setAlignment(4);
diff --git a/lld/test/COFF/debug_dir_magic_strings_from_section_pgi.s b/lld/test/COFF/debug_dir_magic_strings_from_section_pgi.s
new file mode 100644
index 00000000000000..b1782dade39042
--- /dev/null
+++ b/lld/test/COFF/debug_dir_magic_strings_from_section_pgi.s
@@ -0,0 +1,17 @@
+// This test checks if lld puts magic string "PGI" when an object files contains
+// .pgi section.
+
+// REQUIRES: system-windows
+
+// RUN: llvm-mc -filetype=obj -triple=x86_64-pc-windows-msvc %s -o %t.main.obj
+
+// RUN: lld-link -out:%t.exe %t.main.obj -entry:entry -subsystem:console -debug:symtab
+// RUN: dumpbin /HEADERS %t.exe
+// CHECK: PGI
+
+#--- main.s
+.section .pgi
+.global entry
+entry:
+ movl %edx, %edx
+
diff --git a/lld/test/COFF/debug_dir_magic_strings_from_section_pgu.s b/lld/test/COFF/debug_dir_magic_strings_from_section_pgu.s
new file mode 100644
index 00000000000000..341f88d25bbaad
--- /dev/null
+++ b/lld/test/COFF/debug_dir_magic_strings_from_section_pgu.s
@@ -0,0 +1,17 @@
+// This test checks if lld puts magic string "PGU" when an object files contains
+// .pgu section.
+
+// REQUIRES: system-windows
+
+// RUN: llvm-mc -filetype=obj -triple=x86_64-pc-windows-msvc %s -o %t.main.obj
+
+// RUN: lld-link -out:%t.exe %t.main.obj -entry:entry -subsystem:console -debug:symtab
+// RUN: dumpbin /HEADERS %t.exe
+// CHECK: PGU
+
+#--- main.s
+.section .pgu
+.global entry
+entry:
+ movl %edx, %edx
+
diff --git a/llvm/include/llvm/MC/MCTargetOptions.h b/llvm/include/llvm/MC/MCTargetOptions.h
index 7b0d81faf73d2d..5e6a58a36615b1 100644
--- a/llvm/include/llvm/MC/MCTargetOptions.h
+++ b/llvm/include/llvm/MC/MCTargetOptions.h
@@ -1,5 +1,4 @@
//===- MCTargetOptions.h - MC Target Options --------------------*- C++ -*-===//
-//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
@@ -112,6 +111,8 @@ class MCTargetOptions {
// Whether or not to use full register names on PowerPC.
bool PPCUseFullRegisterNames : 1;
+ bool PgoInstrumentation = false;
+ bool PgoUse = false;
MCTargetOptions();
/// getABIName - If this returns a non-empty string this represents the
diff --git a/llvm/lib/MC/WinCOFFObjectWriter.cpp b/llvm/lib/MC/WinCOFFObjectWriter.cpp
index 62f53423126ea9..e413a2d3e48b9e 100644
--- a/llvm/lib/MC/WinCOFFObjectWriter.cpp
+++ b/llvm/lib/MC/WinCOFFObjectWriter.cpp
@@ -28,6 +28,7 @@
#include "llvm/MC/MCSectionCOFF.h"
#include "llvm/MC/MCSymbol.h"
#include "llvm/MC/MCSymbolCOFF.h"
+#include "llvm/MC/MCTargetOptions.h"
#include "llvm/MC/MCValue.h"
#include "llvm/MC/MCWinCOFFObjectWriter.h"
#include "llvm/MC/StringTableBuilder.h"
@@ -981,6 +982,18 @@ static std::time_t getTime() {
uint64_t WinCOFFWriter::writeObject(MCAssembler &Asm) {
uint64_t StartOffset = W.OS.tell();
+ const auto *Options = Asm.getContext().getTargetOptions();
+
+ if (Options && Options->PgoInstrumentation) {
+ auto *Section = Asm.getContext().getCOFFSection(".pgi", 0);
+ defineSection(Asm, *Section);
+ }
+
+ if (Options && Options->PgoUse) {
+ auto *Section = Asm.getContext().getCOFFSection(".pgu", 0);
+ defineSection(Asm, *Section);
+ }
+
if (Sections.size() > INT32_MAX)
report_fatal_error(
"PE COFF object files can't have more than 2147483647 sections");
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this. It's always good to be compatible with MSVC as much as possible. Some comments and questions. We also need some more people that know more COFF to weigh in.
What mechanism does MSVC compiler use to indicate PGO to the linker for non-LTO builds? We probably want to do the same thing if possible, since people can mix compilers/linkers. |
✅ With the latest revision this PR passed the C/C++ code formatter. |
} | ||
} | ||
return false; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to do this? Why can we not just emit the magic content with COMDAT and let /debug handle the preservation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand you correctly, you suggest creating a section holding the debug dir during COFF file emission, with expectation that the linker will preserve it, right? But then we would have to update the "Debug" field of the of the "Optional Header Data Directories" of Windows PE file to point to the debug directory, which I believe would more or less be the same to the current solution, we would have to iterate over all section of all object files to update the "Debug" entry
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, my thinking was that if we have content that is guaranteed to be folded into the debug data directory, the directory will be emitted. As such, the linker will link the directory in the header and emit that. This would avoid the need to iterate all the sections, it would simply force the emission of the debug directory without /debug
being passed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this content is guaranteed to be folded into the debug data directory. I can create a debug dir entry in the COFF with COMDAT, and this will be folded, but then I still have to adjust pointers in the optional header Debug
field to point to debug dir, and to specific entries in the debug dir AddressOfRawData
field. This has to be done during the linking phase, because this structures don't exist in the COFF files
I am just curious what is the purpose of this? Is it just for feature parity? Can you also investigate if MSVC emits other special things/sections in the binary in LTCG/PGO builds? |
I believe they simply pass every PGO/LTO flag to the linker. This is not the case of clang -- the linker (be that ld, lld or anything other) doesn't know it's linking PGOed files. Regarding mix/matching the compiler/linkers -- I just tested that's it sort of possible. This patch obviously works only when using lld linker. It's possible to link clang object files (with PGO) with MSVC linker. I tried simply passing a flag indicating LTO to MSVC linker (while linking a clang object file without LTO) -- it complained that /LTCG is unnecessary, but put the magic string regardless. I will investigate further, if this trick also works with PGO and I will try to incorporate it to this patch (simply pass some options in the driver when using MSVC linker). This also suggests that MSVC linker simply checks for CLI options being present to put a magic string. |
Yes, feature parity is the reason this patch exists. I haven't noticed any other special things MSVC does when doing PGO/LTO; then again, I wasn't looking for them. I can dig a little more to see if they are any, but I would like to do it after this patch is finished |
I would like to benchmark |
I think benchmarking If you have trouble building all this I can provide more detailed instructions, please let me know. |
I can recommend the hyperfine tool when benchmarking. It runs the same command multiple times and does all the fancy things to make sure you are measuring correctly. |
FWIW, no further objections from me on this one, but others may want to have another look (potentially also CC @MaskRay for the clang tests). |
Gently pinging @aganea @MaskRay @HaohaiWen. If there are no objections to this patch, I would like to merge it |
@@ -112,6 +112,8 @@ class MCTargetOptions { | |||
// Whether or not to use full register names on PowerPC. | |||
bool PPCUseFullRegisterNames : 1; | |||
|
|||
bool PgoInstrumentation = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Target options like this don't play well with (thin)LTO , because they don't carry over naturally from the frontend compilation step to the backend compilation step, which LTO separates. Is there an existing global named metadata flag you can look for instead to control this debug info setting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I am aware, there isn't any global metadata flag I could fetch from within MC. Could you elaborate a bit more when the current solution would cause problems? I am not that familiar with LTO inner workings
Thanks to the review above mentioning LTO, I realized that this solution has a problem with LTO, namely using LTO + PGO/PGU, will only emit the LTCG string in the binary. This is because right now the magic sections are only written in the COFF object file emission. I will also add the emission of this sections in the AsmPrinter, so they should appear in the bitcode files, and using LTO + PGO/PGU will emit both the "LTCG" and "PGO/PGU" strings |
@@ -77,6 +77,12 @@ static unsigned char dosProgram[] = { | |||
static_assert(sizeof(dosProgram) % 8 == 0, | |||
"DOSProgram size must be multiple of 8"); | |||
|
|||
static char ltcg[] = "LTCG"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
constexpr char ...[]
constexpr/const make this internal linkage, making static unneeded
As I'm not a Windows developer, I defer to other reviewers' expertise on MSVC's PGO/LTO feature. However, to be honest, I'm unsure about the value of porting the strings given the large feature differences between Clang and MSVC on PGO and LTO. There are many PGO flavors. If we add this, there could be some inconsistency everytime someone adds a new flavor of PGO and does not port this piece of code. |
I had a more in-depth look at this. Overall I don't agree with the whole direction of this patch. I don't think it's wise for LLD to emit debug records/ |
@@ -1206,6 +1245,19 @@ void Writer::createMiscChunks() { | |||
IMAGE_DLL_CHARACTERISTICS_EX_CET_COMPAT)); | |||
} | |||
|
|||
if (writeLTO) { | |||
debugRecords.emplace_back(COFF::IMAGE_DEBUG_TYPE_POGO, | |||
make<DebugDirStringChunk>(ltcg)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn’t mean to discourage you @mikolaj-pirog. But if you could come up with a proper structure here for IMAGE_DEBUG_TYPE_POGO, I think the PR would be acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn’t mean to discourage you @mikolaj-pirog. But if you could come up with a proper structure here for IMAGE_DEBUG_TYPE_POGO, I think the PR would be acceptable.
No worry, you didn't discourage me, I appreciate each piece of feedback :) Just to be clear, would this patch be accepted if I manage to make lld emit the appropriate structure (like MSVC does) for the PGO/PGU/LTCG?
I agree that it's a little awkward to port this behavior to clang, since msvc does pgo/lto differently; the lto is enabled by default; it's impossible to do pgo without lto. The value of this patch, as I have seen when creating it, is the feature parity with msvc, accepting the awkwardness. |
This PR adds support for putting magic strings indicating PGO/LTO in the debug directory of Windows PE files. This is to make clang, lld behave as MSVC compiler in this regard.
This needs a little background: MSVC compiler puts magic strings ("PGI" for instrumentation, "PGU" for binary built using instrumented data, "LTCG" for LTO builds) in the debug directory of Windows PE files. You can see these strings by using



dumpbin
utility (dumpbin /HEADERS a.exe
) on the files built with MSVC and PGO/LTO on Windows, see the screenshots fordumpbin
showing debug dir of a binary file, built with PGO/LTO.This PR "ports" this behavior to clang and lld, so compiling and linking using lld with
-fprofile-generate
will result in Windows PE file with "PGI" string in debug dir; for-fprofile-use
,-fprofile-sample-use
this will result in "PGU" string; for-flto
this will result in "LTCG" string in the debug dir.The implementation of that is split between lld linker and emission of COFF object files. The linker puts the magic strings in the debug directory; the problem is it needs to know when to do that -- lld isn't aware of instrumentation, or building based on profiling info. It knows only about linking of LTOed binary. Naturally, lld has to somehow know that it needs to put "PGI"/ "PGU" string -- I have solved this by putting a special sections (".pgi", ".pgu") in the COFF object files, when they are compiled with instrumentation, or with data from instrumentation. lld checks for these sections, and based on their presence, puts the magic string in the debug directory of Windows PE file.