-
Notifications
You must be signed in to change notification settings - Fork 13.3k
[SHT_LLVM_FUNC_MAP][ObjectYaml]Introduce function address map section and emit dynamic instruction count(ObjectYaml part) #124332
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
2e0f4ff
d2b1895
5841112
cc11306
e8a1dce
d2d627e
970042e
3703e53
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1149,6 +1149,7 @@ enum : unsigned { | |
SHT_LLVM_OFFLOADING = 0x6fff4c0b, // LLVM device offloading data. | ||
SHT_LLVM_LTO = 0x6fff4c0c, // .llvm.lto for fat LTO. | ||
SHT_LLVM_JT_SIZES = 0x6fff4c0d, // LLVM jump tables sizes. | ||
SHT_LLVM_FUNC_MAP = 0x6fff4c0e, // LLVM function address map. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please also update https://llvm.org/docs/Extensions.html. |
||
// Android's experimental support for SHT_RELR sections. | ||
// https://android.googlesource.com/platform/bionic/+/b7feec74547f84559a1467aca02708ff61346d2a/libc/include/elf.h#512 | ||
SHT_ANDROID_RELR = 0x6fffff00, // Relocation entries; only offsets. | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -195,6 +195,12 @@ struct PGOAnalysisMapEntry { | |
std::optional<std::vector<PGOBBEntry>> PGOBBEntries; | ||
}; | ||
|
||
struct FuncMapEntry { | ||
uint8_t Version; | ||
llvm::yaml::Hex64 Address; | ||
uint64_t DynamicInstCount; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I'm not mistaken, not using There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right. This is just a count. So there is no point in using the hex encoding in Yaml. |
||
}; | ||
|
||
struct StackSizeEntry { | ||
llvm::yaml::Hex64 Address; | ||
llvm::yaml::Hex64 Size; | ||
|
@@ -229,6 +235,7 @@ struct Chunk { | |
DependentLibraries, | ||
CallGraphProfile, | ||
BBAddrMap, | ||
FuncMap, | ||
|
||
// Special chunks. | ||
SpecialChunksStart, | ||
|
@@ -355,6 +362,18 @@ struct BBAddrMapSection : Section { | |
} | ||
}; | ||
|
||
struct FuncMapSection : Section { | ||
std::optional<std::vector<FuncMapEntry>> Entries; | ||
|
||
FuncMapSection() : Section(ChunkKind::FuncMap) {} | ||
|
||
std::vector<std::pair<StringRef, bool>> getEntries() const override { | ||
return {{"Entries", Entries.has_value()}}; | ||
}; | ||
|
||
static bool classof(const Chunk *S) { return S->Kind == ChunkKind::FuncMap; } | ||
}; | ||
|
||
struct StackSizesSection : Section { | ||
std::optional<std::vector<StackSizeEntry>> Entries; | ||
|
||
|
@@ -762,6 +781,7 @@ bool shouldAllocateFileSpace(ArrayRef<ProgramHeader> Phdrs, | |
} // end namespace llvm | ||
|
||
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::StackSizeEntry) | ||
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::FuncMapEntry) | ||
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::BBAddrMapEntry) | ||
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::BBAddrMapEntry::BBEntry) | ||
LLVM_YAML_IS_SEQUENCE_VECTOR(llvm::ELFYAML::BBAddrMapEntry::BBRangeEntry) | ||
|
@@ -929,6 +949,10 @@ template <> struct MappingTraits<ELFYAML::StackSizeEntry> { | |
static void mapping(IO &IO, ELFYAML::StackSizeEntry &Rel); | ||
}; | ||
|
||
template <> struct MappingTraits<ELFYAML::FuncMapEntry> { | ||
static void mapping(IO &IO, ELFYAML::FuncMapEntry &E); | ||
}; | ||
|
||
template <> struct MappingTraits<ELFYAML::BBAddrMapEntry> { | ||
static void mapping(IO &IO, ELFYAML::BBAddrMapEntry &E); | ||
}; | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -287,6 +287,9 @@ template <class ELFT> class ELFState { | |||||
void writeSectionContent(Elf_Shdr &SHeader, | ||||||
const ELFYAML::BBAddrMapSection &Section, | ||||||
ContiguousBlobAccumulator &CBA); | ||||||
void writeSectionContent(Elf_Shdr &SHeader, | ||||||
const ELFYAML::FuncMapSection &Section, | ||||||
ContiguousBlobAccumulator &CBA); | ||||||
void writeSectionContent(Elf_Shdr &SHeader, | ||||||
const ELFYAML::HashSection &Section, | ||||||
ContiguousBlobAccumulator &CBA); | ||||||
|
@@ -894,6 +897,8 @@ void ELFState<ELFT>::initSectionHeaders(std::vector<Elf_Shdr> &SHeaders, | |||||
writeSectionContent(SHeader, *S, CBA); | ||||||
} else if (auto S = dyn_cast<ELFYAML::BBAddrMapSection>(Sec)) { | ||||||
writeSectionContent(SHeader, *S, CBA); | ||||||
} else if (auto S = dyn_cast<ELFYAML::FuncMapSection>(Sec)) { | ||||||
writeSectionContent(SHeader, *S, CBA); | ||||||
} else { | ||||||
llvm_unreachable("Unknown section type"); | ||||||
} | ||||||
|
@@ -1537,6 +1542,33 @@ void ELFState<ELFT>::writeSectionContent( | |||||
} | ||||||
} | ||||||
|
||||||
template <class ELFT> | ||||||
void ELFState<ELFT>::writeSectionContent(Elf_Shdr &SHeader, | ||||||
const ELFYAML::FuncMapSection &Section, | ||||||
ContiguousBlobAccumulator &CBA) { | ||||||
if (!Section.Entries) | ||||||
return; | ||||||
|
||||||
for (const auto &[Idx, E] : llvm::enumerate(*Section.Entries)) { | ||||||
unsigned Size = 0; | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The size setting code in this loop could be dramatically simplified for now, now that we're using fixed size entries. Outside the loop, you could just do something like:
|
||||||
if (Section.Type == llvm::ELF::SHT_LLVM_FUNC_MAP) { | ||||||
if (E.Version > 1) | ||||||
WithColor::warning() << "unsupported SHT_LLVM_FUNC_MAP version: " | ||||||
<< static_cast<int>(E.Version) | ||||||
<< "; encoding using the most recent version"; | ||||||
CBA.write(E.Version); | ||||||
Size += 1; | ||||||
} | ||||||
CBA.write<uintX_t>(E.Address, ELFT::Endianness); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correct me if I'm wrong, but shouldn't this be using the Elf_Addr type, since it's representing an address? They might amount to the same thing, but it conveys the meaning better. (NB: I haven't tested it, so this might not work as desired)
Suggested change
|
||||||
Size += sizeof(uintX_t); | ||||||
CBA.write<uint64_t>(E.DynamicInstCount, ELFT::Endianness); | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if the DynamicInstCount should be architecture-dependent type, so that on 32-bit architectures, it only takes up 4 bytes? There's probably an argument actually that it should always just be 4 bytes, since you'd be hard pressed to need more than what 4 bytes can represent in instruction counts. |
||||||
Size += 8; | ||||||
|
||||||
SHeader.sh_size += Size; | ||||||
SHeader.sh_entsize = Size; | ||||||
} | ||||||
} | ||||||
|
||||||
template <class ELFT> | ||||||
void ELFState<ELFT>::writeSectionContent( | ||||||
Elf_Shdr &SHeader, const ELFYAML::LinkerOptionsSection &Section, | ||||||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,134 @@ | ||
## Check how obj2yaml produces YAML .llvm_func_map descriptions. | ||
|
||
## Check that obj2yaml uses the "Entries" tag to describe an .llvm_func_map section. | ||
|
||
# RUN: yaml2obj --docnum=1 %s -o %t1 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You probably should have a 32-bit version of this test case too, since you're writing an address, which has a different size on 32-bit architectures. |
||
# RUN: obj2yaml %t1 | FileCheck %s --check-prefix=VALID | ||
|
||
# VALID: --- !ELF | ||
# VALID-NEXT: FileHeader: | ||
# VALID-NEXT: Class: ELFCLASS64 | ||
# VALID-NEXT: Data: ELFDATA2LSB | ||
# VALID-NEXT: Type: ET_EXEC | ||
# VALID-NEXT: Sections: | ||
# VALID-NEXT: - Name: .llvm_func_map | ||
# VALID-NEXT: Type: SHT_LLVM_FUNC_MAP | ||
# VALID-NEXT: EntSize: 0x11 | ||
# VALID-NEXT: Entries: | ||
# VALID-NEXT: - Version: 1 | ||
## The 'Address' field is omitted when it's zero. | ||
# VALID-NEXT: DynInstCnt: 16 | ||
## The 'DynInstCnt' field is omitted when it's zero. | ||
# VALID-NEXT: - Version: 1 | ||
# VALID-NEXT: Address: 0x1 | ||
# VALID-NEXT: - Version: 1 | ||
# VALID-NEXT: Address: 0xFFFFFFFFFFFFFFF1 | ||
# VALID-NEXT: DynInstCnt: 100001 | ||
|
||
--- !ELF | ||
FileHeader: | ||
Class: ELFCLASS64 | ||
Data: ELFDATA2LSB | ||
Type: ET_EXEC | ||
Sections: | ||
- Name: .llvm_func_map | ||
Type: SHT_LLVM_FUNC_MAP | ||
ShSize: [[SIZE=<none>]] | ||
Entries: | ||
- Version: 1 | ||
Address: 0x0 | ||
DynInstCnt: 16 | ||
- Version: 1 | ||
Address: 0x1 | ||
DynInstCnt: 0 | ||
- Version: 1 | ||
Address: 0xFFFFFFFFFFFFFFF1 | ||
DynInstCnt: 100001 | ||
|
||
## Check obj2yaml can dump empty .llvm_func_map sections. | ||
|
||
# RUN: yaml2obj --docnum=2 %s -o %t2 | ||
# RUN: obj2yaml %t2 | FileCheck %s --check-prefix=EMPTY | ||
|
||
# EMPTY: --- !ELF | ||
# EMPTY-NEXT: FileHeader: | ||
# EMPTY-NEXT: Class: ELFCLASS64 | ||
# EMPTY-NEXT: Data: ELFDATA2LSB | ||
# EMPTY-NEXT: Type: ET_EXEC | ||
# EMPTY-NEXT: Sections: | ||
# EMPTY-NEXT: - Name: .llvm_func_map | ||
# EMPTY-NEXT: Type: SHT_LLVM_FUNC_MAP | ||
# EMPTY-NOT: Content: | ||
|
||
--- !ELF | ||
FileHeader: | ||
Class: ELFCLASS64 | ||
Data: ELFDATA2LSB | ||
Type: ET_EXEC | ||
Sections: | ||
- Name: .llvm_func_map | ||
Type: SHT_LLVM_FUNC_MAP | ||
Content: "" | ||
|
||
## Check obj2yaml can dump multiple .llvm_func_map sections. | ||
|
||
# RUN: yaml2obj --docnum=3 %s -o %t3 | ||
# RUN: obj2yaml %t3 | FileCheck %s --check-prefix=MULTI | ||
|
||
# MULTI: --- !ELF | ||
# MULTI-NEXT: FileHeader: | ||
# MULTI-NEXT: Class: ELFCLASS64 | ||
# MULTI-NEXT: Data: ELFDATA2LSB | ||
# MULTI-NEXT: Type: ET_EXEC | ||
# MULTI-NEXT: Sections: | ||
# MULTI-NEXT: - Name: .llvm_func_map | ||
# MULTI-NEXT: Type: SHT_LLVM_FUNC_MAP | ||
# MULTI-NEXT: EntSize: 0x11 | ||
# MULTI-NEXT: Entries: | ||
# MULTI-NEXT: - Version: 1 | ||
# MULTI-NEXT: Address: 0x2 | ||
# MULTI-NEXT: DynInstCnt: 3 | ||
# MULTI-NEXT: - Name: '.llvm_func_map (1)' | ||
# MULTI-NEXT: Type: SHT_LLVM_FUNC_MAP | ||
# MULTI-NEXT: EntSize: 0x11 | ||
# MULTI-NEXT: Entries: | ||
# MULTI-NEXT: - Version: 1 | ||
# MULTI-NEXT: Address: 0xA | ||
# MULTI-NEXT: DynInstCnt: 100 | ||
|
||
--- !ELF | ||
FileHeader: | ||
Class: ELFCLASS64 | ||
Data: ELFDATA2LSB | ||
Type: ET_EXEC | ||
Sections: | ||
- Name: .llvm_func_map | ||
Type: SHT_LLVM_FUNC_MAP | ||
Entries: | ||
- Version: 1 | ||
Address: 0x2 | ||
DynInstCnt: 3 | ||
- Name: '.llvm_func_map (1)' | ||
Type: SHT_LLVM_FUNC_MAP | ||
Entries: | ||
- Version: 1 | ||
Address: 0xA | ||
DynInstCnt: 100 | ||
|
||
## Check that obj2yaml uses the "Content" tag to describe an .llvm_func_map section | ||
## when it can't extract the entries, for example, when the section is truncated. | ||
|
||
# RUN: yaml2obj --docnum=1 -DSIZE=0x8 %s -o %t4 | ||
# RUN: obj2yaml %t4 | FileCheck %s --check-prefixes=TRUNCATED,INVALID | ||
|
||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: double blank line |
||
# INVALID: --- !ELF | ||
# INVALID-NEXT: FileHeader: | ||
# INVALID-NEXT: Class: ELFCLASS64 | ||
# INVALID-NEXT: Data: ELFDATA2LSB | ||
# INVALID-NEXT: Type: ET_EXEC | ||
# INVALID-NEXT: Sections: | ||
# INVALID-NEXT: - Name: .llvm_func_map | ||
# INVALID-NEXT: Type: SHT_LLVM_FUNC_MAP | ||
# INVALID-NEXT: EntSize: 0x11 | ||
# TRUNCATED-NEXT: Content: '{{([[:xdigit:]]{16})}}'{{$}} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd be tempted to explicitly check the expected 8 bytes here, to show that it's not producing garbage here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced we want a version byte for every single entry. That feels wasteful when there could be thousands of functions. However, we also can't just have a single version byte at the start of the section, because then the section becomes ambiguous when concatenated together by the linker from two different objects.
Would a better idea be to have a header, consisting of a size (or count of function entries in this table) and a version number (for now), followed by all the function address/count entries? A section might consist of one or more of these header + address/count blocks, to allow for this concatenation. The end of a block (and therefore start of the next block) is identified via the size/count member of the header.
Another idea, if you adopt the header + body approach, is to split the entries into separate function list/data, i.e. you'd have something like the following:
This would be useful for reducing the amount of data that needs to be read to find out information for a specific function. However, it can only be used if the data is fixed size for all entries within a block (i.e. no ULEBs), because accessing the data requires finding the right function and then using its index in the function address list to jump to the right block of data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jh7370 Does the header approach require adding custom merging support in the linker?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's the same approach as taken by many DWARF sections. Pull the common stuff into a table header and then put the data in the table body, with a size in the header to indicate how much data there is (an alternative approach would be some kind of end marker in the data, but that has issues under some conditions, depending on the values used).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds a great idea, I will give a try! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. So we will get one section per module with multiple function entries and one header. Then the linker will simply concatenate these sections. So we may end up with multiple headers. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jh7370 Sorry for late reply and thank you for the detailed clarification, that was super helpful!
I've now done the single section approach you suggested, it does work to emit the good data!(IIUC, the first suggestion might rely on features that aren’t ready yet)
Then to understand the tradeoffs, I ran some experiments to compare the original design(duplicated version filed) vs the one single section design. I ran them on one of our top services(big size binary, contains 1M+ functions), I noticed one significant diff in finial binary's section size.
It's 2~3X more size, which I think that's due to the dead entries(missing gc-sections). For other overheads, I think that's not a significant factor for our system. For build time, as for our major services, the build time could take 30mins+ time, the extra linking time for the section is too small to measure. And the disk/network overhead is fine for the small intermediate elf obj size increase. But for the finial binary size, given we could extend more data, that means for each additional data, it would cost 2 ~ 3X more(dead entry) size, which I feel could be a problem for long run. Given this, I'm leaning towards the original design. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem. It's important to get these things right! On the plus side, as long as the first part of an entry is the version byte, if we come up with a better format in the future, all we have to do is bump the version number and ensure the header or first entry has that version byte first again, i.e. the header approach (should you decide to switch in the future for whatever reason) is interchangeable with the version-per-entry approach, assuming you have the correct version byte, since the version-per-entry approach is effectively "header per entry" where the version field is the sole component of the header.
Another consideration: is it important to be able to traverse this structure quickly to find the appropriate entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it!
Is it related to your earlier comment about "using a fixed size instead of the ULEBs"? That sounds good. As we could extend more data in future, it should be beneficial to quickly look up the entry. I will change to use a fixed size so that we can skip parsing the unused data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With fixed-sized entries, you can then use a binary search algorithm to search the section for a specific address, assuming that the entries are in address order. I think this will be guaranteed by the SHF_LINK_ORDER flag. You might want to set the seciton's
sh_entsize
appropriately too.Some related thoughts:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate the suggestion! Updated to set the entry size.
I verified that on a local test, right, the linker set
sh_entsize
to 0 if the entry size is not fixed size(combine different entry sizes from two versions)