-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[DebugInfo] Add fast path for parsing DW_TAG_compile_unit abbrevs #108757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
In `DWARFDebugInfoEntry::extractFast`, we were parsing all abbreviation declarations belonging to the compilation unit by calling the `getAbbreviations` method. This resulted in a large overhead (mostly vector resizes and ULEB128 parsing) in cases where only the Compilation Unit DIE ended up being used. As `DW_TAG_compile_unit` typically comes first in the abbreviation table, this commit adds a fast-path function (`tryExtractCUAbbrevFast`) which attempts to read only the first abbreviation, without constructing a full `DWARFAbbreviationDeclarationSet`. This significantly speeds up `ld64.lld`'s generation of `N_OSO` stab information (which needs `DW_AT_name` from the Compilation Unit DIE). The following measurement was taken on an M1 Mac Mini linking Chromium with full debug info: x: before +: after N Min Max Median Avg Stddev x 15 3.136759 4.390569 3.5234511 3.6028554 0.38726359 + 15 2.7222703 3.5872169 3.237128 3.1830136 0.31002649 Difference at 95.0% confidence -0.419842 +/- 0.26232 -11.653% +/- 7.28088% (Student's t, pooled s = 0.350777)
@llvm/pr-subscribers-debuginfo Author: Daniel Bertalan (BertalanD) ChangesIn As This significantly speeds up x: before
x 15 3.136759 4.390569 3.5234511 3.6028554 0.38726359
Full diff: https://github.com/llvm/llvm-project/pull/108757.diff 5 Files Affected:
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h
index 6439827ef70f0f..18555bafdc1f01 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h
@@ -62,6 +62,7 @@ class DWARFDebugAbbrev {
mutable DWARFAbbreviationDeclarationSetMap AbbrDeclSets;
mutable DWARFAbbreviationDeclarationSetMap::const_iterator PrevAbbrOffsetPos;
mutable std::optional<DataExtractor> Data;
+ mutable std::map<uint64_t, DWARFAbbreviationDeclaration> CUAbbrevs;
public:
DWARFDebugAbbrev(DataExtractor Data);
@@ -69,6 +70,9 @@ class DWARFDebugAbbrev {
Expected<const DWARFAbbreviationDeclarationSet *>
getAbbreviationDeclarationSet(uint64_t CUAbbrOffset) const;
+ Expected<const DWARFAbbreviationDeclaration *>
+ tryExtractCUAbbrevFast(uint64_t CUAbbrOffset) const;
+
void dump(raw_ostream &OS) const;
Error parse() const;
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
index 80c27aea893123..87f8742fd9d9f0 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
@@ -419,6 +419,10 @@ class DWARFUnit {
uint64_t getAbbreviationsOffset() const { return Header.getAbbrOffset(); }
+ /// Extracts only the abbreviation declaration with code 1, which is
+ /// typically the compile unit DIE (DW_TAG_compile_unit).
+ const DWARFAbbreviationDeclaration *tryExtractCUAbbrevFast() const;
+
const DWARFAbbreviationDeclarationSet *getAbbreviations() const;
static bool isMatchingUnitTypeAndTag(uint8_t UnitType, dwarf::Tag Tag) {
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp b/llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp
index 85959ecc5e17f1..7944fc881e6bd1 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp
@@ -168,3 +168,21 @@ DWARFDebugAbbrev::getAbbreviationDeclarationSet(uint64_t CUAbbrOffset) const {
.first;
return &PrevAbbrOffsetPos->second;
}
+
+Expected<const DWARFAbbreviationDeclaration *>
+DWARFDebugAbbrev::tryExtractCUAbbrevFast(uint64_t CUAbbrOffset) const {
+ if (auto AbbrevDecl = CUAbbrevs.find(CUAbbrOffset);
+ AbbrevDecl != CUAbbrevs.end())
+ return &AbbrevDecl->second;
+
+ DWARFAbbreviationDeclaration Decl;
+ uint64_t Offset = CUAbbrOffset;
+ Expected<DWARFAbbreviationDeclaration::ExtractState> ES =
+ Decl.extract(*Data, &Offset);
+ if (!ES)
+ return ES.takeError();
+ if (Decl.getCode() != 1)
+ return nullptr;
+
+ return &(CUAbbrevs[CUAbbrOffset] = std::move(Decl));
+}
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp b/llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp
index 4f0a6d96ace9e2..030faad13f46f6 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp
@@ -34,36 +34,49 @@ bool DWARFDebugInfoEntry::extractFast(const DWARFUnit &U, uint64_t *OffsetPtr,
return false;
}
assert(DebugInfoData.isValidOffset(UEndOffset - 1));
+ AbbrevDecl = nullptr;
+
uint64_t AbbrCode = DebugInfoData.getULEB128(OffsetPtr);
if (0 == AbbrCode) {
// NULL debug tag entry.
- AbbrevDecl = nullptr;
return true;
}
- const auto *AbbrevSet = U.getAbbreviations();
- if (!AbbrevSet) {
- U.getContext().getWarningHandler()(
- createStringError(errc::invalid_argument,
- "DWARF unit at offset 0x%8.8" PRIx64 " "
- "contains invalid abbreviation set offset 0x%" PRIx64,
- U.getOffset(), U.getAbbreviationsOffset()));
- // Restore the original offset.
- *OffsetPtr = Offset;
- return false;
+
+ // Fast path: parsing the entire abbreviation table is wasteful if we only
+ // need the unit DIE (typically AbbrCode == 1).
+ if (1 == AbbrCode) {
+ AbbrevDecl = U.tryExtractCUAbbrevFast();
+ assert(!AbbrevDecl || AbbrevDecl->getCode() == AbbrCode);
}
- AbbrevDecl = AbbrevSet->getAbbreviationDeclaration(AbbrCode);
+
if (!AbbrevDecl) {
- U.getContext().getWarningHandler()(
- createStringError(errc::invalid_argument,
- "DWARF unit at offset 0x%8.8" PRIx64 " "
- "contains invalid abbreviation %" PRIu64 " at "
- "offset 0x%8.8" PRIx64 ", valid abbreviations are %s",
- U.getOffset(), AbbrCode, *OffsetPtr,
- AbbrevSet->getCodeRange().c_str()));
- // Restore the original offset.
- *OffsetPtr = Offset;
- return false;
+ const auto *AbbrevSet = U.getAbbreviations();
+ if (!AbbrevSet) {
+ U.getContext().getWarningHandler()(createStringError(
+ errc::invalid_argument,
+ "DWARF unit at offset 0x%8.8" PRIx64 " "
+ "contains invalid abbreviation set offset 0x%" PRIx64,
+ U.getOffset(), U.getAbbreviationsOffset()));
+ // Restore the original offset.
+ *OffsetPtr = Offset;
+ return false;
+ }
+ AbbrevDecl = AbbrevSet->getAbbreviationDeclaration(AbbrCode);
+
+ if (!AbbrevDecl) {
+ U.getContext().getWarningHandler()(createStringError(
+ errc::invalid_argument,
+ "DWARF unit at offset 0x%8.8" PRIx64 " "
+ "contains invalid abbreviation %" PRIu64 " at "
+ "offset 0x%8.8" PRIx64 ", valid abbreviations are %s",
+ U.getOffset(), AbbrCode, *OffsetPtr,
+ AbbrevSet->getCodeRange().c_str()));
+ // Restore the original offset.
+ *OffsetPtr = Offset;
+ return false;
+ }
}
+
// See if all attributes in this DIE have fixed byte sizes. If so, we can
// just add this size to the offset to skip to the next DIE.
if (std::optional<size_t> FixedSize =
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
index bdd04b00f557bd..dcf323525b10ee 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
@@ -1051,6 +1051,17 @@ DWARFUnit::getLastChildEntry(const DWARFDebugInfoEntry *Die) const {
return nullptr;
}
+const DWARFAbbreviationDeclaration *DWARFUnit::tryExtractCUAbbrevFast() const {
+ Expected<const DWARFAbbreviationDeclaration *> AbbrevOrError =
+ Abbrev->tryExtractCUAbbrevFast(getAbbreviationsOffset());
+ if (!AbbrevOrError) {
+ // FIXME: We should propagate this error upwards.
+ consumeError(AbbrevOrError.takeError());
+ return nullptr;
+ }
+ return *AbbrevOrError;
+}
+
const DWARFAbbreviationDeclarationSet *DWARFUnit::getAbbreviations() const {
if (!Abbrevs) {
Expected<const DWARFAbbreviationDeclarationSet *> AbbrevsOrError =
|
@llvm/pr-subscribers-lld-macho Author: Daniel Bertalan (BertalanD) ChangesIn As This significantly speeds up x: before
x 15 3.136759 4.390569 3.5234511 3.6028554 0.38726359
Full diff: https://github.com/llvm/llvm-project/pull/108757.diff 5 Files Affected:
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h
index 6439827ef70f0f..18555bafdc1f01 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFDebugAbbrev.h
@@ -62,6 +62,7 @@ class DWARFDebugAbbrev {
mutable DWARFAbbreviationDeclarationSetMap AbbrDeclSets;
mutable DWARFAbbreviationDeclarationSetMap::const_iterator PrevAbbrOffsetPos;
mutable std::optional<DataExtractor> Data;
+ mutable std::map<uint64_t, DWARFAbbreviationDeclaration> CUAbbrevs;
public:
DWARFDebugAbbrev(DataExtractor Data);
@@ -69,6 +70,9 @@ class DWARFDebugAbbrev {
Expected<const DWARFAbbreviationDeclarationSet *>
getAbbreviationDeclarationSet(uint64_t CUAbbrOffset) const;
+ Expected<const DWARFAbbreviationDeclaration *>
+ tryExtractCUAbbrevFast(uint64_t CUAbbrOffset) const;
+
void dump(raw_ostream &OS) const;
Error parse() const;
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
index 80c27aea893123..87f8742fd9d9f0 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFUnit.h
@@ -419,6 +419,10 @@ class DWARFUnit {
uint64_t getAbbreviationsOffset() const { return Header.getAbbrOffset(); }
+ /// Extracts only the abbreviation declaration with code 1, which is
+ /// typically the compile unit DIE (DW_TAG_compile_unit).
+ const DWARFAbbreviationDeclaration *tryExtractCUAbbrevFast() const;
+
const DWARFAbbreviationDeclarationSet *getAbbreviations() const;
static bool isMatchingUnitTypeAndTag(uint8_t UnitType, dwarf::Tag Tag) {
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp b/llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp
index 85959ecc5e17f1..7944fc881e6bd1 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFDebugAbbrev.cpp
@@ -168,3 +168,21 @@ DWARFDebugAbbrev::getAbbreviationDeclarationSet(uint64_t CUAbbrOffset) const {
.first;
return &PrevAbbrOffsetPos->second;
}
+
+Expected<const DWARFAbbreviationDeclaration *>
+DWARFDebugAbbrev::tryExtractCUAbbrevFast(uint64_t CUAbbrOffset) const {
+ if (auto AbbrevDecl = CUAbbrevs.find(CUAbbrOffset);
+ AbbrevDecl != CUAbbrevs.end())
+ return &AbbrevDecl->second;
+
+ DWARFAbbreviationDeclaration Decl;
+ uint64_t Offset = CUAbbrOffset;
+ Expected<DWARFAbbreviationDeclaration::ExtractState> ES =
+ Decl.extract(*Data, &Offset);
+ if (!ES)
+ return ES.takeError();
+ if (Decl.getCode() != 1)
+ return nullptr;
+
+ return &(CUAbbrevs[CUAbbrOffset] = std::move(Decl));
+}
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp b/llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp
index 4f0a6d96ace9e2..030faad13f46f6 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFDebugInfoEntry.cpp
@@ -34,36 +34,49 @@ bool DWARFDebugInfoEntry::extractFast(const DWARFUnit &U, uint64_t *OffsetPtr,
return false;
}
assert(DebugInfoData.isValidOffset(UEndOffset - 1));
+ AbbrevDecl = nullptr;
+
uint64_t AbbrCode = DebugInfoData.getULEB128(OffsetPtr);
if (0 == AbbrCode) {
// NULL debug tag entry.
- AbbrevDecl = nullptr;
return true;
}
- const auto *AbbrevSet = U.getAbbreviations();
- if (!AbbrevSet) {
- U.getContext().getWarningHandler()(
- createStringError(errc::invalid_argument,
- "DWARF unit at offset 0x%8.8" PRIx64 " "
- "contains invalid abbreviation set offset 0x%" PRIx64,
- U.getOffset(), U.getAbbreviationsOffset()));
- // Restore the original offset.
- *OffsetPtr = Offset;
- return false;
+
+ // Fast path: parsing the entire abbreviation table is wasteful if we only
+ // need the unit DIE (typically AbbrCode == 1).
+ if (1 == AbbrCode) {
+ AbbrevDecl = U.tryExtractCUAbbrevFast();
+ assert(!AbbrevDecl || AbbrevDecl->getCode() == AbbrCode);
}
- AbbrevDecl = AbbrevSet->getAbbreviationDeclaration(AbbrCode);
+
if (!AbbrevDecl) {
- U.getContext().getWarningHandler()(
- createStringError(errc::invalid_argument,
- "DWARF unit at offset 0x%8.8" PRIx64 " "
- "contains invalid abbreviation %" PRIu64 " at "
- "offset 0x%8.8" PRIx64 ", valid abbreviations are %s",
- U.getOffset(), AbbrCode, *OffsetPtr,
- AbbrevSet->getCodeRange().c_str()));
- // Restore the original offset.
- *OffsetPtr = Offset;
- return false;
+ const auto *AbbrevSet = U.getAbbreviations();
+ if (!AbbrevSet) {
+ U.getContext().getWarningHandler()(createStringError(
+ errc::invalid_argument,
+ "DWARF unit at offset 0x%8.8" PRIx64 " "
+ "contains invalid abbreviation set offset 0x%" PRIx64,
+ U.getOffset(), U.getAbbreviationsOffset()));
+ // Restore the original offset.
+ *OffsetPtr = Offset;
+ return false;
+ }
+ AbbrevDecl = AbbrevSet->getAbbreviationDeclaration(AbbrCode);
+
+ if (!AbbrevDecl) {
+ U.getContext().getWarningHandler()(createStringError(
+ errc::invalid_argument,
+ "DWARF unit at offset 0x%8.8" PRIx64 " "
+ "contains invalid abbreviation %" PRIu64 " at "
+ "offset 0x%8.8" PRIx64 ", valid abbreviations are %s",
+ U.getOffset(), AbbrCode, *OffsetPtr,
+ AbbrevSet->getCodeRange().c_str()));
+ // Restore the original offset.
+ *OffsetPtr = Offset;
+ return false;
+ }
}
+
// See if all attributes in this DIE have fixed byte sizes. If so, we can
// just add this size to the offset to skip to the next DIE.
if (std::optional<size_t> FixedSize =
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
index bdd04b00f557bd..dcf323525b10ee 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFUnit.cpp
@@ -1051,6 +1051,17 @@ DWARFUnit::getLastChildEntry(const DWARFDebugInfoEntry *Die) const {
return nullptr;
}
+const DWARFAbbreviationDeclaration *DWARFUnit::tryExtractCUAbbrevFast() const {
+ Expected<const DWARFAbbreviationDeclaration *> AbbrevOrError =
+ Abbrev->tryExtractCUAbbrevFast(getAbbreviationsOffset());
+ if (!AbbrevOrError) {
+ // FIXME: We should propagate this error upwards.
+ consumeError(AbbrevOrError.takeError());
+ return nullptr;
+ }
+ return *AbbrevOrError;
+}
+
const DWARFAbbreviationDeclarationSet *DWARFUnit::getAbbreviations() const {
if (!Abbrevs) {
Expected<const DWARFAbbreviationDeclarationSet *> AbbrevsOrError =
|
|
||
// Fast path: parsing the entire abbreviation table is wasteful if we only | ||
// need the unit DIE (typically AbbrCode == 1). | ||
if (1 == AbbrCode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- minor, we don't usually do these "constant on the LHS" style comparison in LLVM, I think?
- The abbreviation numbers in the abbrev table are arbitrary/not ordered - so we probably shouldn't be needing/trying to check for the value is 1.
Could this either generalize this to check the first abbrev in the table, regardless of numbering - or change abbrev parsing to be lazy in general? (like parse as much of the table as is needed to find the number - oh, I guess we do have some optimizations that only fire if the abbrev numbering is strictly increasing (doesn't have to start at 1, doesn't have to increase by 1 each time - but so long as it's always increasing we can still binary search))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with David here, this is seems like an overfit. Both suggestions make sense to me, either check the first entry in the table (which seems to be what you're doing anyway) or make abbreviation parsing lazier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or change abbrev parsing to be lazy in general?
I have considered this, but this would likely pessimize the general case.
Profiling shows that the majority of ld64.lld
's DWARF parsing time is split evenly between ULEB128 parsing and appending to the DWARFAbbreviationDeclaration::attribute_specs
vector.
As far as I can tell, everything is ULEB128-encoded in the abbrev table, so if we do it lazily, we'll end up constantly re-parsing the abbrev declarations that come before the one we're looking for. So we'd parse these numbers O(n^2)
times instead of O(n)
. I guess we could do a first pass where we only note down the begin offsets of the abbreviations, but that would bring us back to the situation with the large number of vector appends (although the count is |abbrevs|
, not |attributes|
, that might be better)
And if we do need everything, the total cost of building e.g. an std::map
would end up being larger than the vector appends' cost. For the use cases where everything is needed (LLDB, dsymutil, etc.), constructing the single sorted vector upfront (i.e. what DWARFAbbreviationDeclarationSet
does) is faster than the above approach.
Could this either generalize this to check the first abbrev in the table
Do you mean always trying to parse the first abbreviation (so no AbbrCode == 1
check), and checking afterwards if the code happens to be what we need? That sounds okay to me, since after the first call, the value is cached, so its cost would be a single std::map
lookup.
Just out of curiosity, how common is the first abbreviation code not being 1 or code 1 not being DW_TAG_compile_unit
? All C++/Swift/Rust programs I tested hit my fast path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(re 1: Agreed, it's not common in LLVM, but if (0 == AbbrCode) {
is just 7 lines above in this file. So it does match local style.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or change abbrev parsing to be lazy in general?
I have considered this, but this would likely pessimize the general case.
Ah, sorry, not /that/ lazy. Like we could store the existing vector of abbrevs, and an offset to past the end of the last abbrev we parsed (or some sentinel if we reached the end of the abbrev list, with its 0 marker). Then if an abbrev number is requested that isn't in the list already, we parse abbrevs (& put them in the vector, etc) until we find the one we're looking for.
But I think the only place I'm thinking of that'd benefit from lazy abbrev parsing is just for the CU abbrev too anyway (building an address lookup table when the DWARF doesn't include .debug_aranges - currently that involves parsing all the abbrevs, and should only need the CU's abbrev too)
& also there's some code in llvm-dwp that tries to parse just the first CU to access the dwo_id for DWARFv4. Though I'm not sure that code's using libDebugInfoDWARF at all, since it mostly doesn't need to parse DWARF.
I guess it'd help to be more generalized laziness would handle cases where the CU DIE abbrev isn't the first one (like with type units, but those aren't used on MacOS, so aren't relevant to ld64 perf).
Why is ld64 reading the CUs anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(will reflect on the other bits later):
Why is ld64 reading the CUs anyway?
For emitting N_OSO
stabs; which indicate the source file names of symbols. We need to read DW_AT_source_dir
for this, see ObjFile::sourceFile
in LLD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One generalization would be to try symbolizing (llvm-symbolizer) where the abbrev parsing probably isn't /as/ large a part of the profile, but probably it shows up and lazy CU abbrev parsing probably helps there too?
|
||
// Fast path: parsing the entire abbreviation table is wasteful if we only | ||
// need the unit DIE (typically AbbrCode == 1). | ||
if (1 == AbbrCode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with David here, this is seems like an overfit. Both suggestions make sense to me, either check the first entry in the table (which seems to be what you're doing anyway) or make abbreviation parsing lazier.
... but still drop it on the floor in the end, as we want to re-try the full DIE parsing.
I don't have data, but things to try would be: enabling type units (a type unit might precede the compile unit); enabling split DWARF (would have a skeleton unit). I agree that a compilation without these features is pretty sure to have the CU show up first. |
In
DWARFDebugInfoEntry::extractFast
, we were parsing all abbreviation declarations belonging to the compilation unit by calling thegetAbbreviations
method. This resulted in a large overhead (mostly vector resizes and ULEB128 parsing) in cases where only the Compilation Unit DIE ended up being used.As
DW_TAG_compile_unit
typically comes first in the abbreviation table, this commit adds a fast-path function (tryExtractCUAbbrevFast
) which attempts to read only the first abbreviation, without constructing a fullDWARFAbbreviationDeclarationSet
.This significantly speeds up
ld64.lld
's generation ofN_OSO
stab information (which needsDW_AT_name
from the Compilation Unit DIE). The following measurement was taken on an M1 Mac Mini linking Chromium with full debug info: