-
Notifications
You must be signed in to change notification settings - Fork 13.3k
[clang] Fix the crash when dumping deserialized decls #133395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -27,6 +27,8 @@ class MacroInfo; | |
class Module; | ||
class SourceLocation; | ||
|
||
// IMPORTANT: when you add a new interface to this class, please update the | ||
// DelegatingDeserializationListener in FrontendAction.cpp | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Future idea: maybe we could move the |
||
class ASTDeserializationListener { | ||
public: | ||
virtual ~ASTDeserializationListener(); | ||
|
@@ -57,6 +59,8 @@ class ASTDeserializationListener { | |
/// A module import was read from the AST file. | ||
virtual void ModuleImportRead(serialization::SubmoduleID ID, | ||
SourceLocation ImportLoc) {} | ||
/// The deserialization of the AST file was finished. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could we document that while causing more serialization in the callbacks may have complicated side-effects and the implementors should be careful? I believe it's a very non-trivial details that folks reading the code should be warned about. |
||
virtual void FinishedDeserializing() {} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should also have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I guess the interface name might be misleading -- it doesn't fully align with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I saw that. But in any case you probably want the other callback too. |
||
}; | ||
} | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -76,6 +76,10 @@ class DelegatingDeserializationListener : public ASTDeserializationListener { | |
if (Previous) | ||
Previous->IdentifierRead(ID, II); | ||
} | ||
void MacroRead(serialization::MacroID ID, MacroInfo *MI) override { | ||
if (Previous) | ||
Previous->MacroRead(ID, MI); | ||
} | ||
void TypeRead(serialization::TypeIdx Idx, QualType T) override { | ||
if (Previous) | ||
Previous->TypeRead(Idx, T); | ||
|
@@ -93,6 +97,19 @@ class DelegatingDeserializationListener : public ASTDeserializationListener { | |
if (Previous) | ||
Previous->MacroDefinitionRead(PPID, MD); | ||
} | ||
void ModuleRead(serialization::SubmoduleID ID, Module *Mod) override { | ||
if (Previous) | ||
Previous->ModuleRead(ID, Mod); | ||
} | ||
void ModuleImportRead(serialization::SubmoduleID ID, | ||
SourceLocation ImportLoc) override { | ||
if (Previous) | ||
Previous->ModuleImportRead(ID, ImportLoc); | ||
} | ||
void FinishedDeserializing() override { | ||
if (Previous) | ||
Previous->FinishedDeserializing(); | ||
} | ||
}; | ||
|
||
/// Dumps deserialized declarations. | ||
|
@@ -103,15 +120,30 @@ class DeserializedDeclsDumper : public DelegatingDeserializationListener { | |
: DelegatingDeserializationListener(Previous, DeletePrevious) {} | ||
|
||
void DeclRead(GlobalDeclID ID, const Decl *D) override { | ||
llvm::outs() << "PCH DECL: " << D->getDeclKindName(); | ||
if (const NamedDecl *ND = dyn_cast<NamedDecl>(D)) { | ||
llvm::outs() << " - "; | ||
ND->printQualifiedName(llvm::outs()); | ||
PendingDecls.push_back(D); | ||
DelegatingDeserializationListener::DeclRead(ID, D); | ||
} | ||
void FinishedDeserializing() override { | ||
auto Decls = std::move(PendingDecls); | ||
for (const auto *D : Decls) { | ||
llvm::outs() << "PCH DECL: " << D->getDeclKindName(); | ||
if (const NamedDecl *ND = dyn_cast<NamedDecl>(D)) { | ||
llvm::outs() << " - "; | ||
ND->printQualifiedName(llvm::outs()); | ||
} | ||
llvm::outs() << "\n"; | ||
} | ||
llvm::outs() << "\n"; | ||
|
||
DelegatingDeserializationListener::DeclRead(ID, D); | ||
if (!PendingDecls.empty()) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Our theory is that At the point where the callback is called now, we have already updated the state of
We probably need to figure out an API that does not require handling situations like this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I tested it with the crash case, and I think a broader question is: once deserialization is finished, can we safely assume that using a loaded declaration will never trigger additional deserialization? Currently, the contract for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, we cannot assume that. E.g. we can load a function without a body and requesting a body at any other point in code may cause deserialization of the body itself, all declarations it references and so on.
Deserializations get started and completed throughout the program many times and it's generally fine.
I don't think this works, actually. It's very hard to write code that does not deserialize. And it's probably not necessary to actually have that level of scrutiny. Deserializing from inside the callbacks in the deserialization itself is cheesy, but deserializing more outside of the deserialization is a perfectly valid use-case. I would recommend a different approach and instead putting it on the author of the interface to figure out when they want to process their results. We would require wiring up other callbacks (
So maybe it crashed simply because we did not propagate the other callbacks? If that's the case, a more narrow change that does not add more methods to the interface would be enough. Or is that not the case? |
||
llvm::errs() << "Deserialized more decls while printing, total of " | ||
<< PendingDecls.size() << "\n"; | ||
PendingDecls.clear(); | ||
} | ||
DelegatingDeserializationListener::FinishedDeserializing(); | ||
} | ||
|
||
private: | ||
std::vector<const Decl *> PendingDecls; | ||
}; | ||
|
||
/// Checks deserialized declarations and emits error if a name | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: is it
-Xclang=-dump-deserialized-decls
? Users might be confused if they decide to try it out simply because it's mentioned in the release notes.