Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backend switching for Mooncake #768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Backend switching for Mooncake #768
Changes from 10 commits
1a389a6
08b176a
ba0c9e6
1340d92
2ce1ee2
08de6df
84f27c9
2e95299
13233e5
1e8df98
afdddd4
233c312
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are derivatives inside
dw
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not sure i understand. dw would be a function and if derivatives are used inside i think that would be handled by the substitute backend?
Check warning on line 7 in DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl
DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl#L3-L7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about this, to me it seems like we're forgetting to carry around the
FData
ofy
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im not sure i understand what you mean by this. what ive tried here is instead of creating the output Codual later, we create it early so that in cases of memory address function return types, the pullback functions can have access to the FData to create the correct adjoints.
Check warning on line 12 in DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl
DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl#L10-L12
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not sure that
f
has noRData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all functions would have no RData
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not all callable objects are functions though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I'm not even convinced the current Mooncake behavior is correct, see chalk-lab/Mooncake.jl#557
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, Mooncake treats all callable objects as a function, giving
NoRData()
. The issue you have opened in mooncake, i think would need to be figured out independent of DI (atleast this PR) as of now (comes in when Mooncake is a substitute backend as well).Check warning on line 18 in DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl
DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl#L16-L18
Check warning on line 21 in DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl
DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl#L21
Check warning on line 29 in DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl
DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl#L24-L29
Check warning on line 31 in DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl
DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl#L31
Check warning on line 37 in DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl
DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl#L34-L37
Check warning on line 44 in DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl
DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl#L41-L44
Check warning on line 47 in DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl
DifferentiationInterface/ext/DifferentiationInterfaceMooncakeExt/differentiate_with.jl#L47