Replies: 2 comments
-
Thanks for the proposal, @datajoely! I think that's a reasonable idea - let's implement it! |
Beta Was this translation helpful? Give feedback.
0 replies
-
Somehow connected to #2728 Turning this into a discussion to gather a bit more feedback |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Description
I have a massive dataset which is too big to work on quickly if we use the full table. I wanted to write a hook that would kick in based on run environment, intercept the data before it was saved and add a
limit(n)
to operation before it was saved by the catalog. To my surprise it turned out that this wasn't possible.Another possible use case would be something like PII stripping before save.
I solved this problem with a custom dataset but it feels clunky.
Context
The hook implementation (like all hooks actually)
return None
hookimpl
Possible Implementation
The runner tasky.py implements this hook and could be tweaked based on whether anything is returned.
Beta Was this translation helpful? Give feedback.
All reactions