Skip to content

add npy export capability to dataset #337

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 26, 2023

Conversation

jperez999
Copy link
Collaborator

This PR adds support for exporting a dataset to an npy file. There are certain caveats and checks that must abided by in order for this to work. It allows for two methods of writing, first as one large dataframe dump, second is with append, meaning you can have larger than memory sized data added to the file. List Columns must be exploded via the column axis to write to an npy file. A helper method has been created to facilitate that action.

@jperez999 jperez999 added the enhancement New feature or request label May 25, 2023
@jperez999 jperez999 added this to the Merlin 23.06 milestone May 25, 2023
@jperez999 jperez999 self-assigned this May 25, 2023
@github-actions
Copy link

Documentation preview

https://nvidia-merlin.github.io/core/review/pr-337

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants