Skip to content

Conversation

@akchinSTC
Copy link
Member

@akchinSTC akchinSTC commented Jun 24, 2022

What changes were proposed in this pull request?

Updated the docs and placeholder text in response to

#2789 and
#995 (comment)

How was this pull request tested?

Developer's Certificate of Origin 1.1

   By making a contribution to this project, I certify that:

   (a) The contribution was created in whole or in part by me and I
       have the right to submit it under the Apache License 2.0; or

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

   (c) The contribution was provided directly to me by some other
       person who certified (a), (b) or (c) and I have not modified
       it.

   (d) I understand and agree that this project and the contribution
       are public and that a record of the contribution (including all
       personal information I submit with it, including my sign-off) is
       maintained indefinitely and may be redistributed consistent with
       this project or the open source license(s) involved.

@elyra-bot
Copy link

elyra-bot bot commented Jun 24, 2022

Thanks for making a pull request to Elyra!

To try out this branch on binder, follow this link: Binder

Copy link
Member

@ptitzler ptitzler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Should we perhaps also reference the data volume topic as it provides an alternative for large (in terms of disk size or number of files) outputs ?

@ptitzler ptitzler added area:documentation Improvements or additions to documentation kind:bug Something isn't working labels Jun 24, 2022
@ptitzler ptitzler added this to the 3.10.0 milestone Jun 24, 2022
@akchinSTC
Copy link
Member Author

LGTM! Should we perhaps also reference the data volume topic as it provides an alternative for large (in terms of disk size or number of files) outputs ?

Im probably going to follow that up in a separate PR since Id like to address the issues brought up by cupdike in #2794. Maybe updating the best practices doc and explain the diff between the intermediate volume workspace and the output file parameter

@cupdike
Copy link

cupdike commented Jun 24, 2022

Apologies if I am missing context here, but doesn't this suggest that wildcards actually are supported?
https://github.com/elyra-ai/elyra/blob/main/elyra/airflow/bootstrapper.py#L197

@akchinSTC
Copy link
Member Author

thank you @cupdike.
@ptitzler, im probably going modify this PR a bit more while fixing #995, since the current documentation is correct (albeit with the bug). Listing files in outputs with wildcards does function correctly, but subsequent child nodes that we pass the wildcard path to cannot properly expand it and grab from s3.
wondering if we should disable wildcarding for now, so that subsequent nodes are able to proceed and provide guidance on how to properly push/pull artifacts to/from s3 (via compressed archives), or perhaps add the compression steps when wildcards are present to the bootstrapper and make the process transparent ?

@ptitzler
Copy link
Member

Listing files in outputs with wildcards does function correctly, but subsequent child nodes that we pass the wildcard path to cannot properly expand it and grab from s3. wondering if we should disable wildcarding for now, so that subsequent nodes are able to proceed

In a way doing so would result in a breaking change, as there might be users that use this feature to make artifacts available in the last (or only) pipeline node.

and provide guidance on how to properly push/pull artifacts to/from s3 (via compressed archives), or perhaps add the compression steps when wildcards are present to the bootstrapper and make the process transparent ?

Should we perhaps we update the best practices document for now and fix the implementation (based on your suggestion) in a future release? We are at the tail end of the 3.10 release and I'm not sure the latter can be handled in time.

@akchinSTC akchinSTC merged commit 56b4981 into elyra-ai:main Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:documentation Improvements or additions to documentation kind:bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants