Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add extract method to MetaflowCode #2246

Merged
merged 6 commits into from
Feb 19, 2025
Merged

add extract method to MetaflowCode #2246

merged 6 commits into from
Feb 19, 2025

Conversation

savingoyal
Copy link
Collaborator

No description provided.

this object is garbage collected.
"""
exclusions = [
"metaflow/",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have constants for this so we should use that. +1 to factoring it out as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why would we need to factor this out? #2238 can directly invoke this method unless i am missing something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend waiting on #2269 which will make this easier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, discussion was:

  • release now
  • later will be made easier by 2269
  • we can add a comment about where condav2-1.cnd comes from (but yes, it should be included)

return tmp

@property
def script(self) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe: script_name. That would leave us the ability to return the actual script (whcih we already extract and have) if we want to without name conflcit.

The directory and its contents are automatically deleted when
this object is garbage collected.
"""
EXCLUSIONS = [
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be made configurable or put in metaflow_config instead? We might have some more files at Netflix, that should be excluded.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have an example? ideally it would be nicer to include exclusions neither in the code or the config but the code package itself

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the conda*.cnd files in the code package.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it conda.cnd or something else?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's condav2-1.cnd specifically - I don't see the harm in making it configurable with sane defaults like the ones we already have.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have currently hard coded it in - ideally if we can avoid adding more to configs that would be my preference since it can become quite tricky to keep track of different versions of metaflow with a singular config. a proper solution would be to add the non-user files to INFO and then exclude them.

this object is garbage collected.
"""
exclusions = [
"metaflow/",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend waiting on #2269 which will make this easier.

@@ -878,6 +879,72 @@ def tarball(self) -> tarfile.TarFile:
"""
return self._tar

@property
def extract(self) -> TemporaryDirectory:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would rename to extract_user_code or something like that. We discussed a bit and we could do:

  • one method called extract with a flag that determines WHAT (user code, environment, everything)
  • extract_user_code which would be this and then the tarball extraction would be the other one.

I think we were leaning towards the second solution and that could work but it feels a little less symetric.

I think we also discussed a slightly different UX than the temporary directory. I guess the question is how will users use this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it a function instead of a property.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final comment from me on this:

  • ok with name extract
  • make it a function

@savingoyal savingoyal merged commit a19e782 into master Feb 19, 2025
29 checks passed
@savingoyal savingoyal deleted the savin/extract branch February 19, 2025 23:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants