Skip to content

Conversation

Json-Andriopoulos
Copy link
Contributor

@Json-Andriopoulos Json-Andriopoulos commented Oct 7, 2025

  • Allow to specify multiple entities in same log_metadata() call
  • Batch submit RunMetadataResource objects
  • Allow to attach metadata for multiple artifacts with infer
  • Split RunMetadataResource resolution logic for testability

Describe changes

I implemented/fixed _ to achieve _.

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
  • IMPORTANT: I made sure that my changes are reflected properly in the following resources:
    • ZenML Docs
    • Dashboard: Needs to be communicated to the frontend team.
    • Templates: Might need adjustments (that are not reflected in the template tests) in case of non-breaking changes and deprecations.
    • Projects: Depending on the version dependencies, different projects might get affected.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

@github-actions github-actions bot added the enhancement New feature or request label Oct 7, 2025
@Json-Andriopoulos Json-Andriopoulos force-pushed the feature/4015-multi-log-metadata branch 4 times, most recently from 6784dcb to 67dfc61 Compare October 7, 2025 12:17
@Json-Andriopoulos Json-Andriopoulos marked this pull request as ready for review October 8, 2025 08:45

def __hash__(self) -> int:
"""Hash operator."""
# Only safe if _key() is made from immutable values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does _key() here refer to?

Comment on lines 409 to 411
The `log_metadata` function does not support logging metadata for
multiple entities of the same type. To do it, you can use the ZenML Client
functionality directly:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `log_metadata` function does not support logging metadata for
multiple entities of the same type. To do it, you can use the ZenML Client
functionality directly:
The `log_metadata` function does not support logging the same metadata for
multiple entities of the same type at once. To do it, you can use the ZenML Client
functionality directly:

Comment on lines 36 to 41
# Manual logging to a step
log_metadata(metadata={}, step_name=..., run_id_name_or_prefix=...)
log_metadata(metadata={}, step_id=...)
# Manual logging to a run
log_metadata(metadata={}, run_id_name_or_prefix=...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to fix this without breaking the functionality but I think there is something off here. Previously, if I would provide:

  1. Step id -> I log it to the step
  2. Run id name or prefix -> I log it to the run
  3. Run id name or prefix and step name -> I log it to the step

While the first two stay the same, the third one now logs it not only to the step but also the run. In a setting, where there are a lot of steps, this might end up overloading the pipeline run. Any ideas on how to fix it?

Copy link
Contributor

@bcdurak bcdurak Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, if I provide all three of them together, I think step_id will be processed but step_run_name and run_id_name_or_prefix will be silently ignored. I haven't checked it yet but the same might apply to other entities as well. I am not sure if ignoring it is the best idea here. I was more leaning towards raising a ValueError or in the worst case, throwing a warning.

Copy link
Contributor Author

@Json-Andriopoulos Json-Andriopoulos Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can add a validation mechanism, sure. So you either pass values by id or name/version. I feel like the best possible solution here is to introduce a new function with a new signature that groups those options to valid parameter groups:

def log_metadata(
    step_identifier: StepIdentifier | None,
    model_identifier: VersionedIdentifier | None,
    artifact_identifier: VersionedIdentifier | None,
    pipeline_run_id
):

Where for instance identifiers are:

class VersionedIdentifier:

    uuid: UUID | None
    name: str | None
    version: str | None

    def validate_options():
        # validate either id or name/version are set
        ...

For step we could have something like:

class StepRunIdentifier:
     id: UUID | None
     name: str | None
     pipeline_identifier: str | UUID | None
     
    def validate_options():
        # validate either id or name/pipeline are used
        ...

With this implementation when you specify the StepRunIdentifier it doesn't get mixed up with the pipeline_run_id, so you can control and separate when it is logged for the pipeline or not.

@Json-Andriopoulos Json-Andriopoulos force-pushed the feature/4015-multi-log-metadata branch from a1ac646 to f65865f Compare October 17, 2025 07:09
Copy link
Contributor

github-actions bot commented Oct 17, 2025

ZenML CLI Performance Comparison (Threshold: 1.0s, Timeout: 60s, Slow: 5s)

❌ Failed Commands on Current Branch (feature/4015-multi-log-metadata)

  • zenml stack list: Command failed on run 1 (exit code: 1)
  • zenml pipeline list: Command failed on run 1 (exit code: 1)
  • zenml model list: Command failed on run 1 (exit code: 1)

🚨 New Failures Introduced

The following commands fail on your branch but worked on the target branch:

  • zenml stack list
  • zenml pipeline list
  • zenml model list

Performance Comparison

Command develop Time (s) feature/4015-multi-log-metadata Time (s) Difference Status
zenml --help 1.364060 ± 0.012490 1.385330 ± 0.025020 +0.021s ✓ No significant change
zenml model list Not tested Failed N/A ❌ Broken in current branch
zenml pipeline list Not tested Failed N/A ❌ Broken in current branch
zenml stack --help 1.364508 ± 0.010296 1.412566 ± 0.024698 +0.048s ✓ No significant change
zenml stack list Not tested Failed N/A ❌ Broken in current branch

Summary

  • Total commands analyzed: 5
  • Commands compared for timing: 2
  • Commands improved: 0 (0.0% of compared)
  • Commands degraded: 0 (0.0% of compared)
  • Commands unchanged: 2 (100.0% of compared)
  • Failed commands: 3 (NEW FAILURES INTRODUCED)
  • Timed out commands: 0
  • Slow commands: 0

Environment Info

  • Target branch: Linux 6.11.0-1018-azure
  • Current branch: Linux 6.11.0-1018-azure
  • Test timestamp: 2025-10-17T08:25:29Z
  • Timeout: 60 seconds
  • Slow threshold: 5 seconds

@bcdurak bcdurak linked an issue Oct 17, 2025 that may be closed by this pull request
1 task
@Json-Andriopoulos Json-Andriopoulos force-pushed the feature/4015-multi-log-metadata branch 2 times, most recently from ed97a14 to 1e71ad9 Compare October 17, 2025 10:10
@Json-Andriopoulos Json-Andriopoulos force-pushed the feature/4015-multi-log-metadata branch from 1e71ad9 to 5f1c747 Compare October 17, 2025 11:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request run-slow-ci

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add ability to log metadata for multiple entries at once

2 participants