Skip to content

Conversation

safoinme
Copy link
Contributor

Describe changes

I implemented/fixed _ to achieve _.

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
  • IMPORTANT: I made sure that my changes are reflected properly in the following resources:
    • ZenML Docs
    • Dashboard: Needs to be communicated to the frontend team.
    • Templates: Might need adjustments (that are not reflected in the template tests) in case of non-breaking changes and deprecations.
    • Projects: Depending on the version dependencies, different projects might get affected.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

safoinme added 11 commits June 23, 2025 13:36
- Introduced a new implementation plan for ZenML's prompt abstraction, emphasizing prompt management as first-class artifacts.
- Removed analytics components and simplified the core Prompt class.
- Created utility functions for prompt operations and added comprehensive example pipelines for prompt development, comparison, and experimentation.
- Updated documentation with detailed README and example prompts for better user guidance.
- Simplified the structure of the `training.py` and `pipelines.py` files by reorganizing variable assignments and removing unnecessary line breaks.
- Enhanced the clarity of the `run.py` script by adjusting import statements and streamlining function definitions.
- Updated the `steps.py` file to improve the organization of prompt-related functions and ensure consistent formatting.
- Made minor adjustments to the prompt evaluation and materialization processes for better maintainability.
- Overall, these changes aim to enhance code readability and maintainability across the prompt engineering module.
- Completed the final implementation of the prompt abstraction, removing all analytics components and simplifying the core Prompt class to focus on essential fields.
- Updated server endpoints and visualization components to align with the new prompt structure.
- Enhanced example pipelines and documentation to demonstrate the new simplified prompt management features.
- Introduced comprehensive testing results confirming the functionality of prompt creation, comparison, and integration with ZenML's pipeline system.
- Removed deprecated prompt management models and endpoints to streamline the codebase and improve maintainability.
Copy link
Contributor

coderabbitai bot commented Jul 29, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing Touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/prompt-abstraction

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai generate unit tests to generate unit tests for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

safoinme added 5 commits July 29, 2025 16:24
- Introduced a new `IMPLEMENTATION_SUMMARY.md` detailing the completed implementation of a streamlined prompt management system, focusing on user needs for simplicity and Git-like versioning.
- Added `compare_prompts_simple()` utility for A/B testing of prompts, enhancing user experience with metric-based comparisons.
- Created example scripts (`simple_comparison.py` and `test_prompt_functionality.py`) to demonstrate core functionalities and testing of the new prompt features.
- Removed overengineered components, including the PromptTemplate entity system and associated REST API endpoints, to simplify the codebase and improve maintainability.
- Verified functionality through comprehensive testing, confirming successful prompt creation, formatting, and comparison capabilities.
- Improved formatting and organization in `simple_comparison.py` and `test_prompt_functionality.py` for better readability and consistency.
- Enhanced error handling and output messages in the prompt comparison utility.
- Removed unnecessary whitespace and streamlined code for clarity.
- Updated comments and print statements to align with user requirements and improve user experience during prompt testing.
…nd focus on essential components of the prompt management system. This deletion aligns with the goal of simplifying the codebase and reducing unnecessary complexity.
- Simplified the assignment of `latest_metric` and `current_metric` by consolidating the unpacking of values into a single line for improved readability.
- Removed unnecessary whitespace in the configuration file `training_rf.yaml` to enhance clarity and maintain consistency.
Copy link
Contributor

github-actions bot commented Jul 29, 2025

Documentation Link Check Results

Absolute links check failed
There are broken absolute links in the documentation. See workflow logs for details
Relative links check failed
There are broken relative links in the documentation. See workflow logs for details
Last checked: 2025-08-06 16:58:50 UTC

safoinme added 2 commits July 29, 2025 19:10
- Changed the link for the "LLM-Complete Guide" project to point to the GitHub repository for better resource access.
- Ensured all project descriptions and links remain accurate and up-to-date for user reference.
- Added a new section on prompt engineering best practices to the user guide, providing production-tested strategies for effective prompt management.
- Introduced a dedicated README for prompt engineering, outlining core features such as version control, A/B testing, and dashboard integration.
- Updated the table of contents to include links to new prompt engineering resources, ensuring users can easily navigate to relevant information.
- Removed outdated example scripts and streamlined the project structure to focus on essential components for prompt comparison and testing.
- Implemented a simple comparison pipeline to demonstrate the core functionalities of prompt engineering in ZenML.
Copy link
Contributor

github-actions bot commented Jul 30, 2025

🔍 Broken Links Report

Summary

  • 📁 Files with broken links: 1
  • 🔗 Total broken links: 2
  • 📄 Broken markdown links: 2
  • 🖼️ Broken image links: 0
  • ⚠️ Broken reference placeholders: 0

Details

File Link Type Link Text Broken Path
prompt-engineering/understanding-prompt-management.md 📄 "Basic prompt workflows" basic-prompt-workflows.md
prompt-engineering/understanding-prompt-management.md 📄 "Version control and testing" version-control-and-testing.md
📂 Full file paths
  • /home/runner/work/zenml/zenml/scripts/../docs/book/user-guide/llmops-guide/prompt-engineering/understanding-prompt-management.md
  • /home/runner/work/zenml/zenml/scripts/../docs/book/user-guide/llmops-guide/prompt-engineering/understanding-prompt-management.md

Copy link
Contributor

github-actions bot commented Jul 30, 2025

ZenML CLI Performance Comparison (Threshold: 1.0s, Timeout: 60s, Slow: 5s)

❌ Failed Commands on Current Branch (feature/prompt-abstraction)

  • zenml stack list: Command failed on run 1 (exit code: 1)
  • zenml pipeline list: Command failed on run 1 (exit code: 1)
  • zenml model list: Command failed on run 1 (exit code: 1)

🚨 New Failures Introduced

The following commands fail on your branch but worked on the target branch:

  • zenml stack list
  • zenml pipeline list
  • zenml model list

Performance Comparison

Command develop Time (s) feature/prompt-abstraction Time (s) Difference Status
zenml --help 1.545868 ± 0.019313 1.587354 ± 0.012166 +0.041s ✓ No significant change
zenml model list Not tested Failed N/A ❌ Broken in current branch
zenml pipeline list Not tested Failed N/A ❌ Broken in current branch
zenml stack --help 1.547151 ± 0.016971 1.569801 ± 0.031512 +0.023s ✓ No significant change
zenml stack list Not tested Failed N/A ❌ Broken in current branch

Summary

  • Total commands analyzed: 5
  • Commands compared for timing: 2
  • Commands improved: 0 (0.0% of compared)
  • Commands degraded: 0 (0.0% of compared)
  • Commands unchanged: 2 (100.0% of compared)
  • Failed commands: 3 (NEW FAILURES INTRODUCED)
  • Timed out commands: 0
  • Slow commands: 0

Environment Info

  • Target branch: Linux 6.11.0-1018-azure
  • Current branch: Linux 6.11.0-1018-azure
  • Test timestamp: 2025-08-06T16:47:27Z
  • Timeout: 60 seconds
  • Slow threshold: 5 seconds

safoinme and others added 5 commits August 3, 2025 22:17
- Updated docstrings in the `Prompt` class to include return descriptions for better understanding.
- Corrected a minor typo in the caching function's docstring for improved accuracy.
- Removed unnecessary comments and docstring content in `run_simple_comparison.py` for clarity.
- Enhanced docstring formatting in `simple_comparison.py` to improve readability and added return descriptions.
- Added whitespace for better code organization in `prompt_creation.py` and `prompt_testing.py` to maintain consistency.
- Replaced the "Prompt engineering in 30 lines" section with a "Quick start" link in the table of contents and README for better accessibility.
- Deleted the `basic-prompt-workflows.md` file to streamline content and focus on essential prompt engineering practices.
- Removed the `version-control-and-testing.md` file to eliminate redundancy and enhance clarity in the documentation structure.
- Updated README to reflect changes and emphasize core features of prompt engineering in ZenML.
- Rearranged import statements in `helpers.py` for better readability.
- Updated regex pattern in `validate_prompt_template` function to use double quotes for consistency.
Copy link
Contributor

@strickvl strickvl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still some work to be done here. The weakest part is the comparison functionality. I think I'd want to think about this before attaching too much of this at the Prompt level. (Though I understand why it was there, given the dashboard you built etc).

# logger.warning("Failed to parse LLM judge response as JSON")
# return {criterion: 5.0 for criterion in criteria} # fallback scores

# Mock scores for development - replace with actual implementation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:)

return mock_scores


def _get_quality_level(score: float) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, probably should just be a number and/or an Enum here.

return "Poor"


def _generate_recommendations(scores: Dict[str, float]) -> List[str]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels under-thought out

) -> Annotated[Dict[str, float], "evaluation_metrics"]:
"""Evaluate prompt performance with various metrics."""

# Simulate evaluation metrics based on response characteristics
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a bit weird all this 'simulation' going on here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one is sort of an entire example. was it meant to be in here?

safoinme and others added 11 commits August 4, 2025 21:45
Co-authored-by: Alex Strick van Linschoten <[email protected]>
- Added a new section on ZenML's philosophy emphasizing simplicity in prompt management.
- Introduced the PromptType enum to standardize prompt types (system, user, assistant) in the Prompt class.
- Updated the prompt instantiation examples to utilize the new PromptType enum for clarity and consistency.
- Introduced `demo_diff.py` to showcase ZenML's core prompt comparison features, including template and output diffs.
- Removed `run_simple_comparison.py` as it was redundant with the new demo.
- Updated `simple_demo.py` to utilize the new diffing capabilities and improved output presentation.
- Enhanced `comparison.py` and `evaluation.py` to integrate comprehensive diff analysis and output comparison.
- Added `diff_utils.py` for core diffing utilities, including GitHub-style text diffs and prompt comparisons.
- Refactored `__init__.py` to include new diff utilities and updated prompt imports for consistency.
…utomatic versioning

- Introduced a new "Quick Start" guide to help users quickly leverage ZenML's prompt engineering features.
- Updated documentation to reflect automatic versioning capabilities, removing manual version management examples.
- Improved clarity on GitHub-style diff comparisons and dashboard visualizations.
- Enhanced README to emphasize core features and simplify the user experience.
- Adjusted spacing in print statements and formatted strings in `demo_diff.py`, `run_prompt_comparison.py`, and `helpers.py` to ensure uniformity in output presentation.
- Enhanced readability by maintaining consistent formatting across comparison outputs.
…readability

- Adjusted print statements in `run_text_summarization.py` to enhance output clarity by standardizing spacing and formatting.
- Ensured consistent presentation of results and metrics for better user experience.
- Introduced a new document extraction project featuring a main script (`main.py`) to run the extraction pipeline.
- Implemented a comprehensive document extraction pipeline in `document_extraction_pipeline.py`, integrating various processing steps.
- Added utility functions for document processing, API interactions, and text extraction from different file types.
- Created prompt management for invoice extraction with structured templates in `invoice_prompts.py`.
- Developed Pydantic schemas for validation of extracted invoice data in `invoice_schema.py`.
- Included sample documents for testing and demonstration purposes.
- Updated README with setup instructions, project structure, and quick start guide for users.
- Updated the `select_prompt` function in `main.py` to return prompt objects instead of template strings for better integration with ZenML.
- Changed the type of `extraction_prompt` in `document_extraction_pipeline.py` and `extract_batch_data.py` from string to ZenML's `Prompt` class for improved type safety and functionality.
- Removed the obsolete `base_prompt.py` file to streamline prompt management.
- Adjusted invoice prompt definitions in `invoice_prompts.py` to utilize a dictionary for variable management, enhancing clarity and usability.
…nd validation

- Refactored `main.py` to utilize the new `Prompt` class for better integration with ZenML's prompt management.
- Updated `invoice_prompts.py` to include structured output schemas and examples for enhanced extraction accuracy.
- Added new sample documents to test various extraction scenarios, including challenging and poor-quality OCR cases.
- Enhanced README documentation to reflect new features, including structured output schemas and comprehensive response tracking.
- Introduced `PromptResponse` artifacts for capturing LLM outputs, improving validation and metadata tracking in the extraction pipeline.
- Updated best practices documentation to include structured output, response tracking, and cost optimization strategies.
- Expanded quick start guide with detailed instructions on dashboard visualization and enhanced prompt features.
- Improved README to reflect comprehensive capabilities, including structured output schemas and response tracking.
- Added new sections on advanced prompt management techniques, including metadata linking and performance monitoring.
- Enhanced examples for clarity and usability, showcasing structured output and response tracking in practice.
Copy link

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedspacy@​3.8.77410010010080
Addedpillow@​11.3.08510010010070
Addedpymupdf@​1.26.38710010010070
Addedreportlab@​4.4.39210010010070
Addedtextract@​1.6.598100100100100
Addedpytesseract@​0.3.13100100100100100

View full report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants