Prompt abstraction #3862

safoinme · 2025-07-29T15:21:37Z

Describe changes

I implemented/fixed _ to achieve _.

Pre-requisites

Please ensure you have done the following:

I have read the CONTRIBUTING.md document.
I have added tests to cover my changes.
I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
IMPORTANT: I made sure that my changes are reflected properly in the following resources:
- ZenML Docs
- Dashboard: Needs to be communicated to the frontend team.
- Templates: Might need adjustments (that are not reflected in the template tests) in case of non-breaking changes and deprecations.
- Projects: Depending on the version dependencies, different projects might get affected.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Other (add details above)

…ure/prompt-abstraction

…irements

…entation from the examples directory.

- Introduced a new implementation plan for ZenML's prompt abstraction, emphasizing prompt management as first-class artifacts. - Removed analytics components and simplified the core Prompt class. - Created utility functions for prompt operations and added comprehensive example pipelines for prompt development, comparison, and experimentation. - Updated documentation with detailed README and example prompts for better user guidance.

- Simplified the structure of the `training.py` and `pipelines.py` files by reorganizing variable assignments and removing unnecessary line breaks. - Enhanced the clarity of the `run.py` script by adjusting import statements and streamlining function definitions. - Updated the `steps.py` file to improve the organization of prompt-related functions and ensure consistent formatting. - Made minor adjustments to the prompt evaluation and materialization processes for better maintainability. - Overall, these changes aim to enhance code readability and maintainability across the prompt engineering module.

- Completed the final implementation of the prompt abstraction, removing all analytics components and simplifying the core Prompt class to focus on essential fields. - Updated server endpoints and visualization components to align with the new prompt structure. - Enhanced example pipelines and documentation to demonstrate the new simplified prompt management features. - Introduced comprehensive testing results confirming the functionality of prompt creation, comparison, and integration with ZenML's pipeline system. - Removed deprecated prompt management models and endpoints to streamline the codebase and improve maintainability.

coderabbitai · 2025-07-29T15:21:43Z

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

✨ Finishing Touches

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/prompt-abstraction

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

- Introduced a new `IMPLEMENTATION_SUMMARY.md` detailing the completed implementation of a streamlined prompt management system, focusing on user needs for simplicity and Git-like versioning. - Added `compare_prompts_simple()` utility for A/B testing of prompts, enhancing user experience with metric-based comparisons. - Created example scripts (`simple_comparison.py` and `test_prompt_functionality.py`) to demonstrate core functionalities and testing of the new prompt features. - Removed overengineered components, including the PromptTemplate entity system and associated REST API endpoints, to simplify the codebase and improve maintainability. - Verified functionality through comprehensive testing, confirming successful prompt creation, formatting, and comparison capabilities.

- Improved formatting and organization in `simple_comparison.py` and `test_prompt_functionality.py` for better readability and consistency. - Enhanced error handling and output messages in the prompt comparison utility. - Removed unnecessary whitespace and streamlined code for clarity. - Updated comments and print statements to align with user requirements and improve user experience during prompt testing.

…nd focus on essential components of the prompt management system. This deletion aligns with the goal of simplifying the codebase and reducing unnecessary complexity.

- Simplified the assignment of `latest_metric` and `current_metric` by consolidating the unpacking of values into a single line for improved readability. - Removed unnecessary whitespace in the configuration file `training_rf.yaml` to enhance clarity and maintain consistency.

github-actions · 2025-07-29T17:32:04Z

Documentation Link Check Results

❌ Absolute links check failed
There are broken absolute links in the documentation. See workflow logs for details
❌ Relative links check failed
There are broken relative links in the documentation. See workflow logs for details
_{Last checked: 2025-08-06 16:58:50 UTC}

- Changed the link for the "LLM-Complete Guide" project to point to the GitHub repository for better resource access. - Ensured all project descriptions and links remain accurate and up-to-date for user reference.

- Added a new section on prompt engineering best practices to the user guide, providing production-tested strategies for effective prompt management. - Introduced a dedicated README for prompt engineering, outlining core features such as version control, A/B testing, and dashboard integration. - Updated the table of contents to include links to new prompt engineering resources, ensuring users can easily navigate to relevant information. - Removed outdated example scripts and streamlined the project structure to focus on essential components for prompt comparison and testing. - Implemented a simple comparison pipeline to demonstrate the core functionalities of prompt engineering in ZenML.

github-actions · 2025-07-30T14:24:11Z

🔍 Broken Links Report

Summary

📁 Files with broken links: 1
🔗 Total broken links: 2
📄 Broken markdown links: 2
🖼️ Broken image links: 0
⚠️ Broken reference placeholders: 0

Details

File	Link Type	Link Text	Broken Path
`prompt-engineering/understanding-prompt-management.md`	📄	"Basic prompt workflows"	`basic-prompt-workflows.md`
`prompt-engineering/understanding-prompt-management.md`	📄	"Version control and testing"	`version-control-and-testing.md`

📂 Full file paths

/home/runner/work/zenml/zenml/scripts/../docs/book/user-guide/llmops-guide/prompt-engineering/understanding-prompt-management.md
/home/runner/work/zenml/zenml/scripts/../docs/book/user-guide/llmops-guide/prompt-engineering/understanding-prompt-management.md

github-actions · 2025-07-30T14:26:41Z

ZenML CLI Performance Comparison (Threshold: 1.0s, Timeout: 60s, Slow: 5s)

❌ Failed Commands on Current Branch (feature/prompt-abstraction)

zenml stack list: Command failed on run 1 (exit code: 1)
zenml pipeline list: Command failed on run 1 (exit code: 1)
zenml model list: Command failed on run 1 (exit code: 1)

🚨 New Failures Introduced

The following commands fail on your branch but worked on the target branch:

zenml stack list
zenml pipeline list
zenml model list

Performance Comparison

Command	develop Time (s)	feature/prompt-abstraction Time (s)	Difference	Status
`zenml --help`	1.545868 ± 0.019313	1.587354 ± 0.012166	+0.041s	✓ No significant change
`zenml model list`	Not tested	Failed	N/A	❌ Broken in current branch
`zenml pipeline list`	Not tested	Failed	N/A	❌ Broken in current branch
`zenml stack --help`	1.547151 ± 0.016971	1.569801 ± 0.031512	+0.023s	✓ No significant change
`zenml stack list`	Not tested	Failed	N/A	❌ Broken in current branch

Summary

Total commands analyzed: 5
Commands compared for timing: 2
Commands improved: 0 (0.0% of compared)
Commands degraded: 0 (0.0% of compared)
Commands unchanged: 2 (100.0% of compared)
Failed commands: 3 (NEW FAILURES INTRODUCED)
Timed out commands: 0
Slow commands: 0

Environment Info

Target branch: Linux 6.11.0-1018-azure
Current branch: Linux 6.11.0-1018-azure
Test timestamp: 2025-08-06T16:47:27Z
Timeout: 60 seconds
Slow threshold: 5 seconds

- Updated docstrings in the `Prompt` class to include return descriptions for better understanding. - Corrected a minor typo in the caching function's docstring for improved accuracy.

- Removed unnecessary comments and docstring content in `run_simple_comparison.py` for clarity. - Enhanced docstring formatting in `simple_comparison.py` to improve readability and added return descriptions. - Added whitespace for better code organization in `prompt_creation.py` and `prompt_testing.py` to maintain consistency.

- Replaced the "Prompt engineering in 30 lines" section with a "Quick start" link in the table of contents and README for better accessibility. - Deleted the `basic-prompt-workflows.md` file to streamline content and focus on essential prompt engineering practices. - Removed the `version-control-and-testing.md` file to eliminate redundancy and enhance clarity in the documentation structure. - Updated README to reflect changes and emphasize core features of prompt engineering in ZenML.

- Rearranged import statements in `helpers.py` for better readability. - Updated regex pattern in `validate_prompt_template` function to use double quotes for consistency.

strickvl

Still some work to be done here. The weakest part is the comparison functionality. I think I'd want to think about this before attaching too much of this at the Prompt level. (Though I understand why it was there, given the dashboard you built etc).

src/zenml/materializers/__init__.py

src/zenml/models/__init__.py

src/zenml/prompts/__init__.py

strickvl · 2025-08-04T11:19:22Z

examples/quickstart/steps/prompt_evaluation.py

+    #     logger.warning("Failed to parse LLM judge response as JSON")
+    #     return {criterion: 5.0 for criterion in criteria}  # fallback scores
+
+    # Mock scores for development - replace with actual implementation


strickvl · 2025-08-04T11:19:41Z

examples/quickstart/steps/prompt_evaluation.py

+    return mock_scores
+
+
+def _get_quality_level(score: float) -> str:


again, probably should just be a number and/or an Enum here.

strickvl · 2025-08-04T11:20:05Z

examples/quickstart/steps/prompt_evaluation.py

+        return "Poor"
+
+
+def _generate_recommendations(scores: Dict[str, float]) -> List[str]:


this feels under-thought out

strickvl · 2025-08-04T11:24:10Z

examples/quickstart/steps/prompt_example.py

+) -> Annotated[Dict[str, float], "evaluation_metrics"]:
+    """Evaluate prompt performance with various metrics."""
+
+    # Simulate evaluation metrics based on response characteristics


a bit weird all this 'simulation' going on here.

strickvl · 2025-08-04T11:24:38Z

examples/quickstart/steps/prompt_example.py

this one is sort of an entire example. was it meant to be in here?

Co-authored-by: Alex Strick van Linschoten <[email protected]>

- Added a new section on ZenML's philosophy emphasizing simplicity in prompt management. - Introduced the PromptType enum to standardize prompt types (system, user, assistant) in the Prompt class. - Updated the prompt instantiation examples to utilize the new PromptType enum for clarity and consistency.

- Introduced `demo_diff.py` to showcase ZenML's core prompt comparison features, including template and output diffs. - Removed `run_simple_comparison.py` as it was redundant with the new demo. - Updated `simple_demo.py` to utilize the new diffing capabilities and improved output presentation. - Enhanced `comparison.py` and `evaluation.py` to integrate comprehensive diff analysis and output comparison. - Added `diff_utils.py` for core diffing utilities, including GitHub-style text diffs and prompt comparisons. - Refactored `__init__.py` to include new diff utilities and updated prompt imports for consistency.

…utomatic versioning - Introduced a new "Quick Start" guide to help users quickly leverage ZenML's prompt engineering features. - Updated documentation to reflect automatic versioning capabilities, removing manual version management examples. - Improved clarity on GitHub-style diff comparisons and dashboard visualizations. - Enhanced README to emphasize core features and simplify the user experience.

- Adjusted spacing in print statements and formatted strings in `demo_diff.py`, `run_prompt_comparison.py`, and `helpers.py` to ensure uniformity in output presentation. - Enhanced readability by maintaining consistent formatting across comparison outputs.

…readability - Adjusted print statements in `run_text_summarization.py` to enhance output clarity by standardizing spacing and formatting. - Ensured consistent presentation of results and metrics for better user experience.

- Introduced a new document extraction project featuring a main script (`main.py`) to run the extraction pipeline. - Implemented a comprehensive document extraction pipeline in `document_extraction_pipeline.py`, integrating various processing steps. - Added utility functions for document processing, API interactions, and text extraction from different file types. - Created prompt management for invoice extraction with structured templates in `invoice_prompts.py`. - Developed Pydantic schemas for validation of extracted invoice data in `invoice_schema.py`. - Included sample documents for testing and demonstration purposes. - Updated README with setup instructions, project structure, and quick start guide for users.

- Updated the `select_prompt` function in `main.py` to return prompt objects instead of template strings for better integration with ZenML. - Changed the type of `extraction_prompt` in `document_extraction_pipeline.py` and `extract_batch_data.py` from string to ZenML's `Prompt` class for improved type safety and functionality. - Removed the obsolete `base_prompt.py` file to streamline prompt management. - Adjusted invoice prompt definitions in `invoice_prompts.py` to utilize a dictionary for variable management, enhancing clarity and usability.

…nd validation - Refactored `main.py` to utilize the new `Prompt` class for better integration with ZenML's prompt management. - Updated `invoice_prompts.py` to include structured output schemas and examples for enhanced extraction accuracy. - Added new sample documents to test various extraction scenarios, including challenging and poor-quality OCR cases. - Enhanced README documentation to reflect new features, including structured output schemas and comprehensive response tracking. - Introduced `PromptResponse` artifacts for capturing LLM outputs, improving validation and metadata tracking in the extraction pipeline.

- Updated best practices documentation to include structured output, response tracking, and cost optimization strategies. - Expanded quick start guide with detailed instructions on dashboard visualization and enhanced prompt features. - Improved README to reflect comprehensive capabilities, including structured output schemas and response tracking. - Added new sections on advanced prompt management techniques, including metadata linking and performance monitoring. - Enhanced examples for clarity and usability, showcasing structured output and response tracking in practice.

socket-security · 2025-08-06T16:46:28Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	spacy@3.8.7
	pillow@11.3.0
	pymupdf@1.26.3
	reportlab@4.4.3
	textract@1.6.5
	pytesseract@0.3.13

View full report

safoinme added 11 commits June 23, 2025 13:36

Add Prompt Materializer and related model enhancements

bec6122

Merge branch 'develop' of https://github.com/zenml-io/zenml into feat…

ed20aa5

…ure/prompt-abstraction

Add prompt example modules and pipelines for demonstration

07771be

Merge branch 'develop' into feature/prompt-abstraction

5e51696

Refactor prompt imports and update environment configurations

4ee96c3

Add comprehensive ZenML prompt management demo script and update requ…

6959830

…irements

No code changes made.

de61436

Remove deprecated prompt evaluation example scripts and related docum…

a37b84e

…entation from the examples directory.

safoinme added 5 commits July 29, 2025 16:24

Merge branch 'develop' into feature/prompt-abstraction

f2b31b1

Remove IMPLEMENTATION_SUMMARY.md file to streamline documentation a…

2a491d7

…nd focus on essential components of the prompt management system. This deletion aligns with the goal of simplifying the codebase and reducing unnecessary complexity.

safoinme added 2 commits July 29, 2025 19:10

Update project links in user guide README.md for improved accessibility

b4f3cf8

- Changed the link for the "LLM-Complete Guide" project to point to the GitHub repository for better resource access. - Ensured all project descriptions and links remain accurate and up-to-date for user reference.

safoinme and others added 5 commits August 3, 2025 22:17

Refactor prompt documentation for clarity and consistency

d57edc9

- Updated docstrings in the `Prompt` class to include return descriptions for better understanding. - Corrected a minor typo in the caching function's docstring for improved accuracy.

Merge branch 'develop' into feature/prompt-abstraction

15782b1

Refactor helper functions for prompt validation

2a76eae

- Rearranged import statements in `helpers.py` for better readability. - Updated regex pattern in `validate_prompt_template` function to use double quotes for consistency.

safoinme requested review from AlexejPenner and strickvl August 3, 2025 21:48

strickvl requested changes Aug 4, 2025

View reviewed changes

safoinme and others added 11 commits August 4, 2025 21:45

Update src/zenml/prompts/__init__.py

2748ff0

Co-authored-by: Alex Strick van Linschoten <[email protected]>

Alex Review

cc5a30f

		return mock_scores


		def _get_quality_level(score: float) -> str:

		return "Poor"


		def _generate_recommendations(scores: Dict[str, float]) -> List[str]:

Prompt abstraction #3862

Are you sure you want to change the base?

Prompt abstraction #3862

Uh oh!

Conversation

safoinme commented Jul 29, 2025

Describe changes

Pre-requisites

Types of changes

Uh oh!

coderabbitai bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

Uh oh!

github-actions bot commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Documentation Link Check Results

Uh oh!

github-actions bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Broken Links Report

Summary

Details

Uh oh!

github-actions bot commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

ZenML CLI Performance Comparison (Threshold: 1.0s, Timeout: 60s, Slow: 5s)

❌ Failed Commands on Current Branch (feature/prompt-abstraction)

🚨 New Failures Introduced

Performance Comparison

Summary

Environment Info

Uh oh!

strickvl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

strickvl Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

strickvl Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

strickvl Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

strickvl Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

strickvl Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

socket-security bot commented Aug 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Jul 29, 2025 •

edited

Loading

github-actions bot commented Jul 29, 2025 •

edited

Loading

github-actions bot commented Jul 30, 2025 •

edited

Loading

github-actions bot commented Jul 30, 2025 •

edited

Loading