-
Notifications
You must be signed in to change notification settings - Fork 548
Prompt abstraction #3862
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Prompt abstraction #3862
Conversation
…ure/prompt-abstraction
…entation from the examples directory.
- Introduced a new implementation plan for ZenML's prompt abstraction, emphasizing prompt management as first-class artifacts. - Removed analytics components and simplified the core Prompt class. - Created utility functions for prompt operations and added comprehensive example pipelines for prompt development, comparison, and experimentation. - Updated documentation with detailed README and example prompts for better user guidance.
- Simplified the structure of the `training.py` and `pipelines.py` files by reorganizing variable assignments and removing unnecessary line breaks. - Enhanced the clarity of the `run.py` script by adjusting import statements and streamlining function definitions. - Updated the `steps.py` file to improve the organization of prompt-related functions and ensure consistent formatting. - Made minor adjustments to the prompt evaluation and materialization processes for better maintainability. - Overall, these changes aim to enhance code readability and maintainability across the prompt engineering module.
- Completed the final implementation of the prompt abstraction, removing all analytics components and simplifying the core Prompt class to focus on essential fields. - Updated server endpoints and visualization components to align with the new prompt structure. - Enhanced example pipelines and documentation to demonstrate the new simplified prompt management features. - Introduced comprehensive testing results confirming the functionality of prompt creation, comparison, and integration with ZenML's pipeline system. - Removed deprecated prompt management models and endpoints to streamline the codebase and improve maintainability.
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the ✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
- Introduced a new `IMPLEMENTATION_SUMMARY.md` detailing the completed implementation of a streamlined prompt management system, focusing on user needs for simplicity and Git-like versioning. - Added `compare_prompts_simple()` utility for A/B testing of prompts, enhancing user experience with metric-based comparisons. - Created example scripts (`simple_comparison.py` and `test_prompt_functionality.py`) to demonstrate core functionalities and testing of the new prompt features. - Removed overengineered components, including the PromptTemplate entity system and associated REST API endpoints, to simplify the codebase and improve maintainability. - Verified functionality through comprehensive testing, confirming successful prompt creation, formatting, and comparison capabilities.
- Improved formatting and organization in `simple_comparison.py` and `test_prompt_functionality.py` for better readability and consistency. - Enhanced error handling and output messages in the prompt comparison utility. - Removed unnecessary whitespace and streamlined code for clarity. - Updated comments and print statements to align with user requirements and improve user experience during prompt testing.
…nd focus on essential components of the prompt management system. This deletion aligns with the goal of simplifying the codebase and reducing unnecessary complexity.
- Simplified the assignment of `latest_metric` and `current_metric` by consolidating the unpacking of values into a single line for improved readability. - Removed unnecessary whitespace in the configuration file `training_rf.yaml` to enhance clarity and maintain consistency.
Documentation Link Check Results❌ Absolute links check failed |
- Changed the link for the "LLM-Complete Guide" project to point to the GitHub repository for better resource access. - Ensured all project descriptions and links remain accurate and up-to-date for user reference.
- Added a new section on prompt engineering best practices to the user guide, providing production-tested strategies for effective prompt management. - Introduced a dedicated README for prompt engineering, outlining core features such as version control, A/B testing, and dashboard integration. - Updated the table of contents to include links to new prompt engineering resources, ensuring users can easily navigate to relevant information. - Removed outdated example scripts and streamlined the project structure to focus on essential components for prompt comparison and testing. - Implemented a simple comparison pipeline to demonstrate the core functionalities of prompt engineering in ZenML.
🔍 Broken Links ReportSummary
Details
📂 Full file paths
|
ZenML CLI Performance Comparison (Threshold: 1.0s, Timeout: 60s, Slow: 5s)❌ Failed Commands on Current Branch (feature/prompt-abstraction)
🚨 New Failures IntroducedThe following commands fail on your branch but worked on the target branch:
Performance Comparison
Summary
Environment Info
|
- Updated docstrings in the `Prompt` class to include return descriptions for better understanding. - Corrected a minor typo in the caching function's docstring for improved accuracy.
- Removed unnecessary comments and docstring content in `run_simple_comparison.py` for clarity. - Enhanced docstring formatting in `simple_comparison.py` to improve readability and added return descriptions. - Added whitespace for better code organization in `prompt_creation.py` and `prompt_testing.py` to maintain consistency.
- Replaced the "Prompt engineering in 30 lines" section with a "Quick start" link in the table of contents and README for better accessibility. - Deleted the `basic-prompt-workflows.md` file to streamline content and focus on essential prompt engineering practices. - Removed the `version-control-and-testing.md` file to eliminate redundancy and enhance clarity in the documentation structure. - Updated README to reflect changes and emphasize core features of prompt engineering in ZenML.
- Rearranged import statements in `helpers.py` for better readability. - Updated regex pattern in `validate_prompt_template` function to use double quotes for consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still some work to be done here. The weakest part is the comparison functionality. I think I'd want to think about this before attaching too much of this at the Prompt level. (Though I understand why it was there, given the dashboard you built etc).
# logger.warning("Failed to parse LLM judge response as JSON") | ||
# return {criterion: 5.0 for criterion in criteria} # fallback scores | ||
|
||
# Mock scores for development - replace with actual implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
return mock_scores | ||
|
||
|
||
def _get_quality_level(score: float) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, probably should just be a number and/or an Enum here.
return "Poor" | ||
|
||
|
||
def _generate_recommendations(scores: Dict[str, float]) -> List[str]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels under-thought out
) -> Annotated[Dict[str, float], "evaluation_metrics"]: | ||
"""Evaluate prompt performance with various metrics.""" | ||
|
||
# Simulate evaluation metrics based on response characteristics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a bit weird all this 'simulation' going on here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one is sort of an entire example. was it meant to be in here?
Co-authored-by: Alex Strick van Linschoten <[email protected]>
- Added a new section on ZenML's philosophy emphasizing simplicity in prompt management. - Introduced the PromptType enum to standardize prompt types (system, user, assistant) in the Prompt class. - Updated the prompt instantiation examples to utilize the new PromptType enum for clarity and consistency.
- Introduced `demo_diff.py` to showcase ZenML's core prompt comparison features, including template and output diffs. - Removed `run_simple_comparison.py` as it was redundant with the new demo. - Updated `simple_demo.py` to utilize the new diffing capabilities and improved output presentation. - Enhanced `comparison.py` and `evaluation.py` to integrate comprehensive diff analysis and output comparison. - Added `diff_utils.py` for core diffing utilities, including GitHub-style text diffs and prompt comparisons. - Refactored `__init__.py` to include new diff utilities and updated prompt imports for consistency.
…utomatic versioning - Introduced a new "Quick Start" guide to help users quickly leverage ZenML's prompt engineering features. - Updated documentation to reflect automatic versioning capabilities, removing manual version management examples. - Improved clarity on GitHub-style diff comparisons and dashboard visualizations. - Enhanced README to emphasize core features and simplify the user experience.
- Adjusted spacing in print statements and formatted strings in `demo_diff.py`, `run_prompt_comparison.py`, and `helpers.py` to ensure uniformity in output presentation. - Enhanced readability by maintaining consistent formatting across comparison outputs.
…readability - Adjusted print statements in `run_text_summarization.py` to enhance output clarity by standardizing spacing and formatting. - Ensured consistent presentation of results and metrics for better user experience.
- Introduced a new document extraction project featuring a main script (`main.py`) to run the extraction pipeline. - Implemented a comprehensive document extraction pipeline in `document_extraction_pipeline.py`, integrating various processing steps. - Added utility functions for document processing, API interactions, and text extraction from different file types. - Created prompt management for invoice extraction with structured templates in `invoice_prompts.py`. - Developed Pydantic schemas for validation of extracted invoice data in `invoice_schema.py`. - Included sample documents for testing and demonstration purposes. - Updated README with setup instructions, project structure, and quick start guide for users.
- Updated the `select_prompt` function in `main.py` to return prompt objects instead of template strings for better integration with ZenML. - Changed the type of `extraction_prompt` in `document_extraction_pipeline.py` and `extract_batch_data.py` from string to ZenML's `Prompt` class for improved type safety and functionality. - Removed the obsolete `base_prompt.py` file to streamline prompt management. - Adjusted invoice prompt definitions in `invoice_prompts.py` to utilize a dictionary for variable management, enhancing clarity and usability.
…nd validation - Refactored `main.py` to utilize the new `Prompt` class for better integration with ZenML's prompt management. - Updated `invoice_prompts.py` to include structured output schemas and examples for enhanced extraction accuracy. - Added new sample documents to test various extraction scenarios, including challenging and poor-quality OCR cases. - Enhanced README documentation to reflect new features, including structured output schemas and comprehensive response tracking. - Introduced `PromptResponse` artifacts for capturing LLM outputs, improving validation and metadata tracking in the extraction pipeline.
- Updated best practices documentation to include structured output, response tracking, and cost optimization strategies. - Expanded quick start guide with detailed instructions on dashboard visualization and enhanced prompt features. - Improved README to reflect comprehensive capabilities, including structured output schemas and response tracking. - Added new sections on advanced prompt management techniques, including metadata linking and performance monitoring. - Enhanced examples for clarity and usability, showcasing structured output and response tracking in practice.
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
Describe changes
I implemented/fixed _ to achieve _.
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes