Skip to content

Conversation

@aprilk-ms
Copy link
Member

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

- Add test_samples_evaluations.py with custom preparer and LLM instructions
- Add UTF-8 encoding fix to sample_executor.py for Windows compatibility
- Add azure_ai_agent_name to servicePreparer in test_base.py
- Update assets.json with new recording tag
- sample_model_evaluation.py
- sample_agent_response_evaluation.py
- sample_evaluations_builtin_with_dataset_id.py
- sample_evaluations_builtin_with_inline_data.py
- Add sanitizer for eval dataset timestamp (eval-data-YYYY-MM-DD_HHMMSS_UTC)
- Remove sample_evaluations_builtin_with_dataset_id from tests (requires blob storage upload which doesn't work in playback)
- All 4 evaluation samples now pass in playback mode
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds recording tests for evaluation samples, enabling automated testing of evaluation-related sample code. It introduces infrastructure for testing evaluation samples that require agent configuration and adds necessary sanitization patterns for evaluation-specific data in test recordings.

Changes:

  • Added a new test class TestSamplesEvaluations to test evaluation samples with recording support
  • Enhanced sample executor with UTF-8 encoding for log file writing
  • Extended test infrastructure with azure_ai_agent_name parameter for evaluation samples

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
sdk/ai/azure-ai-projects/tests/samples/test_samples_evaluations.py New test file for evaluation samples with custom validation instructions and test configuration for 4 evaluation samples
sdk/ai/azure-ai-projects/tests/samples/sample_executor.py Added explicit UTF-8 encoding to file writing operation for better cross-platform compatibility
sdk/ai/azure-ai-projects/tests/test_base.py Added azure_ai_agent_name parameter to servicePreparer for evaluation sample tests
sdk/ai/azure-ai-projects/tests/conftest.py Added regex sanitizer for eval dataset names with timestamps to ensure consistent test recordings
sdk/ai/azure-ai-projects/assets.json Updated test recording asset tag to reference new recordings

Added:
- sample_eval_catalog.py
- sample_eval_catalog_code_based_evaluators.py
- sample_eval_catalog_prompt_based_evaluators.py
- sample_evaluation_compare_insight.py
- sample_agent_response_evaluation_with_function_tool.py

Skipped sample_evaluations_builtin_with_inline_data_oai.py (uses direct OpenAI client with get_bearer_token_provider which doesn't work with mock credentials)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants