Releases · Azure-Samples/azure-search-openai-demo

23 May 17:06

2025-05-23

1b9885c

2025-05-23: Optional feature for agentic retrieval from Azure AI Search Latest

Latest

This release includes an exciting new option to turn on an agentic retrieval API from Azure AI Search (currently in public preview).
Read the docs about it here:
https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/docs/agentic_retrieval.md

You can also watch this talk from @mattgotteiner and @pamelafox at Microsoft Build 2025 about agentic retrieval:
https://build.microsoft.com/en-US/sessions/BRK142

Please share your feedback in either the issue tracker or discussions here. Since the retrieval API is in public preview, this is a great time to give feedback to the AI Search team.

What's Changed

Explicitly activate the uv environment in CI by @pamelafox in #2534
Updates the baseline evals with embedding 3 large, renames other folders for clarity by @pamelafox in #2533
Add support for agentic retrieval by @mattgotteiner in #2537
Remove locust from requirements-dev.txt by @pamelafox in #2539
Fix UI and answer gen issues by @mattgotteiner in #2541

Full Changelog: 2025-05-08...2025-05-23

Contributors

pamelafox and mattgotteiner

Assets 2

09 May 06:44

pamelafox

2025-05-08

faf0d46

2025-05-08: Default to text-embedding-3-large with compression, GlobalStandard SKU

This release upgrades the infrastructure and code to default to the text-embedding-3-large model from OpenAI. The model has a maximum dimensions of 3072, but we are using BinaryQuantizationCompression and truncating the dimensions to 1024, with oversampling and rescoring enabled. That means the embeddings will be stored efficiently, but search quality should remain high.
Learn more about compression from this RAG time episode or Azure AI Search documentation.

If you are already using the repository and don't wish to use the new embedding model, you can continue to use text-embedding-ada-002. You may need to set azd environment variables if they aren't already set, see the embedding models customization guide. If you want to switch over to the new embedding model, you will either need to re-ingest your data from scratch in a new index, or you will need to add a new field for the new model and re-generate embeddings for just that field. The code now has a variable for the embedding column field, so it should be possible to have a search index with fields for two different embedding models.

As part of this change, all model deploments now default to the GlobalStandard SKU. We made that change since it is easier to find regions in common across the many models used by this repository when using the GlobalStandard SKU. However, if you can't use that SKU for whatever reason, you can still customize the SKU using the parameters described in the documentation.

Please let us know in the issue tracker if you encounter any issues with the new default embedding model configuration.

What's Changed

Upgrade syntax to Python 3.9 by @tonybaloney in #2484
Remove outdated docs by @pamelafox in #2492
Use ENFORCE_ACCESS_CONTROL to decide whether to make acls by @pamelafox in #2494
Bump idna from 3.8 to 3.10 by @dependabot in #2464
Bump vite from 5.4.14 to 5.4.18 in /app/frontend by @dependabot in #2486
Bump types-html5lib from 1.1.11.20240806 to 1.1.11.20241018 by @dependabot in #2462
Bump msal-extensions from 1.2.0 to 1.3.1 by @dependabot in #2463
Update reasoning docs to include API version by @pamelafox in #2499
Bump @babel/runtime from 7.25.6 to 7.27.0 in /app/frontend by @dependabot in #2497
Upgrade Bicep versions of resources by @pamelafox in #2500
Add missing output for reasoning effort, updated evals including o3-mini by @pamelafox in #2501
Resolve datetime deprecation warnings by @emmanuel-ferdman in #2502
Upgrade to text-embedding-3-large model as default, with vector storage optimizations by @pamelafox in #2470
Update evals requirements by @pamelafox in #2528
Raise minimum node version by @pamelafox in #2519
Add migration script for Azure Cosmos DB, old container to new container by @pamelafox in #2442
Bump astral-sh/setup-uv from 5 to 6 in the github-actions group by @dependabot in #2512

New Contributors

@emmanuel-ferdman made their first contribution in #2502

Full Changelog: 2025-04-02...2025-05-08

Contributors

pamelafox, tonybaloney, and 2 other contributors

Assets 2

03 Apr 02:40

pamelafox

2025-04-02

56294c9

2025-04-02: Support for reasoning models and token usage display

You can now optionally use a reasoning model (o1 or o3-mini) for all chat completion requests, following the reasoning guide.

When using a reasoning model, you can select the reasoning effort (low/medium/high):

For all models, you can now see token usage in the "Thought process" tab:

Reasoning models incur more latency, due to the thinking process, so it is an option for developers to try, but not necessarily what you want to use for most RAG domains.

This PR also includes several fixes for performance, Windows support, and deployment.

What's Changed

Add quotes to azd env set by @mattgotteiner in #2413
Upgrade ms graph SDK packages to remove pendulum dependency by @pamelafox in #2454
Reduce list to only the available ones for gpt-4o-mini/Standard by @pamelafox in #2459
Add support for reasoning models and token usage display by @mattgotteiner in #2448
Upgrade prompty by @pamelafox in #2475

Full Changelog: 2025-03-26...2025-04-02

Contributors

pamelafox and mattgotteiner

Assets 2

26 Mar 22:43

pamelafox

2025-03-26

cb5149d

2025-03-26: Removal of conversation truncation logic

Previously, we had logic that would truncate conversation history by counting the tokens (with tiktoken) and only keeping the messages that fit inside the context window. Now that we are using a model with a higher context window (128K) and most models have that high limit, we have removed that truncation logic, so all conversations will be sent in full to the model.
See the pull request for more reasoning behind the decision.

## What's Changed

Remove token-counting library for conversation history truncation by @pamelafox in #2449

Full Changelog: 2025-03-25...2025-03-26

Contributors

pamelafox

Assets 2

24 Mar 23:38

pamelafox

2025-03-25

236b592

2025-03-25: Chat completion model is gpt-4o-mini by default

The infrastructure for this project was previously deploying a gpt-35-turbo model. We have since upgraded to the more recent gpt-4o-mini model, which has a much higher context window (128K) and cheaper per-token costs.
In terms of performance, it gives similarly accurate responses, but it does tend to produce more verbose responses. You can see the comparisons on the sample data in the evals folder, and you can read my blog post summarizing the differences. You may want to adjust the prompt to generate shorter results if you find the new answers to be too verbose.

For developers with existing deployments, it will continue to use gpt-35-turbo. You can follow the steps in the docs to use gpt-4o-mini or other models.

## What's Changed

Port to gpt-4o-mini as default by @pamelafox in #2443

Full Changelog: 2025-03-21...2025-03-25

Contributors

pamelafox

Assets 2

21 Mar 22:24

pamelafox

2025-03-21

88f987e

Container apps deployment now allows scaling to zero

To lower costs for developers experimenting, we've adjusted the scaling rules for the container apps deployment. See the productionizing guide for tips of what to change if you're preparing code based on this repository for production:
https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/docs/productionizing.md#azure-container-apps

What's Changed

Adjust container apps to scale to zero by @pamelafox in #2440
Bump dompurify from 3.2.0 to 3.2.4 in /app/frontend by @dependabot in #2363
Bump react-i18next from 15.1.1 to 15.4.1 in /app/frontend by @dependabot in #2376

Full Changelog: 2025-03-19...2025-03-21

Contributors

pamelafox and dependabot

Assets 2

19 Mar 23:44

pamelafox

2025-03-19

62f8b58

2025-03-19: Query rewriting from Azure AI Search

This release adds a new optional feature, the query rewriting option from Azure AI Search. This is distinct from the already existing query rewriting step in our RAG flows, which incorporates conversation history. The query rewriting from Azure AI Search focuses on expanding the query to semantically similar queries that can improve retrieval.

Enable the feature following the documentation:
https://github.com/Azure-Samples/azure-search-openai-demo/blob/main/docs/deploy_features.md#enabling-query-rewriting

What's Changed

Add auth-related azd env variable checks and improve docs by @pamelafox in #2386
Upgrade to latest GA API Version by @pamelafox in #2334
Upgrade Ubuntu runner for tests in Github Workflow to latest by @egor-yudkin in #2428
Bump jinja2 from 3.1.5 to 3.1.6 in /app/backend by @dependabot in #2435
Add query rewriting option by @mattgotteiner in #2437

New Contributors

@egor-yudkin made their first contribution in #2428

Full Changelog: 2025-02-20...2025-03-19

Contributors

pamelafox, egor-yudkin, and 2 other contributors

Assets 2

20 Feb 19:41

pamelafox

2025-02-20

31ea846

2025-02-20: Safety evaluations

This project now includes optional AI Safety evaluations, using an Azure AI Project and the Azure Azure AI evaluation SDK.
See documentation for instructons on running the evaluations.

What's Changed

Upgrading openai and removing numpy dependency by @pamelafox in #2362
Bump Azure/setup-azd from 2.0.0 to 2.1.0 in the github-actions group by @dependabot in #2366
AI Safety evaluations (with AI Project provisioning) by @pamelafox in #2370

Full Changelog: 2025-02-13...2025-02-20

Contributors

pamelafox and dependabot

Assets 2

14 Feb 06:57

pamelafox

2025-02-13

efbf397

2025-02-13: Italian localization

The UI is now available in Italian, so the text will display in Italian if the user's browser is configured accordingly, or if the app has the language picker enabled and the user picks italian.

What's Changed

Bump cryptography from 44.0.0 to 44.0.1 in /app/backend by @dependabot in #2354
Improve locust test script by @tonybaloney in #2357
Fix screenshot for Monitoring doc by @pamelafox in #2355
Added support for italian language by @ivanvaccarics in #2356

New Contributors

@ivanvaccarics made their first contribution in #2356

Full Changelog: 2025-02-11...2025-02-13

Contributors

pamelafox, tonybaloney, and 2 other contributors

Assets 2

11 Feb 08:19

pamelafox

2025-02-11

e873ba9

2025-02-11: Evaluation scripts and workflow

For a long time, we've directed developers to follow the steps in ai-rag-chat-evaluator to run evaluations on this app. To make it easier, we've now integrated evaluation directly into the repository, both as CLI scripts and GitHub Actions workflow.

Learn more from the evaluation guide or watch this video about evaluation.

What's Changed

Make it easy to run evaluation directly from this repo by @pamelafox in #2233
Use uv managed python in GHA workflows by @eifinger in #2342
Evaluation workflow for GitHub Actions by @pamelafox in #2350

New Contributors

@eifinger made their first contribution in #2342

Full Changelog: 2025-02-07...2025-02-11

Contributors

pamelafox and eifinger

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What's Changed

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

Contributors

Uh oh!

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

What's Changed

New Contributors

Contributors

Uh oh!

Releases: Azure-Samples/azure-search-openai-demo

2025-05-23: Optional feature for agentic retrieval from Azure AI Search

What's Changed

Contributors

Uh oh!

2025-05-08: Default to text-embedding-3-large with compression, GlobalStandard SKU

What's Changed

New Contributors

Contributors

Uh oh!

2025-04-02: Support for reasoning models and token usage display

What's Changed

Contributors

Uh oh!

2025-03-26: Removal of conversation truncation logic

Contributors

Uh oh!

2025-03-25: Chat completion model is gpt-4o-mini by default

Contributors

Uh oh!

Container apps deployment now allows scaling to zero

What's Changed

Contributors

Uh oh!

2025-03-19: Query rewriting from Azure AI Search

What's Changed

New Contributors

Contributors

Uh oh!

2025-02-20: Safety evaluations

What's Changed

Contributors

Uh oh!

2025-02-13: Italian localization

What's Changed

New Contributors

Contributors

Uh oh!

2025-02-11: Evaluation scripts and workflow

What's Changed

New Contributors

Contributors

Uh oh!