Skip to content

Commit a7f9924

Browse files
author
Jarno Hakulinen
authored
Merge pull request #54 from jhakulin/main
Support for o1 model for ChatAssistant and experimental Realtime support
2 parents 2fd6ccf + a103b78 commit a7f9924

50 files changed

Lines changed: 5056 additions & 966 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 57 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -7,55 +7,89 @@
77
![CrossPlatform](https://img.shields.io/badge/cross-platform-blue)
88
</div>
99
<div align="center">
10-
⚡Develop stateful copilot applications powered by Azure OpenAI Assistants at lightning speed⚡
10+
⚡Develop AI agents powered by Azure OpenAI Assistants, Chat Completion and Realtime APIs at lightning speed⚡
1111
</div>
1212
<br>
1313

14-
**Azure AI Assistants tool** is an experimental Python application and middleware designed to simplify the development, experimentation, testing, and debugging of Assistants created with **Azure OpenAI Assistants (Preview)** _(see below)_. Use this powerful, easy-to-setup low-code / no code playground tool to quickly experiment and build AI Assistants within your application with Azure OpenAI Assistants API.
14+
## Table of Contents
15+
16+
- 🤖🛠️ Azure AI Assistants Tool
17+
- 🆕 Latest News
18+
- 🧱 What is Assistants from Azure OpenAI Service?
19+
- 🚀 How does this Tool help?
20+
- 🔊🎤 OpenAI Realtime Support (Experimental)
21+
- ✨ Quick Start: Getting Started with the Tool
22+
- 📖 License
23+
- Contributing
24+
- Code of Conduct
25+
- Getting Help
26+
27+
## 🤖🛠️ Azure AI Assistants tool
28+
Azure AI Assistants tool is an experimental Python application and middleware designed to simplify the development, experimentation, testing, and debugging of AI agents created with Azure OpenAI Assistants, Chat Completion and/or Realtime API based technologies. Use this powerful, easy-to-setup low-code playground tool to quickly experiment and build AI agents within your application.
1529

1630
> [!IMPORTANT]
17-
> **The Azure AI Assistant Tool is currently in Alpha**. This early stage of development means the project is actively evolving, with significant updates and improvements expected. Users should anticipate changes as we work towards refining features, enhancing functionality, and expanding capabilities. We welcome feedback and contributions during this phase to help shape the future of the tool.
31+
> **The Azure AI Assistant Tool is experimental**, created to support your product ideation and experimentation using AI agents. As the tool evolves, expect significant updates and improvements. We welcome feedback and contributions to help shape its future.
32+
33+
## 🆕 Latest News
34+
35+
- **January 20, 2025:** Released 0.5.1 version of the tool containing **o1 Model Support** which allows to use o1 models with ChatAssistant (with limited completion settings) and **OpenAI Realtime Support**, with real-time audio interaction capabilities. The Azure Cognitive Services for speech input and output has been removed from the tool, however Azure Speech SDK is still used within OpenAI Realtime for keyword based detection. For more detailed information, refer to the OpenAI Realtime Support section below.
1836

1937

2038
## 🧱 What is Assistants from Azure OpenAI service?
2139

22-
🌟**Assistants**, a new API from Azure OpenAI Service, is a stateful evolution of the Chat Completions API. Assistants makes it easier for developers to create applications with sophisticated copilot-like experiences in their applications and enable developer access to powerful tools like Code Interpreter and Retrieval. Assistants is built on the same capabilities that power OpenAI’s GPT product and offers unparalleled flexibility for creating a wide range of copilot-like applications. Copilots created with Assistants can sift through data, suggest solutions, and automate tasks and use cases span a wide range: AI-powered product recommender, sales analyst app, coding assistant, employee Q&A chatbot, and more.
40+
🌟**Assistants**, API from Azure OpenAI Service, is a stateful evolution of the Chat Completions API. Assistants makes it easier for developers to create applications with sophisticated copilot-like experiences in their applications and enable developer access to powerful tools like Code Interpreter and File Search. Assistants is built on the same capabilities that power OpenAI’s GPT product and offers unparalleled flexibility for creating a wide range of copilot-like applications. Copilots created with Assistants can sift through data, suggest solutions, and automate tasks and use cases span a wide range: AI-powered product recommender, sales analyst app, coding assistant, employee Q&A chatbot, and more.
2341

2442
**Features** include:
2543

26-
💬 Inbuilt thread and memory management <br>
27-
📊 Advanced Data Analysis, create data visualizations and solving complex code and math problems with **Code Interpreter**<br>
28-
🚀 Build your own tools or call external tools and APIs with **Function Calling**<br>
29-
📚 Retrieval Augmented Generation with **File Search** tool (coming soon to Azure OpenAI Assistants)<br>
30-
🎤📢 Speech transcription and synthesis using Azure CognitiveServices Speech SDK<br>
31-
📤 Exporting the assistant configuration into simple CLI application
44+
- Inbuilt thread and memory management <br>
45+
- Advanced Data Analysis, create data visualizations and solving complex code and math problems with **Code Interpreter**<br>
46+
- Retrieval Augmented Generation with **File Search** tool<br>
47+
- Build your own tools or call external tools and APIs with **Function Calling**<br>
3248

3349
**Learn more** about Assistants on Azure OpenAI Service:
3450

35-
📹 Watch a [short video](https://www.youtube.com/watch?v=CMXtAe5DhXc&embeds_referring_euri=https%3A%2F%2Ftechcommunity.microsoft.com%2F&source_ve_path=OTY3MTQ&feature=emb_imp_woyt) about Azure OpenAI Assistants
36-
📖 Read the [launch announcement](https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/azure-openai-service-announces-assistants-api-new-models-for/ba-p/4049940)
37-
📌 Get familiar with the [Assistants API Quickstart](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/assistant)
51+
- Watch a [short video](https://www.youtube.com/watch?v=CMXtAe5DhXc&embeds_referring_euri=https%3A%2F%2Ftechcommunity.microsoft.com%2F&source_ve_path=OTY3MTQ&feature=emb_imp_woyt) about Azure OpenAI Assistants
52+
- Read the [launch announcement](https://techcommunity.microsoft.com/t5/ai-azure-ai-services-blog/azure-openai-service-announces-assistants-api-new-models-for/ba-p/4049940)
53+
- Get familiar with the [Assistants API Quickstart](https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/assistant)
54+
55+
56+
## 🚀 How Does This Tool Help?
57+
58+
- **Enable Rapid AI Agent Prototyping:** Rapidly create AI agent prototypes using OpenAI's technologies, Assistants, Chat Completion, and Realtime APIs. This includes user-friendly configurability of different agents, built-in system functions, specific tool configurations, and LLM configurations.
59+
60+
- **Enhance Developer Productivity:** Streamline the agent development process through built-in middleware libraries and tools, utilizing tools in prompt engineering to automate your coding tasks and integrate AI capabilities into your copilot applications more effectively.
61+
3862

63+
## 🔊🎤 OpenAI Realtime Support (Experimental)
3964

40-
## 🚀 How does this Tool help?
65+
This section covers the Realtime capabilities for AI agent prototyping with OpenAI's Realtime APIs, focusing on speech and text input/output through real-time WebSocket communication.
4166

42-
✔️ **Enhance Developer Productivity:** Streamline the assistant development process with Azure OpenAI Assistans through built-in middleware libraries and tools that making it easy to integrate AI capabilities into your copilot applications
67+
Please note that these capabilities are offered as an experimental feature. They are intended primarily for exploration, demos, or proof-of-concept usage.
68+
We do not recommend using these features in production or business-critical applications until further notice.
4369

44-
✔️ **Enable rapid prototyping:** Create amazing demos with AOAI Assistants and develop end-to-end assistant solutions with a robust set of features, including built-in system functions, dynamic generation of user functions specification and implementation, assistant task creation and scheduling, and much more.
70+
### Key Features
4571

46-
✔️**Optimize your copilot development workflow:** Get a reliable and scalable framework to test new Copilot use cases and dynamic AI applications with Assistants API without the need to build out manual tooling and configurations
72+
- **Real-time Audio Interaction**: Use the Realtime API with speech input using predefined and integrated keyword `Computer` to trigger the conversation.
73+
- Using keyword can be helpful to optimize cost and reliability of your application. To create your own keywords, visit [Creating the Custom Keyword](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/custom-keyword-basics?pivots=programming-language-python). Currently only 1 keyword is supported.
74+
- **Real-time Text Interaction**: Use the Realtime API with text input. The agent can respond back with audio or text.
75+
- **Local Voice Activity Detection**: Efficiently manage audio data by detecting speech activities by using local voice activity detection.
76+
- **Function Calling**: Customize the realtime agent with your own functions which runs asynchronously in the background.
77+
- **Configurable AI Options**: Fine-tune realtime agent responses and behaviors with different options in Realtime API.
4778

79+
### Demo Video
4880

81+
Check out the demo video to see the OpenAI Realtime Support in action!
4982

50-
## 💥 Highlights
83+
https://github.com/user-attachments/assets/b0c80b34-b825-4442-a80c-93f314909a92
5184

52-
- **Easy Configuration**: Set up your assistant with the model, custom instructions, files, and tools
53-
- **Tool Integration**: Incorporate knowledge retrieval, code interpreters, and built-in system and dynamic user functions to enhance assistant skills and capabilities.
54-
- **Dynamic User Functions**: Quickly create and apply user-defined functions to assistants.
55-
- **Task Management**: Efficiently manage and schedule tasks, including batch and multi-step operations, for parallel execution.
85+
### Resources
5686

87+
- [Realtime AI GitHub Repository](https://github.com/jhakulin/realtime-ai)
88+
- [OpenAI Realtime WebSocket API Documentation](https://platform.openai.com/docs/guides/realtime)
89+
- [Azure Speech Services Documentation](https://azure.microsoft.com/en-us/services/cognitive-services/speech-services/)
5790

58-
## ✨ Quick Start
91+
92+
## ✨ Quick Start: Getting Started with the Tool
5993

6094
### Step 1: Complete Azure prerequisities
6195

@@ -143,20 +177,6 @@ export AZURE_OPENAI_API_VERSION="Azure OpenAI version"
143177
export OPENAI_API_KEY="Your OpenAI Key"
144178
```
145179

146-
2. Set Cognitive Services Speech key (this is optional and if you want to use speech input & output).
147-
148-
**Windows:**
149-
```
150-
setx AZURE_AI_SPEECH_KEY "Your Speech Key"
151-
setx AZURE_AI_SPEECH_REGION "Your Speech Region"
152-
```
153-
154-
**Linux/Mac**
155-
```
156-
export AZURE_AI_SPEECH_KEY="Your Speech Key"
157-
export AZURE_AI_SPEECH_REGION="Your Speech Region"
158-
```
159-
160180
### Step 7: Launch the application
161181

162182
#### ⌨️ Command Line (CLI)

THIRD_PARTY_LICENSES.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Third-Party Components and Licenses
2+
3+
This file provides information regarding third-party components included in this project (or referenced by it) and their respective licenses. Please review these licenses to ensure compliance.
4+
5+
---
6+
7+
## 1. Silero Voice Activity Detector (VAD) Model
8+
9+
- **Repository:**
10+
[https://github.com/snakers4/silero-vad](https://github.com/snakers4/silero-vad)
11+
12+
- **License:**
13+
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
14+
15+
- **Full License Text:**
16+
The full text of the CC BY-NC-SA 4.0 license can be accessed here:
17+
[https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode](https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode)
18+
19+
**NOTE:** The Silero VAD model is not bundled directly in this repository’s wheel by default. If you opt to download and use it, you must comply with the CC BY-NC-SA 4.0 license—particularly regarding noncommercial use and share-alike obligations. Our code remains licensed under the MIT License, found in `LICENSE.md`.

assets/RealtimeAssistant.mp4

39.5 MB
Binary file not shown.

assets/kws.table

4.74 MB
Binary file not shown.

assets/silero_vad.onnx

1.72 MB
Binary file not shown.

config/SpeechTranscriptionSummarizer_assistant_config.yaml

Lines changed: 0 additions & 42 deletions
This file was deleted.

config/o1_assistant_assistant_config.yaml

Lines changed: 0 additions & 71 deletions
This file was deleted.

0 commit comments

Comments
 (0)