Skip to content

Commit a690206

Browse files
deepseek blog
1 parent bf0efbd commit a690206

File tree

6 files changed

+618
-0
lines changed

6 files changed

+618
-0
lines changed

examples/deepseek/deepseek-r1-with-azure-ai-foundry-and-gradio.ipynb

Lines changed: 418 additions & 0 deletions
Large diffs are not rendered by default.

examples/deepseek/requirements.txt

78 Bytes
Binary file not shown.

public/thinking-llm-hero.png

175 KB
Loading
5.06 MB
Loading

src/assets/deepseek-demo.gif

18.3 MB
Loading
Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
---
2+
title: 'Deploy DeepSeek R1 with Azure AI Foundry and Gradio'
3+
description: 'Create a chat interface for the Thinking LLM model using Azure AI Foundry, DeepSeek, and Gradio.'
4+
pubDate: 'Jan 29 2025'
5+
heroImage: '/cookbook/thinking-llm-hero.png'
6+
tags: ["Azure AI", "Azure AI Foundry", "Model catalog", "Models-as-a-Service", "MaaS", "Generative AI", "Gradio", "Advanced reasoning", "LLM"]
7+
---
8+
9+
In this walkthrough, I'll show you how to deploying create a "thinking LLM" chat interface with thought bubbles using DeepSeek R1, Azure AI Foundry SDK, and Gradio.
10+
11+
> This post is based on a Jupyter notebook example I created. You can use it alongside this walkthrough. Find it here: [DeepSeek R1 with Azure AI Foundry and Gradio](https://github.com/nicholasdbrady/cookbook/blob/main/examples/deepseek/deepseek-r1-with-azure-ai-foundry-and-gradio.ipynb)
12+
13+
## Table of Contents
14+
- [Introduction](#introduction)
15+
- [DeepSeek R1 on Azure AI Foundry](#deepseek-r1-on-azure-ai-foundry)
16+
- [Benefits of Using DeepSeek R1 on Azure AI Foundry](#benefits-of-using-deepseek-r1-on-azure-ai-foundry)
17+
- [Prerequisites](#prerequisites)
18+
- [Setting Up the ChatCompletionsClient](#step-1-setting-up-the-chatcompletionsclient)
19+
- [Implementing a Streaming Response Function](#step-2-implementing-a-streaming-response-function)
20+
- [Creating the Gradio Interface](#step-3-creating-the-gradio-interface)
21+
- [Conclusion](#conclusion)
22+
23+
## Introduction
24+
25+
[![DeepSeek logo](https://raw.githubusercontent.com/deepseek-ai/DeepSeek-V2/refs/heads/main/figures/logo.svg)](https://github.com/deepseek-ai/DeepSeek-R1)
26+
27+
**DeepSeek R1** has gained widespread attention for its advanced reasoning capabilities, excelling in language processing, scientific problem-solving, and coding. With 671B total parameters, 37B active parameters, and a 128K context length, it pushes the boundaries of AI-driven reasoning ([Explore DeepSeek R1 on Azure AI Foundry](https://ai.azure.com/explore/models/DeepSeek-R1/version/1/registry/azureml-deepseek)). Benchmarking and evaluation results highlight its performance against other models, showcasing its effectiveness in reasoning tasks ([Evaluation Results](https://github.com/deepseek-ai/DeepSeek-R1/tree/main?tab=readme-ov-file#4-evaluation-results)). Building on prior models, DeepSeek R1 integrates Chain-of-Thought (CoT) reasoning, reinforcement learning (RL), and fine-tuning on curated datasets to achieve state-of-the-art performance. This tutorial will walk you through how to deploy DeepSeek R1 from [Azure AI Foundry's model catalog](https://ai.azure.com/explore/models/) and integrate it with [Gradio](https://www.gradio.app/) to build a real-time streaming chatbot specifically for thinking LLMs like **DeepSeek R1**.
28+
29+
### DeepSeek R1 on Azure AI Foundry
30+
31+
On **January 29, 2025**, Microsoft announced that **DeepSeek R1** is now available on **Azure AI Foundry** and **GitHub**, making it part of a growing portfolio of over **1,800 AI models** available for enterprise use. With this integration, businesses can deploy DeepSeek R1 using **serverless APIs**, ensuring seamless scalability, security, and compliance with Microsoft’s responsible AI principles. ([Azure AI Foundry announcement](https://azure.microsoft.com/en-us/blog/deepseek-r1-on-azure-ai-foundry))
32+
33+
<!-- Local image stored at public/assets/stars.png -->
34+
![Model catalog](https://github.com/nicholasdbrady/cookbook/blob/main/src/assets/deepseek-azure-foundry.gif?raw=true)
35+
36+
### Benefits of Using DeepSeek R1 on Azure AI Foundry
37+
38+
- **Enterprise-Ready AI:** DeepSeek R1 is available as a trusted, scalable, and secure AI model, backed by Microsoft's infrastructure.
39+
- **Optimized Model Evaluation:** Built-in tools allow developers to benchmark performance and compare outputs across different models.
40+
- **Security and Responsible AI:** DeepSeek R1 has undergone **rigorous safety evaluations**, including automated assessments, security reviews, and **Azure AI Content Safety** integration for filtering potentially harmful content.
41+
- **Flexible Deployment:** Developers can deploy the model via the **Azure AI Studio, Azure CLI, ARM templates**, or Python SDK.
42+
43+
By combining **DeepSeek R1**'s robust language understanding with Gradio's interactive capabilities, you can create a powerful chatbot application that processes and responds to user inputs in real time. This tutorial will walk you through the necessary steps, from setting up the DeepSeek API client to building a responsive Gradio interface, ensuring a comprehensive understanding of the integration process.
44+
45+
[Return to top](#top)
46+
47+
## Prerequisites
48+
49+
Before we begin, ensure you have the following:
50+
51+
- **Python 3.8+** installed on your system.
52+
- An Azure AI Foundry model deployment with an endpoint. If you haven't deployed DeepSeek R1 as a serverless API yet, please follow the steps outlined in [Deploy models as serverless APIs](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-serverless).
53+
- An **Azure AI Foundry** model deployment with an endpoint.
54+
- The required Python packages installed:
55+
```sh
56+
pip install azure-ai-inference gradio
57+
```
58+
- Environment variables set for your Azure AI credentials:
59+
```sh
60+
export AZURE_INFERENCE_ENDPOINT="https://your-endpoint-name.region.inference.ai.azure.com"
61+
export AZURE_INFERENCE_CREDENTIAL="your-api-key"
62+
```
63+
64+
[Return to top](#top)
65+
66+
## Step 1: Setting Up the ChatCompletionsClient
67+
68+
```python
69+
import os
70+
import gradio as gr
71+
from azure.ai.inference import ChatCompletionsClient
72+
from azure.ai.inference.models import SystemMessage, UserMessage, AssistantMessage
73+
from azure.core.credentials import AzureKeyCredential
74+
from gradio import ChatMessage
75+
from typing import Iterator
76+
77+
client = ChatCompletionsClient(
78+
endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
79+
credential=AzureKeyCredential(os.environ["AZURE_INFERENCE_CREDENTIAL"])
80+
# If you're authenticating with Microsoft Entra ID, use DefaultAzureCredential()
81+
# or other supported credentials instead of AzureKeyCredential.
82+
)
83+
```
84+
85+
[Return to top](#top)
86+
87+
## Step 2: Implementing a Streaming Response Function
88+
89+
```python
90+
def stream_response(user_message: str, messages: list) -> Iterator[list]:
91+
if not messages:
92+
messages = []
93+
94+
# Convert Gradio chat history into Azure AI Inference messages
95+
azure_messages = [SystemMessage(content="You are a helpful assistant.")]
96+
for msg in messages:
97+
print(f"Gradio ChatMessage: {msg}") # Debug print
98+
if isinstance(msg, ChatMessage):
99+
azure_msg = UserMessage(content=msg.content) if msg.role == "user" else AssistantMessage(content=msg.content)
100+
elif isinstance(msg, dict) and "role" in msg and "content" in msg:
101+
azure_msg = UserMessage(content=msg["content"]) if msg["role"] == "user" else AssistantMessage(content=msg["content"])
102+
else:
103+
continue
104+
azure_messages.append(azure_msg)
105+
106+
# Ensure only serializable objects are sent to Azure
107+
azure_messages = [msg.dict() if hasattr(msg, "dict") else msg for msg in azure_messages]
108+
109+
response = client.complete(messages=azure_messages, stream=True)
110+
111+
# Initialize buffers
112+
thought_buffer = ""
113+
response_buffer = ""
114+
inside_thought = False
115+
116+
for update in response:
117+
if update.choices:
118+
current_chunk = update.choices[0].delta.content
119+
120+
if "<think>" in current_chunk:
121+
inside_thought = True
122+
print("Entering thought processing mode.")
123+
messages.append(ChatMessage(role="assistant", content="", metadata={"title": "🧠 R1 Thinking...", "status": "pending"}))
124+
yield messages
125+
continue
126+
elif "</think>" in current_chunk:
127+
inside_thought = False
128+
messages[-1] = ChatMessage(
129+
role="assistant",
130+
content=thought_buffer.strip(),
131+
metadata={"title": "🧠 R1 Thinking...", "status": "done"}
132+
)
133+
yield messages # Yield the thought message immediately
134+
thought_buffer = ""
135+
continue
136+
137+
if inside_thought:
138+
thought_buffer += current_chunk
139+
messages[-1] = ChatMessage(
140+
role="assistant",
141+
content=thought_buffer,
142+
metadata={"title": "🧠 R1 Thinking...", "status": "pending"}
143+
)
144+
yield messages # Yield the thought message as it updates
145+
else:
146+
response_buffer += current_chunk
147+
if messages and isinstance(messages[-1], ChatMessage) and messages[-1].role == "assistant" and (not messages[-1].metadata or "title" not in messages[-1].metadata):
148+
messages[-1] = ChatMessage(role="assistant", content=response_buffer)
149+
else:
150+
messages.append(ChatMessage(role="assistant", content=response_buffer))
151+
yield messages
152+
```
153+
154+
[Return to top](#top)
155+
156+
## Step 3: Creating the Gradio Interface
157+
158+
```python
159+
with gr.Blocks(title="DeepSeek R1 with Azure AI Foundry", fill_height=True, fill_width=True) as demo:
160+
title = gr.Markdown("## DeepSeek R1 with Azure AI Foundry 🤭")
161+
chatbot = gr.Chatbot(
162+
type="messages",
163+
label="DeepSeek-R1",
164+
render_markdown=True,
165+
show_label=False,
166+
scale=1,
167+
)
168+
169+
input_box = gr.Textbox(
170+
lines=1,
171+
submit_btn=True,
172+
show_label=False,
173+
)
174+
175+
msg_store = gr.State("")
176+
input_box.submit(lambda msg: (msg, msg, ""), inputs=[input_box], outputs=[msg_store, input_box, input_box], queue=False)
177+
input_box.submit(lambda msg, chat: (ChatMessage(role="user", content=msg), chat + [ChatMessage(role="user", content=msg)]), inputs=[msg_store, chatbot], outputs=[msg_store, chatbot], queue=False).then(
178+
stream_response, inputs=[msg_store, chatbot], outputs=chatbot
179+
)
180+
181+
demo.launch()
182+
```
183+
184+
## Conclusion
185+
186+
In this tutorial, we built a **streaming chatbot** using **DeepSeek R1** on **Azure AI Foundry** with **Gradio**.
187+
188+
<!-- Local image stored at public/assets/stars.png -->
189+
![Model catalog](https://github.com/nicholasdbrady/cookbook/blob/main/src/assets/deepseek-demo.gif?raw=true)
190+
191+
We covered:
192+
193+
- Setting up the DeepSeek R1 model on Azure.
194+
- Creating and handling chat completion requests.
195+
- Implementing real-time streaming responses.
196+
- Deploying the chatbot using Gradio.
197+
198+
Get started today by visiting **[Azure AI Foundry](https://azure.microsoft.com/en-us/products/ai-services/ai-foundry)** and **[DeepSeek on GitHub](https://github.com/DeepSeekAI/DeepSeek-R1)**.
199+
200+
Happy coding! 🚀

0 commit comments

Comments
 (0)