Skip to content

Add GPT-4V sample code and images #66

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open
57 changes: 57 additions & 0 deletions Basic_Samples/GPT-4V/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@

# Introduction

This repository contains samples demonstrating how to use GPT-4V for Chat Completions via REST API.

## Installation
Install all Python modules and packages listed in the requirements.txt file using the below command.

```python
pip install -r requirements.txt
```

### Microsoft Azure Endpoints
In order to use REST API with Microsoft Azure endpoints, you need to set GPT-4V_MODEL, OPENAI_API_BASE, OPENAI_API_VERSION & VISION_API_ENDPOINT in _config.json_ file.

```js
{
"GPT-4V_MODEL":"<GPT-4V Model Name>",
"OPENAI_API_BASE":"https://<Your Azure Resource Name>.openai.azure.com",
"OPENAI_API_VERSION":"<OpenAI API Version>",

"VISION_API_ENDPOINT": "https://<Your Azure Vision Resource Name>.cognitiveservices.azure.com"
}
```

### For getting started:
- Add "OPENAI_API_KEY" and "VISION_API_KEY" (optional) as variable name and \<Your API Key Value\> and \<Your VISION Key Value\> (optional) as variable value in the environment variables.
<br>
One can get the OPENAI_API_KEY and VISION_API_KEY values from the Azure Portal. Go to https://portal.azure.com, find your resource and then under "Resource Management" -> "Keys and Endpoints" look for one of the "Keys" values.
<br>

WINDOWS Users:
setx OPENAI_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"
setx VISION_API_KEY "REPLACE_WITH_YOUR_KEY_VALUE_HERE"

MACOS/LINUX Users:
export OPENAI_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE"
export VISION_API_KEY="REPLACE_WITH_YOUR_KEY_VALUE_HERE"

- To find your "OPENAI_API_BASE" and "VISION_API_ENDPOINT" go to https://portal.azure.com, find your resource and then under "Resource Management" -> "Keys and Endpoints" look for the "Endpoint" value.

Learn more about Azure OpenAI Service REST API [here](https://learn.microsoft.com/en-us/azure/cognitive-services/openai/reference).


## Requirements
Python 3.8+ <br>
Jupyter Notebook 6.5.2

<br>

## Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft
trademarks or logos is subject to and must follow
[Microsoft's Trademark & Brand Guidelines](https://www.microsoft.com/en-us/legal/intellectualproperty/trademarks/usage/general).
Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship.
Any use of third-party trademarks or logos are subject to those third-party's policies.
166 changes: 166 additions & 0 deletions Basic_Samples/GPT-4V/basic_chatcompletions_example_restapi.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "759f9ec0",
"metadata": {},
"source": [
"<h1 align =\"center\"> REST API Basic Samples</h1>\n",
"<hr>\n",
" \n",
"# Chat Completions"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f4b3d21a",
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import os\n",
"import requests\n",
"import base64"
]
},
{
"cell_type": "markdown",
"id": "5b2d4a0f",
"metadata": {},
"source": [
"### Setup Parameters\n",
"\n",
"\n",
"Here we will load the configurations from _config.json_ file to setup deployment_name, openai_api_base, openai_api_key and openai_api_version."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fd85fb30",
"metadata": {},
"outputs": [],
"source": [
"# Load config values\n",
"with open(r'config.json') as config_file:\n",
" config_details = json.load(config_file)\n",
" \n",
"# Setting up the deployment name\n",
"deployment_name = config_details['GPT-4V_MODEL']\n",
"\n",
"# The base URL for your Azure OpenAI resource. e.g. \"https://<your resource name>.openai.azure.com\"\n",
"openai_api_base = config_details['OPENAI_API_BASE']\n",
"\n",
"# The API key for your Azure OpenAI resource.\n",
"openai_api_key = os.getenv(\"OPENAI_API_KEY\")\n",
"\n",
"# Currently OPENAI API have the following versions available: 2022-12-01. All versions follow the YYYY-MM-DD date structure.\n",
"openai_api_version = config_details['OPENAI_API_VERSION']"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aef62557",
"metadata": {},
"outputs": [],
"source": [
"\n",
"#Image Description Assistant\n",
"image_file_path = \"../../common/images/ImageDescriptionAssistant.jpg\" # Update with your image path\n",
"sys_message = \"You are an AI assistant that helps people craft a clear and detailed sentence that describes the content depicted in an image.\"\n",
"user_prompt = \"Describe image\"\n",
"\n",
"#Image Tagging Assistant\n",
"\"\"\"\n",
"image_file_path = \"../../common/images/ImageTaggingAssistant.jpg\" \n",
"sys_message = \"Generate a list of descriptive tags for the following image. Analyze the image carefully and produce tags that accurately represent the image. Ensure the tags are relevant.\"\n",
"user_prompt = \"Provide tags for this image.\"\n",
"\"\"\"\n",
"\n",
"#Listing Assistant\n",
"\"\"\"\n",
"image_file_path = \"../../common/images/ListingAssistant.jpg\" \n",
"sys_message = \"You are an AI assistant which generates listings for vacation rentals. Please generate exciting and inviting content for this image but don't talk about content that you cannot see. Follow the format of an attention-grabbing title and provide a description that is 6 sentences long.\"\n",
"user_prompt = \"Generate content.\"\n",
"\"\"\"\n",
"\n",
"# Encode the image in base64\n",
"with open(image_file_path, 'rb') as image_file:\n",
" encoded_image = base64.b64encode(image_file.read()).decode('ascii')\n",
"\n",
"# Construct the API request URL\n",
"api_url = f\"{openai_api_base}/openai/deployments/{deployment_name}/chat/completions?api-version={openai_api_version}\"\n",
"\n",
"# Including the api-key in HTTP headers\n",
"headers = {\n",
" \"Content-Type\": \"application/json\",\n",
" \"api-key\": openai_api_key,\n",
"}\n",
"\n",
"# Payload for the request\n",
"payload = {\n",
" \"messages\": [\n",
" {\n",
" \"role\": \"system\",\n",
" \"content\": [\n",
" sys_message\n",
" ]\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" user_prompt, # Pass the prompt\n",
" {\n",
" \"image\": encoded_image #Pass the encoded image\n",
" }\n",
" ]\n",
" }\n",
" ],\n",
" \"temperature\": 0.7,\n",
" \"top_p\": 0.95,\n",
" \"max_tokens\": 800\n",
"}\n",
"\n",
"# Send the request and handle the response\n",
"try:\n",
" response = requests.post(api_url, headers=headers, json=payload)\n",
" response.raise_for_status() # Raise an error for bad HTTP status codes\n",
" response_content = response.json()\n",
" print(response_content['choices'][0]['message']['content']) # Print the content of the response\n",
"except requests.RequestException as e:\n",
" raise SystemExit(f\"Failed to make the request. Error: {e}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b6165c63",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
7 changes: 7 additions & 0 deletions Basic_Samples/GPT-4V/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"GPT-4V_MODEL":"<GPT-4V Model Name>",
"OPENAI_API_BASE":"https://<Your Azure Resource Name>.openai.azure.com",
"OPENAI_API_VERSION":"<OpenAI API Version>",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should document where to find this configuration setting. For example this link has a list of supported versions:
https://learn.microsoft.com/en-us/azure/ai-services/openai/reference

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also just put a default API version here based on what we know we'll be using at announce. There's no reason to force someone to go figure this out themselves

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should document where to find this configuration setting. For example this link has a list of supported versions: https://learn.microsoft.com/en-us/azure/ai-services/openai/reference

We documented in the readme file.


"VISION_API_ENDPOINT": "https://<Your Azure Vision Resource Name>.cognitiveservices.azure.com"
}
155 changes: 155 additions & 0 deletions Basic_Samples/GPT-4V/enhancement_chatcompletions_example_restapi.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "759f9ec0",
"metadata": {},
"source": [
"<h1 align =\"center\"> REST API Enhanchment Samples</h1>\n",
"<hr>\n",
" \n",
"# Chat Completions"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f4b3d21a",
"metadata": {},
"outputs": [],
"source": [
"import json\n",
"import os\n",
"import requests\n",
"import base64"
]
},
{
"cell_type": "markdown",
"id": "5b2d4a0f",
"metadata": {},
"source": [
"### Setup Parameters\n",
"\n",
"\n",
"Here we will load the configurations from _config.json_ file to setup deployment_name, openai_api_base, openai_api_key and openai_api_version."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fd85fb30",
"metadata": {},
"outputs": [],
"source": [
"# Load config values\n",
"with open(r'config.json') as config_file:\n",
" config_details = json.load(config_file)\n",
" \n",
"# Setting up the deployment name\n",
"deployment_name = config_details['GPT-4V_MODEL']\n",
"\n",
"# The base URL for your Azure OpenAI resource. e.g. \"https://<your resource name>.openai.azure.com\"\n",
"openai_api_base = config_details['OPENAI_API_BASE']\n",
"\n",
"# The API key for your Azure OpenAI resource.\n",
"openai_api_key = os.getenv(\"OPENAI_API_KEY\")\n",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use the config file for all settings? Using the environment variables just adds one more step to setup

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that simplifying the setup process is important. However, prioritizing the security of sensitive data like 'key' is crucial. Keeping it in environment variables rather than in a configuration file offers an additional layer of security.

"\n",
"# Currently OPENAI API have the following versions available: 2022-12-01. All versions follow the YYYY-MM-DD date structure.\n",
"openai_api_version = config_details['OPENAI_API_VERSION']"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "aef62557",
"metadata": {},
"outputs": [],
"source": [
"image_file_path = \"../../common/images/AdobeStock_221666419.jpeg\" # Update with your image path\n",
"sys_message = \"You are an AI assistant that helps people find information.\"\n",
"user_prompt = \"Based on this flight information board, can you provide specifics for my trip to Zurich?\"\n",
"\n",
"# Encode the image in base64\n",
"with open(image_file_path, 'rb') as image_file:\n",
" encoded_image = base64.b64encode(image_file.read()).decode('ascii')\n",
"\n",
"# Construct the API request URL\n",
"api_url = f\"{openai_api_base}/openai/deployments/{deployment_name}/extensions/chat/completions?api-version={openai_api_version}\"\n",
"\n",
"# Including the api-key in HTTP headers\n",
"headers = {\n",
" \"Content-Type\": \"application/json\",\n",
" \"api-key\": openai_api_key,\n",
"}\n",
"\n",
"# Payload for the request\n",
"payload = {\n",
" \"enhancements\": {\n",
" \"ocr\": {\n",
" \"enabled\": True # Enable OCR enhancement\n",
" },\n",
" },\n",
" \"messages\": [\n",
" {\n",
" \"role\": \"system\",\n",
" \"content\": [\n",
" sys_message\n",
" ]\n",
" },\n",
" {\n",
" \"role\": \"user\",\n",
" \"content\": [\n",
" user_prompt, # Pass the prompt\n",
" {\n",
" \"image\": encoded_image #Pass the encoded image\n",
" }\n",
" ]\n",
" }\n",
" ],\n",
" \"temperature\": 0.7,\n",
" \"top_p\": 0.95,\n",
" \"max_tokens\": 800\n",
"}\n",
"\n",
"# Send the request and handle the response\n",
"try:\n",
" response = requests.post(api_url, headers=headers, json=payload)\n",
" response.raise_for_status() # Raise an error for bad HTTP status codes\n",
" response_content = response.json()\n",
" print(response_content['choices'][0]['message']['content']) # Print the content of the response\n",
"except requests.RequestException as e:\n",
" raise SystemExit(f\"Failed to make the request. Error: {e}\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b6165c63",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading