Skripti za kuandaa data za uandishi wa maneno hupakua manukuu ya video za YouTube na kuziandaa kwa matumizi na mfano wa Semantic Search with OpenAI Embeddings and Functions.
Skripti za kuandaa data za uandishi wa maneno zimeshahakikiwa kwenye toleo la hivi karibuni la Windows 11, macOS Ventura na Ubuntu 22.04 (na zaidi).
Important
Tunapendekeza usasishaji wa Azure CLI hadi toleo la hivi karibuni ili kuhakikisha ulinganifu na OpenAI Angalia Documentation
- Tengeneza kundi la rasilimali
Note
Kwa maelekezo haya tunatumia kundi la rasilimali linaloitwa "semantic-video-search" katika East US. Unaweza kubadilisha jina la kundi la rasilimali, lakini unapobadilisha eneo la rasilimali, angalia model availability table.
az group create --name semantic-video-search --location eastus- Tengeneza rasilimali ya Azure OpenAI Service.
az cognitiveservices account create --name semantic-video-openai --resource-group semantic-video-search \
--location eastus --kind OpenAI --sku s0- Pata endpoint na funguo kwa matumizi katika programu hii
az cognitiveservices account show --name semantic-video-openai \
--resource-group semantic-video-search | jq -r .properties.endpoint
az cognitiveservices account keys list --name semantic-video-openai \
--resource-group semantic-video-search | jq -r .key1- Sambaza mifano ifuatayo:
text-embedding-ada-002toleo2au zaidi, liitwetext-embedding-ada-002gpt-35-turbotoleo0613au zaidi, liitwegpt-35-turbo
az cognitiveservices account deployment create \
--name semantic-video-openai \
--resource-group semantic-video-search \
--deployment-name text-embedding-ada-002 \
--model-name text-embedding-ada-002 \
--model-version "2" \
--model-format OpenAI \
--scale-settings-scale-type "Standard"
az cognitiveservices account deployment create \
--name semantic-video-openai \
--resource-group semantic-video-search \
--deployment-name gpt-35-turbo \
--model-name gpt-35-turbo \
--model-version "0613" \
--model-format OpenAI \
--sku-capacity 100 \
--sku-name "Standard"- Python 3.9 au zaidi
Mabadiliko yafuatayo ya mazingira yanahitajika kuendesha skripti za kuandaa data za uandishi wa maneno za YouTube.
Inashauriwa kuongeza mabadiliko haya kwenye mabadiliko ya mazingira ya user.
Windows Start > Edit the system environment variables > Environment Variables > User variables kwa [USER] > New.
AZURE_OPENAI_API_KEY \<your Azure OpenAI Service API key>
AZURE_OPENAI_ENDPOINT \<your Azure OpenAI Service endpoint>
AZURE_OPENAI_MODEL_DEPLOYMENT_NAME \<your Azure OpenAI Service model deployment name>
GOOGLE_DEVELOPER_API_KEY = \<your Google developer API key>
Inashauriwa kuongeza amri zifuatazo za export kwenye faili yako ya ~/.bashrc au ~/.zshrc.
export AZURE_OPENAI_API_KEY=<your Azure OpenAI Service API key>
export AZURE_OPENAI_ENDPOINT=<your Azure OpenAI Service endpoint>
export AZURE_OPENAI_MODEL_DEPLOYMENT_NAME=<your Azure OpenAI Service model deployment name>
export GOOGLE_DEVELOPER_API_KEY=<your Google developer API key>-
Sakinisha git client kama bado haijasakinishwa.
-
Kutoka kwenye dirisha la
Terminal, nakili mfano hadi folda yako unayopendelea ya repo.git clone https://github.com/gloveboxes/semanic-search-openai-embeddings-functions.git
-
Nenda kwenye folda ya
data_prep.cd semanic-search-openai-embeddings-functions/src/data_prep -
Tengeneza mazingira ya virtual ya Python.
Kwenye Windows:
python -m venv .venvKwenye macOS na Linux:
python3 -m venv .venv
-
Washa mazingira ya virtual ya Python.
Kwenye Windows:
.venv\Scripts\activate
Kwenye macOS na Linux:
source .venv/bin/activate -
Sakinisha maktaba zinazohitajika.
Kwenye Windows:
pip install -r requirements.txtKwenye macOS na Linux:
pip3 install -r requirements.txt
.\transcripts_prepare.ps1./transcripts_prepare.shKiarifu cha Kutotegemea:
Hati hii imetafsiriwa kwa kutumia huduma ya tafsiri ya AI Co-op Translator. Ingawa tunajitahidi kwa usahihi, tafadhali fahamu kwamba tafsiri za kiotomatiki zinaweza kuwa na makosa au upungufu wa usahihi. Hati ya asili katika lugha yake ya asili inapaswa kuchukuliwa kama chanzo cha mamlaka. Kwa taarifa muhimu, tafsiri ya kitaalamu inayofanywa na binadamu inapendekezwa. Hatubebei dhamana kwa kutoelewana au tafsiri potofu zinazotokana na matumizi ya tafsiri hii.