-
Notifications
You must be signed in to change notification settings - Fork 205
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
PR for merging content on Notebook for RAG with Llama 3 open-source a…
…nd Elastic #260 (#266) * Readme Updated * Cleared Cell Outputs * Added notebooks to skip list * Updated as per PR comments. --------- Co-authored-by: rishikeshradhakrishnan <[email protected]>
- Loading branch information
1 parent
0282941
commit 4c66913
Showing
4 changed files
with
617 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
# RAG-Elastic-Llama3 | ||
|
||
Building RAG with Llama 3 open-source and Elastic. | ||
|
||
The notebooks in this repo are companion to the blog content [Rag with Llama3 and Elastic](https://www.elastic.co/search-labs/blog/rag-with-llama3-elastic) | ||
|
||
Two examples of RAG using Llama3 as an LLM. Llama3 will be hosted using Ollama locally. | ||
|
||
1. Using Llama3 as an LLM and use it to generate embeddings. | ||
2. Using Llama3 as an LLM and use Elastic ELSER for embeddings and semantic search. | ||
|
304 changes: 304 additions & 0 deletions
304
notebooks/integrations/llama3/rag-elastic-llama3-elser.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,304 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "430bd0198f228af0", | ||
"metadata": {}, | ||
"source": [ | ||
"# RAG with Elastic ELSER and Llama3 using Langchain\n", | ||
"\n", | ||
"This interactive notebook uses `Langchain` to process fictional workplace documents and uses `ELSER v2` running in `Elasticsearch` to transform these documents into embeddings and store them into `Elasticsearch`. We then ask a question, retrieve the relevant documents from `Elasticsearch` and use `Llama3` running locally using `Ollama` to provide a response. \n", | ||
"\n", | ||
"**_Note_** : _`Llama3` is expected to be running using `Ollama` on the same machine where you will be running this notebook._" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "fc4d3a839afa5bf1", | ||
"metadata": {}, | ||
"source": [ | ||
"## Requirements\n", | ||
"\n", | ||
"For this example, you will need:\n", | ||
"\n", | ||
"- An Elastic deployment\n", | ||
" - We'll be using [Elastic Cloud](https://www.elastic.co/guide/en/cloud/current/ec-getting-started.html) for this example (available with a [free trial](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook))\n", | ||
" - For LLM we will be using [Ollama](https://ollama.com/) and [Llama3](https://ollama.com/library/llama3) configured locally. \n", | ||
"\n", | ||
"### Use Elastic Cloud\n", | ||
"\n", | ||
"If you don't have an Elastic Cloud deployment, follow these steps to create one.\n", | ||
"\n", | ||
"1. Go to [Elastic cloud Registration](https://cloud.elastic.co/registration?utm_source=github&utm_content=elasticsearch-labs-notebook) and sign up for a free trial\n", | ||
"2. Select **Create Deployment** and follow the instructions" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7497ed11046b9a21", | ||
"metadata": {}, | ||
"source": [ | ||
"## Install required dependencies\n", | ||
"First we install the packages we need for this example." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "a0c049f18215f065", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!pip install langchain langchain-elasticsearch langchain-community tiktoken" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e97921e3cf79cc71", | ||
"metadata": {}, | ||
"source": [ | ||
"## Import packages\n", | ||
"Next we import the required packages as required. The imports are placed in the cells as required." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "96d6fab134fd493d", | ||
"metadata": {}, | ||
"source": [ | ||
"## Prompt user to provide Cloud ID and API Key\n", | ||
"We now prompt the user to provide us Cloud ID and API Key using `getpass`. We get these details from the deployment." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "5be75ecc855e59b9", | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2024-06-04T11:36:07.879351Z", | ||
"start_time": "2024-06-04T11:35:57.006555Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from getpass import getpass\n", | ||
"\n", | ||
"# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#finding-your-cloud-id\n", | ||
"ELASTIC_CLOUD_ID = getpass(\"Elastic Cloud ID: \")\n", | ||
"\n", | ||
"# https://www.elastic.co/search-labs/tutorials/install-elasticsearch/elastic-cloud#creating-an-api-key\n", | ||
"ELASTIC_API_KEY = getpass(\"Elastic Api Key: \")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "84f107432173d8df", | ||
"metadata": {}, | ||
"source": [ | ||
"## Prepare documents for chunking and ingestion\n", | ||
"We now prepare the data to be ingested into `Elasticsearch`. We use `LangChain`'s `RecursiveCharacterTextSplitter` and split the documents' text at 512 characters with an overlap of 256 characters. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "4f949859a20b22d6", | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2024-06-04T11:36:11.046835Z", | ||
"start_time": "2024-06-04T11:36:10.641480Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from urllib.request import urlopen\n", | ||
"import json\n", | ||
"from langchain.text_splitter import RecursiveCharacterTextSplitter\n", | ||
"\n", | ||
"\n", | ||
"url = \"https://raw.githubusercontent.com/elastic/elasticsearch-labs/main/datasets/workplace-documents.json\"\n", | ||
"\n", | ||
"response = urlopen(url)\n", | ||
"\n", | ||
"workplace_docs = json.loads(response.read())\n", | ||
"metadata = []\n", | ||
"content = []\n", | ||
"for doc in workplace_docs:\n", | ||
" content.append(doc[\"content\"])\n", | ||
" metadata.append(\n", | ||
" {\n", | ||
" \"name\": doc[\"name\"],\n", | ||
" \"summary\": doc[\"summary\"],\n", | ||
" \"rolePermissions\": doc[\"rolePermissions\"],\n", | ||
" }\n", | ||
" )\n", | ||
"\n", | ||
"text_splitter = RecursiveCharacterTextSplitter.from_tiktoken_encoder(\n", | ||
" chunk_size=512, chunk_overlap=256\n", | ||
")\n", | ||
"docs = text_splitter.create_documents(content, metadatas=metadata)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "a52ec8a0a663f581", | ||
"metadata": {}, | ||
"source": [ | ||
"## Define Elasticsearch Vector Store\n", | ||
"We define `ElasticsearchStore` as the vector store with [SparseVectorStrategy](https://python.langchain.com/v0.2/docs/integrations/vectorstores/elasticsearch/#sparsevectorstrategy-elser).`SparseVectorStrategy` converts each document into tokens and would be stored in vector field with datatype `rank_features`.\n", | ||
"We will be using text embedding from [ELSER v2](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html#elser-v2) model `.elser_model_2_linux-x86_64`\n", | ||
"\n", | ||
"Note: Before we begin indexing, ensure you have [downloaded and deployed ELSER v2 model](https://www.elastic.co/guide/en/machine-learning/current/ml-nlp-elser.html#download-deploy-elser) in your deployment and is running in ml node. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "fc52a85c69f6e9fb", | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2024-06-04T11:36:33.046663Z", | ||
"start_time": "2024-06-04T11:36:32.381802Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain_elasticsearch import ElasticsearchStore\n", | ||
"from langchain_elasticsearch import SparseVectorStrategy\n", | ||
"\n", | ||
"es_vector_store = ElasticsearchStore(\n", | ||
" es_cloud_id=ELASTIC_CLOUD_ID,\n", | ||
" es_api_key=ELASTIC_API_KEY,\n", | ||
" index_name=\"workplace_index_elser\",\n", | ||
" strategy=SparseVectorStrategy(model_id=\".elser_model_2_linux-x86_64\"),\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "7f38b91b2a358994", | ||
"metadata": {}, | ||
"source": [ | ||
"## Add docs processed above. \n", | ||
"The document has already been chunked. We do not use any specific embedding function here, since the tokens are inferred at index time and at query time within Elasticsearch. \n", | ||
"This requires that the `ELSER v2` model to be loaded and running in Elasticsearch." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "5ee38329856ea47c", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"es_vector_store.add_documents(documents=docs)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "8a5a628abfb6a5a", | ||
"metadata": {}, | ||
"source": [ | ||
"## LLM Configuration\n", | ||
"This connects to your local LLM. Please refer to https://ollama.com/library/llama3 for details on steps to run Llama3 locally. \n", | ||
"\n", | ||
"_If you have sufficient resources (atleast >64 GB Ram and GPU available) then you could try the 70B parameter version of Llama3_\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "23e35e152e9a2714", | ||
"metadata": { | ||
"ExecuteTime": { | ||
"end_time": "2024-06-04T11:36:40.306996Z", | ||
"start_time": "2024-06-04T11:36:40.302460Z" | ||
} | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain_community.llms import Ollama\n", | ||
"\n", | ||
"llm = Ollama(model=\"llama3\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "dbca801f2aae187", | ||
"metadata": {}, | ||
"source": [ | ||
"## Semantic Search using Elasticsearch ELSER v2 and Llama3\n", | ||
"\n", | ||
"We will perform a semantic search on query with `ELSER v2` as the model. The contextually relevant answer is then composed into a template along with the users original query. \n", | ||
"\n", | ||
"We then user `Llama3` to answer your questions with contextually relevant data fetched earlier from Elasticsearch using the retriever. " | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "d348766295f19e35", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from langchain.prompts import ChatPromptTemplate\n", | ||
"from langchain.schema.output_parser import StrOutputParser\n", | ||
"from langchain.schema.runnable import RunnablePassthrough\n", | ||
"\n", | ||
"\n", | ||
"def format_docs(docs):\n", | ||
" return \"\\n\\n\".join(doc.page_content for doc in docs)\n", | ||
"\n", | ||
"\n", | ||
"retriever = es_vector_store.as_retriever()\n", | ||
"template = \"\"\"Answer the question based only on the following context:\\n\n", | ||
"\n", | ||
" {context}\n", | ||
" \n", | ||
" Question: {question}\n", | ||
" \"\"\"\n", | ||
"prompt = ChatPromptTemplate.from_template(template)\n", | ||
"chain = (\n", | ||
" {\"context\": retriever | format_docs, \"question\": RunnablePassthrough()}\n", | ||
" | prompt\n", | ||
" | llm\n", | ||
" | StrOutputParser()\n", | ||
")\n", | ||
"\n", | ||
"a = chain.invoke(\"What are the organizations sales goals?\")\n", | ||
"\n", | ||
"print(a)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "1e855e64da97b052", | ||
"metadata": {}, | ||
"source": [ | ||
"_You could now try experimenting with other questions._\n" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.12" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
Oops, something went wrong.