Skip to content

Commit

Permalink
chatbot-rag-app: recover from timeouts on first use of ELSER
Browse files Browse the repository at this point in the history
Fixes #307

Signed-off-by: Adrian Cole <[email protected]>
  • Loading branch information
codefromthecrypt committed Feb 20, 2025
1 parent ce9eeb2 commit 48d829f
Show file tree
Hide file tree
Showing 10 changed files with 266 additions and 127 deletions.
20 changes: 18 additions & 2 deletions .github/workflows/docker-chatbot-rag-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ on:
branches:
- main
paths:
# Verify changes to the Dockerfile on PRs
# Verify changes to the Dockerfile on PRs, tainted when we update ES.
- docker/docker-compose-elastic.yml
- example-apps/chatbot-rag-app/docker-compose.yml
- example-apps/chatbot-rag-app/Dockerfile
- .github/workflows/docker-chatbot-rag-app.yml
- '!**/*.md'
Expand Down Expand Up @@ -42,13 +44,27 @@ jobs:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
# This builds the image and pushes its digest if a multi-architecture
# image will be made later (event_name == 'push'). If PR, the image is
# loaded into docker for testing.
- uses: docker/build-push-action@v6
id: build
with:
context: example-apps/chatbot-rag-app
outputs: type=image,name=${{ env.IMAGE }},push-by-digest=true,name-canonical=true,push=${{ github.event_name == 'push' && 'true' || 'false' }}
outputs: type=${{ github.event_name == 'pull_request' && 'docker' || 'image' }},name=${{ env.IMAGE }},push-by-digest=true,name-canonical=true,push=${{ github.event_name == 'push' && 'true' || 'false' }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: start elasticsearch
if: github.event_name == 'pull_request'
run: docker compose -f docker/docker-compose-elastic.yml up --quiet-pull -d --wait --wait-timeout 120 elasticsearch
- name: test image
if: github.event_name == 'pull_request'
working-directory: example-apps/chatbot-rag-app
run: | # This tests ELSER is working, which doesn't require an LLM.
cp env.example .env
# same as `docker compose run --rm -T create-index`, except pull never
docker run --rm --name create-index --env-file .env --pull never \
--add-host "localhost:host-gateway" ${{ env.IMAGE }} flask create-index
- name: export digest
if: github.event_name == 'push'
run: |
Expand Down
2 changes: 1 addition & 1 deletion docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ wget https://raw.githubusercontent.com/elastic/elasticsearch-labs/refs/heads/mai
Use docker compose to run Elastic stack in the background:

```bash
docker compose -f docker-compose-elastic.yml up --force-recreate -d
docker compose -f docker-compose-elastic.yml up --force-recreate --wait -d
```

Then, you can view Kibana at http://localhost:5601/app/home#/
Expand Down
22 changes: 15 additions & 7 deletions docker/docker-compose-elastic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: elastic-stack

services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.0
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.2
container_name: elasticsearch
ports:
- 9200:9200
Expand All @@ -16,21 +16,29 @@ services:
- xpack.security.http.ssl.enabled=false
- xpack.security.transport.ssl.enabled=false
- xpack.license.self_generated.type=trial
- ES_JAVA_OPTS=-Xmx8g
# Note that ELSER is recommended to have 2GB, but it is JNI (PyTorch).
# So, ELSER's memory is in addition to the heap and other overhead.
- ES_JAVA_OPTS=-Xms2g -Xmx2g
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test: ["CMD-SHELL", "curl -s http://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=500ms"]
retries: 300
test: # readiness probe taken from kbn-health-gateway-server script
[
"CMD-SHELL",
"curl -s http://localhost:9200 | grep -q 'missing authentication credentials'",
]
start_period: 10s
interval: 1s
timeout: 10s
retries: 120

elasticsearch_settings:
depends_on:
elasticsearch:
condition: service_healthy
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.0
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.2
container_name: elasticsearch_settings
restart: 'no'
command: >
Expand All @@ -42,7 +50,7 @@ services:
'
kibana:
image: docker.elastic.co/kibana/kibana:8.17.0
image: docker.elastic.co/kibana/kibana:8.17.2
container_name: kibana
depends_on:
elasticsearch_settings:
Expand All @@ -66,7 +74,7 @@ services:
interval: 1s

apm-server:
image: docker.elastic.co/apm/apm-server:8.17.0
image: docker.elastic.co/apm/apm-server:8.17.2
container_name: apm-server
depends_on:
elasticsearch:
Expand Down
35 changes: 18 additions & 17 deletions example-apps/chatbot-rag-app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,34 +45,36 @@ and configure its templated connection settings:

## Running the App

This application contains two services:
* create-index: Installs ELSER and ingests data into elasticsearch
* api-frontend: Hosts the chatbot-rag-app application on http://localhost:4000

There are two ways to run the app: via Docker or locally. Docker is advised for
ease while locally is advised if you are making changes to the application.

### Run with docker

Docker compose is the easiest way, as you get one-step to:
* ingest data into elasticsearch
* run the app, which listens on http://localhost:4000
Docker compose is the easiest way to get started, as you don't need to have a
working Python environment.

**Double-check you have a `.env` file with all your variables set first!**

```bash
docker compose up --pull always --force-recreate
```

*Note*: First time creating the index can fail on timeout. Wait a few minutes
and retry.
*Note*: The first run may take several minutes to become available.

Clean up when finished, like this:

```bash
docker compose down
```

### Run locally
### Run with Python

If you want to run this example with Python and Node.js, you need to do a few
things listed in the [Dockerfile](Dockerfile). The below uses the same
If you want to run this example with Python, you need to do a few things listed
in the [Dockerfile](Dockerfile) to build it first. The below uses the same
production mode as used in Docker to avoid problems in debug mode.

**Double-check you have a `.env` file with all your variables set first!**
Expand All @@ -89,7 +91,7 @@ nvm use --lts
(cd frontend; yarn install; REACT_APP_API_HOST=/api yarn build)
```

#### Configure your python environment
#### Configure your Python environment

Before we can run the app, we need a working Python environment with the
correct packages installed:
Expand All @@ -102,17 +104,16 @@ pip install "python-dotenv[cli]"
pip install -r requirements.txt
```

#### Run the ingest command
#### Create your Elasticsearch index

First, ingest the data into elasticsearch:
```bash
FLASK_APP=api/app.py dotenv run -- flask create-index
dotenv run -- flask create-index
```

*Note*: First time creating the index can fail on timeout. Wait a few minutes
and retry.
*Note*: This may take several minutes to complete

#### Run the app
#### Run the application

Now, run the app, which listens on http://localhost:4000
```bash
Expand Down Expand Up @@ -185,10 +186,10 @@ passages. Modify this script to index your own data.

See [Langchain documentation][loader-docs] for more ways to load documents.

### Building from source with docker
### Running from source with Docker

To build the app from source instead of using published images, pass the `--build`
flag to Docker Compose.
To build the app from source instead of using published images, pass the
`--build` flag to Docker Compose instead of `--pull always`

```bash
docker compose up --build --force-recreate
Expand Down
8 changes: 7 additions & 1 deletion example-apps/chatbot-rag-app/api/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
get_elasticsearch_chat_message_history,
)
from flask import current_app, render_template, stream_with_context
from functools import cache
from langchain_elasticsearch import (
ElasticsearchStore,
SparseVectorStrategy,
Expand All @@ -27,11 +28,16 @@
strategy=SparseVectorStrategy(model_id=ELSER_MODEL),
)

llm = get_llm()

@cache
def get_lazy_llm():
return get_llm()


@stream_with_context
def ask_question(question, session_id):
llm = get_lazy_llm()

yield f"data: {SESSION_ID_TAG} {session_id}\n\n"
current_app.logger.debug("Chat session ID: %s", session_id)

Expand Down
Loading

0 comments on commit 48d829f

Please sign in to comment.