Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chatbot-rag-app: recover from timeouts on first use of ELSER #397

Merged
merged 1 commit into from
Feb 21, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 18 additions & 2 deletions .github/workflows/docker-chatbot-rag-app.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,9 @@ on:
branches:
- main
paths:
# Verify changes to the Dockerfile on PRs
# Verify changes to the Dockerfile on PRs, tainted when we update ES.
- docker/docker-compose-elastic.yml
- example-apps/chatbot-rag-app/docker-compose.yml
- example-apps/chatbot-rag-app/Dockerfile
- .github/workflows/docker-chatbot-rag-app.yml
- '!**/*.md'
Expand Down Expand Up @@ -42,13 +44,27 @@ jobs:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
# This builds the image and pushes its digest if a multi-architecture
# image will be made later (event_name == 'push'). If PR, the image is
# loaded into docker for testing.
- uses: docker/build-push-action@v6
id: build
with:
context: example-apps/chatbot-rag-app
outputs: type=image,name=${{ env.IMAGE }},push-by-digest=true,name-canonical=true,push=${{ github.event_name == 'push' && 'true' || 'false' }}
outputs: type=${{ github.event_name == 'pull_request' && 'docker' || 'image' }},name=${{ env.IMAGE }},push-by-digest=true,name-canonical=true,push=${{ github.event_name == 'push' && 'true' || 'false' }}
cache-from: type=gha
cache-to: type=gha,mode=max
- name: start elasticsearch
if: github.event_name == 'pull_request'
run: docker compose -f docker/docker-compose-elastic.yml up --quiet-pull -d --wait --wait-timeout 120 elasticsearch
- name: test image
if: github.event_name == 'pull_request'
working-directory: example-apps/chatbot-rag-app
run: | # This tests ELSER is working, which doesn't require an LLM.
cp env.example .env
# same as `docker compose run --rm -T create-index`, except pull never
docker run --rm --name create-index --env-file .env --pull never \
--add-host "localhost:host-gateway" ${{ env.IMAGE }} flask create-index
- name: export digest
if: github.event_name == 'push'
run: |
Expand Down
2 changes: 1 addition & 1 deletion docker/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ wget https://raw.githubusercontent.com/elastic/elasticsearch-labs/refs/heads/mai
Use docker compose to run Elastic stack in the background:

```bash
docker compose -f docker-compose-elastic.yml up --force-recreate -d
docker compose -f docker-compose-elastic.yml up --force-recreate --wait -d
```

Then, you can view Kibana at http://localhost:5601/app/home#/
Expand Down
22 changes: 15 additions & 7 deletions docker/docker-compose-elastic.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ name: elastic-stack

services:
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.0
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.2
container_name: elasticsearch
ports:
- 9200:9200
Expand All @@ -16,21 +16,29 @@ services:
- xpack.security.http.ssl.enabled=false
- xpack.security.transport.ssl.enabled=false
- xpack.license.self_generated.type=trial
- ES_JAVA_OPTS=-Xmx8g
# Note that ELSER is recommended to have 2GB, but it is JNI (PyTorch).
# So, ELSER's memory is in addition to the heap and other overhead.
- ES_JAVA_OPTS=-Xms2g -Xmx2g
ulimits:
memlock:
soft: -1
hard: -1
healthcheck:
test: ["CMD-SHELL", "curl -s http://localhost:9200/_cluster/health?wait_for_status=yellow&timeout=500ms"]
retries: 300
test: # readiness probe taken from kbn-health-gateway-server script
[
"CMD-SHELL",
"curl -s http://localhost:9200 | grep -q 'missing authentication credentials'",
]
start_period: 10s
interval: 1s
timeout: 10s
retries: 120

elasticsearch_settings:
depends_on:
elasticsearch:
condition: service_healthy
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.0
image: docker.elastic.co/elasticsearch/elasticsearch:8.17.2
container_name: elasticsearch_settings
restart: 'no'
command: >
Expand All @@ -42,7 +50,7 @@ services:
'

kibana:
image: docker.elastic.co/kibana/kibana:8.17.0
image: docker.elastic.co/kibana/kibana:8.17.2
container_name: kibana
depends_on:
elasticsearch_settings:
Expand All @@ -66,7 +74,7 @@ services:
interval: 1s

apm-server:
image: docker.elastic.co/apm/apm-server:8.17.0
image: docker.elastic.co/apm/apm-server:8.17.2
container_name: apm-server
depends_on:
elasticsearch:
Expand Down
35 changes: 18 additions & 17 deletions example-apps/chatbot-rag-app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,34 +45,36 @@ and configure its templated connection settings:

## Running the App

This application contains two services:
* create-index: Installs ELSER and ingests data into elasticsearch
* api-frontend: Hosts the chatbot-rag-app application on http://localhost:4000

There are two ways to run the app: via Docker or locally. Docker is advised for
ease while locally is advised if you are making changes to the application.

### Run with docker

Docker compose is the easiest way, as you get one-step to:
* ingest data into elasticsearch
* run the app, which listens on http://localhost:4000
Docker compose is the easiest way to get started, as you don't need to have a
working Python environment.

**Double-check you have a `.env` file with all your variables set first!**

```bash
docker compose up --pull always --force-recreate
```

*Note*: First time creating the index can fail on timeout. Wait a few minutes
and retry.
*Note*: The first run may take several minutes to become available.

Clean up when finished, like this:

```bash
docker compose down
```

### Run locally
### Run with Python

If you want to run this example with Python and Node.js, you need to do a few
things listed in the [Dockerfile](Dockerfile). The below uses the same
If you want to run this example with Python, you need to do a few things listed
in the [Dockerfile](Dockerfile) to build it first. The below uses the same
production mode as used in Docker to avoid problems in debug mode.

**Double-check you have a `.env` file with all your variables set first!**
Expand All @@ -89,7 +91,7 @@ nvm use --lts
(cd frontend; yarn install; REACT_APP_API_HOST=/api yarn build)
```

#### Configure your python environment
#### Configure your Python environment

Before we can run the app, we need a working Python environment with the
correct packages installed:
Expand All @@ -102,17 +104,16 @@ pip install "python-dotenv[cli]"
pip install -r requirements.txt
```

#### Run the ingest command
#### Create your Elasticsearch index

First, ingest the data into elasticsearch:
```bash
FLASK_APP=api/app.py dotenv run -- flask create-index
dotenv run -- flask create-index
```

*Note*: First time creating the index can fail on timeout. Wait a few minutes
and retry.
*Note*: This may take several minutes to complete

#### Run the app
#### Run the application

Now, run the app, which listens on http://localhost:4000
```bash
Expand Down Expand Up @@ -185,10 +186,10 @@ passages. Modify this script to index your own data.

See [Langchain documentation][loader-docs] for more ways to load documents.

### Building from source with docker
### Running from source with Docker

To build the app from source instead of using published images, pass the `--build`
flag to Docker Compose.
To build the app from source instead of using published images, pass the
`--build` flag to Docker Compose instead of `--pull always`

```bash
docker compose up --build --force-recreate
Expand Down
8 changes: 7 additions & 1 deletion example-apps/chatbot-rag-app/api/chat.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
get_elasticsearch_chat_message_history,
)
from flask import current_app, render_template, stream_with_context
from functools import cache
from langchain_elasticsearch import (
ElasticsearchStore,
SparseVectorStrategy,
Expand All @@ -27,11 +28,16 @@
strategy=SparseVectorStrategy(model_id=ELSER_MODEL),
)

llm = get_llm()

@cache
def get_lazy_llm():
return get_llm()


@stream_with_context
def ask_question(question, session_id):
llm = get_lazy_llm()

yield f"data: {SESSION_ID_TAG} {session_id}\n\n"
current_app.logger.debug("Chat session ID: %s", session_id)

Expand Down
Loading