Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chatbot-rag-app: adds Kubernetes manifest and instructions #396

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

codefromthecrypt
Copy link
Collaborator

Decided to action this so that we have a coherent experience between docker compose and k8s. This is as close as I could get it. If folks have feedback or a different direction, do tell!

Fixes #366

@codefromthecrypt
Copy link
Collaborator Author

note: each thing we do runs back into this. it would be great to have a way to quickly initialize elser not just installing it, but first time use without timeouts for several minutes #307

@codefromthecrypt
Copy link
Collaborator Author

I have work almost done to make this "normal k8s" local, but wanted to solve the timeout first. so I'll push commit after #397 is merged

@codefromthecrypt
Copy link
Collaborator Author

will bump this tomorrow or when an approver looks at #397

@codefromthecrypt codefromthecrypt changed the base branch from main to recover-from-timeout February 21, 2025 03:57
@codefromthecrypt
Copy link
Collaborator Author

rebased and changed to non-host network k8s. will leave this in draft until #397 is merged as using not-yet-deployed images in k8s is a pain.

Base automatically changed from recover-from-timeout to main February 21, 2025 12:13
@codefromthecrypt
Copy link
Collaborator Author

waiting to get the docker image smaller before "ready for review", as I noticed my network lagging #407

Signed-off-by: Adrian Cole <[email protected]>
@codefromthecrypt
Copy link
Collaborator Author

ok things work in general, but I'm not seeing traces in kibana. I have to put this down for a bit as I have other more urgent things to address.


Note: If you haven't checked out this repository, all you need is one file:
```bash
wget https://raw.githubusercontent.com/elastic/elasticsearch-labs/refs/heads/main/docker/docker-compose-elastic.yml
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this is wrong file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

ports:
- containerPort: 8200
command:
- /usr/local/bin/docker-entrypoint
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH didn't go so far as running chatbot yet but could confirm traces for normal openai examples with this change

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good eye!

labels:
app: chatbot-rag-app
spec:
# The below will recreate your secret based on the gcloud credentials file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally find the GCP-specific instructions in the readme and manifest to be sort of jarring and one point is this is mostly for running locally, while in a cluster the pod should use service account auth provided by the cluster. Similar can be said about using static AWS credentials via env.

"Note for cloud provided backends such as Vertex AI and Bedrock, the pod will need to be authenticated with access to them. This is usually with service account authentication in running clusters, while if testing locally can be with AWS static credentials passed by environment variables or copying a GCP credentials file into a secret mounted as a volume."

Not sure if this is too much information though but I think it would be good if this yaml could avoid the gcloud-specific parts.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point, I also was troubled by it. better to leave this out as it is arduous and creates FUD.

@codefromthecrypt
Copy link
Collaborator Author

Due to elasticon singapore and Sydney... while excited about this i am not finishing it this weekend. Maybe Tuesday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

chatbot-rag-app: add instructions for k8s deployment
2 participants