Unable to access the Llama API after following the deployment instructions

### What happened?

I have been following the instructions from [How to run Llama-3-8B with Kubernetes](https://github.com/kuasar-io/kuasar/blob/main/docs/wasm/How-to-run-Llama-3-8B-with-Kubernetes.md), but I am unable to get it working. The container is in the RUNNING state, and I have added the -s 0.0.0.0:8080 flag to open the service (learned from [LlamaEdge](https://github.com/LlamaEdge/LlamaEdge)). However, I am still unable to access it, and it does not seem to be working as expected.

Here is a summary of the steps I followed:

I applied the Kubernetes deployment YAML to start the Llama API (runtime and wasm-sandboxer all setup in right ways):
```bash
apiVersion: apps/v1
kind: Deployment
metadata:
  name: llama
  labels:
    app: llama
spec:
  replicas: 1
  selector:
    matchLabels:
      app: llama
  template:
    metadata:
      labels:
        app: llama
    spec:
      containers:
      - command:
        - llama-api-server.wasm
        args: ["--prompt-template", "llama-3-chat", "--ctx-size", "4096", "--model-name", "Llama-3-8B", "-s", "0.0.0.0:8080"]
        env:
        - name: io.kuasar.wasm.nn_preload
          value: default:GGML:AUTO:Meta-Llama-3-8B-Instruct-Q5_K_M.gguf
        image: docker.io/kuasario/llama-api-server:v1
        name: llama-api-server
      runtimeClassName: kuasar-wasm
```
- The container is running successfully (kubectl get pods shows RUNNING, It means that all the prerequisites have been met ).
- I am using the correct -s 0.0.0.0:8080 flag in the command arguments.

First, I did not use the SERVICE strategy, and instead opted for a lightweight testing approach. I used port forwarding to expose the container's 8080 port on my local machine's 8000 port. Then, I temporarily used the curl command mentioned in [LlamaEdge](https://github.com/LlamaEdge/LlamaEdge) to make API requests, with a slight modification to the command. However, the result was:

![Image](https://github.com/user-attachments/assets/ffdeea57-946e-42ed-ae3a-82d2661d51e5)

![Image](https://github.com/user-attachments/assets/3ef39bba-f9a5-4e69-992b-277f163da3d7)

![Image](https://github.com/user-attachments/assets/bd3fbaab-1a9b-49a7-add6-1523d17d793f)

![Image](https://github.com/user-attachments/assets/7545a06f-ac9d-4665-a3c2-cfc0eb91c6ea)

- This error suggests that the application inside the container is not successfully listening on the 8080 port or the port forwarding is failing due to some internal issue with the container setup. 
- The container is indeed in the running state, However, the logs are empty, and even after testing with the "--log-all" option, there are still no logs, which could indicate that the service has not been started or encountered an error. 
- Since this is a Wasm sandbox, it is impossible to directly enter the container to check if the service has started properly.

Here are a few things that could be happening:
- The service inside the container might not be correctly binding to the 8080 port.
- The port forwarding might not be properly set up, or Kubernetes networking might be blocking the connection to the port.
- There could be an issue with the Wasm runtime or the wasmedge environment not correctly handling the service.

Could you provide guidance on troubleshooting this issue further, or is there something I'm missing in the configuration or setup?


### What did you expect to happen?

I hope to successfully use Kuasar as the runtime to deploy this service.

### How can we reproduce it (as minimally and precisely as possible)?

You just need to repeat the steps in [How to run Llama-3-8B with Kubernetes](https://github.com/kuasar-io/kuasar/blob/main/docs/wasm/How-to-run-Llama-3-8B-with-Kubernetes.md). I strictly followed the recommended versions of containerd and wasmedge as specified in the repository, yet it still doesn't work.

### Anything else we need to know?

In the above test, I used the -s parameter to specify the server listening port for the application inside the container. The original configuration did not include this parameter, but after checking, it seems that the default port is 8080. However, the test results remain the same.

### Dev environment

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unable to access the Llama API after following the deployment instructions #184

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Dev environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to access the Llama API after following the deployment instructions #184

Description

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Dev environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions