[TASK][CHALLENGE] Offline GPT backend for Chat engine

### Code of Conduct

- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)


### Search before asking

- [X] I have searched in the [issues](https://github.com/apache/kyuubi/issues) and found no similar issues.


### Describe the subtask

Currently, the Kyuubi supports Chat engine by invoking the ChatGPT online open API

```
Client => Kyuubi Server => Kyuubi Chat engine (invoke ChatGPT REST API)
```

But, is there any chance we can add a built-in offline ChatGPT engine? The answer is YES.

```
Client => Kyuubi Server => Kyuubi Chat engine (do prediction using local GPT model)
```

There is a project https://github.com/karpathy/nanoGPT which can train the GPT-2 model.

> ... currently the file `train.py` reproduces GPT-2 (124M) on OpenWebText, running on a single 8XA100 40GB node in about 4 days of training.

So, the basic idea is, training a GPT-like model, and Kyuubi Chat engine invoking this model to answer the question.

There are some specific questions:

- Given that Kyuubi is a kind of SQL gateway, we may want it to be smart in SQL area, then how to choose the training DataSet?
- [karpathy/nanoGPT](https://github.com/karpathy/nanoGPT) written in PyTorch, how can Kyuubi Chat engine invoke it? The options may be 
  - use PyTorch Java SDK(does such thing exist?) to load model and Kyuubi Chat engine simply calls function to do prediction 
  - launch a PyTorch serving service which expose RESTful/gRPC/thrift api, and Kyuubi Chat engine using RPC call for prediction

Also, there is another interesting project https://github.com/ggerganov/llama.cpp

### Parent issue

https://github.com/apache/kyuubi/issues/4549

### Are you willing to submit PR?

- [ ] Yes. I would be willing to submit a PR with guidance from the Kyuubi community to improve.
- [X] No. I cannot submit a PR at this time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TASK][CHALLENGE] Offline GPT backend for Chat engine #4555

Code of Conduct

Search before asking

Describe the subtask

Parent issue

Are you willing to submit PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[TASK][CHALLENGE] Offline GPT backend for Chat engine #4555

Description

Code of Conduct

Search before asking

Describe the subtask

Parent issue

Are you willing to submit PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions