[RFC] Refactor chat template and remove model name from engine config

## Motivation
- Decoupling dialogue templates from the inference engine.
- Reduce the barrier to adding new dialogue templates.
- Remove `model_name` from EngineConfig to avoid redundant specification.
- Support external dialogue templates compatible with Transformers.

## Major features
- The `Tokenizer` class supports Transformers’ Jinja dialogue templates.
- Original `model.get_prompt` will be moved to `Tokenizer.apply_chat_template`
- `model_name` is removed from `TurbomindEngineConfig` and `PytorchEngineConfig`.

## How to use
For api_server, to use an extra template, commands could be:
```shell
lmdeploy serve api_server $MODEL_PATH --chat-template $JINJIA
```
For APIs like `pipeline`, we are going to provide documents to show how to add a chat template in python or Jinjia.
The codes will be:
```python
chat_template = PythonTemplate() # or a function or a Jinjia str or file path
input_inds = tokenizer.apply_chat_template(messages, chat_template=chat_template)
pipeline(input_ids)
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Refactor chat template and remove model name from engine config #1065

Motivation

Major features

How to use

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Refactor chat template and remove model name from engine config #1065

Description

Motivation

Major features

How to use

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions