Skip to content

[RFC] Refactor chat template and remove model name from engine config #1065

Open
@AllentDan

Description

@AllentDan

Motivation

  • Decoupling dialogue templates from the inference engine.
  • Reduce the barrier to adding new dialogue templates.
  • Remove model_name from EngineConfig to avoid redundant specification.
  • Support external dialogue templates compatible with Transformers.

Major features

  • The Tokenizer class supports Transformers’ Jinja dialogue templates.
  • Original model.get_prompt will be moved to Tokenizer.apply_chat_template
  • model_name is removed from TurbomindEngineConfig and PytorchEngineConfig.

How to use

For api_server, to use an extra template, commands could be:

lmdeploy serve api_server $MODEL_PATH --chat-template $JINJIA

For APIs like pipeline, we are going to provide documents to show how to add a chat template in python or Jinjia.
The codes will be:

chat_template = PythonTemplate() # or a function or a Jinjia str or file path
input_inds = tokenizer.apply_chat_template(messages, chat_template=chat_template)
pipeline(input_ids)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions