Skip to content

xinference 用内置executor 跑deepseek MTP 模式性能低 #176

@ZhikaiGuo960110

Description

@ZhikaiGuo960110

Is your feature request related to a problem? Please describe

A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
目前用xinference 内置的executor 去跑deepseek MTP 模式,性能会低,大概低25%
【相比较于用vllm内置原生的executor】

Image

Describe the solution you'd like

A clear and concise description of what you want to happen.
有可能是因为xoscar 内部没有对pytorch 做zero copy 序列化处理。目前还需要调研

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.
在序列化处理逻辑添加。 https://github.com/xorbitsai/xoscar/tree/main/python/xoscar/serialization

Additional context

Add any other context or screenshots about the feature request here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions