[Feature] Add common SGLang runtime override options to `sgl-omni serve`

## Motivation
`sgl-omni-serve` currently exposes only high-level serving options. Common SGLang runtime settings such as tensor parallel size, CPU offload, and static memory fraction can only be set by manually editing a pipeline config YAML

This affects Ming-flash-omni-2.0 immediately because the default generated config for `sgl-omni serve` uses `cpu_offload_gb=0` and `mem_fraction_static=0.7`, which is often not enough to launch the model. But underlying problem is CLI-level: users need a stable way to pass common SGLang ServerArgs overrides through `sgl-omni serve`.

## Proposed CLI options
  - `--tp-size`
  - `--cpu-offload-gb`
  - `--mem-fraction-static`

## Mapping Semantics
Initial implementation can apply these options to the pipeline's primary SGLang generation stage. For Qwen-style and Ming omni pipelines, this is the hinker stage.

For pipelines with multiple SGLang-backed stages, we could:
- apply only to the primary generation stage and document that behavior
- add a stage-targeting mechanism later

 ## Example Target UX

  TP=1:
```
  CUDA_VISIBLE_DEVICES=0 \
    sgl-omni serve \
      --model-path inclusionAI/Ming-flash-omni-2.0 \
      --port 8000 \
      --model-name ming-omni \
      --cpu-offload-gb 80 \
      --mem-fraction-static 0.92
```
  TP=2:
```
  CUDA_VISIBLE_DEVICES=0,1 \
    sgl-omni serve \
      --model-path inclusionAI/Ming-flash-omni-2.0 \
      --port 8000 \
      --model-name ming-omni \
      --tp-size 2 \
      --cpu-offload-gb 0 \
      --mem-fraction-static 0.80
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Add common SGLang runtime override options to `sgl-omni serve` #296

Motivation

Proposed CLI options

Mapping Semantics

Example Target UX

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Add common SGLang runtime override options to sgl-omni serve #296

Description

Motivation

Proposed CLI options

Mapping Semantics

Example Target UX

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Feature] Add common SGLang runtime override options to `sgl-omni serve` #296