Skip to content

Add NPU-compatible select_device() and new test cases for data preprocessing#9512

Open
Ginray wants to merge 10 commits into
modelscope:mainfrom
Ginray:main
Open

Add NPU-compatible select_device() and new test cases for data preprocessing#9512
Ginray wants to merge 10 commits into
modelscope:mainfrom
Ginray:main

Conversation

@Ginray

@Ginray Ginray commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

A new round of testing has been performed on Swift test cases running on NPU, including manually triggered cases. Adjustments have been made in response to the test outcomes. Please review.

PR type

  • [ √ ] Bug Fix
  • New Feature
  • Document Updates
  • More Models or Datasets Support

PR information

Swift's test suite faces two potential limitations. (1) CUDA_VISIBLE_DEVICES may not take effect on NPUs; (2) Core training, inference and preprocessing top-level functions may be undiscoverable by unittests, which could create CI coverage gaps.

Changes :

  1. New tests/_test_utils.py : Provides select_device() , auto-setting ASCEND_RT_VISIBLE_DEVICES on NPU and CUDA_VISIBLE_DEVICES on GPU.

  2. 51 existing files : Only replaced os.environ['CUDA_VISIBLE_DEVICES'] with setup_device_env() , no other code changes.

  3. 2 new unittest.TestCase files (auto-discovered by CI):

    • tests/general/test_data_preprocess.py (6 cases, ~52s): dataset encode, truncation, collator, multi-turn, tool message, packing
    • tests/megatron/test_megatron_args.py (4 cases, ~5s, auto-skip if no megatron): Megatron args construction
  4. 1 new top-level function file (manual call, not in CI):

    • tests/infer/test_transformers_engine.py (3 cases, ~129s): TransformersEngine batch/stream/system inference
  5. 6 existing files with appended top-level functions (commented out by default):

    • test_sft.py: test_lora_sft_minimal , test_full_sft_minimal
    • test_rlhf.py: test_dpo_minimal
    • test_grpo.py: test_grpo_minimal (auto-skip if trl<0.26)
    • test_pt.py: test_pretrain_minimal
    • test_quant.py: test_lora_merge_export_minimal
    • test_eval.py: test_eval_native_minimal (auto-skip if evalscope not installed)
      Verification : All new cases executed and passed on NPU.

No impact on existing UTs :

  • Original function signatures, parameters, and logic completely unchanged
  • main blocks identical to original
  • New unittest.TestCase cases total <60s, minimal CI time impact
  • Top-level functions commented out by default, not auto-executed

Experiment results

Paste your experiment result here(if needed).

All new test cases passed, including manual ones:

image

@Ginray Ginray changed the title Add NPU-compatible setup_device_env() and new test cases for data pre… Add NPU-compatible setup_device_env() and new test cases for data preprocessing Jun 8, 2026

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request centralizes device environment setup across tests using a new setup_device_env utility, and introduces lightweight tests for data preprocessing, Megatron arguments, and the TransformersEngine. Key feedback includes casting device_ids to a string to prevent type errors, using a try...finally block to avoid test pollution when modifying template properties, adding defensive checks for empty choices in streaming responses, gracefully handling missing trl imports during test collection, and lazy-loading the TransformersEngine to prevent resource consumption during test discovery.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread tests/_test_utils.py Outdated
Comment thread tests/general/test_data_preprocess.py
Comment thread tests/infer/test_transformers_engine.py
Comment thread tests/train/test_grpo.py Outdated
Comment thread tests/infer/test_transformers_engine.py
Ginray added 2 commits June 8, 2026 16:49
…cessing, TransformersEngine inference, and Megatron args to improve CI coverage
# Conflicts:
#	tests/deploy/test_dataset.py
#	tests/deploy/test_logprobs.py
#	tests/eval/test_eval.py
#	tests/export/test_quant.py
#	tests/general/test_data_preprocess.py
#	tests/infer/test_agent.py
#	tests/infer/test_infer.py
#	tests/infer/test_logprobs.py
#	tests/infer/test_main.py
#	tests/infer/test_mllm.py
#	tests/infer/test_sglang.py
#	tests/infer/test_transformers_engine.py
#	tests/llm/test_ollama_export.py
#	tests/llm/test_run.py
#	tests/llm/test_template.py
#	tests/megatron/test_embedding.py
#	tests/megatron/test_export.py
#	tests/megatron/test_gkd.py
#	tests/megatron/test_grpo.py
#	tests/megatron/test_kto.py
#	tests/megatron/test_lora.py
#	tests/megatron/test_rlhf.py
#	tests/megatron/test_train.py
#	tests/models/test_llm.py
#	tests/models/test_mllm.py
#	tests/test_align/test_cls.py
#	tests/test_align/test_lmdeploy_vlm.py
#	tests/test_align/test_padding_side.py
#	tests/test_align/test_template/test_agent.py
#	tests/test_align/test_template/test_audio.py
#	tests/test_align/test_template/test_gene.py
#	tests/test_align/test_template/test_llm.py
#	tests/test_align/test_template/test_tool.py
#	tests/test_align/test_template/test_video.py
#	tests/test_align/test_template/test_vision.py
#	tests/test_align/test_vllm_vlm.py
#	tests/train/test_channel.py
#	tests/train/test_cls.py
#	tests/train/test_embedding.py
#	tests/train/test_freeze.py
#	tests/train/test_gkd.py
#	tests/train/test_grpo.py
#	tests/train/test_kto.py
#	tests/train/test_liger.py
#	tests/train/test_multilabel.py
#	tests/train/test_packing.py
#	tests/train/test_ppo.py
#	tests/train/test_pt.py
#	tests/train/test_resume_from_checkpoint.py
#	tests/train/test_rlhf.py
#	tests/train/test_sft.py
#	tests/train/test_train_eval.py
#	tests/train/test_vit_lr.py
@Ginray Ginray changed the title Add NPU-compatible setup_device_env() and new test cases for data preprocessing Add NPU-compatible select_device() and new test cases for data preprocessing Jun 8, 2026
@Ginray Ginray requested a review from hjh0119 June 8, 2026 13:57
@Ginray

Ginray commented Jun 8, 2026

Copy link
Copy Markdown
Contributor Author

We confirmed the UT failure comes from No module named 'swift.ray' and is not related to this PR. It is expected that PR pull/9471 and follow-up PRs will address the UT problem.

Comment thread swift/utils/env.py Outdated
@Jintao-Huang

Copy link
Copy Markdown
Collaborator

不用管这些CI的

@Ginray Ginray requested a review from Jintao-Huang June 9, 2026 06:54
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'

from swift.utils import select_device

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里会初始化 cuda/npu,导致环境变量失效的。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants