Port https://github.com/NovaSky-AI/SkyRL/pull/1095 to skyrl folder by pcmoritz · Pull Request #1129 · NovaSky-AI/SkyRL

pcmoritz · 2026-02-15T22:14:58Z

gemini-code-assist

Code Review

This pull request refactors the model loading logic within the tests by centralizing it into a load_model helper function. This is a great improvement that simplifies the test files, removes redundant code, and eliminates the need for saving models to temporary directories. The changes make the tests cleaner and more maintainable. However, I've noticed a consistent omission of the shard_attention_heads=True parameter in the new load_model calls across multiple tests. This parameter was explicitly set in the previous implementation, and its absence could alter the model configuration and potentially impact the correctness of the tests. I've added specific comments with suggestions to address this.

gemini-code-assist · 2026-02-15T22:16:16Z

+        gradient_checkpointing=False,
+    )


The shard_attention_heads=True parameter seems to be missing in the call to load_model. The original load_model function in this file hardcoded this parameter. Its omission in the refactored code might change the model's configuration and affect the test's correctness. It should be added to maintain consistency with the previous behavior.

Suggested change

gradient_checkpointing=False,

)

gradient_checkpointing=False,

shard_attention_heads=True,

)

gemini-code-assist · 2026-02-15T22:16:16Z

-        model = load_model(tmp, model_name, config_cls, model_cls, mesh_axes, loss_chunk_size=chunk_size)
-        outputs = model(input_ids, attention_mask=attention_mask)
-        logprobs_chunked = np.asarray(model.compute_logprobs(outputs.last_hidden_state, target_ids))
+    common_kwargs = dict(max_lora_adapters=1, max_lora_rank=1, gradient_checkpointing=False)


The shard_attention_heads=True parameter seems to be missing from common_kwargs. The original model loading logic included this parameter. To ensure the test behaves as it did before the refactoring, this parameter should be added to the common keyword arguments passed to load_model.

Suggested change

common_kwargs = dict(max_lora_adapters=1, max_lora_rank=1, gradient_checkpointing=False)

common_kwargs = dict(max_lora_adapters=1, max_lora_rank=1, gradient_checkpointing=False, shard_attention_heads=True)

gemini-code-assist · 2026-02-15T22:16:16Z

+        max_lora_rank=32,
+    )


The shard_attention_heads=True parameter is missing from the load_model call. The original code explicitly set shard_attention_heads=True when creating the Qwen3Config. This parameter should be added to the load_model call to ensure the model configuration remains the same after refactoring.

Suggested change

max_lora_rank=32,

)

max_lora_rank=32,

shard_attention_heads=True,

)

gemini-code-assist · 2026-02-15T22:16:16Z

+        max_lora_rank=max(cfg.r for cfg in lora_configs),
+    )


The shard_attention_heads=True parameter is missing from the load_model call. The original code explicitly set this parameter when creating the Qwen3Config. To maintain the original test behavior, it should be included in the call to load_model.

Suggested change

max_lora_rank=max(cfg.r for cfg in lora_configs),

)

max_lora_rank=max(cfg.r for cfg in lora_configs),

shard_attention_heads=True,

)

Port NovaSky-AI#1095 to skyrl folder

33b5307

gemini-code-assist Bot reviewed Feb 15, 2026

View reviewed changes

This comment was marked as resolved.

Sign in to view

pcmoritz added 2 commits February 15, 2026 14:25

add conftest

0aee78c

update

972fd97

pcmoritz merged commit 8173aa7 into NovaSky-AI:main Feb 15, 2026
3 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Port https://github.com/NovaSky-AI/SkyRL/pull/1095 to skyrl folder#1129

Port https://github.com/NovaSky-AI/SkyRL/pull/1095 to skyrl folder#1129
pcmoritz merged 3 commits into
NovaSky-AI:mainfrom
pcmoritz:port-1095

pcmoritz commented Feb 15, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Feb 15, 2026

Uh oh!

gemini-code-assist Bot Feb 15, 2026

Uh oh!

gemini-code-assist Bot Feb 15, 2026

Uh oh!

gemini-code-assist Bot Feb 15, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	common_kwargs = dict(max_lora_adapters=1, max_lora_rank=1, gradient_checkpointing=False)
	common_kwargs = dict(max_lora_adapters=1, max_lora_rank=1, gradient_checkpointing=False, shard_attention_heads=True)

Conversation

pcmoritz commented Feb 15, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot Feb 15, 2026

Choose a reason for hiding this comment

Uh oh!

This comment was marked as resolved.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pcmoritz commented Feb 15, 2026 •

edited by devin-ai-integration Bot

Loading