Skip to content

Conversation

@taoluo
Copy link
Contributor

@taoluo taoluo commented Sep 2, 2025

Summary

  • Adding Megatron backend support for critic models (WIP)
  • Created comprehensive design document outlining the implementation
    approach
  • Established configuration file for testing the new backend

Description

This PR introduces the design and planning phase for adding Megatron
backend support to critic models, enabling them to leverage Megatron's
advanced parallelism features (tensor, pipeline, context, and expert
parallel) alongside the existing DeepSpeed backend.

What's included:

  • Design document (docs/critic_megatron_backend_design_final.md):
    Complete technical design with:
    • McaValueModel class implementation details
    • Model provider integration strategy
    • Data flow comparison between backends
    • Minimal 3-phase implementation plan
  • Test configuration
    (examples/docs_examples/example_ppo_megatron_critic.yaml): PPO config for
    testing Megatron critic

What's coming next:

  • Implement McaValueModel class in mcore_adapter
  • Update default_value_model_provider to support Megatron
  • Add integration tests with CriticWorker
  • Validate distributed training features

Status

🚧 Work in Progress - This PR currently contains only the design
documentation and configuration. Implementation will follow based on the
outlined plan.

cc: @PanAndy it would be great if you can review the design doc before implementation begins, thanks

taoluo and others added 4 commits September 2, 2025 11:50
  Add design document for Megatron backend support in critic models,
  including
  McaValueModel class implementation, model provider integration, and minimal
  3-phase implementation plan.
@CLAassistant
Copy link

CLAassistant commented Sep 4, 2025

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
1 out of 2 committers have signed the CLA.

✅ taoluo
❌ kemurukagami
You have signed the CLA already but the status is still pending? Let us recheck it.


logger = get_logger(__name__)

class McaValueModel(McaGPTModel):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some model classes don’t instantiate McaGPTModel directly; they subclass it (e.g., Qwen2VLModel). If we introduce an McaValueModel to provide a value head, we’d have to create a matching ValueModel variant for every new model class. Would it be better to add a value_head option in McaModelConfig and build the value-head capability into the McaGPTModel base class, or to add a post-init hook that runs after model init?

- Add use_value_head config field to McaModelConfig
- Create ValueHeadWrapper class with weight property
- Replace output_layer with value head when use_value_head=True
- Set share_embeddings_and_output_weights=False for value models
- Filter value_head weights from missing_keys during checkpoint loading
- Initialize value head weights to 0.01 for testing parity
- Enable value head for CriticWorker in MegatronStrategy
- Remove unused McaValueModel class
- Update design documentation
- Add test pipeline and configs for critic comparison

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants