Skip to content

Add MoeAdamHHeuristic, drop dense layers, fix align_kv_heads sharding #4096

Add MoeAdamHHeuristic, drop dense layers, fix align_kv_heads sharding

Add MoeAdamHHeuristic, drop dense layers, fix align_kv_heads sharding #4096

Job log options

This job was skipped