Skip to content

Add MoeAdamHHeuristic, drop dense layers, fix align_kv_heads sharding #30

Add MoeAdamHHeuristic, drop dense layers, fix align_kv_heads sharding

Add MoeAdamHHeuristic, drop dense layers, fix align_kv_heads sharding #30

Job log options

This job was skipped