Skip to content

Question for the performance of tensor parallelism #4525

You must be logged in to vote

Shardformer implements tensor parallelism strategy specifically for transformer models. which effectively reduces communication costs. You can leverage the features provided by the Hybrid Parallel Plugin. https://github.com/hpcaitech/ColossalAI/blob/feature/shardformer/colossalai/booster/plugin/hybrid_parallel_plugin.py

Replies: 2 comments 4 replies

You must be logged in to vote
0 replies

You must be logged in to vote
4 replies
@eric8607242

@eric8607242

@flybird11111

Answer selected by eric8607242
@eric8607242

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants