Commit Graph

8 Commits (2d642eea0f92c7f7c1fb7bef3abdfdb0cb61d1bf)

Author SHA1 Message Date
Hongxin Liu 9664b1bc19
[shardformer] hotfix attn mask (#5945)
4 months ago
Guangyao Zhang 1c961b20f3
[ShardFormer] fix qwen2 sp (#5903)
5 months ago
Guangyao Zhang 669849d74b
[ShardFormer] Add Ulysses Sequence Parallelism support for Command-R, Qwen2 and ChatGLM (#5897)
5 months ago
Edenzzzz fbf33ecd01
[Feature] Enable PP + SP for llama (#5868)
5 months ago
Jianghai 8ab46b4000
[Shardformer] change qwen2 modeling into gradient checkpointing style (#5874)
5 months ago
Hongxin Liu 73e88a5553
[shardformer] fix import (#5788)
6 months ago
Wang Binluo 537f6a3855
[Shardformer]fix the num_heads assert for llama model and qwen model (#5704)
7 months ago
Wang Binluo a3cc68ca93
[Shardformer] Support the Qwen2 model (#5699)
7 months ago