Wenxuan Tan
62c13e7969
[Ring Attention] Improve comments ( #6085 )
...
* improve comments
* improve comments
---------
Co-authored-by: Edenzzzz <wtan45@wisc.edu>
2024-10-16 11:23:35 +08:00
Wang Binluo
dcd41d0973
Merge pull request #6071 from wangbluo/ring_attention
...
[Ring Attention] fix the 2d ring attn when using multiple machine
2024-10-15 15:17:21 +08:00
wangbluo
83cf2f84fb
fix
2024-10-15 14:50:27 +08:00
wangbluo
bc7eeade33
fix
2024-10-15 13:28:33 +08:00
wangbluo
fd92789af2
fix
2024-10-15 13:26:44 +08:00
wangbluo
6be9862aaf
fix
2024-10-15 11:56:49 +08:00
wangbluo
3dc08c8a5a
fix
2024-10-15 11:01:34 +08:00
wangbluo
8ff7d0c780
fix
2024-10-14 18:16:03 +08:00
wangbluo
fe9208feac
fix
2024-10-14 18:07:56 +08:00
wangbluo
3201377e94
fix
2024-10-14 18:06:24 +08:00
wangbluo
23199e34cc
fix
2024-10-14 18:01:53 +08:00
wangbluo
d891e50617
fix
2024-10-14 14:56:05 +08:00
wangbluo
e1e86f9f1f
fix
2024-10-14 11:45:35 +08:00
Tong Li
4c8e85ee0d
[Coati] Train DPO using PP ( #6054 )
...
* update dpo
* remove unsupport plugin
* update msg
* update dpo
* remove unsupport plugin
* update msg
* update template
* update dataset
* add pp for dpo
* update dpo
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* add dpo fn
* update dpo
* update dpo
* update dpo
* update dpo
* minor update
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update loss
* update help
* polish code
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-11 19:32:00 +08:00
wangbluo
703bb5c18d
fix the test
2024-10-11 17:34:20 +08:00
wangbluo
4e0e99bb6a
fix the test
2024-10-11 17:31:40 +08:00
wangbluo
1507a7528f
fix
2024-10-11 06:20:34 +00:00
wangbluo
0002ae5956
fix
2024-10-11 14:16:21 +08:00
Hongxin Liu
dc2cdaf3e8
[shardformer] optimize seq parallelism ( #6086 )
...
* [shardformer] optimize seq parallelism
* [shardformer] fix gpt2 fused linear col
* [plugin] update gemini plugin
* [plugin] update moe hybrid plugin
* [test] update gpt2 fused linear test
* [shardformer] fix gpt2 fused linear reduce
2024-10-11 13:44:40 +08:00
wangbluo
efe3042bb2
fix
2024-10-10 18:38:47 +08:00
梁爽
6b2c506fc5
Update README.md ( #6087 )
...
add HPC-AI.COM activity
2024-10-10 17:02:49 +08:00
wangbluo
5ecc27e150
fix
2024-10-10 15:35:52 +08:00
wangbluo
f98384aef6
fix
2024-10-10 15:17:06 +08:00
Hongxin Liu
646b3c5a90
[shardformer] fix linear 1d row and support uneven splits for fused qkv linear ( #6084 )
...
* [tp] hotfix linear row
* [tp] support uneven split for fused linear
* [tp] support sp for fused linear
* [tp] fix gpt2 mlp policy
* [tp] fix gather fused and add fused linear row
2024-10-10 14:34:45 +08:00
wangbluo
b635dd0669
fix
2024-10-09 14:05:26 +08:00
wangbluo
3532f77b90
fix
2024-10-09 10:57:19 +08:00
wangbluo
3fab92166e
fix
2024-09-26 18:03:09 +08:00
binmakeswell
f4daf04270
add funding news ( #6072 )
...
* add funding news
* add funding news
* add funding news
2024-09-26 12:29:27 +08:00
wangbluo
6705dad41b
fix
2024-09-25 19:02:21 +08:00
wangbluo
91ed32c256
fix
2024-09-25 19:00:38 +08:00
wangbluo
6fb1322db1
fix
2024-09-25 18:56:18 +08:00
wangbluo
65c8297710
fix the attn
2024-09-25 18:51:03 +08:00
wangbluo
cfd9eda628
fix the ring attn
2024-09-25 18:34:29 +08:00
binmakeswell
cbaa104216
release FP8 news ( #6068 )
...
* add FP8 news
* release FP8 news
* release FP8 news
2024-09-25 11:57:16 +08:00
Hongxin Liu
dabc2e7430
[release] update version ( #6062 )
2024-09-19 10:45:32 +08:00
Camille Zhong
f9546ba0be
[ColossalEval] support for vllm ( #6056 )
...
* support vllm
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* modify vllm and update readme
* run pre-commit
* remove dupilicated lines and refine code
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* update param name
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* refine code
* update readme
* refine code
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-18 17:09:45 +08:00
botbw
4fa6b9509c
[moe] add parallel strategy for shared_expert && fix test for deepseek ( #6063 )
2024-09-18 10:09:01 +08:00
Wang Binluo
63314ce4e4
Merge pull request #6064 from wangbluo/fix_attn
...
[sp] : fix the attention kernel for sp
2024-09-18 10:08:15 +08:00
wangbluo
10e4f7da72
fix
2024-09-16 13:45:04 +08:00
Wang Binluo
37e35230ff
Merge pull request #6061 from wangbluo/sp_fix
...
[sp] : fix the attention kernel for sp
2024-09-14 20:54:35 +08:00
wangbluo
827ef3ee9a
fix
2024-09-14 10:40:35 +00:00
Guangyao Zhang
bdb125f83f
[doc] FP8 training and communication document ( #6050 )
...
* Add FP8 training and communication document
* add fp8 docstring for plugins
* fix typo
* fix typo
2024-09-14 11:01:05 +08:00
Guangyao Zhang
f20b066c59
[fp8] Disable all_gather intranode. Disable Redundant all_gather fp8 ( #6059 )
...
* all_gather only internode, fix pytest
* fix cuda arch <89 compile pytest error
* fix pytest failure
* disable all_gather_into_tensor_flat_fp8
* fix fp8 format
* fix pytest
* fix conversations
* fix chunk tuple to list
2024-09-14 10:40:01 +08:00
wangbluo
b582319273
fix
2024-09-13 10:24:41 +00:00
wangbluo
0ad3129cb9
fix
2024-09-13 09:01:26 +00:00
wangbluo
0b14a5512e
fix
2024-09-13 07:06:14 +00:00
botbw
696fced0d7
[fp8] fix missing fp8_comm flag in mixtral ( #6057 )
2024-09-13 14:30:05 +08:00
wangbluo
dc032172c3
fix
2024-09-13 06:00:58 +00:00
wangbluo
f393867cff
fix
2024-09-13 05:24:52 +00:00
wangbluo
6eb8832366
fix
2024-09-13 05:06:56 +00:00