Commit Graph

3783 Commits (1fa7e8b2281d3156bc09b1227e5cbc617fc541d4)

Author SHA1 Message Date
Wenxuan Tan 62c13e7969
[Ring Attention] Improve comments (#6085)
* improve comments

* improve comments

---------

Co-authored-by: Edenzzzz <wtan45@wisc.edu>
2024-10-16 11:23:35 +08:00
Wang Binluo dcd41d0973
Merge pull request #6071 from wangbluo/ring_attention
[Ring Attention] fix the 2d ring attn when using multiple machine
2024-10-15 15:17:21 +08:00
wangbluo 83cf2f84fb fix 2024-10-15 14:50:27 +08:00
wangbluo bc7eeade33 fix 2024-10-15 13:28:33 +08:00
wangbluo fd92789af2 fix 2024-10-15 13:26:44 +08:00
wangbluo 6be9862aaf fix 2024-10-15 11:56:49 +08:00
wangbluo 3dc08c8a5a fix 2024-10-15 11:01:34 +08:00
wangbluo 8ff7d0c780 fix 2024-10-14 18:16:03 +08:00
wangbluo fe9208feac fix 2024-10-14 18:07:56 +08:00
wangbluo 3201377e94 fix 2024-10-14 18:06:24 +08:00
wangbluo 23199e34cc fix 2024-10-14 18:01:53 +08:00
wangbluo d891e50617 fix 2024-10-14 14:56:05 +08:00
wangbluo e1e86f9f1f fix 2024-10-14 11:45:35 +08:00
Tong Li 4c8e85ee0d
[Coati] Train DPO using PP (#6054)
* update dpo

* remove unsupport plugin

* update msg

* update dpo

* remove unsupport plugin

* update msg

* update template

* update dataset

* add pp for dpo

* update dpo

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* add dpo fn

* update dpo

* update dpo

* update dpo

* update dpo

* minor update

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update loss

* update help

* polish code

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-10-11 19:32:00 +08:00
wangbluo 703bb5c18d fix the test 2024-10-11 17:34:20 +08:00
wangbluo 4e0e99bb6a fix the test 2024-10-11 17:31:40 +08:00
wangbluo 1507a7528f fix 2024-10-11 06:20:34 +00:00
wangbluo 0002ae5956 fix 2024-10-11 14:16:21 +08:00
Hongxin Liu dc2cdaf3e8
[shardformer] optimize seq parallelism (#6086)
* [shardformer] optimize seq parallelism

* [shardformer] fix gpt2 fused linear col

* [plugin] update gemini plugin

* [plugin] update moe hybrid plugin

* [test] update gpt2 fused linear test

* [shardformer] fix gpt2 fused linear reduce
2024-10-11 13:44:40 +08:00
wangbluo efe3042bb2 fix 2024-10-10 18:38:47 +08:00
梁爽 6b2c506fc5
Update README.md (#6087)
add HPC-AI.COM activity
2024-10-10 17:02:49 +08:00
wangbluo 5ecc27e150 fix 2024-10-10 15:35:52 +08:00
wangbluo f98384aef6 fix 2024-10-10 15:17:06 +08:00
Hongxin Liu 646b3c5a90
[shardformer] fix linear 1d row and support uneven splits for fused qkv linear (#6084)
* [tp] hotfix linear row

* [tp] support uneven split for fused linear

* [tp] support sp for fused linear

* [tp] fix gpt2 mlp policy

* [tp] fix gather fused and add fused linear row
2024-10-10 14:34:45 +08:00
wangbluo b635dd0669 fix 2024-10-09 14:05:26 +08:00
wangbluo 3532f77b90 fix 2024-10-09 10:57:19 +08:00
wangbluo 3fab92166e fix 2024-09-26 18:03:09 +08:00
binmakeswell f4daf04270
add funding news (#6072)
* add funding news

* add funding news

* add funding news
2024-09-26 12:29:27 +08:00
wangbluo 6705dad41b fix 2024-09-25 19:02:21 +08:00
wangbluo 91ed32c256 fix 2024-09-25 19:00:38 +08:00
wangbluo 6fb1322db1 fix 2024-09-25 18:56:18 +08:00
wangbluo 65c8297710 fix the attn 2024-09-25 18:51:03 +08:00
wangbluo cfd9eda628 fix the ring attn 2024-09-25 18:34:29 +08:00
binmakeswell cbaa104216
release FP8 news (#6068)
* add FP8 news

* release FP8 news

* release FP8 news
2024-09-25 11:57:16 +08:00
Hongxin Liu dabc2e7430
[release] update version (#6062) 2024-09-19 10:45:32 +08:00
Camille Zhong f9546ba0be
[ColossalEval] support for vllm (#6056)
* support vllm

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* modify vllm and update readme

* run pre-commit

* remove dupilicated lines and refine code

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update param name

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* refine code

* update readme

* refine code

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-09-18 17:09:45 +08:00
botbw 4fa6b9509c
[moe] add parallel strategy for shared_expert && fix test for deepseek (#6063) 2024-09-18 10:09:01 +08:00
Wang Binluo 63314ce4e4
Merge pull request #6064 from wangbluo/fix_attn
[sp] : fix the attention kernel for sp
2024-09-18 10:08:15 +08:00
wangbluo 10e4f7da72 fix 2024-09-16 13:45:04 +08:00
Wang Binluo 37e35230ff
Merge pull request #6061 from wangbluo/sp_fix
[sp] : fix the attention kernel for sp
2024-09-14 20:54:35 +08:00
wangbluo 827ef3ee9a fix 2024-09-14 10:40:35 +00:00
Guangyao Zhang bdb125f83f
[doc] FP8 training and communication document (#6050)
* Add FP8 training and communication document

* add fp8 docstring for plugins

* fix typo

* fix typo
2024-09-14 11:01:05 +08:00
Guangyao Zhang f20b066c59
[fp8] Disable all_gather intranode. Disable Redundant all_gather fp8 (#6059)
* all_gather only internode, fix pytest

* fix cuda arch <89 compile pytest error

* fix pytest failure

* disable all_gather_into_tensor_flat_fp8

* fix fp8 format

* fix pytest

* fix conversations

* fix chunk tuple to list
2024-09-14 10:40:01 +08:00
wangbluo b582319273 fix 2024-09-13 10:24:41 +00:00
wangbluo 0ad3129cb9 fix 2024-09-13 09:01:26 +00:00
wangbluo 0b14a5512e fix 2024-09-13 07:06:14 +00:00
botbw 696fced0d7
[fp8] fix missing fp8_comm flag in mixtral (#6057) 2024-09-13 14:30:05 +08:00
wangbluo dc032172c3 fix 2024-09-13 06:00:58 +00:00
wangbluo f393867cff fix 2024-09-13 05:24:52 +00:00
wangbluo 6eb8832366 fix 2024-09-13 05:06:56 +00:00