Commit Graph

88 Commits (ckpt)

Author SHA1 Message Date
Wenxuan Tan 8fd25d6e09
[Feature] Split cross-entropy computation in SP (#5959)
3 months ago
Hanks b480eec738
[Feature]: support FP8 communication in DDP, FSDP, Gemini (#5928)
4 months ago
flybird11111 0c10afd372
[FP8] rebase main (#5963)
4 months ago
BurkeHulk 66018749f3 add fp8_communication flag in the script
5 months ago
hxwang 154720ba6e [chore] refactor profiler utils
6 months ago
hxwang ca674549e0 [chore] remove unnecessary test & changes
6 months ago
genghaozhe a280517dd9 remove unrelated file
6 months ago
genghaozhe df63db7e63 remote comments
6 months ago
hxwang 2e68eebdfe [chore] refactor & sync
6 months ago
Edenzzzz c25f83c85f
fix missing pad token (#5690)
7 months ago
Hongxin Liu 7f8b16635b
[misc] refactor launch API and tensor constructor (#5666)
7 months ago
Edenzzzz d83c633ca6
[hotfix] Fix examples no pad token & auto parallel codegen bug; (#5606)
7 months ago
Wenhao Chen bb0a668fee
[hotfix] set return_outputs=False in examples and polish code (#5404)
8 months ago
flybird11111 29695cf70c
[example]add gpt2 benchmark example script. (#5295)
9 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
Baizhou Zhang df66741f77
[bug] fix get_default_parser in examples (#4764)
1 year ago
Wenhao Chen 7b9b86441f
[chat]: update rm, add wandb and fix bugs (#4471)
1 year ago
Hongxin Liu 079bf3cb26
[misc] update pre-commit and run all files (#4752)
1 year ago
github-actions[bot] 3c6b831c26
[format] applied code formatting on changed files in pull request 4743 (#4750)
1 year ago
Hongxin Liu b5f9e37c70
[legacy] clean up legacy code (#4743)
1 year ago
Bin Jia 608cffaed3
[example] add gpt2 HybridParallelPlugin example (#4653)
1 year ago
Hongxin Liu 554aa9592e
[legacy] move communication and nn to legacy and refactor logger (#4671)
1 year ago
Hongxin Liu ac178ca5c1 [legacy] move builder and registry to legacy (#4603)
1 year ago
Hongxin Liu 89fe027787 [legacy] move trainer to legacy (#4545)
1 year ago
Hongxin Liu 27061426f7
[gemini] improve compatibility and add static placement policy (#4479)
1 year ago
digger yu 2d40759a53
fix #3852 path error (#4058)
1 year ago
Baizhou Zhang 4da324cd60
[hotfix]fix argument naming in docs and examples (#4083)
1 year ago
LuGY 160c64c645
[example] fix bucket size in example of gpt gemini (#4028)
1 year ago
digger yu 33eef714db
fix typo examples and docs (#3932)
1 year ago
jiangmingyan 5f79008c4a
[example] update gemini examples (#3868)
2 years ago
binmakeswell 15024e40d9
[auto] fix install cmd (#3772)
2 years ago
digger-yu b9a8dff7e5
[doc] Fix typo under colossalai and doc(#3618)
2 years ago
Frank Lee 80eba05b0a
[test] refactor tests with spawn (#3452)
2 years ago
ver217 26b7aac0be
[zero] reorganize zero/gemini folder structure (#3424)
2 years ago
Yan Fang 189347963a
[auto] fix requirements typo for issue #3125 (#3209)
2 years ago
Zihao 18dbe76cae
[auto-parallel] add auto-offload feature (#3154)
2 years ago
Ziyue Jiang 400f63012e
[pipeline] Add Simplified Alpa DP Partition (#2507)
2 years ago
github-actions[bot] da056285f2
[format] applied code formatting on changed files in pull request 2922 (#2923)
2 years ago
binmakeswell 12bafe057f
[doc] update installation for GPT (#2922)
2 years ago
dawei-wang 55424a16a5
[doc] fix GPT tutorial (#2860)
2 years ago
cloudhuang 43dffdaba5
[doc] fixed a typo in GPT readme (#2736)
2 years ago
Jiatong (Julius) Han a255a38f7f
[example] Polish README.md (#2658)
2 years ago
HELSON 6e0faa70e0
[gemini] add profiler in the demo (#2534)
2 years ago
HELSON 66dfcf5281
[gemini] update the gpt example (#2527)
2 years ago
HELSON 707b11d4a0
[gemini] update ddp strict mode (#2518)
2 years ago
HELSON 2d1a7dfe5f
[zero] add strict ddp mode (#2508)
2 years ago
Jiarui Fang e327e95144
[hotfix] gpt example titans bug #2493 (#2494)
2 years ago
binmakeswell fcc6d61d92
[example] fix requirements (#2488)
2 years ago
Jiarui Fang 3a21485ead
[example] titans for gpt (#2484)
2 years ago
Jiarui Fang 7c31706227
[CI] add test_ci.sh for palm, opt and gpt (#2475)
2 years ago