Commit Graph

69 Commits (8e6fdb4f2948bf27b67ca25501f381a0fb514146)

Author SHA1 Message Date
Ziyue Jiang 4b01da24cd
[TP] change the check assert in split batch 2d (#772)
3 years ago
アマデウス b8899e0905
[TP] allow layernorm without bias (#750)
3 years ago
Frank Lee eda30a058e
[compatibility] fixed tensor parallel compatibility with torch 1.9 (#700)
3 years ago
HELSON a9b8300d54
[zero] improve adaptability for not-shard parameters (#708)
3 years ago
アマデウス 3fc8a204dc
[]Corrected 3d vocab parallel embedding (#707)
3 years ago
HELSON b31daed4cf
fix bugs in CPU adam (#633)
3 years ago
Liang Bowen 828e465622
[hotfix] Raise messages for indivisible batch sizes with tensor parallelism (#622)
3 years ago
アマデウス 77ad24bf94
[model checkpoint] updated saving/loading for 3d layers (#597)
3 years ago
アマデウス 93089ed708
[model checkpoint] updated saving/loading for 2.5d layers (#596)
3 years ago
アマデウス c50bfb807b
[model checkpoint] updated saving/loading for 1d layers (#594)
3 years ago
アマデウス 7636d518e1
[model checkpoint] updated saving/loading for 2d layers (#595)
3 years ago
アマデウス cd13b63832
[model checkpoint] reworked unified layers for ease of save/load states (#593)
3 years ago
Ziyue Jiang 1c40ee8749
[TP] add assert for tp1d (#621)
3 years ago
ver217 e619a651fb
polish optimizer docstring (#619)
3 years ago
ver217 8432dc7080
polish moe docsrting (#618)
3 years ago
ver217 104cbbb313
[hotfix] add hybrid adam to __init__ (#584)
3 years ago
HELSON e6d50ec107
[zero] adapt zero for unsharded parameters (#561)
3 years ago
Wesley 46c9ba33da update code format
3 years ago
Wesley 666cfd094a fix parallel_input flag for Linear1D_Col gather_output
3 years ago
Liang Bowen 2c45efc398
html refactor (#555)
3 years ago
LuGY c44d797072
[docs] updatad docs of hybrid adam and cpu adam (#552)
3 years ago
Ziyue Jiang 763dc325f1
[TP] Add gather_out arg to Linear (#541)
3 years ago
HELSON 8c90d4df54
[zero] add zero context manager to change config during initialization (#546)
3 years ago
Liang Bowen ec5086c49c Refactored docstring to google style
3 years ago
LuGY 105c5301c3
[zero]added hybrid adam, removed loss scale in adam (#527)
3 years ago
LuGY 6a3f9fda83
[cuda] modify the fused adam, support hybrid of fp16 and fp32 (#497)
3 years ago
Jiarui Fang a445e118cf
[polish] polish singleton and global context (#500)
3 years ago
ver217 9ec1ce6ab1
[zero] sharded model support the reuse of fp16 shard (#495)
3 years ago
HELSON c9023d4078
[MOE] support PR-MOE (#488)
3 years ago
ver217 62b0a8d644
[zero] sharded optim support hybrid cpu adam (#486)
3 years ago
HELSON d7ea63992b
[MOE] add FP32LinearGate for MOE in NaiveAMP context (#480)
3 years ago
Jiarui Fang 65c0f380c2
[format] polish name format for MOE (#481)
3 years ago
HELSON 7544347145
[MOE] add unitest for MOE experts layout, gradient handler and kernel (#469)
3 years ago
HELSON aff9d354f7
[MOE] polish moe_env (#467)
3 years ago
HELSON bccbc15861
[MOE] changed parallelmode to dist process group (#460)
3 years ago
Jiarui Fang 0fcfb1e00d
[test] make zero engine test really work (#447)
3 years ago
Jiarui Fang 237d08e7ee
[zero] hybrid cpu adam (#445)
3 years ago
HELSON dbdc9a7783
added Multiply Jitter and capacity factor eval for MOE (#434)
3 years ago
HELSON 3f70a2b12f
removed noisy function during evaluation of MoE router (#419)
3 years ago
Jiang Zhuo 5a4a3b77d9 fix format (#376)
3 years ago
LuGY de46450461 Added activation offload (#331)
3 years ago
Kai Wang (Victor Kai) 53bb3bcc0a fix format (#362)
3 years ago
Yuer867 4a0f8c2c50 fix format parallel_2p5d (#357)
3 years ago
Liang Bowen 7eb87f516d flake8 style (#352)
3 years ago
xuqifan897 148207048e Qifan formated file ColossalAI\colossalai\nn\layer\parallel_1d\layers.py (#342)
3 years ago
DouJS cbb6436ff0 fix format for dir-[parallel_3d] (#333)
3 years ago
LuGY a3269de5c9 [zero] cpu adam kernel (#288)
3 years ago
1SAA 82023779bb Added TPExpert for special situation
3 years ago
HELSON 36b8477228 Fixed parameter initialization in FFNExpert (#251)
3 years ago
アマデウス e13293bb4c fixed CI dataset directory; fixed import error of 2.5d accuracy (#255)
3 years ago