Commit Graph

23 Commits (29159d9b5b1c2e4dd2dac7b795b6eeeb092dd993)

Author SHA1 Message Date
LuGY 13ed4b6441
[model zoo] add activation offload for gpt model (#582)
3 years ago
HELSON 0f2d219162
[MOE] add MOEGPT model (#510)
3 years ago
Jiarui Fang a445e118cf
[polish] polish singleton and global context (#500)
3 years ago
HELSON c9023d4078
[MOE] support PR-MOE (#488)
3 years ago
ver217 d70f43dd7a
embedding remove attn mask (#474)
3 years ago
HELSON 7544347145
[MOE] add unitest for MOE experts layout, gradient handler and kernel (#469)
3 years ago
ver217 1559c0df41
fix attn mask shape of gpt (#472)
3 years ago
ver217 304263c2ce
fix gpt attention mask (#461)
3 years ago
HELSON dbdc9a7783
added Multiply Jitter and capacity factor eval for MOE (#434)
3 years ago
Frank Lee 0f5f5dd556
fixed gpt attention mask in pipeline (#430)
3 years ago
lucasliunju ce886a9062 fix format (#374)
3 years ago
Ziheng Qin 0db43fa995 fix format (#364)
3 years ago
1SAA 82023779bb Added TPExpert for special situation
3 years ago
1SAA 219df6e685 Optimized MoE layer and fixed some bugs;
3 years ago
アマデウス 9ee197d0e9 moved env variables to global variables; (#215)
3 years ago
HELSON 1ff5be36c2
Added moe parallel example (#140)
3 years ago
HELSON dceae85195
Added MoE parallel (#127)
3 years ago
ver217 7904baf6e1
fix layers/schedule for hybrid parallelization (#111) (#112)
3 years ago
アマデウス e5b9f9a08d
added gpt model & benchmark (#95)
3 years ago
アマデウス 01a80cd86d
Hotfix/Colossalai layers (#92)
3 years ago
アマデウス 0fedef4f3c
Layer integration (#83)
3 years ago
Frank Lee da01c234e1
Develop/experiments (#59)
3 years ago
zbian 404ecbdcc6 Migrated project
3 years ago