HELSON
|
e5ea3fdeef
|
[gemini] add GeminiMemoryManger (#832)
* refactor StatefulTensor, tensor utilities
* add unitest for GeminiMemoryManager
|
2022-04-24 13:08:48 +08:00 |
HELSON
|
a9b8300d54
|
[zero] improve adaptability for not-shard parameters (#708)
* adapt post grad hooks for not-shard parameters
* adapt optimizer for not-shard parameters
* offload gradients for not-replicated parameters
|
2022-04-11 13:38:51 +08:00 |
ver217
|
8432dc7080
|
polish moe docsrting (#618)
|
2022-04-01 16:15:36 +08:00 |
HELSON
|
e6d50ec107
|
[zero] adapt zero for unsharded parameters (#561)
* support existing sharded and unsharded parameters in zero
* add unitest for moe-zero model init
* polish moe gradient handler
|
2022-03-31 18:34:11 +08:00 |
HELSON
|
8c90d4df54
|
[zero] add zero context manager to change config during initialization (#546)
|
2022-03-29 17:57:59 +08:00 |
Liang Bowen
|
ec5086c49c
|
Refactored docstring to google style
|
2022-03-29 17:17:47 +08:00 |
Jiarui Fang
|
a445e118cf
|
[polish] polish singleton and global context (#500)
|
2022-03-23 18:03:39 +08:00 |
HELSON
|
c9023d4078
|
[MOE] support PR-MOE (#488)
|
2022-03-22 16:48:22 +08:00 |
HELSON
|
d7ea63992b
|
[MOE] add FP32LinearGate for MOE in NaiveAMP context (#480)
|
2022-03-22 10:50:20 +08:00 |
Jiarui Fang
|
65c0f380c2
|
[format] polish name format for MOE (#481)
|
2022-03-21 23:19:47 +08:00 |
HELSON
|
aff9d354f7
|
[MOE] polish moe_env (#467)
|
2022-03-19 15:36:25 +08:00 |
HELSON
|
bccbc15861
|
[MOE] changed parallelmode to dist process group (#460)
|
2022-03-19 13:46:29 +08:00 |
HELSON
|
dbdc9a7783
|
added Multiply Jitter and capacity factor eval for MOE (#434)
|
2022-03-16 16:47:44 +08:00 |
HELSON
|
3f70a2b12f
|
removed noisy function during evaluation of MoE router (#419)
|
2022-03-15 12:06:09 +08:00 |
1SAA
|
82023779bb
|
Added TPExpert for special situation
|
2022-03-11 15:50:28 +08:00 |
HELSON
|
36b8477228
|
Fixed parameter initialization in FFNExpert (#251)
|
2022-03-11 15:50:28 +08:00 |
1SAA
|
219df6e685
|
Optimized MoE layer and fixed some bugs;
Decreased moe tests;
Added FFNExperts and ViTMoE model
|
2022-03-11 15:50:28 +08:00 |
HELSON
|
0f8c7f9804
|
Fixed docstring in colossalai (#171)
|
2022-01-21 10:44:30 +08:00 |
HELSON
|
dceae85195
|
Added MoE parallel (#127)
|
2022-01-07 15:08:36 +08:00 |