ver217
|
26b7aac0be
|
[zero] reorganize zero/gemini folder structure (#3424)
* [zero] refactor low-level zero folder structure
* [zero] fix legacy zero import path
* [zero] fix legacy zero import path
* [zero] remove useless import
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor legacy zero import path
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor gemini folder structure
* [zero] refactor legacy zero import path
* [zero] fix test import path
* [zero] fix test
* [zero] fix circular import
* [zero] update import
|
2023-04-04 13:48:16 +08:00 |
HELSON
|
1a1d68b053
|
[moe] add checkpoint for moe models (#3354)
* [moe] add checkpoint for moe models
* [hotfix] fix bugs in unit test
|
2023-03-31 09:20:33 +08:00 |
HELSON
|
f7f2248771
|
[moe] fix MoE bugs (#1628)
* remove forced FP32 modules
* correct no_shard-contexts' positions
|
2022-09-22 13:56:30 +08:00 |
HELSON
|
e6d50ec107
|
[zero] adapt zero for unsharded parameters (#561)
* support existing sharded and unsharded parameters in zero
* add unitest for moe-zero model init
* polish moe gradient handler
|
2022-03-31 18:34:11 +08:00 |
HELSON
|
8c90d4df54
|
[zero] add zero context manager to change config during initialization (#546)
|
2022-03-29 17:57:59 +08:00 |
Liang Bowen
|
ec5086c49c
|
Refactored docstring to google style
|
2022-03-29 17:17:47 +08:00 |
Jiarui Fang
|
a445e118cf
|
[polish] polish singleton and global context (#500)
|
2022-03-23 18:03:39 +08:00 |
Jiarui Fang
|
65c0f380c2
|
[format] polish name format for MOE (#481)
|
2022-03-21 23:19:47 +08:00 |
HELSON
|
aff9d354f7
|
[MOE] polish moe_env (#467)
|
2022-03-19 15:36:25 +08:00 |
1SAA
|
82023779bb
|
Added TPExpert for special situation
|
2022-03-11 15:50:28 +08:00 |
HELSON
|
36b8477228
|
Fixed parameter initialization in FFNExpert (#251)
|
2022-03-11 15:50:28 +08:00 |
1SAA
|
219df6e685
|
Optimized MoE layer and fixed some bugs;
Decreased moe tests;
Added FFNExperts and ViTMoE model
|
2022-03-11 15:50:28 +08:00 |