Commit Graph

11 Commits (e1b0a78afa9d78fbac81d772834a4d6f7b45ce5b)

Author SHA1 Message Date
HELSON e5ea3fdeef
[gemini] add GeminiMemoryManger (#832)
* refactor StatefulTensor, tensor utilities

* add unitest for GeminiMemoryManager
2022-04-24 13:08:48 +08:00
HELSON a9b8300d54
[zero] improve adaptability for not-shard parameters (#708)
* adapt post grad hooks for not-shard parameters
* adapt optimizer for not-shard parameters
* offload gradients for not-replicated parameters
2022-04-11 13:38:51 +08:00
ver217 8432dc7080
polish moe docsrting (#618) 2022-04-01 16:15:36 +08:00
HELSON e6d50ec107
[zero] adapt zero for unsharded parameters (#561)
* support existing sharded and unsharded parameters in zero

* add unitest for moe-zero model init

* polish moe gradient handler
2022-03-31 18:34:11 +08:00
Liang Bowen ec5086c49c Refactored docstring to google style 2022-03-29 17:17:47 +08:00
Jiarui Fang a445e118cf
[polish] polish singleton and global context (#500) 2022-03-23 18:03:39 +08:00
HELSON d7ea63992b
[MOE] add FP32LinearGate for MOE in NaiveAMP context (#480) 2022-03-22 10:50:20 +08:00
HELSON aff9d354f7
[MOE] polish moe_env (#467) 2022-03-19 15:36:25 +08:00
HELSON dbdc9a7783
added Multiply Jitter and capacity factor eval for MOE (#434) 2022-03-16 16:47:44 +08:00
1SAA 82023779bb Added TPExpert for special situation 2022-03-11 15:50:28 +08:00
1SAA 219df6e685 Optimized MoE layer and fixed some bugs;
Decreased moe tests;

Added FFNExperts and ViTMoE model
2022-03-11 15:50:28 +08:00