Commit Graph

5 Commits (86bcda5ca9ce73d8f0751f58e80a280d61a1c93f)

Author SHA1 Message Date
Wenwen Qu 86bcda5ca9 add default setting for expert parallel size 2023-08-24 18:52:12 +08:00
Wenwen Qu 12c614db94 create expert data group and broadcast moe parameter in expert data group 2023-08-21 11:40:39 +08:00
Wenwen Qu c357288a8b feat(XXX): add moe 2023-08-07 20:17:49 +08:00
ytxiong c219065348
feat(*): support sequence_parallel (#180)
* support sequence_parallel for no pipeline

* sequence_parallel does not support no-flash-attn

* support sequence parallel for pipeline

* add memory profiler

* Update 13B.py

* add memory profiler

* fix evaluation bug

* remove some unnecessary code

* remove some unnecessary code

* Update parallel_context.py

* modify the config

* remove memory profiler

* modify the config

* support selective dropout
2023-08-07 16:42:52 +08:00
Sun Peng fa7337b37b initial commit 2023-07-06 12:55:23 +08:00