ver217
|
823f3b9cf4
|
[doc] add deepspeed citation and copyright (#2996)
* [doc] add deepspeed citation and copyright
* [doc] add deepspeed citation and copyright
* [doc] add deepspeed citation and copyright
|
2023-03-04 20:08:11 +08:00 |
HELSON
|
56ddc9ca7a
|
[hotfix] add correct device for fake_param (#2796)
|
2023-02-17 15:29:07 +08:00 |
HELSON
|
b528eea0f0
|
[zero] add zero wrappers (#2523)
* [zero] add zero wrappers
* change names
* add wrapper functions to init
|
2023-01-29 17:52:58 +08:00 |
HELSON
|
2bfeb24308
|
[zero] add warning for ignored parameters (#2446)
|
2023-01-11 15:30:09 +08:00 |
HELSON
|
7829aa094e
|
[ddp] add is_ddp_ignored (#2434)
[ddp] rename to is_ddp_ignored
|
2023-01-11 12:22:45 +08:00 |
HELSON
|
dddacd2d2c
|
[hotfix] add norm clearing for the overflow step (#2416)
|
2023-01-10 15:43:06 +08:00 |
HELSON
|
e7d3afc9cc
|
[optimizer] add div_scale for optimizers (#2117)
* [optimizer] add div_scale for optimizers
* [zero] use div_scale in zero optimizer
* fix testing error
|
2022-12-12 17:58:57 +08:00 |
HELSON
|
63fbba3c19
|
[zero] add L2 gradient clipping for ZeRO (#2112)
* [zero] add L2 gradient clipping
* [testing] add MlpModel
* [zero] add unit test for grad clipping
* fix atol
|
2022-12-09 18:09:17 +08:00 |
Jiarui Fang
|
f7e276fa71
|
[Gemini] add GeminiAdamOptimizer (#1960)
|
2022-11-16 14:44:28 +08:00 |