ColossalAI/colossalai
ver217 d26902645e
[ddp] add save/load state dict for ColoDDP (#1127)
* add save/load state dict for ColoDDP

* add unit test

* refactor unit test folder

* polish unit test

* rename unit test
2022-06-20 10:51:47 +08:00
..
amp [amp] included dict for type casting of model output (#1102) 2022-06-13 14:18:04 +08:00
builder [pipeline] refactor the pipeline module (#1087) 2022-06-10 11:27:38 +08:00
cli [hotfix] fix some bugs caused by size mismatch. (#1011) 2022-05-23 14:02:28 +08:00
communication [pipeline]refactor ppschedule to support tensor list (#1050) 2022-06-02 13:48:59 +08:00
context [usability] improved error messages in the context module (#856) 2022-04-25 13:42:31 +08:00
engine [hotfix]fix bugs caused by refactored pipeline (#1133) 2022-06-17 17:54:15 +08:00
fx [fx]add autoparallel passes (#1121) 2022-06-15 16:36:46 +08:00
gemini [gemini] gemini mgr supports "cpu" placement policy (#1118) 2022-06-15 15:05:19 +08:00
kernel [NFC] polish colossalai/kernel/cuda_native/csrc/colossal_C_frontend.cpp code style 2022-05-20 23:57:38 +08:00
logging [doc] improved docstring in the logging module (#861) 2022-04-25 13:42:00 +08:00
nn [ddp] add save/load state dict for ColoDDP (#1127) 2022-06-20 10:51:47 +08:00
pipeline [hotfix]fix bugs caused by refactored pipeline (#1133) 2022-06-17 17:54:15 +08:00
registry Remove duplication registry (#1078) 2022-06-08 07:47:24 +08:00
tensor [hotfix] fix param op hook (#1131) 2022-06-17 16:12:05 +08:00
testing [test] skip tests when not enough GPUs are detected (#1090) 2022-06-09 17:19:13 +08:00
trainer fix issue #1080 (#1071) 2022-06-07 17:21:11 +08:00
utils [pipeline] refactor the pipeline module (#1087) 2022-06-10 11:27:38 +08:00
zero [hotfix] fix zero init ctx numel (#1128) 2022-06-16 17:17:27 +08:00
__init__.py [NFC] polish __init__.py code style (#965) 2022-05-17 10:25:06 +08:00
constants.py fix typo in constants (#1027) 2022-05-26 08:45:08 +08:00
core.py [polish] polish singleton and global context (#500) 2022-03-23 18:03:39 +08:00
global_variables.py [MOE] add unitest for MOE experts layout, gradient handler and kernel (#469) 2022-03-21 13:35:04 +08:00
initialize.py [ddp] supported customized torch ddp configuration (#1123) 2022-06-15 18:11:53 +08:00