ColossalAI/colossalai
HELSON 340e59f968
[utils] add synchronized cuda memory monitor (#740)
2022-04-13 10:50:54 +08:00
..
amp [bug] fixed grad scaler compatibility with torch 1.8 (#735) 2022-04-12 16:04:21 +08:00
builder [NFC] polish colossalai/builder/builder.py code style (#662) 2022-04-06 11:40:59 +08:00
communication [util] fixed communication API depth with PyTorch 1.9 (#721) 2022-04-12 09:44:40 +08:00
context [utils] support detection of number of processes on current node (#723) 2022-04-12 09:28:19 +08:00
engine [refactor] zero directory (#724) 2022-04-11 23:13:02 +08:00
kernel [NFC] polish colossalai/kernel/cuda_native/csrc/multi_tensor_adam.cu code style (#667) 2022-04-06 11:40:59 +08:00
logging Refactored docstring to google style 2022-03-29 17:17:47 +08:00
nn [compatibility] fixed tensor parallel compatibility with torch 1.9 (#700) 2022-04-11 13:44:50 +08:00
registry Refactored docstring to google style 2022-03-29 17:17:47 +08:00
testing [test] fixed rerun_on_exception and adapted test cases (#487) 2022-03-25 17:25:12 +08:00
trainer [utils] add synchronized cuda memory monitor (#740) 2022-04-13 10:50:54 +08:00
utils [utils] add synchronized cuda memory monitor (#740) 2022-04-13 10:50:54 +08:00
zero [hotfix] fix memory leak in backward of sharded model (#741) 2022-04-13 09:59:05 +08:00
__init__.py Develop/experiments (#59) 2021-12-09 15:08:29 +08:00
constants.py fix format constants.py (#358) 2022-03-11 15:50:28 +08:00
core.py [polish] polish singleton and global context (#500) 2022-03-23 18:03:39 +08:00
global_variables.py [MOE] add unitest for MOE experts layout, gradient handler and kernel (#469) 2022-03-21 13:35:04 +08:00
initialize.py [utils] support detection of number of processes on current node (#723) 2022-04-12 09:28:19 +08:00