..
amp
[hotfix] fix memory leak in zero ( #781 )
2022-04-18 13:57:03 +08:00
builder
modefied the pp build for ckpt adaptation ( #803 )
2022-04-24 12:23:16 +08:00
cli
[cli] added check installation cli
2022-04-20 12:13:27 +08:00
communication
[util] fixed communication API depth with PyTorch 1.9 ( #721 )
2022-04-12 09:44:40 +08:00
context
[compatibility] used backward-compatible API for global process group ( #758 )
2022-04-14 17:20:35 +08:00
engine
[refactor] moving grad acc logic to engine ( #804 )
2022-04-19 14:03:21 +08:00
gemini
[tensor] reorganize files ( #820 )
2022-04-21 14:15:48 +08:00
kernel
Revert "[zero] add ZeroTensorShardStrategy ( #793 )" ( #806 )
2022-04-19 14:40:02 +08:00
logging
Refactored docstring to google style
2022-03-29 17:17:47 +08:00
nn
[TP] change the check assert in split batch 2d ( #772 )
2022-04-16 21:29:57 +08:00
registry
[dependency] removed torchvision ( #833 )
2022-04-22 15:24:35 +08:00
tensor
[hotfix] the bug of numel() in ColoTensor ( #845 )
2022-04-24 12:32:10 +08:00
testing
[test] added a decorator for address already in use error with backward compatibility ( #760 )
2022-04-14 16:48:44 +08:00
trainer
[log] local throughput metrics ( #811 )
2022-04-20 10:05:39 +08:00
utils
[pipelinable]use pipelinable context to initialize non-pipeline model ( #816 )
2022-04-24 13:03:12 +08:00
zero
revert zero tensors back ( #829 )
2022-04-22 12:12:35 +08:00
__init__.py
Develop/experiments ( #59 )
2021-12-09 15:08:29 +08:00
constants.py
fix format constants.py ( #358 )
2022-03-11 15:50:28 +08:00
core.py
[polish] polish singleton and global context ( #500 )
2022-03-23 18:03:39 +08:00
global_variables.py
[MOE] add unitest for MOE experts layout, gradient handler and kernel ( #469 )
2022-03-21 13:35:04 +08:00
initialize.py
modefied the pp build for ckpt adaptation ( #803 )
2022-04-24 12:23:16 +08:00