ColossalAI/colossalai
Kirigaya Kazuto 3b2a59b0ba
[pipeline/rank_recorder] fix bug when process data before backward | add a tool for multiple ranks debug (#1681)
* [pipeline/tuning] improve dispatch performance both time and space cost

* [pipeline/converge] add interface for testing convergence

* [NFC] polish colossalai/utils/multi_tensor_apply/multi_tensor_apply.py code style

* Update PipelineBase.py

* [pipeline/chimera] reconstruct PipelineBase and Worker to support more feasible custom schedule | finish Chimera

* [pipeline/chimera] test chimera | fix bug of initializing

* [pipeline/pytree] add pytree to process args and kwargs | provide  to process args and kwargs after forward
2022-10-09 17:32:57 +08:00
..
amp [doc] update rst and docstring (#1351) 2022-07-21 15:54:53 +08:00
auto_parallel [autoparallel] add unary element wise handler v2 (#1674) 2022-10-09 17:30:42 +08:00
builder [NFC] polish colossalai/builder/__init__.py code style (#1560) 2022-09-08 22:11:04 +08:00
cli [hotfix] fix some bugs caused by size mismatch. (#1011) 2022-05-23 14:02:28 +08:00
communication [communication] add p2p_v2.py to support communication with List[Any] (#1407) 2022-08-09 11:40:04 +08:00
context [moe] initialize MoE groups by ProcessGroup (#1640) 2022-09-23 17:20:41 +08:00
device [tensor]add 1D device mesh (#1492) 2022-08-25 16:48:12 +08:00
engine [engin/schedule] use p2p_v2 to recontruct pipeline_schedule (#1408) 2022-08-12 11:33:26 +08:00
fx [autoparallel] fix insecure subprocess (#1680) 2022-10-06 15:07:03 +08:00
gemini [feature] A new ZeRO implementation (#1644) 2022-10-09 09:18:51 +08:00
kernel [hotfix] fix CPUAdam kernel nullptr (#1410) 2022-08-05 19:45:45 +08:00
logging [doc] improved docstring in the logging module (#861) 2022-04-25 13:42:00 +08:00
nn [feature] A new ZeRO implementation (#1644) 2022-10-09 09:18:51 +08:00
pipeline [pipeline/rank_recorder] fix bug when process data before backward | add a tool for multiple ranks debug (#1681) 2022-10-09 17:32:57 +08:00
registry Remove duplication registry (#1078) 2022-06-08 07:47:24 +08:00
tensor [autoparallel] update CommSpec (#1667) 2022-09-29 11:20:59 +08:00
testing [NFC] polish colossalai/testing/comparison.py code style. (#1558) 2022-09-08 22:11:04 +08:00
trainer [NFC] polish ./colossalai/trainer/hooks/_lr_scheduler_hook.py code style (#1576) 2022-09-08 22:11:04 +08:00
utils [pipeline/rank_recorder] fix bug when process data before backward | add a tool for multiple ranks debug (#1681) 2022-10-09 17:32:57 +08:00
zero [feature] A new ZeRO implementation (#1644) 2022-10-09 09:18:51 +08:00
__init__.py update version to 0.1.10 (#1676) 2022-10-09 10:43:29 +08:00
_meta_registrations.py [fx/profiler] tuned the calculation of memory estimation (#1619) 2022-09-23 10:59:47 +08:00
constants.py fix typo in constants (#1027) 2022-05-26 08:45:08 +08:00
core.py [Tensor] distributed view supports inter-process hybrid parallel (#1169) 2022-06-27 09:45:26 +08:00
global_variables.py [MOE] add unitest for MOE experts layout, gradient handler and kernel (#469) 2022-03-21 13:35:04 +08:00
initialize.py [hotfix] remove potiential circle import (#1307) 2022-07-14 13:44:26 +08:00