ColossalAI

Commit Graph

Author	SHA1	Message	Date
Frank Lee	235792f170	[shardformer] updated readme (#3827 )	1 year ago
FoolPlayer	8cc11235c0	[shardformer]: Feature/shardformer, add some docstring and readme (#3816 ) * init shardformer code structure * add implement of sharder (inject and replace) * add implement of replace layer to colossal layer * separate different layer policy, add some notion * implement 1d and 2d slicer, can tell col or row * fix bug when slicing and inject model * fix some bug; add inference test example * add share weight and train example * add train * add docstring and readme * add docstring for other files * pre-commit	1 year ago
FoolPlayer	8d68de767d	[shardformer] init shardformer code structure (#3731 ) * init shardformer code structure * add implement of sharder (inject and replace) * add implement of replace layer to colossal layer * separate different layer policy, add some notion * implement 1d and 2d slicer, can tell col or row * fix bug when slicing and inject model * fix some bug; add inference test example	1 year ago
Baizhou Zhang	1350ece492	[hotfix] fix import bug in checkpoint_io (#4142 )	1 year ago
digger yu	8abc87798f	fix Tensor is not defined (#4129 )	1 year ago
digger yu	7e46bc87b6	fix CheckpointIndexFile is not defined (#4109 )	1 year ago
digger yu	09fe9dc704	[nfc]fix ColossalaiOptimizer is not defined (#4122 )	1 year ago
Frank Lee	95e95b6d58	[testing] move pytest to be inside the function (#4087 )	1 year ago
Baizhou Zhang	0bb0b481b4	[gemini] fix argument naming during chunk configuration searching	1 year ago
github-actions[bot]	a52f62082d	[format] applied code formatting on changed files in pull request 4021 (#4022 ) Co-authored-by: github-actions <github-actions@github.com>	1 year ago
Baizhou Zhang	822c3d4d66	[checkpointio] sharded optimizer checkpoint for DDP plugin (#4002 )	1 year ago
Wenhao Chen	725af3eeeb	[booster] make optimizer argument optional for boost (#3993 ) * feat: make optimizer optional in Booster.boost * test: skip unet test if diffusers version > 0.10.2	1 year ago
Baizhou Zhang	c9cff7e7fa	[checkpointio] General Checkpointing of Sharded Optimizers (#3984 )	1 year ago
Frank Lee	71fe52769c	[gemini] fixed the gemini checkpoint io (#3934 )	1 year ago
Frank Lee	ddcf58cacf	Revert "[sync] sync feature/shardformer with develop"	1 year ago
FoolPlayer	24651fdd4f	Merge pull request #3931 from FrankLeeeee/sync/develop-to-shardformer [sync] sync feature/shardformer with develop	1 year ago
FoolPlayer	ef1537759c	[shardformer] add gpt2 policy and modify shard and slicer to support (#3883 ) * add gpt2 policy and modify shard and slicer to support * remove unused code * polish code	2 years ago
FoolPlayer	6370a935f6	update README (#3909 )	2 years ago
FoolPlayer	21a3915c98	[shardformer] add Dropout layer support different dropout pattern (#3856 ) * add dropout layer, add dropout test * modify seed manager as context manager * add a copy of col_nn.layer * add dist_crossentropy loss; separate module test * polish the code * fix dist crossentropy loss	2 years ago
FoolPlayer	997544c1f9	[shardformer] update readme with modules implement doc (#3834 ) * update readme with modules content * remove img	2 years ago
Frank Lee	537a52b7a2	[shardformer] refactored the user api (#3828 ) * [shardformer] refactored the user api * polish code	2 years ago
Frank Lee	bc19024bf9	[shardformer] updated readme (#3827 )	2 years ago
FoolPlayer	58f6432416	[shardformer]: Feature/shardformer, add some docstring and readme (#3816 ) * init shardformer code structure * add implement of sharder (inject and replace) * add implement of replace layer to colossal layer * separate different layer policy, add some notion * implement 1d and 2d slicer, can tell col or row * fix bug when slicing and inject model * fix some bug; add inference test example * add share weight and train example * add train * add docstring and readme * add docstring for other files * pre-commit	2 years ago
FoolPlayer	6a69b44dfc	[shardformer] init shardformer code structure (#3731 ) * init shardformer code structure * add implement of sharder (inject and replace) * add implement of replace layer to colossal layer * separate different layer policy, add some notion * implement 1d and 2d slicer, can tell col or row * fix bug when slicing and inject model * fix some bug; add inference test example	2 years ago
Frank Lee	eb39154d40	[dtensor] updated api and doc (#3845 )	2 years ago
digger yu	de0d7df33f	[nfc] fix typo colossalai/zero (#3923 )	2 years ago
digger yu	a9d1cadc49	fix typo with colossalai/trainer utils zero (#3908 )	2 years ago
Frank Lee	d51e83d642	Merge pull request #3916 from FrankLeeeee/sync/dtensor-with-develop [sync] sync feature/dtensor with develop	2 years ago
Hongxin Liu	9c88b6cbd1	[lazy] fix compatibility problem on torch 1.13 (#3911 )	2 years ago
digger yu	0e484e6201	[nfc]fix typo colossalai/pipeline tensor nn (#3899 ) * fix typo colossalai/autochunk auto_parallel amp * fix typo colossalai/auto_parallel nn utils etc. * fix typo colossalai/auto_parallel autochunk fx/passes etc. * fix typo docs/ * change placememt_policy to placement_policy in docs/ and examples/ * fix typo colossalai/ applications/ * fix typo colossalai/cli fx kernel * fix typo colossalai/nn * revert change warmuped * fix typo colossalai/pipeline tensor nn	2 years ago
Baizhou Zhang	c1535ccbba	[doc] fix docs about booster api usage (#3898 )	2 years ago
digger yu	1878749753	[nfc] fix typo colossalai/nn (#3887 ) * fix typo colossalai/autochunk auto_parallel amp * fix typo colossalai/auto_parallel nn utils etc. * fix typo colossalai/auto_parallel autochunk fx/passes etc. * fix typo docs/ * change placememt_policy to placement_policy in docs/ and examples/ * fix typo colossalai/ applications/ * fix typo colossalai/cli fx kernel * fix typo colossalai/nn * revert change warmuped	2 years ago
Hongxin Liu	ae02d4e4f7	[bf16] add bf16 support (#3882 ) * [bf16] add bf16 support for fused adam (#3844) * [bf16] fused adam kernel support bf16 * [test] update fused adam kernel test * [test] update fused adam test * [bf16] cpu adam and hybrid adam optimizers support bf16 (#3860) * [bf16] implement mixed precision mixin and add bf16 support for low level zero (#3869) * [bf16] add mixed precision mixin * [bf16] low level zero optim support bf16 * [text] update low level zero test * [text] fix low level zero grad acc test * [bf16] add bf16 support for gemini (#3872) * [bf16] gemini support bf16 * [test] update gemini bf16 test * [doc] update gemini docstring * [bf16] add bf16 support for plugins (#3877) * [bf16] add bf16 support for legacy zero (#3879) * [zero] init context support bf16 * [zero] legacy zero support bf16 * [test] add zero bf16 test * [doc] add bf16 related docstring for legacy zero	2 years ago
Liu Ziming	8065cc5fba	Modify torch version requirement to adapt torch 2.0 (#3896 )	2 years ago
Hongxin Liu	dbb32692d2	[lazy] refactor lazy init (#3891 ) * [lazy] remove old lazy init * [lazy] refactor lazy init folder structure * [lazy] fix lazy tensor deepcopy * [test] update lazy init test	2 years ago
digger yu	70c8cdecf4	[nfc] fix typo colossalai/cli fx kernel (#3847 ) * fix typo colossalai/autochunk auto_parallel amp * fix typo colossalai/auto_parallel nn utils etc. * fix typo colossalai/auto_parallel autochunk fx/passes etc. * fix typo docs/ * change placememt_policy to placement_policy in docs/ and examples/ * fix typo colossalai/ applications/ * fix typo colossalai/cli fx kernel	2 years ago
digger yu	e2d81eba0d	[nfc] fix typo colossalai/ applications/ (#3831 ) * fix typo colossalai/autochunk auto_parallel amp * fix typo colossalai/auto_parallel nn utils etc. * fix typo colossalai/auto_parallel autochunk fx/passes etc. * fix typo docs/ * change placememt_policy to placement_policy in docs/ and examples/ * fix typo colossalai/ applications/	2 years ago
wukong1992	3229f93e30	[booster] add warning for torch fsdp plugin doc (#3833 )	2 years ago
Hongxin Liu	7c9f2ed6dd	[dtensor] polish sharding spec docstring (#3838 ) * [dtensor] polish sharding spec docstring * [dtensor] polish sharding spec example docstring	2 years ago
digger yu	7f8203af69	fix typo colossalai/auto_parallel autochunk fx/passes etc. (#3808 )	2 years ago
wukong1992	6b305a99d6	[booster] torch fsdp fix ckpt (#3788 )	2 years ago
digger yu	9265f2d4d7	[NFC]fix typo colossalai/auto_parallel nn utils etc. (#3779 ) * fix typo colossalai/autochunk auto_parallel amp * fix typo colossalai/auto_parallel nn utils etc.	2 years ago
jiangmingyan	e871e342b3	[API] add docstrings and initialization to apex amp, naive amp (#3783 ) * [mixed_precison] add naive amp demo * [mixed_precison] add naive amp demo * [api] add docstrings and initialization to apex amp, naive amp * [api] add docstring to apex amp/ naive amp * [api] add docstring to apex amp/ naive amp * [api] add docstring to apex amp/ naive amp * [api] add docstring to apex amp/ naive amp * [api] add docstring to apex amp/ naive amp * [api] add docstring to apex amp/ naive amp * [api] fix * [api] fix	2 years ago
Frank Lee	f5c425c898	fixed the example docstring for booster (#3795 )	2 years ago
Hongxin Liu	72688adb2f	[doc] add booster docstring and fix autodoc (#3789 ) * [doc] add docstr for booster methods * [doc] fix autodoc	2 years ago
Hongxin Liu	3c07a2846e	[plugin] a workaround for zero plugins' optimizer checkpoint (#3780 ) * [test] refactor torch ddp checkpoint test * [plugin] update low level zero optim checkpoint * [plugin] update gemini optim checkpoint	2 years ago
Hongxin Liu	60e6a154bc	[doc] add tutorial for booster checkpoint (#3785 ) * [doc] add checkpoint related docstr for booster * [doc] add en checkpoint doc * [doc] add zh checkpoint doc * [doc] add booster checkpoint doc in sidebar * [doc] add cuation about ckpt for plugins * [doc] add doctest placeholder * [doc] add doctest placeholder * [doc] add doctest placeholder	2 years ago
digger yu	32f81f14d4	[NFC] fix typo colossalai/amp auto_parallel autochunk (#3756 )	2 years ago
Hongxin Liu	5452df63c5	[plugin] torch ddp plugin supports sharded model checkpoint (#3775 ) * [plugin] torch ddp plugin add save sharded model * [test] fix torch ddp ckpt io test * [test] fix torch ddp ckpt io test * [test] fix low level zero plugin test * [test] fix low level zero plugin test * [test] add debug info * [test] add debug info * [test] add debug info * [test] add debug info * [test] add debug info * [test] fix low level zero plugin test * [test] fix low level zero plugin test * [test] remove debug info	2 years ago
jiangmingyan	2703a37ac9	[amp] Add naive amp demo (#3774 ) * [mixed_precison] add naive amp demo * [mixed_precison] add naive amp demo	2 years ago
digger yu	1baeb39c72	[NFC] fix typo with colossalai/auto_parallel/tensor_shard (#3742 ) * fix typo applications/ and colossalai/ date 5.11 * fix typo colossalai/	2 years ago
wukong1992	b37797ed3d	[booster] support torch fsdp plugin in booster (#3697 ) Co-authored-by: 纪少敏 <jishaomin@jishaomindeMBP.lan>	2 years ago
digger-yu	ad6460cf2c	[NFC] fix typo applications/ and colossalai/ (#3735 )	2 years ago
digger-yu	b7141c36dd	[CI] fix some spelling errors (#3707 ) * fix spelling error with examples/comminity/ * fix spelling error with tests/ * fix some spelling error with tests/ colossalai/ etc.	2 years ago
jiangmingyan	20068ba188	[booster] add tests for ddp and low level zero's checkpointio (#3715 ) * [booster] update tests for booster * [booster] update tests for booster * [booster] update tests for booster * [booster] update tests for booster * [booster] update tests for booster * [booster] update booster tutorials#3717, fix recursive check	2 years ago
Hongxin Liu	6552cbf8e1	[booster] fix no_sync method (#3709 ) * [booster] fix no_sync method * [booster] add test for ddp no_sync * [booster] fix merge * [booster] update unit test * [booster] update unit test * [booster] update unit test	2 years ago
Hongxin Liu	3bf09efe74	[booster] update prepare dataloader method for plugin (#3706 ) * [booster] add prepare dataloader method for plug * [booster] update examples and docstr	2 years ago
Hongxin Liu	f83ea813f5	[example] add train resnet/vit with booster example (#3694 ) * [example] add train vit with booster example * [example] update readme * [example] add train resnet with booster example * [example] enable ci * [example] enable ci * [example] add requirements * [hotfix] fix analyzer init * [example] update requirements	2 years ago
YH	2629f9717d	[tensor] Refactor handle_trans_spec in DistSpecManager	2 years ago
Hongxin Liu	d0915f54f4	[booster] refactor all dp fashion plugins (#3684 ) * [booster] add dp plugin base * [booster] inherit dp plugin base * [booster] refactor unit tests	2 years ago
jiangmingyan	307894f74d	[booster] gemini plugin support shard checkpoint (#3610 ) * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin add shard checkpoint save/load * gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint * [API Refactoring]gemini plugin support shard checkpoint --------- Co-authored-by: luchen <luchen@luchendeMBP.lan> Co-authored-by: luchen <luchen@luchendeMacBook-Pro.local>	2 years ago
YH	a22407cc02	[zero] Suggests a minor change to confusing variable names in the ZeRO optimizer. (#3173 ) * Fix confusing variable name in zero opt * Apply lint * Fix util func * Fix minor util func * Fix zero param optimizer name	2 years ago
Hongxin Liu	50793b35f4	[gemini] accelerate inference (#3641 ) * [gemini] support don't scatter after inference * [chat] update colossalai strategy * [chat] fix opt benchmark * [chat] update opt benchmark * [gemini] optimize inference * [test] add gemini inference test * [chat] fix unit test ci * [chat] fix ci * [chat] fix ci * [chat] skip checkpoint test	2 years ago
Hongxin Liu	4b3240cb59	[booster] add low level zero plugin (#3594 ) * [booster] add low level zero plugin * [booster] fix gemini plugin test * [booster] fix precision * [booster] add low level zero plugin test * [test] fix booster plugin test oom * [test] fix booster plugin test oom * [test] fix googlenet and inception output trans * [test] fix diffuser clip vision model * [test] fix torchaudio_wav2vec2_base * [test] fix low level zero plugin test	2 years ago
digger-yu	b9a8dff7e5	[doc] Fix typo under colossalai and doc(#3618 ) * Fixed several spelling errors under colossalai * Fix the spelling error in colossalai and docs directory * Cautious Changed the spelling error under the example folder * Update runtime_preparation_pass.py revert autograft to autograd * Update search_chunk.py utile to until * Update check_installation.py change misteach to mismatch in line 91 * Update 1D_tensor_parallel.md revert to perceptron * Update 2D_tensor_parallel.md revert to perceptron in line 73 * Update 2p5D_tensor_parallel.md revert to perceptron in line 71 * Update 3D_tensor_parallel.md revert to perceptron in line 80 * Update README.md revert to resnet in line 42 * Update reorder_graph.py revert to indice in line 7 * Update p2p.py revert to megatron in line 94 * Update initialize.py revert to torchrun in line 198 * Update routers.py change to detailed in line 63 * Update routers.py change to detailed in line 146 * Update README.md revert random number in line 402	2 years ago
Hongxin Liu	12eff9eb4c	[gemini] state dict supports fp16 (#3590 ) * [gemini] save state dict support fp16 * [gemini] save state dict shard support fp16 * [gemini] fix state dict * [gemini] fix state dict	2 years ago
Hongxin Liu	dac127d0ee	[fx] fix meta tensor registration (#3589 ) * [meta] fix torch 1.13.1 * [meta] fix torch 2.0.0 * [meta] fix torch 1.13.0 * [meta] polish code	2 years ago
Hongxin Liu	f313babd11	[gemini] support save state dict in shards (#3581 ) * [gemini] support state dict shard * [gemini] add test state dict shard * [gemini] polish docstr * [gemini] fix merge * [gemini] polish code	2 years ago
YH	d329c294ec	Add docstr for zero3 chunk search utils (#3572 )	2 years ago
Hongxin Liu	173dad0562	[misc] add verbose arg for zero and op builder (#3552 ) * [misc] add print verbose * [gemini] add print verbose * [zero] add print verbose for low level * [misc] add print verbose for op builder	2 years ago
Hongxin Liu	4341f5e8e6	[lazyinit] fix clone and deepcopy (#3553 )	2 years ago
Hongxin Liu	152239bbfa	[gemini] gemini supports lazy init (#3379 ) * [gemini] fix nvme optimizer init * [gemini] gemini supports lazy init * [gemini] add init example * [gemini] add fool model * [zero] update gemini ddp * [zero] update init example * add chunk method * add chunk method * [lazyinit] fix lazy tensor tolist * [gemini] fix buffer materialization * [misc] remove useless file * [booster] update gemini plugin * [test] update gemini plugin test * [test] fix gemini plugin test * [gemini] fix import * [gemini] fix import * [lazyinit] use new metatensor * [lazyinit] use new metatensor * [lazyinit] fix __set__ method	2 years ago
jiangmingyan	366a035552	[checkpoint] Shard saved checkpoint need to be compatible with the naming format of hf checkpoint files (#3479 ) * [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format * [checkpoint] support huggingface style sharded checkpoint, to be compatible with hf file naming format * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename * [checkpoint] Shard saved checkpoint add 'variant' field to customize filename --------- Co-authored-by: luchen <luchen@luchendeMacBook-Pro.local> Co-authored-by: luchen <luchen@luchendeMBP.lan>	2 years ago
YH	bcf0cbcbe7	[doc] Add docs for clip args in zero optim (#3504 )	2 years ago
jiangmingyan	52a933e175	[checkpoint] support huggingface style sharded checkpoint (#3461 ) * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint * [checkpoint] support huggingface style sharded checkpoint --------- Co-authored-by: luchen <luchen@luchendeMBP.lan>	2 years ago
Frank Lee	80eba05b0a	[test] refactor tests with spawn (#3452 ) * [test] added spawn decorator * polish code * polish code * polish code * polish code * polish code * polish code	2 years ago
Frank Lee	7d8d825681	[booster] fixed the torch ddp plugin with the new checkpoint api (#3442 )	2 years ago
YH	8f740deb53	Fix typo (#3448 )	2 years ago
Hakjin Lee	46c009dba4	[format] Run lint on colossalai.engine (#3367 )	2 years ago
YuliangLiu0306	ffcdbf0f65	[autoparallel]integrate auto parallel feature with new tracer (#3408 ) * [autoparallel] integrate new analyzer in module level * unify the profiling method * polish * fix no codegen bug * fix pass bug * fix liveness test * polish	2 years ago
ver217	573af84184	[example] update examples related to zero/gemini (#3431 ) * [zero] update legacy import * [zero] update examples * [example] fix opt tutorial * [example] fix opt tutorial * [example] fix opt tutorial * [example] fix opt tutorial * [example] fix import	2 years ago
Frank Lee	1beb85cc25	[checkpoint] refactored the API and added safetensors support (#3427 ) * [checkpoint] refactored the API and added safetensors support * polish code	2 years ago
ver217	26b7aac0be	[zero] reorganize zero/gemini folder structure (#3424 ) * [zero] refactor low-level zero folder structure * [zero] fix legacy zero import path * [zero] fix legacy zero import path * [zero] remove useless import * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor gemini folder structure * [zero] refactor legacy zero import path * [zero] fix test import path * [zero] fix test * [zero] fix circular import * [zero] update import	2 years ago
Frank Lee	638a07a7f9	[test] fixed gemini plugin test (#3411 ) * [test] fixed gemini plugin test * polish code * polish code	2 years ago
ver217	5f2e34e6c9	[booster] implement Gemini plugin (#3352 ) * [booster] add gemini plugin * [booster] update docstr * [booster] gemini plugin add coloparam convertor * [booster] fix coloparam convertor * [booster] fix gemini plugin device * [booster] add gemini plugin test * [booster] gemini plugin ignore sync bn * [booster] skip some model * [booster] skip some model * [booster] modify test world size * [booster] modify test world size * [booster] skip test	2 years ago
HELSON	1a1d68b053	[moe] add checkpoint for moe models (#3354 ) * [moe] add checkpoint for moe models * [hotfix] fix bugs in unit test	2 years ago
YuliangLiu0306	fee2af8610	[autoparallel] adapt autoparallel with new analyzer (#3261 ) * [autoparallel] adapt autoparallel with new analyzer * fix all node handler tests * polish * polish	2 years ago
Ofey Chan	8706a8c66c	[NFC] polish colossalai/engine/gradient_handler/__init__.py code style (#3329 )	2 years ago
yuxuan-lou	198a74b9fd	[NFC] polish colossalai/context/random/__init__.py code style (#3327 )	2 years ago
YuliangLiu0306	fbd2a9e05b	[hotfix] meta_tensor_compatibility_with_torch2	2 years ago
Michelle	ad285e1656	[NFC] polish colossalai/fx/tracer/_tracer_utils.py (#3323 ) * [NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style * [NFC] polish colossalai/fx/tracer/_tracer_utils.py code style --------- Co-authored-by: Qianran Ma <qianranm@luchentech.com>	2 years ago
Xu Kai	64350029fe	[NFC] polish colossalai/gemini/paramhooks/_param_hookmgr.py code style	2 years ago
RichardoLuo	1ce9d0c531	[NFC] polish initializer_data.py code style (#3287 )	2 years ago
Ziheng Qin	1bed38ef37	[NFC] polish colossalai/cli/benchmark/models.py code style (#3290 )	2 years ago
Kai Wang (Victor Kai)	964a28678f	[NFC] polish initializer_3d.py code style (#3279 )	2 years ago
Sze-qq	94eec1c5ad	[NFC] polish colossalai/engine/gradient_accumulation/_gradient_accumulation.py code style (#3277 ) Co-authored-by: siqi <siqi@siqis-MacBook-Pro.local>	2 years ago
Arsmart1	8af977f223	[NFC] polish colossalai/context/parallel_context.py code style (#3276 )	2 years ago
Zirui Zhu	1168b50e33	[NFC] polish colossalai/engine/schedule/_pipeline_schedule_v2.py code style (#3275 )	2 years ago
Tong Li	196d4696d0	[NFC] polish colossalai/nn/_ops/addmm.py code style (#3274 )	2 years ago
lucasliunju	4b95464994	[NFC] polish colossalai/amp/__init__.py code style (#3272 )	2 years ago
Xuanlei Zhao	6b3bb2c249	[NFC] polish code style (#3273 )	2 years ago
CZYCW	4cadb25b96	[NFC] policy colossalai/fx/proxy.py code style (#3269 )	2 years ago
Yuanchen	d58fa705b2	[NFC] polish code style (#3268 ) Co-authored-by: Yuanchen Xu <yuanchen.xu00@gmail.com>	2 years ago
Camille Zhong	c4a226b729	[NFC] polish tensor_placement_policy.py code style (#3265 )	2 years ago
CsRic	00778abc48	[NFC] polish colossalai/fx/passes/split_module.py code style (#3263 ) Co-authored-by: csric <richcsr256@gmail.com>	2 years ago
jiangmingyan	488f37048c	[NFC] polish colossalai/global_variables.py code style (#3259 ) Co-authored-by: luchen <luchen@luchendeMBP.lan>	2 years ago
LuGY	1ff7d5bfa5	[NFC] polish colossalai/engine/gradient_handler/_moe_gradient_handler.py (#3260 )	2 years ago
dayellow	204ca2f09a	[NFC] polish colossalai/fx/profiler/experimental/profiler_module/embedding.py code style (#3256 ) Co-authored-by: Minghao Huang <huangminghao@luchentech.com>	2 years ago
HELSON	02b058032d	[fx] meta registration compatibility (#3253 ) * [fx] meta registration compatibility * fix error	2 years ago
Frank Lee	73d3e4d309	[booster] implemented the torch ddd + resnet example (#3232 ) * [booster] implemented the torch ddd + resnet example * polish code	2 years ago
YH	1a229045af	Add interface for colo tesnor dp size (#3227 )	2 years ago
YuliangLiu0306	4d5d8f98a4	[API] implement device mesh manager (#3221 ) * [API] implement device mesh manager * polish	2 years ago
Frank Lee	cd142fbefa	[api] implemented the checkpoint io module (#3205 ) * [api] implemented the checkpoint io module * polish code * polish code	2 years ago
ver217	f8289d4221	[lazyinit] combine lazy tensor with dtensor (#3204 ) * [lazyinit] lazy tensor add distribute * [lazyinit] refactor distribute * [lazyinit] add test dist lazy init * [lazyinit] add verbose info for dist lazy init * [lazyinit] fix rnn flatten weight op * [lazyinit] polish test * [lazyinit] polish test * [lazyinit] fix lazy tensor data setter * [lazyinit] polish test * [lazyinit] fix clean * [lazyinit] make materialize inplace * [lazyinit] refactor materialize * [lazyinit] refactor test distribute * [lazyinit] fix requires_grad * [lazyinit] fix tolist after materialization * [lazyinit] refactor distribute module * [lazyinit] polish docstr * [lazyinit] polish lazy init context * [lazyinit] temporarily skip test * [lazyinit] polish test * [lazyinit] add docstr	2 years ago
Frank Lee	e3ad88fb48	[booster] implemented the cluster module (#3191 ) * [booster] implemented the cluster module * polish code	2 years ago
YuliangLiu0306	f57d34958b	[FX] refactor experimental tracer and adapt it with hf models (#3157 ) * pass gpt trace and meta_prop * pass t5 trace and meta_prop * [FX] refactor experimental tracer and adapt it with hf models * pass all mainstream model zoo * fix CI * fix CI * fix CI * fix CI * fix CI * fix CI * fix CI * fix CI * skip tests * fix CI * using packaging version * polish	2 years ago
Frank Lee	e7f3bed2d3	[booster] added the plugin base and torch ddp plugin (#3180 ) * [booster] added the plugin base and torch ddp plugin * polish code * polish code * polish code	2 years ago
Zihao	18dbe76cae	[auto-parallel] add auto-offload feature (#3154 ) * add auto-offload feature * polish code * fix syn offload runtime pass bug * add offload example * fix offload testing bug * fix example testing bug	2 years ago
YuliangLiu0306	258b43317c	[hotfix] layout converting issue (#3188 )	2 years ago
YH	80aed29cd3	[zero] Refactor ZeroContextConfig class using dataclass (#3186 )	2 years ago
YH	9d644ff09f	Fix docstr for zero statedict (#3185 )	2 years ago
zbian	7bc0afc901	updated flash attention usage	2 years ago
Frank Lee	a9b8402d93	[booster] added the accelerator implementation (#3159 )	2 years ago
ver217	6ae8ed0407	[lazyinit] add correctness verification (#3147 ) * [lazyinit] fix shared module * [tests] add lazy init test utils * [tests] add torchvision for lazy init * [lazyinit] fix pre op fn * [lazyinit] handle legacy constructor * [tests] refactor lazy init test models * [tests] refactor lazy init test utils * [lazyinit] fix ops don't support meta * [tests] lazy init test timm models * [lazyinit] fix set data * [lazyinit] handle apex layers * [tests] lazy init test transformers models * [tests] lazy init test torchaudio models * [lazyinit] fix import path * [tests] lazy init test torchrec models * [tests] update torch version in CI * [tests] revert torch version in CI * [tests] skip lazy init test	2 years ago
Frank Lee	ed19290560	[booster] implemented mixed precision class (#3151 ) * [booster] implemented mixed precision class * polish code	2 years ago
YuliangLiu0306	2eca4cd376	[DTensor] refactor dtensor with new components (#3089 ) * [DTensor] refactor dtensor with new components * polish	2 years ago
ver217	ed8f60b93b	[lazyinit] refactor lazy tensor and lazy init ctx (#3131 ) * [lazyinit] refactor lazy tensor and lazy init ctx * [lazyinit] polish docstr * [lazyinit] polish docstr	2 years ago
Frank Lee	95a36eae63	[kernel] added kernel loader to softmax autograd function (#3093 ) * [kernel] added kernel loader to softmax autograd function * [release] v0.2.6	2 years ago
Super Daniel	fff98f06ed	[analyzer] a minimal implementation of static graph analyzer (#2852 ) * [hotfix] meta tensor default device. * [siu] add experimental submodules to main branch. * [siu] * [siu] * [analyzer] init. * [analyzer] readme. * [analyzer] readme. * [analyzer] readme. * [analyzer] readme. * [test] add test. * Update symbolic_trace.py * mark skip tests. * try except. * try except. * try except. * s * init * init * fix * skip * skip --------- Co-authored-by: Daniel Shao <superdainiu@MININT-PVARVID.fareast.corp.microsoft.com> Co-authored-by: Daniel Shao <superdainiu@Daniels-Mac.local>	2 years ago
Xuanlei Zhao	10c61de2f7	[autochunk] support vit (#3084 ) support vit for autochunk * support some new ops for vit * fix some bugs * add test for vit	2 years ago
YuliangLiu0306	8e4e8601b7	[DTensor] implement layout converter (#3055 ) * [DTensor] refactor LayoutConverter for DTensor * polish code * polish docstring	2 years ago
Frank Lee	f19b49e164	[booster] init module structure and definition (#3056 )	2 years ago
Xuanlei Zhao	2ca9728cbb	[autochunk] refactor chunk memory estimation (#2762 ) * refact memory code * dont log free var memory * add memory align * update chunk target * update setting for new memory * finish test * update tracer * update typo * update test	2 years ago
YuliangLiu0306	29386a54e6	[DTensor] refactor CommSpec (#3034 )	2 years ago
YuliangLiu0306	cd2b0eaa8d	[DTensor] refactor sharding spec (#2987 ) * [autoparallel] refactor sharding spec * rename function name	2 years ago
Ziyue Jiang	400f63012e	[pipeline] Add Simplified Alpa DP Partition (#2507 ) * add alpa dp split * add alpa dp split * use fwd+bwd instead of fwd only --------- Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2 years ago
Super Daniel	b42d3d28ed	[fx] remove depreciated algorithms. (#2312 ) (#2313 )	2 years ago
github-actions[bot]	82503a96f2	[format] applied code formatting on changed files in pull request 2997 (#3008 ) Co-authored-by: github-actions <github-actions@github.com>	2 years ago
binmakeswell	52a5078988	[doc] add ISC tutorial (#2997 ) * [doc] add ISC tutorial * [doc] add ISC tutorial * [doc] add ISC tutorial * [doc] add ISC tutorial	2 years ago
ver217	823f3b9cf4	[doc] add deepspeed citation and copyright (#2996 ) * [doc] add deepspeed citation and copyright * [doc] add deepspeed citation and copyright * [doc] add deepspeed citation and copyright	2 years ago
YuliangLiu0306	e414e4092b	[DTensor] implementation of dtensor (#2946 ) * [DTensor] implementation of dtensor * test layout convert * polish	2 years ago
YuliangLiu0306	47fb214b3b	[hotfix] add shard dim to aviod backward communication error (#2954 )	2 years ago
ver217	090f14fd6b	[misc] add reference (#2930 ) * [misc] add reference * [misc] add license	2 years ago
YuliangLiu0306	197d0bf4ed	[autoparallel] apply repeat block to reduce solving time (#2912 )	2 years ago
YH	a848091141	Fix port exception type (#2925 )	2 years ago
zbian	61e687831d	fixed using zero with tp cannot access weight correctly	2 years ago
YH	7b13f7db18	[zero] trivial zero optimizer refactoring (#2869 ) * Fix mionr grad store interface * Apply lint	2 years ago
Jiatong (Julius) Han	8c8a39be95	[hotfix]: Remove math.prod dependency (#2837 ) * Remove math.prod dependency * Fix style * Fix style --------- Co-authored-by: Jiatong Han <jiatong.han@u.nus.edu>	2 years ago
YuliangLiu0306	819e25d8b1	[hotfix] fix autoparallel compatibility test issues (#2754 )	2 years ago
YuliangLiu0306	0f392d7403	[autoparallel] find repeat blocks (#2854 ) * [autoparallel] find repeat blocks * polish * polish * polish	2 years ago
junxu	c52edcf0eb	Rename class method of ZeroDDP (#2692 )	2 years ago
HELSON	6e4ac08172	[hotfix] fix chunk size can not be divided (#2867 ) * [hotfix] fix chunk size can not be divided * [hotfix] use numpy for python3.8	2 years ago
Boyuan Yao	eae77c831d	[autoparallel] Patch meta information for nodes that will not be handled by SPMD solver (#2823 ) * [autoparallel] non spmd meta information generator * [autoparallel] patch meta information for non spmd nodes	2 years ago
Boyuan Yao	c7764d3f22	[autoparallel] Patch meta information of `torch.where` (#2822 ) * [autoparallel] patch meta information of torch.where * [autoparallel] pre-commit modified	2 years ago
Boyuan Yao	fcc4097efa	[autoparallel] Patch meta information of `torch.tanh()` and `torch.nn.Dropout` (#2773 ) * [autoparallel] tanh meta information * [autoparallel] remove redundant code * [autoparallel] patch meta information of torch.nn.Dropout	2 years ago
Frank Lee	935346430f	[cli] handled version check exceptions (#2848 ) * [cli] handled version check exceptions * polish code	2 years ago
Frank Lee	918bc94b6b	[triton] added copyright information for flash attention (#2835 ) * [triton] added copyright information for flash attention * polish code	2 years ago
Boyuan Yao	7ea6bc7f69	[autoparallel] Patch tensor related operations meta information (#2789 ) * [autoparallel] tensor related meta information prototype * [autoparallel] tensor related meta information * [autoparallel] tensor related meta information * [autoparallel] tensor related meta information * [autoparallel] tensor related meta information	2 years ago
Michelle	c008d4ad0c	[NFC] polish colossalai/engine/schedule/_pipeline_schedule.py code style (#2744 )	2 years ago
YuliangLiu0306	2059fdd6b0	[hotfix] add copyright for solver and device mesh (#2803 ) * [hotfix] add copyright for solver and device mesh * add readme * add alpa license * polish	2 years ago
Boyuan Yao	8593ae1a3f	[autoparallel] rotor solver refactor (#2813 ) * [autoparallel] rotor solver refactor * [autoparallel] rotor solver refactor	2 years ago
HELSON	56ddc9ca7a	[hotfix] add correct device for fake_param (#2796 )	2 years ago
Boyuan Yao	a2b43e393d	[autoparallel] Patch meta information of `torch.nn.Embedding` (#2760 ) * [autoparallel] embedding metainfo * [autoparallel] fix function name in test_activation_metainfo * [autoparallel] undo changes in activation metainfo and related tests	2 years ago
Boyuan Yao	8e3f66a0d1	[zero] fix wrong import (#2777 )	2 years ago
Nikita Shulga	01066152f1	Don't use `torch._six` (#2775 ) * Don't use `torch._six` This is a private API which is gone after https://github.com/pytorch/pytorch/pull/94709 * Update common.py	2 years ago
binmakeswell	93b788b95a	Merge branch 'main' into fix/format	2 years ago
xyupeng	2fd528b9f4	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/graph_analysis.py code style (#2737 )	2 years ago
YuliangLiu0306	1dc003c169	[autoparallel] distinguish different parallel strategies (#2699 )	2 years ago
YH	ae86a29e23	Refact method of grad store (#2687 )	2 years ago
Zirui Zhu	c9e3ee389e	[NFC] polish colossalai/context/process_group_initializer/initializer_2d.py code style (#2726 )	2 years ago
Zangwei Zheng	1819373e5c	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/batch_norm_handler.py code style (#2728 )	2 years ago
Wangbo Zhao(黑色枷锁)	8331420520	[NFC] polish colossalai/cli/cli.py code style (#2734 )	2 years ago
ziyuhuang123	d344313533	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/embedding_handler.py code style (#2725 )	2 years ago
Xue Fuzhao	e81caeb4bc	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/cost_graph.py code style (#2720 ) Co-authored-by: Fuzhao Xue <fuzhao@login2.ls6.tacc.utexas.edu>	2 years ago
yuxuan-lou	51c45c2460	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/where_handler.py code style (#2723 )	2 years ago
YuliangLiu0306	21d6a48f4d	[autoparallel] add shard option (#2696 ) * [autoparallel] add shard option * polish	2 years ago
YuliangLiu0306	5b24987fa7	[autoparallel] fix parameters sharding bug (#2716 )	2 years ago
Ziyue Jiang	4603538ddd	[NFC] posh colossalai/context/process_group_initializer/initializer_sequence.py code style (#2712 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2 years ago
YuliangLiu0306	cb2c6a2415	[autoparallel] refactor runtime pass (#2644 ) * [autoparallel] refactor runtime pass * add unit test * polish	2 years ago
Zihao	b3d10db5f1	[NFC] polish colossalai/cli/launcher/__init__.py code style (#2709 )	2 years ago
YuliangLiu0306	0b2a738393	[autoparallel] remove deprecated codes (#2664 )	2 years ago
YuliangLiu0306	7fa6be49d2	[autoparallel] test compatibility for gemini and auto parallel (#2700 )	2 years ago
CZYCW	4ac8bfb072	[NFC] polish colossalai/engine/gradient_handler/utils.py code style (#2708 )	2 years ago
Liu Ziming	6427c406cf	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/strategy_generator.py code style (#2695 ) Co-authored-by: shenggan <csg19971016@gmail.com>	2 years ago
アマデウス	534f68c83c	[NFC] polish pipeline process group code style (#2694 )	2 years ago
LuGY	56ff1921e9	[NFC] polish colossalai/context/moe_context.py code style (#2693 )	2 years ago
Shawn-Kong	1712da2800	[NFC] polish colossalai/gemini/gemini_context.py code style (#2690 )	2 years ago
HELSON	df4f020ee3	[zero1&2] only append parameters with gradients (#2681 )	2 years ago
ver217	f0aa191f51	[gemini] fix colo_init_context (#2683 )	2 years ago
Boyuan Yao	40c916b192	[autoparallel] Patch meta information of `torch.nn.functional.softmax` and `torch.nn.Softmax` (#2674 ) * [autoparallel] softmax metainfo * [autoparallel] softmax metainfo	2 years ago
HELSON	8213f89fd2	[gemini] add fake_release_chunk for keep-gathered chunk in the inference mode (#2671 )	2 years ago
binmakeswell	9ab14b20b5	[doc] add CVPR tutorial (#2666 )	2 years ago
Boyuan Yao	0385b26ebf	[autoparallel] Patch meta information of `torch.nn.LayerNorm` (#2647 ) * [autoparallel] layernorm metainfo patch * [autoparallel] polish test	2 years ago
YuliangLiu0306	37df666f38	[autoparallel] refactor handlers which reshape input tensors (#2615 ) * [autoparallel] refactor handlers which reshape input tensors * polish	2 years ago
YuliangLiu0306	28398f1c70	add overlap option (#2613 )	2 years ago
YuliangLiu0306	cb3d1bef62	[autoparallel] adapt autoparallel tests with latest api (#2626 )	2 years ago
Boyuan Yao	90a9fdd91d	[autoparallel] Patch meta information of `torch.matmul` (#2584 ) * [autoparallel] matmul metainfo * [auto_parallel] remove unused print * [tests] skip test_matmul_handler when torch version is lower than 1.12.0	2 years ago
oahzxl	6ba8364881	[autochunk] support diffusion for autochunk (#2621 ) * add alphafold benchmark * renae alphafold test * rename tests * rename diffuser * renme * rename * update transformer * update benchmark * update benchmark * update bench memory * update transformer benchmark * rename * support diffuser * support unet metainfo prop * fix bug and simplify code * update linear and support some op * optimize max region search, support conv * update unet test * support some op * support groupnorm and interpolate * update flow search * add fix dim in node flow * fix utils * rename * support diffusion * update diffuser * update chunk search * optimize imports * import * finish autochunk	2 years ago
Frank Lee	8518263b80	[test] fixed the triton version for testing (#2608 )	2 years ago
HELSON	552183bb74	[polish] polish ColoTensor and its submodules (#2537 )	2 years ago
Frank Lee	dd14783f75	[kernel] fixed repeated loading of kernels (#2549 ) * [kernel] fixed repeated loading of kernels * polish code * polish code	2 years ago
ver217	5b1854309a	[hotfix] fix zero ddp warmup check (#2545 )	2 years ago
oahzxl	fa3d66feb9	support unet metainfo prop (#2544 )	2 years ago
oahzxl	05671fcb42	[autochunk] support multi outputs chunk search (#2538 ) Support multi outputs chunk search. Previously we only support single output chunk search. It is more flexible and improve performance by a large margin. For transformer, we reduce memory by 40% than previous search strategy. 1. rewrite search strategy to support multi outputs chunk search 2. fix many, many bugs 3. update tests	2 years ago
oahzxl	63199c6687	[autochunk] support transformer (#2526 )	2 years ago
HELSON	a4ed9125ac	[hotfix] fix lightning error (#2529 )	2 years ago
HELSON	66dfcf5281	[gemini] update the gpt example (#2527 )	2 years ago
HELSON	b528eea0f0	[zero] add zero wrappers (#2523 ) * [zero] add zero wrappers * change names * add wrapper functions to init	2 years ago
Super Daniel	c198c7c0b0	[hotfix] meta tensor default device. (#2510 )	2 years ago
HELSON	077a5cdde4	[zero] fix gradient clipping in hybrid parallelism (#2521 ) * [zero] fix gradient clipping in hybrid parallelism * [testing] change model name to avoid pytest warning * [hotfix] fix unit testing	2 years ago
YuliangLiu0306	aa0f6686f9	[autoparallel] accelerate gpt2 training (#2495 )	2 years ago
HELSON	707b11d4a0	[gemini] update ddp strict mode (#2518 ) * [zero] add strict ddp mode for chunk init * [gemini] update gpt example	2 years ago
HELSON	2d1a7dfe5f	[zero] add strict ddp mode (#2508 ) * [zero] add strict ddp mode * [polish] add comments for strict ddp mode * [zero] fix test error	2 years ago
oahzxl	c04f183237	[autochunk] support parsing blocks (#2506 )	2 years ago
Super Daniel	35c0c0006e	[utils] lazy init. (#2148 ) * [utils] lazy init. * [utils] remove description. * [utils] complete. * [utils] finalize. * [utils] fix names.	2 years ago
oahzxl	72341e65f4	[auto-chunk] support extramsa (#3 ) (#2504 )	2 years ago
Ziyue Jiang	0f02b8c6e6	add avg partition (#2483 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2 years ago
アマデウス	99d9713b02	Revert "Update parallel_context.py (#2408 )" This reverts commit `7d5640b9db`.	2 years ago
oahzxl	ecccc91f21	[autochunk] support autochunk on evoformer (#2497 )	2 years ago
oahzxl	5db3a5bf42	[fx] allow control of ckpt_codegen init (#2498 ) * [fx] allow control of ckpt_codegen init Currently in ColoGraphModule, ActivationCheckpointCodeGen will be set automatically in __init__. But other codegen can't be set if so. So I add an arg to control whether to set ActivationCheckpointCodeGen in __init__. * code style	2 years ago
HELSON	d565a24849	[zero] add unit testings for hybrid parallelism (#2486 )	2 years ago
oahzxl	4953b4ace1	[autochunk] support evoformer tracer (#2485 ) support full evoformer tracer, which is a main module of alphafold. previously we just support a simplifed version of it. 1. support some evoformer's op in fx 2. support evoformer test 3. add repos for test code	2 years ago
YuliangLiu0306	67e1912b59	[autoparallel] support origin activation ckpt on autoprallel system (#2468 )	2 years ago
Ziyue Jiang	fef5c949c3	polish pp middleware (#2476 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2 years ago
HELSON	a5dc4253c6	[zero] polish low level optimizer (#2473 )	2 years ago
Frank Lee	8b7495dd54	[example] integrate seq-parallel tutorial with CI (#2463 )	2 years ago
Jiarui Fang	867c8c2d3a	[zero] low level optim supports ProcessGroup (#2464 )	2 years ago
Frank Lee	14d9299360	[cli] fixed hostname mismatch error (#2465 )	2 years ago
Haofan Wang	9358262992	Fix False warning in initialize.py (#2456 ) * Update initialize.py * pre-commit run check	2 years ago
YuliangLiu0306	8221fd7485	[autoparallel] update binary elementwise handler (#2451 ) * [autoparallel] update binary elementwise handler * polish	2 years ago
HELSON	2bfeb24308	[zero] add warning for ignored parameters (#2446 )	2 years ago
Frank Lee	39163417a1	[example] updated the hybrid parallel tutorial (#2444 ) * [example] updated the hybrid parallel tutorial * polish code	2 years ago
HELSON	5521af7877	[zero] fix state_dict and load_state_dict for ddp ignored parameters (#2443 ) * [ddp] add is_ddp_ignored [ddp] rename to is_ddp_ignored * [zero] fix state_dict and load_state_dict * fix bugs * [zero] update unit test for ZeroDDP	2 years ago
YuliangLiu0306	2731531bc2	[autoparallel] integrate device mesh initialization into autoparallelize (#2393 ) * [autoparallel] integrate device mesh initialization into autoparallelize * add megatron solution * update gpt autoparallel examples with latest api * adapt beta value to fit the current computation cost	2 years ago
Frank Lee	c72c827e95	[cli] provided more details if colossalai run fail (#2442 )	2 years ago
Super Daniel	c41e59e5ad	[fx] allow native ckpt trace and codegen. (#2438 )	2 years ago
YuliangLiu0306	41429b9b28	[autoparallel] add shard option (#2423 )	2 years ago
HELSON	7829aa094e	[ddp] add is_ddp_ignored (#2434 ) [ddp] rename to is_ddp_ignored	2 years ago
HELSON	bb4e9a311a	[zero] add inference mode and its unit test (#2418 )	2 years ago
Jiarui Fang	93f62dd152	[autochunk] add autochunk feature	2 years ago
HELSON	dddacd2d2c	[hotfix] add norm clearing for the overflow step (#2416 )	2 years ago
oahzxl	7ab2db206f	adapt new fx	2 years ago
oahzxl	e532679c95	Merge branch 'main' of https://github.com/oahzxl/ColossalAI into chunk	2 years ago
Haofan Wang	7d5640b9db	Update parallel_context.py (#2408 )	2 years ago
oahzxl	fd818cf144	change imports	2 years ago
oahzxl	a591d45b29	add available	2 years ago
oahzxl	615e7e68d9	update doc	2 years ago
oahzxl	7d4abaa525	add doc	2 years ago
oahzxl	1be0ac3cbf	add doc for trace indice	2 years ago
oahzxl	0b6af554df	remove useless function	2 years ago
oahzxl	d914a21d64	rename	2 years ago
oahzxl	865f2e0196	rename	2 years ago
HELSON	ea13a201bb	[polish] polish code for get_static_torch_model (#2405 ) * [gemini] polish code * [testing] remove code * [gemini] make more robust	2 years ago
oahzxl	a4ed5b0d0d	rename in doc	2 years ago
oahzxl	1bb1f2ad89	rename	2 years ago
oahzxl	cb9817f75d	rename function from index to indice	2 years ago
oahzxl	0ea903b94e	rename trace_index to trace_indice	2 years ago
Frank Lee	551cafec14	[doc] updated kernel-related optimisers' docstring (#2385 ) * [doc] updated kernel-related optimisers' docstring * polish doc	2 years ago
oahzxl	065f0b4c27	add doc for search	2 years ago
oahzxl	a68d240ed5	add doc for search chunk	2 years ago
oahzxl	1951f7fa87	code style	2 years ago
oahzxl	212b5b1b5f	add comments	2 years ago
oahzxl	19cc64b1d3	remove autochunk_available	2 years ago
eric8607242	9880fd2cd8	Fix state_dict key missing issue of the ZeroDDP (#2363 ) * Fix state_dict output for ZeroDDP duplicated parameters * Rewrite state_dict based on get_static_torch_model * Modify get_static_torch_model to be compatible with the lower version (ZeroDDP)	2 years ago
oahzxl	4d223e18a2	fix typo	2 years ago
Frank Lee	ce08661eb1	[cli] updated installation check cli for aot/jit build (#2395 )	2 years ago
jiaruifang	69d9180c4b	[hotfix] issue #2388	2 years ago
Jiarui Fang	4e96039649	[device] find best logical mesh	2 years ago
Jiarui Fang	8f72b6f8fb	[hotfix] fix implement error in diffusers	2 years ago
Frank Lee	40d376c566	[setup] support pre-build and jit-build of cuda kernels (#2374 ) * [setup] support pre-build and jit-build of cuda kernels * polish code * polish code * polish code * polish code * polish code * polish code	2 years ago
1SAA	33f3023e19	[hotfix] fix implement error in diffusers	2 years ago
Jiarui Fang	12c8bf38d7	[Pipeline] Refine GPT PP Example	2 years ago
oahzxl	8a989a0d89	code style	2 years ago
oahzxl	c3a2bf48b4	code style	2 years ago
oahzxl	a6cdbf9161	seperate trace flow	2 years ago
oahzxl	4748967fb1	ad reorder graph	2 years ago
oahzxl	da4076846d	rename	2 years ago
oahzxl	c3d72f7db9	seperate reorder	2 years ago
binmakeswell	a881d6d000	Revert "[NFC] polish code format" (#2372 )	2 years ago
Ziyue Jiang	9ae9e74017	fix diff device in some partition	2 years ago
Jiarui Fang	0dcc410f57	[NFC] polish code format	2 years ago
oahzxl	6685a9d022	seperate non chunk input	2 years ago
binmakeswell	d634eae05b	Revert "[NFC] polish code format (#2367 )" (#2371 ) This reverts commit `1f8ab6f1f5`.	2 years ago
oahzxl	f856611d21	seperate prepose_nodes	2 years ago
Shawn-Kong	d42aecdda1	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/embedding_handler.py code style (#2368 )	2 years ago
Jiarui Fang	1aaeb596c6	[example] gpt, shard init on all processes (#2366 )	2 years ago
oahzxl	f4a1607e56	seperate input node dim search	2 years ago
binmakeswell	1f8ab6f1f5	[NFC] polish code format (#2367 )	2 years ago
oahzxl	ae27a8b26d	seperate flow tracer	2 years ago
oahzxl	fd87d78a28	rename ambiguous variable	2 years ago
oahzxl	2bde9d2b7f	code format	2 years ago
oahzxl	8a634af2f5	close mem and code print	2 years ago
oahzxl	1a6d2a740b	take apart chunk code gen	2 years ago
ExtremeViscent	ac0d30fe2e	[NFC] polish batch_norm_handler.py code style (#2359 )	2 years ago
HELSON	48d33b1b17	[gemini] add get static torch model (#2356 )	2 years ago
oahzxl	efb1c64c30	restruct dir	2 years ago
ziyuhuang123	7080a8edb0	[workflow]New version: Create workflow files for examples' auto check (#2298 ) * [workflows]bug_repair * [workflow]new_pr_fixing_bugs Co-authored-by: binmakeswell <binmakeswell@gmail.com>	2 years ago
LuGY	e11a005c02	[NFC] polish colossalai/auto_parallel/tensor_shard/utils/factory.py code style (#2349 )	2 years ago
YuliangLiu0306	b5a3a4a65f	[device] find best logical mesh	2 years ago
yuxuan-lou	28e2d16794	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/graph_analysis.py code style (#2340 )	2 years ago
YuliangLiu0306	9c9246c0d9	[device] alpha beta profiler (#2311 ) * [device] alpha beta profiler * add usage * fix variable name	2 years ago
Maruyama_Aya	bd12a49e2a	[NFC] polish <colossalai/auto_parallel/tensor_shard/deprecated/constants.py> code style (#2339 )	2 years ago
Zihao	35427bcab4	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/unary_elementwise_handler.py code style (#2326 )	2 years ago
Jiarui Fang	db6eea3583	[builder] reconfig op_builder for pypi install (#2314 )	2 years ago
Junming Wu	4a79c10750	[NFC] polish colossalai/cli/benchmark/__init__.py code style (#2308 )	2 years ago
Ofey Chan	87d2defda6	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/layer_norm_handler.py code style (#2305 )	2 years ago
ver217	116e3d0b8f	[NFC] polish communication/p2p_v2.py code style (#2303 )	2 years ago
xyupeng	b965585d05	[NFC] polish colossalai/amp/torch_amp/torch_amp.py code style (#2290 )	2 years ago
Zangwei Zheng	d1e5bafcd4	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/__init__.py code style (#2291 )	2 years ago
shenggan	950685873f	[NFC] polish colossalai/auto_parallel/tensor_shard/deprecated/op_handler/reshape_handler.py code style (#2292 )	2 years ago
Ziheng Qin	3041014089	[NFC] polish colossalai/amp/naive_amp/grad_scaler/dynamic_grad_scaler.py code style (#2299 ) Co-authored-by: henryqin1997 <henryqin1997@gamil.com>	2 years ago
アマデウス	49715a78f0	[NFC] polish colossalai/cli/benchmark/benchmark.py code style (#2287 )	2 years ago
Zirui Zhu	1c29b173c9	[NFC] polish colossalai/auto_parallel/tensor_shard/node_handler/getitem_handler.py code style (#2289 )	2 years ago
Zihao	3a02b46447	[auto-parallel] refactoring ColoTracer (#2118 ) * add meta_data_computing * add checkpoint_annotation * rename proxy.data to proxy.meta_data and add bias addition pass * polish code * delete meta_prop_pass invoke and rename ori_node to orig_node * add TracerType * unify meta data computing * delete TracerType * handle setitem operation * operator.setitem	2 years ago
HELSON	5d3a2be3af	[amp] add gradient clipping for unit tests (#2283 ) * [amp] add gradient clipping in unit tests * fix bugs	2 years ago
Boyuan Yao	d45695d94e	Merge pull request #2258 from hpcaitech/debug/ckpt-autoparallel [autockpt] provide option for activation checkpoint search in SPMD solver	2 years ago
Jiarui Fang	16cc8e6aa7	[builder] MOE builder (#2277 )	2 years ago
Boyuan Yao	b904748210	[autoparallel] bypass MetaInfo when unavailable and modify BCAST_FUNC_OP metainfo (#2293 ) * [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline * [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop * [autoparallel] specifycomm nodes' memory cost in construct chain * [autoparallel] fix wrong runtime apply calculation * [autoparallel] fix wrong runtime apply calculation * [autoparallel] fix wrong runtime apply calculation * [autoparallel] bypass metainfo when available and modify BCAST_FUNC_OP	2 years ago
Super Daniel	8ea50d999e	[hotfix] pass a parameter. (#2288 ) * [autockpt] make it work. * [autockpt] linearize / merge shape-consistency nodes. * [autockpt] considering parameter and optimizer weights. * [hotfix] pass a parameter.	2 years ago
zbian	e94c79f15b	improved allgather & reducescatter for 3d	2 years ago
HELSON	62c38e3330	[zero] polish low level zero optimizer (#2275 )	2 years ago
Ziyue Jiang	ac863a01d6	[example] add benchmark (#2276 ) * add benchmark * merge common func * add total and avg tflops Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2 years ago
Boyuan Yao	22e947f982	[autoparallel] fix runtime apply memory estimation (#2281 ) * [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline * [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop * [autoparallel] specifycomm nodes' memory cost in construct chain * [autoparallel] fix wrong runtime apply calculation * [autoparallel] fix wrong runtime apply calculation * [autoparallel] fix wrong runtime apply calculation	2 years ago
Super Daniel	8e8900ff3f	[autockpt] considering parameter and optimizer weights. (#2279 ) * [autockpt] make it work. * [autockpt] linearize / merge shape-consistency nodes. * [autockpt] considering parameter and optimizer weights.	2 years ago
YuliangLiu0306	f027ef7913	[hotfix] fix fp16 optimzier bug (#2273 )	2 years ago
YuliangLiu0306	fb87322773	[autoparallel] fix spelling error (#2270 )	2 years ago
Jiarui Fang	af32022f74	[Gemini] fix the convert_to_torch_module bug (#2269 )	2 years ago
Super Daniel	b0d21d0c4f	[autockpt] linearize / merge shape-consistency nodes. (#2271 ) * [autockpt] make it work. * [autockpt] linearize / merge shape-consistency nodes.	2 years ago
YuliangLiu0306	4b29112ab2	[autoparallel] gpt2 autoparallel examples (#2267 ) * [autoparallel] gpt2 autoparallel examples * polish code * polish code	2 years ago
Ziyue Jiang	8b045b3c1f	[Pipeline Middleware] Reduce comm redundancy by getting accurate output (#2232 ) * move to cpu to avoid dead lock * get output by offsets Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2 years ago
Boyuan Yao	5c2ef9fc76	[autoparallel] modify comm nodes' memory cost in construct chain (#2263 ) * [autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline * [autoparallel] using fwd_time and bwd_time instead of fwd_flop and bwd_flop * [autoparallel] specifycomm nodes' memory cost in construct chain	2 years ago
Boyuan Yao	1ea99b869e	[autoparallel] align the data_ptr with the old version of auto activation checkpoint pipeline (#2261 )	2 years ago
Super Daniel	3ccf58aa76	[autockpt] make it work. (#2257 )	2 years ago
Boyuan Yao	ac3739930d	[autoparallel] modify construct chain in rotor solver (#2254 )	2 years ago
Boyuan Yao	ab38aebace	[autoparallel] Hook all meta information on ResNet nodes for auto activation checkpoint (#2248 ) * [autoparallel] hook node meta on graph nodes for checkpoint solver * [autoparallel] polish code * [autoparallel] restore some node handlers * colossalai/auto_parallel/passes/meta_info_prop.py * [autoparallel] remove some unused import * [autoparallel] hook bwd_mem_out	2 years ago
Boyuan Yao	c8c79102f0	[autoparallel] patch torch.flatten metainfo for autoparallel (#2247 ) * [autoparallel] patch torch.flatten	2 years ago
YuliangLiu0306	8897b8f753	[autoparallel] autoparallel initialize (#2238 )	2 years ago
xcnick	85178a397a	[hotfix] fix error for torch 2.0 (#2243 )	2 years ago
Super Daniel	b7d0990c61	[autoparallel] fix construct meta info. (#2245 )	2 years ago
Ziyue Jiang	57929a6210	fix type of num_worker_threads (#2237 ) Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>	2 years ago
Jiarui Fang	db4cbdc7fb	[builder] builder for scaled_upper_triang_masked_softmax (#2234 )	2 years ago
Super Daniel	78483a9fdd	[logger] hotfix, missing _FORMAT (#2231 )	2 years ago
Jiarui Fang	54de05da5d	[builder] polish builder with better base class (#2216 ) * [builder] polish builder * remove print	2 years ago
YuliangLiu0306	3b1b91eaf4	[autoparallel] record parameter attribute in colotracer (#2217 ) * [autoparallel] record parameter attribute in collotracer * [autoparallel] fix construct_meta_info bug	2 years ago
Jiarui Fang	7675792100	[builder] raise Error when CUDA_HOME is not set (#2213 )	2 years ago
Jiarui Fang	d5e3e3ec01	[example] update gpt example for larger model scale (#2211 )	2 years ago
Boyuan Yao	24246f7aa5	[autoparallel] Attach input, buffer and output tensor to MetaInfo class (#2162 ) * [fx] metainfo class for auto parallel * [fx] add unit test for linear metainfo * [fx] fix bwd param for linear * [fx] modify unit test * [fx] modify unit test * [fx] modify import * [fx] modify import * [fx] modify import * [fx] move meta profiler to auto parallel * [fx] add conv metainfo class * [fx] restore profiler * [fx] restore meta profiler * [autoparallel] modify unit test * [fx] modify unit test * [autoparallel] add batchnorm metainfo class * [autoparallel] fix batchnorm unit test function declaration * [fx] restore profiler * [fx] add relu metainfo class * [fx] restore profiler * [autoparallel] modify metainfo input * [autoparallel] add pooling metainfo * [autoparallel] add F.linear metainfo generator * [autoparallel] add binary elementwise metainfo * [fx] recover profiler * [autoparallel] fix forward memory calculation * [autoparallel] modify constants.py * [autoparallel] remove redundant print * [autoparallel] add F.conv metainfo * [autoparallel] linear fix * [autoparallel] memory estimation for communication actions * [autoparallel] fix docstring * [autoparallel] fix variables name * [autoparallel] attach tensor to metainfo class * [autoparallel] fix dangerous try except * [autoparallel] attach memory cost to shape consistency node * [autoparallel] attach shape consistency node's metainfo to the node * [autoparallel] remove todo in shape consistency memory estimation * [autoparallel] fix the annotation	2 years ago
Boyuan Yao	d0bc5a1b34	[autoparallel] new metainfoprop based on metainfo class (#2179 ) * [autoparallel] new metainfoprop to combine SPMD solver and checkpoint solver * [autoparallel] new metainfoprop to combine SPMD solver and checkpoint solver * [autoparallel] modify placeholder handler * [autoparallel] modify metainfoprop * [autoparallel] fix function typo * [autoparallel] fix placeholder handler	2 years ago
YuliangLiu0306	78509124d3	[autoparallel] update getitem handler (#2207 )	2 years ago
Jiarui Fang	1cb532ffec	[builder] multihead attn runtime building (#2203 ) * [hotfix] correcnt cpu_optim runtime compilation * [builder] multihead attn * fix bug * fix a bug	2 years ago

... 5 6 7 8 9 ...

1728 Commits (6a3086a5055235e51a1bca8a20c4bd967409a259)