ColossalAI

Commit Graph

Author	SHA1	Message	Date
Jiarui Fang	3ddbd1bce1	[gemini] collect cpu-gpu moving volume in each iteration (#813 )	3 years ago
FrankLeeeee	d522cb704e	[cli] fixed single-node process launching	3 years ago
Jiarui Fang	61c20b44bc	[log] local throughput metrics (#811 ) * Revert "[zero] add ZeroTensorShardStrategy (#793)" This reverts commit `88759e289e`. * [gemini] set cpu memory capacity * [log] local throughput collecting * polish * polish * polish * polish code * polish	3 years ago
ver217	dd92b90a68	[DO NOT MERGE] [zero] init fp16 params directly in ZeroInitContext (#808 ) * init fp16 param directly * polish code	3 years ago
Jiarui Fang	227d1cd4b3	[gemini] APIs to set cpu memory capacity (#809 )	3 years ago
YuliangLiu0306	f6dcd23fb9	Merge pull request #807 from FrankLeeeee/feature/cli [cli] fixed a bug in user args and refactored the module structure	3 years ago
FrankLeeeee	f63e91d280	[cli] fixed a bug in user args and refactored the module structure	3 years ago
Jiarui Fang	e761ad2cd7	Revert "[zero] add ZeroTensorShardStrategy (#793 )" (#806 )	3 years ago
HELSON	88759e289e	[zero] add ZeroTensorShardStrategy (#793 )	3 years ago
Jiarui Fang	681addb512	[refactor] moving grad acc logic to engine (#804 )	3 years ago
Frank Lee	05d9ae5999	[cli] add missing requirement (#805 )	3 years ago
YuliangLiu0306	de2f581d43	[cli] added micro benchmarking for tp (#789 ) * [CLI] add CLI launcher * Revert "[CLI] add CLI launcher" This reverts commit `df7e6506d4`. * [CLI]add cli benchmark feature * fix CodeFactor issues. * refactor the module structure.	3 years ago
YuliangLiu0306	cfadc9df8e	[cli] added distributed launcher command (#791 ) * [CLI] add CLI launcher * Revert "[CLI] add CLI launcher" This reverts commit `df7e6506d4`. * [CLI]add cli launcher feature * remove testing message used during developing * refactor the module structure.	3 years ago
Jiarui Fang	97cd9b03b3	[log] display tflops if available (#802 )	3 years ago
Jiarui Fang	4d9332b4c5	[refactor] moving memtracer to gemini (#801 )	3 years ago
Jiarui Fang	8711c706f4	[hotfix] fix grad offload when enabling reuse_fp16_shard	3 years ago
ver217	f1fa1a675f	fix grad offload when enabling reuse_fp16_shard	3 years ago
HELSON	4c4388c46e	[hotfix] fix memory leak in zero (#781 )	3 years ago
Ziyue Jiang	4b01da24cd	[TP] change the check assert in split batch 2d (#772 )	3 years ago
ver217	846406a07a	[gemini] fix auto tensor placement policy (#775 )	3 years ago
ver217	38102cf61a	update version (#779 )	3 years ago
HELSON	a65cbb7e4e	[zero] refactor shard and gather operation (#773 )	3 years ago
Frank Lee	5a1a095b92	[test] refactored with the new rerun decorator (#763 ) * [test] refactored with the new rerun decorator * polish test case	3 years ago
binmakeswell	deaf99f4c9	[readme] sync CN readme (#766 )	3 years ago
ver217	6e553748a7	polish sharded optim docstr and warning (#770 )	3 years ago
LuGY	80e37eec42	fix the ckpt bugs when using DDP (#769 )	3 years ago
Jiarui Fang	1f698f4406	[readme] polish readme (#764 ) * [readme] polish readme * centering image	3 years ago
Frank Lee	920fe31526	[compatibility] used backward-compatible API for global process group (#758 )	3 years ago
Frank Lee	4ea49cb536	[test] added a decorator for address already in use error with backward compatibility (#760 ) * [test] added a decorator for address already in use error with backward compatibility * [test] added a decorator for address already in use error with backward compatibility	3 years ago
Jiarui Fang	10ef8afdd2	[gemini] init genimi individual directory (#754 )	3 years ago
ver217	dcca614eee	[hotfix] fix test_stateful_tensor_mgr (#762 )	3 years ago
github-actions[bot]	6978980f6d	Automated submodule synchronization (#751 ) Co-authored-by: github-actions <github-actions@github.com>	3 years ago
ver217	a93a7d7364	[hotfix] fix reuse_fp16_shard of sharded model (#756 ) * fix reuse_fp16_shard * disable test stm * polish code	3 years ago
ver217	8f7ce94b8e	[hotfix] fix auto tensor placement policy (#753 )	3 years ago
HELSON	84c6700b2a	[zero] refactor memstats_collector (#746 )	3 years ago
アマデウス	b8899e0905	[TP] allow layernorm without bias (#750 )	3 years ago
Jiarui Fang	3d7dc46d33	[zero] use factory pattern for tensor_placement_policy (#752 )	3 years ago
ver217	4b048a8728	fix prepare grads in sharded optim (#749 )	3 years ago
ver217	097772546e	fix initialize about zero	3 years ago
ver217	e396bb71f2	[zero] add tensor placement policies (#743 ) * add tensor placement policies * polish comments * polish comments * update moe unit tests	3 years ago
HELSON	22c4b88d56	[zero] refactor ShardedParamV2 for convenience (#742 )	3 years ago
HELSON	340e59f968	[utils] add synchronized cuda memory monitor (#740 )	3 years ago
ver217	e6212f56cd	[hotfix] fix memory leak in backward of sharded model (#741 )	3 years ago
Frank Lee	f4f42d4c3c	[bug] fixed DDP compatibility with torch 1.8 (#739 )	3 years ago
Frank Lee	a4e91bc87f	[bug] fixed grad scaler compatibility with torch 1.8 (#735 )	3 years ago
Jiarui Fang	53cb584808	[utils] correct cpu memory used and capacity in the context of multi-process (#726 )	3 years ago
Jiarui Fang	7db3ccc79b	[hotfix] remove duplicated param register to stateful tensor manager (#728 )	3 years ago
binmakeswell	600e769a42	add video (#732 )	3 years ago
Frank Lee	a5c3f072f6	[bug] removed zero installation requirements (#731 )	3 years ago
HELSON	b9b469ea50	[moe] add checkpoint for moe zero test (#729 )	3 years ago

... 2 3 4 5 6 ...

628 Commits (ad536e308e0b58bd83ada24e128b1d519b3f3bd9) All Branches Search

628 Commits (ad536e308e0b58bd83ada24e128b1d519b3f3bd9)

All Branches