ColossalAI

Commit Graph

Author	SHA1	Message	Date
Frank Lee	725a39f4bd	update github CI with the current workflow (#441 )	3 years ago
Frank Lee	5a1e33b97f	update contributing.md with the current workflow (#440 )	3 years ago
Jiarui Fang	17b8274f8a	[unitest] polish zero config in unittest (#438 )	3 years ago
Jiarui Fang	640a6cd304	[refactory] refactory the initialize method for new zero design (#431 )	3 years ago
Frank Lee	4f85b687cf	[misc] replace codebeat with codefactor on readme (#436 )	3 years ago
Frank Lee	bffd85bf34	added testing module (#435 )	3 years ago
HELSON	dbdc9a7783	added Multiply Jitter and capacity factor eval for MOE (#434 )	3 years ago
Frank Lee	b03b3ae99c	fixed mem monitor device (#433 ) fixed mem monitor device	3 years ago
Frank Lee	14a7094243	fixed fp16 optimizer none grad bug (#432 )	3 years ago
ver217	fce9432f08	sync before creating empty grad	3 years ago
ver217	ea6905a898	free param.grad	3 years ago
ver217	9506a8beb2	use double buffer to handle grad	3 years ago
Frank Lee	0f5f5dd556	fixed gpt attention mask in pipeline (#430 )	3 years ago
Jiarui Fang	f9c762df85	[test] merge zero optim tests (#428 )	3 years ago
Frank Lee	f0d6e2208b	[polish] add license meta to setup.py (#427 )	3 years ago
Jiarui Fang	5d7dc3525b	[hotfix] run cpu adam unittest in pytest (#424 )	3 years ago
Jiarui Fang	54229cd33e	[log] better logging display with rich (#426 ) * better logger using rich * remove deepspeed in zero requirements	3 years ago
HELSON	3f70a2b12f	removed noisy function during evaluation of MoE router (#419 )	3 years ago
Jiarui Fang	adebb3e041	[zero] cuda margin space for OS (#418 )	3 years ago
Jiarui Fang	56bb412e72	[polish] use GLOBAL_MODEL_DATA_TRACER (#417 )	3 years ago
Jiarui Fang	23ba3fc450	[zero] refactory ShardedOptimV2 init method (#416 )	3 years ago
Frank Lee	e79ea44247	[fp16] refactored fp16 optimizer (#392 )	3 years ago
Frank Lee	f8a0e7fb01	Merge pull request #412 from hpcaitech/develop merge develop to main	3 years ago
Jiarui Fang	21dc54e019	[zero] memtracer to record cuda memory usage of model data and overall system (#395 )	3 years ago
Jiarui Fang	a37bf1bc42	[hotfix] rm test_tensor_detector.py (#413 )	3 years ago
Jiarui Fang	370f567e7d	[zero] new interface for ShardedOptimv2 (#406 )	3 years ago
LuGY	a9c27be42e	Added tensor detector (#393 ) * Added tensor detector * Added the - states * Allowed change include_cpu when detect()	3 years ago
Frank Lee	32296cf462	Merge pull request #409 from 1SAA/develop [hotfix] fixed error when no collective communication in CommProfiler	3 years ago
1SAA	907ac4a2dc	fixed error when no collective communication in CommProfiler	3 years ago
Frank Lee	62b08acc72	update hf badge link (#410 )	3 years ago
Frank Lee	2fe68b359a	Merge pull request #403 from ver217/feature/shard-strategy [zero] Add bucket tensor shard strategy	3 years ago
Frank Lee	cf92a779dc	added huggingface badge (#407 )	3 years ago
HELSON	dfd0363f68	polished output format for communication profiler and pcie profiler (#404 ) fixed typing error	3 years ago
ver217	63469c0f91	polish code	3 years ago
ver217	54fd37f0e0	polish unit test	3 years ago
ver217	88804aee49	add bucket tensor shard strategy	3 years ago
Frank Lee	aaead33cfe	Merge pull request #397 from hpcaitech/create-pull-request/patch-sync-submodule [Bot] Synchronize Submodule References	3 years ago
github-actions	6098bc4cce	Automated submodule synchronization	3 years ago
Frank Lee	6937f85004	Merge pull request #402 from oikosohn/oikosohn-patch-1 fix typo in CHANGE_LOG.md	3 years ago
sohn	ff4f5d7231	fix typo in CHANGE_LOG.md - fix typo, `Unifed` -> `Unified` below Added	3 years ago
Frank Lee	fc5101f24c	Merge pull request #401 from hpcaitech/develop	3 years ago
Frank Lee	fc2fd0abe5	Merge pull request #400 from hpcaitech/hotfix/readme fixed broken badge link	3 years ago
Frank Lee	6d3a4f51bf	fixed broken badge link	3 years ago
HELSON	7c079d9c33	[hotfix] fixed bugs in ShardStrategy and PcieProfiler (#394 )	3 years ago
Frank Lee	1e4bf85cdb	fixed bug in activation checkpointing test (#387 )	3 years ago
Jiarui Fang	3af13a2c3e	[zero] polish ShardedOptimV2 unittest (#385 ) * place params on cpu after zero init context * polish code * bucketzed cpu gpu tensor transter * find a bug in sharded optim unittest * add offload unittest for ShardedOptimV2. * polish code and make it more robust	3 years ago
binmakeswell	ce7b2c9ae3	update README and images path (#384 )	3 years ago
ScalableEKNN	2fcd4f38ee	fix format (#379 )	3 years ago
Jiang Zhuo	5a4a3b77d9	fix format (#376 )	3 years ago
lucasliunju	ce886a9062	fix format (#374 )	3 years ago

... 32 33 34 35 36 ...

1877 Commits (9885ec2b2e1fa6b52f80157c09775558baae4254) All Branches Search

1877 Commits (9885ec2b2e1fa6b52f80157c09775558baae4254)

All Branches