ColossalAI

Commit Graph

Author	SHA1	Message	Date
Wang Binluo	8e08c27e19	[ckpt] Add async ckpt api (#6136 ) * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix * fix	6 days ago
Hongxin Liu	d4a436051d	[checkpointio] support async model save (#6131 ) * [checkpointio] support async model save * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	6 days ago
Hongxin Liu	5a03d2696d	[cli] support run as module option (#6135 )	2 weeks ago
Hanks	cc40fe0e6f	[fix] multi-node backward slowdown (#6134 ) * remove redundant memcpy during backward * get back record_stream	2 weeks ago
duanjunwen	c2fe3137e2	[hotfix] fix flash attn window_size err (#6132 ) * [fix] fix flash attn * [hotfix] fix flash-atten version * [fix] fix flash_atten version * [fix] fix flash-atten versions * [fix] fix flash-attn not enough values to unpack error * [fix] fix test_ring_attn * [fix] fix test ring attn	2 weeks ago
Hongxin Liu	a2596519fd	[zero] support extra dp (#6123 ) * [zero] support extra dp * [zero] update checkpoint * fix bugs * fix bugs	2 weeks ago
Tong Li	30a9443132	[Coati] Refine prompt for better inference (#6117 ) * refine prompt * update prompt * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	3 weeks ago
Tong Li	7a60161035	update readme (#6116 )	3 weeks ago
Hongxin Liu	a15ab139ad	[plugin] support get_grad_norm (#6115 )	3 weeks ago
Hongxin Liu	13ffa08cfa	[release] update version (#6109 )	3 weeks ago
pre-commit-ci[bot]	2f583c1549	[pre-commit.ci] pre-commit autoupdate (#6078 ) updates: - [github.com/psf/black-pre-commit-mirror: 24.8.0 → 24.10.0](https://github.com/psf/black-pre-commit-mirror/compare/24.8.0...24.10.0) - [github.com/pre-commit/mirrors-clang-format: v18.1.8 → v19.1.2](https://github.com/pre-commit/mirrors-clang-format/compare/v18.1.8...v19.1.2) - [github.com/pre-commit/pre-commit-hooks: v4.6.0 → v5.0.0](https://github.com/pre-commit/pre-commit-hooks/compare/v4.6.0...v5.0.0) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	4 weeks ago
Hongxin Liu	c2e8f61592	[checkpointio] fix hybrid plugin model save (#6106 )	4 weeks ago
Tong Li	89a9a600bc	[MCTS] Add self-refined MCTS (#6098 ) * add reasoner * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update code * delete llama * update prompts * update readme * update readme --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	1 month ago
binmakeswell	4294ae83bb	[doc] sora solution news (#6100 ) * [doc] sora solution news * [doc] sora solution news	1 month ago
Hongxin Liu	80a8ca916a	[extension] hotfix compile check (#6099 )	1 month ago
Hanks	dee63cc5ef	Merge pull request #6096 from BurkeHulk/hotfix/lora_ckpt [hotfix] fix lora ckpt saving format	1 month ago
BurkeHulk	6d6cafabe2	pre-commit fix	1 month ago
BurkeHulk	b10339df7c	fix lora ckpt save format (ColoTensor to Tensor)	1 month ago
Hongxin Liu	19baab5fd5	[release] update version (#6094 )	1 month ago
Hongxin Liu	58d8b8a2dd	[misc] fit torch api upgradation and remove legecy import (#6093 ) * [amp] fit torch's new api * [amp] fix api call * [amp] fix api call * [misc] fit torch pytree api upgrade * [misc] remove legacy import * [misc] fit torch amp api * [misc] fit torch amp api	1 month ago
Hongxin Liu	5ddad486ca	[fp8] add fallback and make compile option configurable (#6092 )	1 month ago
botbw	3b1d7d1ae8	[chore] refactor	1 month ago
botbw	2bcd0b6844	[ckpt] add safetensors util	1 month ago
Hongxin Liu	cd61353bae	[pipeline] hotfix backward for multiple outputs (#6090 ) * [pipeline] hotfix backward for multiple outputs * [pipeline] hotfix backward for multiple outputs	1 month ago
Wenxuan Tan	62c13e7969	[Ring Attention] Improve comments (#6085 ) * improve comments * improve comments --------- Co-authored-by: Edenzzzz <wtan45@wisc.edu>	1 month ago
Wang Binluo	dcd41d0973	Merge pull request #6071 from wangbluo/ring_attention [Ring Attention] fix the 2d ring attn when using multiple machine	1 month ago
wangbluo	83cf2f84fb	fix	1 month ago
wangbluo	bc7eeade33	fix	1 month ago
wangbluo	fd92789af2	fix	1 month ago
wangbluo	6be9862aaf	fix	1 month ago
wangbluo	3dc08c8a5a	fix	1 month ago
wangbluo	8ff7d0c780	fix	1 month ago
wangbluo	fe9208feac	fix	1 month ago
wangbluo	3201377e94	fix	1 month ago
wangbluo	23199e34cc	fix	1 month ago
wangbluo	d891e50617	fix	1 month ago
wangbluo	e1e86f9f1f	fix	1 month ago
Tong Li	4c8e85ee0d	[Coati] Train DPO using PP (#6054 ) * update dpo * remove unsupport plugin * update msg * update dpo * remove unsupport plugin * update msg * update template * update dataset * add pp for dpo * update dpo * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add dpo fn * update dpo * update dpo * update dpo * update dpo * minor update * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * update loss * update help * polish code --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2 months ago
wangbluo	703bb5c18d	fix the test	2 months ago
wangbluo	4e0e99bb6a	fix the test	2 months ago
wangbluo	1507a7528f	fix	2 months ago
wangbluo	0002ae5956	fix	2 months ago
Hongxin Liu	dc2cdaf3e8	[shardformer] optimize seq parallelism (#6086 ) * [shardformer] optimize seq parallelism * [shardformer] fix gpt2 fused linear col * [plugin] update gemini plugin * [plugin] update moe hybrid plugin * [test] update gpt2 fused linear test * [shardformer] fix gpt2 fused linear reduce	2 months ago
wangbluo	efe3042bb2	fix	2 months ago
梁爽	6b2c506fc5	Update README.md (#6087 ) add HPC-AI.COM activity	2 months ago
wangbluo	5ecc27e150	fix	2 months ago
wangbluo	f98384aef6	fix	2 months ago
Hongxin Liu	646b3c5a90	[shardformer] fix linear 1d row and support uneven splits for fused qkv linear (#6084 ) * [tp] hotfix linear row * [tp] support uneven split for fused linear * [tp] support sp for fused linear * [tp] fix gpt2 mlp policy * [tp] fix gather fused and add fused linear row	2 months ago
wangbluo	b635dd0669	fix	2 months ago
wangbluo	3532f77b90	fix	2 months ago

1 2 3 4 5 ...

3757 Commits (8e08c27e19d3f8dcfbae36dffcad0591c0cf9cfc) All Branches Search

3757 Commits (8e08c27e19d3f8dcfbae36dffcad0591c0cf9cfc)

All Branches