ColossalAI

Commit Graph

Author	SHA1	Message	Date
Tong Li	1d96a562bb	update	2024-01-11 14:05:44 +08:00
Tong Li	dac240563c	minor update	2024-01-10 11:12:09 +08:00
Tong Li	ea088b5f75	update train code	2024-01-10 10:42:37 +08:00
Tong Li	4b7f273022	add moe	2024-01-09 11:59:38 +08:00
ver217	63ee6fffe6	Merge branch 'main' into exp/mixtral	2024-01-08 16:43:54 +08:00
ver217	ce1cff26bd	Merge branch 'main' into exp/mixtral	2024-01-08 16:42:00 +08:00
Elsa Granger	d565df3821	[pipeline] A more general _communicate in p2p (#5062 ) * A more general _communicate * feat: finish tree_flatten version p2p * fix: update p2p api calls --------- Co-authored-by: Wenhao Chen <cwher@outlook.com>	2024-01-08 15:37:27 +08:00
binmakeswell	7bc6969ce6	[doc] SwiftInfer release (#5236 ) * [doc] SwiftInfer release * [doc] SwiftInfer release * [doc] SwiftInfer release * [doc] SwiftInfer release * [doc] SwiftInfer release	2024-01-08 09:55:12 +08:00
github-actions[bot]	4fb4a22a72	[format] applied code formatting on changed files in pull request 5234 (#5235 ) Co-authored-by: github-actions <github-actions@github.com>	2024-01-07 20:55:34 +08:00
binmakeswell	b9b32b15e6	[doc] add Colossal-LLaMA-2-13B (#5234 ) * [doc] add Colossal-LLaMA-2-13B * [doc] add Colossal-LLaMA-2-13B * [doc] add Colossal-LLaMA-2-13B	2024-01-07 20:53:12 +08:00
JIMMY ZHAO	ce651270f1	[doc] Make leaderboard format more uniform and good-looking (#5231 ) * Make leaderboard format more unifeid and good-looking * Update README.md * Update README.md	2024-01-06 17:12:29 +08:00
Camille Zhong	915b4652f3	[doc] Update README.md of Colossal-LLAMA2 (#5233 ) * Update README.md * Update README.md	2024-01-06 17:06:41 +08:00
Tong Li	d992b55968	[Colossal-LLaMA-2] Release Colossal-LLaMA-2-13b-base model (#5224 ) * update readme * update readme * update link * update * update readme * update * update * update * update title * update example * update example * fix content * add conclusion * add license * update * update * update version * fix minor	2024-01-05 17:24:26 +08:00
Wenhao Chen	196b85368b	[pipeline]: add p2p fallback order and fix interleaved pp deadlock (#5214 ) * fix: add fallback order option and update 1f1b * fix: fix deadlock comm in interleaved pp * test: modify p2p test	2024-01-05 14:01:54 +08:00
Wenhao Chen	931d0e0731	[pipeline]: support arbitrary batch size in forward_only mode (#5201 ) * fix: remove drop last in val & test dataloader * feat: add run_forward_only, support arbitrary bs * chore: modify ci script	2024-01-05 14:01:39 +08:00
Wenhao Chen	1810b9100f	[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp (#5134 ) * test: add more p2p tests * fix: remove send_forward_recv_forward as p2p op list need to use the same group * fix: make send and receive atomic * feat: update P2PComm fn * feat: add metadata cache in 1f1b * feat: add metadata cache in interleaved pp * feat: modify is_xx_stage fn * revert: add _broadcast_object_list * feat: add interleaved pp in llama policy * feat: set NCCL_BUFFSIZE in HybridParallelPlugin	2024-01-05 13:58:53 +08:00
digger yu	b0b53a171c	[nfc] fix typo colossalai/shardformer/ (#5133 )	2024-01-04 16:21:55 +08:00
Xuanlei Zhao	6b69f3085b	update	2024-01-03 15:37:59 +08:00
flybird11111	451e9142b8	fix flash attn (#5209 )	2024-01-03 14:39:53 +08:00
flybird11111	365671be10	fix-test (#5210 ) fix-test fix-test	2024-01-03 14:26:13 +08:00
Xuanlei Zhao	8ca8cf8ec3	update optim	2024-01-03 11:57:23 +08:00
Hongxin Liu	7f3400b560	[devops] update torch versoin in ci (#5217 )	2024-01-03 11:46:33 +08:00
Wenhao Chen	d799a3088f	[pipeline]: add p2p fallback order and fix interleaved pp deadlock (#5214 ) * fix: add fallback order option and update 1f1b * fix: fix deadlock comm in interleaved pp * test: modify p2p test	2024-01-03 11:34:49 +08:00
Wenhao Chen	3c0d82b19b	[pipeline]: support arbitrary batch size in forward_only mode (#5201 ) * fix: remove drop last in val & test dataloader * feat: add run_forward_only, support arbitrary bs * chore: modify ci script	2024-01-02 23:41:12 +08:00
Xuanlei Zhao	f037583bd2	update train	2024-01-02 14:01:58 +08:00
flybird11111	02d2328a04	support linear accumulation fusion (#5199 ) support linear accumulation fusion support linear accumulation fusion fix	2023-12-29 18:22:42 +08:00
Xuanlei Zhao	0b8c33f474	update	2023-12-29 18:20:32 +08:00
Xuanlei Zhao	c1c6af6368	update	2023-12-29 18:09:28 +08:00
Xuanlei Zhao	0bb317d9e6	update	2023-12-29 17:28:46 +08:00
Xuanlei Zhao	ccad7014c6	update optim	2023-12-29 16:51:29 +08:00
Xuanlei Zhao	44014faa67	fix optim	2023-12-28 21:58:08 +08:00
Xuanlei Zhao	0a3aae509b	update utils and fwd bwd	2023-12-28 18:54:56 +08:00
Xuanlei Zhao	a5580e6289	update test	2023-12-28 18:52:37 +08:00
Xuanlei Zhao	73aa406b96	update	2023-12-28 15:48:04 +08:00
Zhongkai Zhao	64519eb830	[doc] Update required third-party library list for testing and torch comptibility checking (#5207 ) * doc/update requirements-test.txt * update torch-cuda compatibility check	2023-12-27 18:03:45 +08:00
Xuanlei Zhao	570f5cd693	update pytest	2023-12-27 16:05:00 +08:00
Xuanlei Zhao	54b197cc02	update readme	2023-12-26 17:39:38 +08:00
Xuanlei Zhao	4922641098	script	2023-12-26 17:33:32 +08:00
Xuanlei Zhao	d660a41850	update	2023-12-26 17:32:59 +08:00
Xuanlei Zhao	b8fadb68a7	add pad	2023-12-25 17:02:05 +08:00
Xuanlei Zhao	23341687ed	update	2023-12-25 16:29:47 +08:00
Xuanlei Zhao	aa2e091dc6	update	2023-12-25 16:05:42 +08:00
Yuanchen	eae01b6740	Improve logic for selecting metrics (#5196 ) Co-authored-by: Xu <yuanchen.xu00@gmail.com>	2023-12-22 14:52:50 +08:00
Wenhao Chen	4fa689fca1	[pipeline]: fix p2p comm, add metadata cache and support llama interleaved pp (#5134 ) * test: add more p2p tests * fix: remove send_forward_recv_forward as p2p op list need to use the same group * fix: make send and receive atomic * feat: update P2PComm fn * feat: add metadata cache in 1f1b * feat: add metadata cache in interleaved pp * feat: modify is_xx_stage fn * revert: add _broadcast_object_list * feat: add interleaved pp in llama policy * feat: set NCCL_BUFFSIZE in HybridParallelPlugin	2023-12-22 10:44:00 +08:00
BlueRum	af952673f7	polish readme in application/chat (#5194 )	2023-12-20 11:28:39 +08:00
Xuanlei Zhao	7c5b1a585f	update	2023-12-18 10:37:07 +08:00
flybird11111	681d9b12ef	[doc] update pytorch version in documents. (#5177 ) * fix aaa fix fix fix * fix * fix * test ci * fix ci fix * update pytorch version in documents	2023-12-15 18:16:48 +08:00
Xuanlei Zhao	ebd8cc579a	update script	2023-12-15 16:38:51 +08:00
Xuanlei Zhao	f66469e209	update	2023-12-15 16:32:32 +08:00
Yuanchen	3ff60d13b0	Fix ColossalEval (#5186 ) Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>	2023-12-15 15:06:06 +08:00

1 2 3 4 5 ...

2967 Commits (feat/moe) All Branches Search

2967 Commits (feat/moe)

All Branches