ColossalAI

Commit Graph

Author	SHA1	Message	Date
Edenzzzz	936d0b0f7b	[doc] Update llama + sp compatibility; fix dist optim table Co-authored-by: Edenzzzz <wtan45@wisc.edu>	2024-07-01 17:07:22 +08:00
flybird11111	773d9f964a	[shardformer]delete xformers (#5859 ) * delete xformers * fix * fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>	2024-06-28 11:20:04 +08:00
Hongxin Liu	bbb2c21f16	[shardformer] fix chatglm implementation (#5644 ) * [shardformer] fix chatglm policy * [shardformer] fix chatglm flash attn * [shardformer] update readme * [shardformer] fix chatglm init * [shardformer] fix chatglm test * [pipeline] fix chatglm merge batch	2024-04-25 14:41:17 +08:00
Wenhao Chen	bb0a668fee	[hotfix] set return_outputs=False in examples and polish code (#5404 ) * fix: simplify merge_batch * fix: use return_outputs=False to eliminate extra memory consumption * feat: add return_outputs warning * style: remove `return_outputs=False` as it is the default value	2024-03-25 12:31:09 +08:00
Wenhao Chen	7172459e74	[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088 ) * [shardformer] implement policy for all GPT-J models and test * [shardformer] support interleaved pipeline parallel for bert finetune * [shardformer] shardformer support falcon (#4883) * [shardformer]: fix interleaved pipeline for bert model (#5048) * [hotfix]: disable seq parallel for gptj and falcon, and polish code (#5093) * Add Mistral support for Shardformer (#5103) * [shardformer] add tests to mistral (#5105) --------- Co-authored-by: Pengtai Xu <henryxu880@gmail.com> Co-authored-by: ppt0011 <143150326+ppt0011@users.noreply.github.com> Co-authored-by: flybird11111 <1829166702@qq.com> Co-authored-by: eric8607242 <e0928021388@gmail.com>	2023-11-28 16:54:42 +08:00
Baizhou Zhang	a2db75546d	[doc] polish shardformer doc (#4779 ) * fix example format in docstring * polish shardformer doc	2023-09-26 10:57:47 +08:00
Baizhou Zhang	451c3465fb	[doc] polish shardformer doc (#4735 ) * arrange position of chapters * fix typos in seq parallel doc	2023-09-15 17:39:10 +08:00
Bin Jia	6a03c933a0	[shardformer] update seq parallel document (#4730 ) * update doc of seq parallel * fix typo	2023-09-15 16:09:32 +08:00
Baizhou Zhang	50e5602c2d	[doc] add shardformer support matrix/update tensor parallel documents (#4728 ) * add compatibility matrix for shardformer doc * update tp doc	2023-09-15 13:52:30 +08:00
Baizhou Zhang	f911d5b09d	[doc] Add user document for Shardformer (#4702 ) * create shardformer doc files * add docstring for seq-parallel * update ShardConfig docstring * add links to llama example * add outdated massage * finish introduction & supporting information * finish 'how shardformer works' * finish shardformer.md English doc * fix doctest fail * add Chinese document	2023-09-15 10:56:39 +08:00

10 Commits (e17f835df7c637e18df708b929b570c2ac459434)