Commit Graph

10 Commits (184a65370451452bae87d0058bba06563028c4a8)

Author SHA1 Message Date
Edenzzzz 936d0b0f7b
[doc] Update llama + sp compatibility; fix dist optim table
Co-authored-by: Edenzzzz <wtan45@wisc.edu>
2024-07-01 17:07:22 +08:00
flybird11111 773d9f964a
[shardformer]delete xformers (#5859)
* delete xformers

* fix

* fix

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
2024-06-28 11:20:04 +08:00
Hongxin Liu bbb2c21f16
[shardformer] fix chatglm implementation (#5644)
* [shardformer] fix chatglm policy

* [shardformer] fix chatglm flash attn

* [shardformer] update readme

* [shardformer] fix chatglm init

* [shardformer] fix chatglm test

* [pipeline] fix chatglm merge batch
2024-04-25 14:41:17 +08:00
Wenhao Chen bb0a668fee
[hotfix] set return_outputs=False in examples and polish code (#5404)
* fix: simplify merge_batch

* fix: use return_outputs=False to eliminate extra memory consumption

* feat: add return_outputs warning

* style: remove `return_outputs=False` as it is the default value
2024-03-25 12:31:09 +08:00
Wenhao Chen 7172459e74
[shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088)
* [shardformer] implement policy for all GPT-J models and test

* [shardformer] support interleaved pipeline parallel for bert finetune

* [shardformer] shardformer support falcon (#4883)

* [shardformer]: fix interleaved pipeline for bert model (#5048)

* [hotfix]: disable seq parallel for gptj and falcon, and polish code (#5093)

* Add Mistral support for Shardformer (#5103)

* [shardformer] add tests to mistral (#5105)

---------

Co-authored-by: Pengtai Xu <henryxu880@gmail.com>
Co-authored-by: ppt0011 <143150326+ppt0011@users.noreply.github.com>
Co-authored-by: flybird11111 <1829166702@qq.com>
Co-authored-by: eric8607242 <e0928021388@gmail.com>
2023-11-28 16:54:42 +08:00
Baizhou Zhang a2db75546d
[doc] polish shardformer doc (#4779)
* fix example format in docstring

* polish shardformer doc
2023-09-26 10:57:47 +08:00
Baizhou Zhang 451c3465fb
[doc] polish shardformer doc (#4735)
* arrange position of chapters

* fix typos in seq parallel doc
2023-09-15 17:39:10 +08:00
Bin Jia 6a03c933a0
[shardformer] update seq parallel document (#4730)
* update doc of seq parallel

* fix typo
2023-09-15 16:09:32 +08:00
Baizhou Zhang 50e5602c2d
[doc] add shardformer support matrix/update tensor parallel documents (#4728)
* add compatibility matrix for shardformer doc

* update tp doc
2023-09-15 13:52:30 +08:00
Baizhou Zhang f911d5b09d
[doc] Add user document for Shardformer (#4702)
* create shardformer doc files

* add docstring for seq-parallel

* update ShardConfig docstring

* add links to llama example

* add outdated massage

* finish introduction & supporting information

* finish 'how shardformer works'

* finish shardformer.md English doc

* fix doctest fail

* add Chinese document
2023-09-15 10:56:39 +08:00