Wang Binluo
868afdb311
Dev/zero offload ( #5858 )
...
* fix llama
* fix llama
5 months ago
Wang Binluo
de3f67d128
fix llama ( #5856 )
5 months ago
Wang Binluo
4c06215dce
Merge pull request #5844 from wangbluo/offload
...
Update Qwen2 model
5 months ago
Wang Binluo
e893f88a4f
Merge branch 'dev/zero-offload' into offload
5 months ago
wangbluo
d4ff644ef3
update qwen model
5 months ago
wangbluo
dba59354d7
remove vocab_size args
5 months ago
Wang Binluo
35ef72bfd1
Merge pull request #5842 from wangbluo/dev/zero-offload
...
update llama model
5 months ago
pre-commit-ci[bot]
351a1c269b
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
5 months ago
wangbluo
b12e9a3275
update llama model
5 months ago
wangbluo
52ea64824e
remove 4d attention mask
5 months ago
pre-commit-ci[bot]
df612434c9
[pre-commit.ci] auto fixes from pre-commit.com hooks
...
for more information, see https://pre-commit.ci
5 months ago
Wang Binluo
4c69e2dc91
support qwen model
5 months ago
Wenhao Chen
32e642bf40
revert: enable return_outputs when necessary
5 months ago
Wenhao Chen
856b39f69d
to: add qwen2 auto policy
5 months ago
Wenhao Chen
6fa181ebef
feat: add qwen2 to model_zoo
5 months ago
Wenhao Chen
14305c9449
test: add qwen2 shard test
5 months ago
Wenhao Chen
5512bdf1fc
fix: modify model config and add Qwen2RMSNorm
5 months ago
Wenhao Chen
5c2a47a667
feat: support qwen2 model
5 months ago
Wenhao Chen
61545fcfee
feat: add `sub_dp_size` in plugin
8 months ago
Wenhao Chen
6ceaf4f1f8
tests: add `sub_dp_group` test
8 months ago
Wenhao Chen
9291f07964
feat: add `sub_dp_group`
8 months ago
Wenhao Chen
1aaa453706
perf: use async copy to accelerate memcpy
8 months ago
Wenhao Chen
a53c8c1ade
to: remove MoE temporarily
8 months ago
Wenhao Chen
93aaa21d4a
feat: add `DataPrefetcher`
8 months ago
Wenhao Chen
a1ab2d374e
misc: add offload warning
8 months ago
Wenhao Chen
e614aa34f3
[shardformer, pipeline] add `gradient_checkpointing_ratio` and heterogenous shard policy for llama ( #5508 )
...
* feat: add `GradientCheckpointConfig` and `PipelineGradientCheckpointConfig`
* feat: apply `GradientCheckpointConfig` to policy and llama_forward
* feat: move `distribute_layer` and `get_stage_index` to PipelineStageManager
* fix: add optional args for `distribute_layer` and `get_stage_index`
* fix: fix changed API calls
* test: update llama tests
* style: polish `GradientCheckpointConfig`
* fix: fix pipeline utils tests
8 months ago
YeAnbang
df5e9c53cf
[ColossalChat] Update RLHF V2 ( #5286 )
...
* Add dpo. Fix sft, ppo, lora. Refactor all
* fix and tested ppo
* 2 nd round refactor
* add ci tests
* fix ci
* fix ci
* fix readme, style
* fix readme style
* fix style, fix benchmark
* reproduce benchmark result, remove useless files
* rename to ColossalChat
* use new image
* fix ci workflow
* fix ci
* use local model/tokenizer for ci tests
* fix ci
* fix ci
* fix ci
* fix ci timeout
* fix rm progress bar. fix ci timeout
* fix ci
* fix ci typo
* remove 3d plugin from ci temporary
* test environment
* cannot save optimizer
* support chat template
* fix readme
* fix path
* test ci locally
* restore build_or_pr
* fix ci data path
* fix benchmark
* fix ci, move ci tests to 3080, disable fast tokenizer
* move ci to 85
* support flash attention 2
* add all-in-one data preparation script. Fix colossal-llama2-chat chat template
* add hardware requirements
* move ci test data
* fix save_model, add unwrap
* fix missing bos
* fix missing bos; support grad accumulation with gemini
* fix ci
* fix ci
* fix ci
* fix llama2 chat template config
* debug sft
* debug sft
* fix colossalai version requirement
* fix ci
* add sanity check to prevent NaN loss
* fix requirements
* add dummy data generation script
* add dummy data generation script
* add dummy data generation script
* add dummy data generation script
* update readme
* update readme
* update readme and ignore
* fix logger bug
* support parallel_output
* modify data preparation logic
* fix tokenization
* update lr
* fix inference
* run pre-commit
---------
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
8 months ago
Yuanheng Zhao
36c4bb2893
[Fix] Grok-1 use tokenizer from the same pretrained path ( #5532 )
...
* [fix] use tokenizer from the same pretrained path
* trust remote code
8 months ago
Insu Jang
00525f7772
[shardformer] fix pipeline forward error if custom layer distribution is used ( #5189 )
...
* Use self.[distribute_layers|get_stage_index] to exploit custom layer distribution
* Change static methods for t5 layer distribution to member functions
* Change static methods for whisper layer distribution to member functions
* Replace whisper policy usage with self one
* Fix test case to use non-static layer distribution methods
* fix: fix typo
---------
Co-authored-by: Wenhao Chen <cwher@outlook.com>
8 months ago
github-actions[bot]
e6707a6e8d
[format] applied code formatting on changed files in pull request 5510 ( #5517 )
...
Co-authored-by: github-actions <github-actions@github.com>
8 months ago
Hongxin Liu
19e1a5cf16
[shardformer] update colo attention to support custom mask ( #5510 )
...
* [feature] refactor colo attention (#5462 )
* [extension] update api
* [feature] add colo attention
* [feature] update sdpa
* [feature] update npu attention
* [feature] update flash-attn
* [test] add flash attn test
* [test] update flash attn test
* [shardformer] update modeling to fit colo attention (#5465 )
* [misc] refactor folder structure
* [shardformer] update llama flash-attn
* [shardformer] fix llama policy
* [devops] update tensornvme install
* [test] update llama test
* [shardformer] update colo attn kernel dispatch
* [shardformer] update blip2
* [shardformer] update chatglm
* [shardformer] update gpt2
* [shardformer] update gptj
* [shardformer] update opt
* [shardformer] update vit
* [shardformer] update colo attention mask prep
* [shardformer] update whisper
* [test] fix shardformer tests (#5514 )
* [test] fix shardformer tests
* [test] fix shardformer tests
8 months ago
Edenzzzz
9a3321e9f4
Merge pull request #5515 from Edenzzzz/fix_layout_convert
...
Fix layout convertor caching
8 months ago
Edenzzzz
18edcd5368
Empty-Commit
8 months ago
Edenzzzz
61da3fbc52
fixed layout converter caching and updated tester
8 months ago
Rocky Duan
cbe34c557c
Fix ColoTensorSpec for py11 ( #5440 )
8 months ago
Hongxin Liu
a7790a92e8
[devops] fix example test ci ( #5504 )
8 months ago
Yuanheng Zhao
131f32a076
[fix] fix grok-1 example typo ( #5506 )
8 months ago
flybird11111
0688d92e2d
[shardformer]Fix lm parallel. ( #5480 )
...
* fix
* padding vocab_size when using pipeline parallellism
padding vocab_size when using pipeline parallellism
fix
fix
* fix
* fix
fix
fix
* fix gather output
* fix
* fix
* fix
fix resize embedding
fix resize embedding
* fix resize embedding
fix
* revert
* revert
* revert
* fix lm forward distribution
* fix
* test ci
* fix
8 months ago
binmakeswell
34e909256c
[release] grok-1 inference benchmark ( #5500 )
...
* [release] grok-1 inference benchmark
* [release] grok-1 inference benchmark
* [release] grok-1 inference benchmark
* [release] grok-1 inference benchmark
* [release] grok-1 inference benchmark
8 months ago
Wenhao Chen
bb0a668fee
[hotfix] set return_outputs=False in examples and polish code ( #5404 )
...
* fix: simplify merge_batch
* fix: use return_outputs=False to eliminate extra memory consumption
* feat: add return_outputs warning
* style: remove `return_outputs=False` as it is the default value
8 months ago
Yuanheng Zhao
5fcd7795cd
[example] update Grok-1 inference ( #5495 )
...
* revise grok-1 example
* remove unused arg in scripts
* prevent re-installing torch
* update readme
* revert modifying colossalai requirements
* add perf
* trivial
* add tokenizer url
8 months ago
binmakeswell
6df844b8c4
[release] grok-1 314b inference ( #5490 )
...
* [release] grok-1 inference
* [release] grok-1 inference
* [release] grok-1 inference
8 months ago
Hongxin Liu
848a574c26
[example] add grok-1 inference ( #5485 )
...
* [misc] add submodule
* remove submodule
* [example] support grok-1 tp inference
* [example] add grok-1 inference script
* [example] refactor code
* [example] add grok-1 readme
* [exmaple] add test ci
* [exmaple] update readme
8 months ago
binmakeswell
d158fc0e64
[doc] update open-sora demo ( #5479 )
...
* [doc] update open-sora demo
* [doc] update open-sora demo
* [doc] update open-sora demo
8 months ago
binmakeswell
bd998ced03
[doc] release Open-Sora 1.0 with model weights ( #5468 )
...
* [doc] release Open-Sora 1.0 with model weights
* [doc] release Open-Sora 1.0 with model weights
* [doc] release Open-Sora 1.0 with model weights
8 months ago
flybird11111
5e16bf7980
[shardformer] fix gathering output when using tensor parallelism ( #5431 )
...
* fix
* padding vocab_size when using pipeline parallellism
padding vocab_size when using pipeline parallellism
fix
fix
* fix
* fix
fix
fix
* fix gather output
* fix
* fix
* fix
fix resize embedding
fix resize embedding
* fix resize embedding
fix
* revert
* revert
* revert
8 months ago
Hongxin Liu
f2e8b9ef9f
[devops] fix compatibility ( #5444 )
...
* [devops] fix compatibility
* [hotfix] update compatibility test on pr
* [devops] fix compatibility
* [devops] record duration during comp test
* [test] decrease test duration
* fix falcon
8 months ago
digger yu
385e85afd4
[hotfix] fix typo s/keywrods/keywords etc. ( #5429 )
9 months ago
Camille Zhong
da885ed540
fix tensor data update for gemini loss caluculation ( #5442 )
9 months ago
Hongxin Liu
8020f42630
[release] update version ( #5411 )
9 months ago