Commit Graph

3080 Commits (6af6d6fc9fe72997af44cdf3cb7b930a365ab915)

Author SHA1 Message Date
ppt0011 c1fab951e7
Merge pull request #4889 from ppt0011/main
[doc] add reminder for issue encountered with hybrid adam
2023-10-12 10:27:10 +08:00
littsk ffd9a3cbc9
[hotfix] fix bug in sequence parallel test (#4887) 2023-10-11 19:30:41 +08:00
ppt0011 1dcaf249bd [doc] add reminder for issue encountered with hybrid adam 2023-10-11 17:51:14 +08:00
Xu Kai fdec650bb4
fix test llama (#4884) 2023-10-11 17:43:01 +08:00
Bin Jia 08a9f76b2f
[Pipeline Inference] Sync pipeline inference branch to main (#4820)
* [pipeline inference] pipeline inference (#4492)

* add pp stage manager as circle stage

* fix a bug when create process group

* add ppinfer basic framework

* add micro batch manager and support kvcache-pp gpt2 fwd

* add generate schedule

* use mb size to control mb number

* support generate with kv cache

* add output, remove unused code

* add test

* reuse shardformer to build model

* refactor some code and use the same attribute name of hf

* fix review and add test for generation

* remove unused file

* fix CI

* add cache clear

* fix code error

* fix typo

* [Pipeline inference] Modify to tieweight (#4599)

* add pp stage manager as circle stage

* fix a bug when create process group

* add ppinfer basic framework

* add micro batch manager and support kvcache-pp gpt2 fwd

* add generate schedule

* use mb size to control mb number

* support generate with kv cache

* add output, remove unused code

* add test

* reuse shardformer to build model

* refactor some code and use the same attribute name of hf

* fix review and add test for generation

* remove unused file

* modify the way of saving newtokens

* modify to tieweight

* modify test

* remove unused file

* solve review

* add docstring

* [Pipeline inference] support llama pipeline inference (#4647)

* support llama pipeline inference

* remove tie weight operation

* [pipeline inference] Fix the blocking of communication when ppsize is 2 (#4708)

* add benchmark verbose

* fix export tokens

* fix benchmark verbose

* add P2POp style to do p2p communication

* modify schedule as p2p type when ppsize is 2

* remove unused code and add docstring

* [Pipeline inference] Refactor code, add docsting, fix bug (#4790)

* add benchmark script

* update argparse

* fix fp16 load

* refactor code style

* add docstring

* polish code

* fix test bug

* [Pipeline inference] Add pipeline inference docs (#4817)

* add readme doc

* add a ico

* Add performance

* update table of contents

* refactor code (#4873)
2023-10-11 11:40:06 +08:00
Camille Zhong 652adc2215 Update README.md 2023-10-10 23:19:34 +08:00
Camille Zhong afe10a85fd Update README.md 2023-10-10 23:19:34 +08:00
Camille Zhong d6c4b9b370 Update main README.md
add modelscope model link
2023-10-10 23:19:34 +08:00
Camille Zhong 3043d5d676 Update modelscope link in README.md
add modelscope link
2023-10-10 23:19:34 +08:00
flybird11111 6a21f96a87
[doc] update advanced tutorials, training gpt with hybrid parallelism (#4866)
* [doc]update advanced tutorials, training gpt with hybrid parallelism

* [doc]update advanced tutorials, training gpt with hybrid parallelism

* update vit tutorials

* update vit tutorials

* update vit tutorials

* update vit tutorials

* update en/train_vit_with_hybrid_parallel.py

* fix

* resolve comments

* fix
2023-10-10 08:18:55 +00:00
Blagoy Simandoff 8aed02b957
[nfc] fix minor typo in README (#4846) 2023-10-07 17:51:11 +08:00
Camille Zhong cd6a962e66 [NFC] polish code style (#4799) 2023-10-07 13:36:52 +08:00
Michelle 07ed155e86 [NFC] polish colossalai/inference/quant/gptq/cai_gptq/__init__.py code style (#4792) 2023-10-07 13:36:52 +08:00
littsk eef96e0877 polish code for gptq (#4793) 2023-10-07 13:36:52 +08:00
Hongxin Liu cb3a25a062
[checkpointio] hotfix torch 2.0 compatibility (#4824) 2023-10-07 10:45:52 +08:00
ppt0011 ad23460cf8
Merge pull request #4856 from KKZ20/test/model_support_for_low_level_zero
[test] remove the redundant code of model output transformation in torchrec
2023-10-06 09:32:33 +08:00
ppt0011 81ee91f2ca
Merge pull request #4858 from Shawlleyw/main
[doc]: typo in document of booster low_level_zero plugin
2023-10-06 09:27:54 +08:00
shaoyuw c97a3523db fix: typo in comment of low_level_zero plugin 2023-10-05 16:30:34 +00:00
Zhongkai Zhao db40e086c8 [test] modify model supporting part of low_level_zero plugin (including correspoding docs) 2023-10-05 15:10:31 +08:00
Xu Kai d1fcc0fa4d
[infer] fix test bug (#4838)
* fix test bug

* delete useless code

* fix typo
2023-10-04 10:01:03 +08:00
Jianghai 013a4bedf0
[inference]fix import bug and delete down useless init (#4830)
* fix import bug and release useless init

* fix

* fix

* fix
2023-10-04 09:18:45 +08:00
Yuanheng Zhao 573f270537
[Infer] Serving example w/ ray-serve (multiple GPU case) (#4841)
* fix imports

* add ray-serve with Colossal-Infer tp

* trivial: send requests script

* add README

* fix worker port

* fix readme

* use app builder and autoscaling

* trivial: input args

* clean code; revise readme

* testci (skip example test)

* use auto model/tokenizer

* revert imports fix (fixed in other PRs)
2023-10-02 17:48:38 +08:00
Yuanheng Zhao 3a74eb4b3a
[Infer] Colossal-Inference serving example w/ TorchServe (single GPU case) (#4771)
* add Colossal-Inference serving example w/ TorchServe

* add dockerfile

* fix dockerfile

* fix dockerfile: fix commit hash, install curl

* refactor file structure

* revise readme

* trivial

* trivial: dockerfile format

* clean dir; revise readme

* fix comments: fix imports and configs

* fix formats

* remove unused requirements
2023-10-02 17:42:37 +08:00
Tong Li ed06731e00
update Colossal (#4832) 2023-09-28 16:05:05 +08:00
Xu Kai c3bef20478
add autotune (#4822) 2023-09-28 13:47:35 +08:00
binmakeswell 822051d888
[doc] update slack link (#4823) 2023-09-27 17:37:39 +08:00
Yuanchen 1fa8c5e09f
Update Qwen-7B results (#4821)
Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
2023-09-27 17:33:54 +08:00
flybird11111 be400a0936
[chat] fix gemini strategy (#4698)
* [chat] fix gemini strategy

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* [chat] fix gemini strategy

* g# This is a combination of 2 commits.

[chat] fix gemini strategy

fox

* [chat] fix gemini strategy

update llama2 example

[chat] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* [fix] fix gemini strategy

* fix

* fix

* fix

* fix

* fix

* Update train_prompts.py
2023-09-27 13:15:32 +08:00
Tong Li bbbcac26e8
fix format (#4815) 2023-09-27 12:50:22 +08:00
github-actions[bot] fb46d05cdf
[format] applied code formatting on changed files in pull request 4595 (#4602)
Co-authored-by: github-actions <github-actions@github.com>
2023-09-27 10:45:03 +08:00
littsk 11f1e426fe
[hotfix] Correct several erroneous code comments (#4794) 2023-09-27 10:43:03 +08:00
littsk 54b3ad8924
[hotfix] fix norm type error in zero optimizer (#4795) 2023-09-27 10:35:24 +08:00
Hongxin Liu da15fdb9ca
[doc] add lazy init docs (#4808) 2023-09-27 10:24:04 +08:00
Yan haixu a22706337a
[misc] add last_epoch in CosineAnnealingWarmupLR (#4778) 2023-09-26 14:43:46 +08:00
Chandler-Bing b6cf0aca55
[hotfix] change llama2 Colossal-LLaMA-2 script filename (#4800)
change filename:
pretraining.py -> trainin.py
there is no file named pretraing.py. wrong writing
2023-09-26 11:44:27 +08:00
Desperado-Jia 62b6af1025
Merge pull request #4805 from TongLi3701/docs/fix
[doc] Update TODO in README of Colossal-LLaMA-2
2023-09-26 11:39:35 +08:00
Tong Li 8cbce6184d update 2023-09-26 11:36:53 +08:00
Hongxin Liu 4965c0dabd
[lazy] support from_pretrained (#4801)
* [lazy] patch from pretrained

* [lazy] fix from pretrained and add tests

* [devops] update ci
2023-09-26 11:04:11 +08:00
Tong Li bd014673b0 update readme 2023-09-26 10:58:05 +08:00
Baizhou Zhang 64a08b2dc3
[checkpointio] support unsharded checkpointIO for hybrid parallel (#4774)
* support unsharded saving/loading for model

* support optimizer unsharded saving

* update doc

* support unsharded loading for optimizer

* small fix
2023-09-26 10:58:03 +08:00
Baizhou Zhang a2db75546d
[doc] polish shardformer doc (#4779)
* fix example format in docstring

* polish shardformer doc
2023-09-26 10:57:47 +08:00
flybird11111 26cd6d850c
[fix] fix weekly runing example (#4787)
* [fix] fix weekly runing example

* [fix] fix weekly runing example
2023-09-25 16:19:33 +08:00
binmakeswell d512a4d38d
[doc] add llama2 domain-specific solution news (#4789)
* [doc] add llama2 domain-specific solution news
2023-09-25 10:44:15 +08:00
Yuanchen ce777853ae
[feature] ColossalEval: Evaluation Pipeline for LLMs (#4786)
* Add ColossalEval

* Delete evaluate in Chat

---------

Co-authored-by: Xu Yuanchen <yuanchen.xu00@gmail.com>
Co-authored-by: Tong Li <tong.li352711588@gmail.com>
2023-09-24 23:14:11 +08:00
Tong Li 74aa7d964a
initial commit: add colossal llama 2 (#4784) 2023-09-24 23:12:26 +08:00
Hongxin Liu 4146f1c0ce
[release] update version (#4775)
* [release] update version

* [doc] revert versions
2023-09-22 18:29:17 +08:00
Jianghai ce7ade3882
[inference] chatglm2 infer demo (#4724)
* add chatglm2

* add

* gather needed kernels

* fix some bugs

* finish context forward

* finish context stage

* fix

* add

* pause

* add

* fix bugs

* finish chatglm

* fix bug

* change some logic

* fix bugs

* change some logics

* add

* add

* add

* fix

* fix tests

* fix
2023-09-22 11:12:50 +08:00
Xu Kai 946ab56c48
[feature] add gptq for inference (#4754)
* [gptq] add gptq kernel (#4416)

* add gptq

* refactor code

* fix tests

* replace auto-gptq

* rname inferance/quant

* refactor test

* add auto-gptq as an option

* reset requirements

* change assert and check auto-gptq

* add import warnings

* change test flash attn version

* remove example

* change requirements of flash_attn

* modify tests

* [skip ci] change requirements-test

* [gptq] faster gptq cuda kernel (#4494)

* [skip ci] add cuda kernels

* add license

* [skip ci] fix max_input_len

* format files & change test size

* [skip ci]

* [gptq] add gptq tensor parallel (#4538)

* add gptq tensor parallel

* add gptq tp

* delete print

* add test gptq check

* add test auto gptq check

* [gptq] combine gptq and kv cache manager (#4706)

* combine gptq and kv cache manager

* add init bits

* delete useless code

* add model path

* delete usless print and update test

* delete usless import

* move option gptq to shard config

* change replace linear to shardformer

* update bloom policy

* delete useless code

* fix import bug and delete uselss code

* change colossalai/gptq to colossalai/quant/gptq

* update import linear for tests

* delete useless code and mv gptq_kernel to kernel directory

* fix triton kernel

* add triton import
2023-09-22 11:02:50 +08:00
littsk 1e0e080837
[bug] Fix the version check bug in colossalai run when generating the cmd. (#4713)
* Fix the version check bug in colossalai run when generating the cmd.

* polish code
2023-09-22 10:50:47 +08:00
Hongxin Liu 3e05c07bb8
[lazy] support torch 2.0 (#4763)
* [lazy] support _like methods and clamp

* [lazy] pass transformers models

* [lazy] fix device move and requires grad

* [lazy] fix requires grad and refactor api

* [lazy] fix requires grad
2023-09-21 16:30:23 +08:00