Commit Graph

1162 Commits (2014cce87062ab10bedf1dbc9871723ba80ded50)

Author SHA1 Message Date
GuangyaoZhang fe2e74c03a fix precommit
5 months ago
GuangyaoZhang 98da648a4a Fix Code Factor check
5 months ago
GuangyaoZhang f656d61778 change command
5 months ago
Edenzzzz 8795bb2e80
Support 4d parallel + flash attention (#5789)
5 months ago
flybird11111 2ddf624a86
[shardformer] upgrade transformers to 4.39.3 (#5815)
6 months ago
Li Xingjian 8554585a5f
[Inference] Fix flash-attn import and add model test (#5794)
6 months ago
Guangyao Zhang aac941ef78
[test] fix qwen2 pytest distLarge (#5797)
6 months ago
Hongxin Liu 587bbf4c6d
[test] fix chatglm test kit (#5793)
6 months ago
char-1ee b303976a27 Fix test import
6 months ago
char-1ee 5f398fc000 Pass inference model shard configs for module init
6 months ago
duanjunwen 10a19e22c6
[hotfix] fix testcase in test_fx/test_tracer (#5779)
6 months ago
botbw 80c3c8789b
[Test/CI] remove test cases to reduce CI duration (#5753)
6 months ago
Edenzzzz 79f7a7b211
[misc] Accelerate CI for zero and dist optim (#5758)
6 months ago
yuehuayingxueluo b45000f839
[Inference]Add Streaming LLM (#5745)
6 months ago
Haze188 e22b82755d
[CI/tests] simplify some test case to reduce testing time (#5755)
6 months ago
duanjunwen 1b76564e16
[test] Fix/fix testcase (#5770)
6 months ago
Hongxin Liu 68359ed1e1
[release] update version (#5752)
6 months ago
botbw 023ea13cb5
Merge pull request #5749 from hpcaitech/prefetch
6 months ago
Yuanheng Zhao b96c6390f4
[inference] Fix running time of test_continuous_batching (#5750)
6 months ago
Edenzzzz 5f8c0a0ac3
[Feature] auto-cast optimizers to distributed version (#5746)
6 months ago
hxwang ca674549e0 [chore] remove unnecessary test & changes
6 months ago
hxwang ff507b755e Merge branch 'main' of github.com:hpcaitech/ColossalAI into prefetch
6 months ago
botbw 2fc85abf43
[gemini] async grad chunk reduce (all-reduce&reduce-scatter) (#5713)
6 months ago
hxwang 15d21a077a Merge remote-tracking branch 'origin/main' into prefetch
6 months ago
botbw 13c06d36a3
[bug] fix early return (#5740)
6 months ago
Yuanheng Zhao 8633c15da9 [sync] Sync feature/colossal-infer with main
6 months ago
genghaozhe 5470e5f94e a commit for fake push test
7 months ago
Edenzzzz 43995ee436
[Feature] Distributed optimizers: Lamb, Galore, CAME and Adafactor (#5694)
7 months ago
Steve Luo 7806842f2d
add paged-attetionv2: support seq length split across thread block (#5707)
7 months ago
Runyu Lu 18d67d0e8e
[Feat]Inference RPC Server Support (#5705)
7 months ago
傅剑寒 50104ab340
[Inference/Feat] Add convert_fp8 op for fp8 test in the future (#5706)
7 months ago
Wang Binluo a3cc68ca93
[Shardformer] Support the Qwen2 model (#5699)
7 months ago
flybird11111 d4c5ef441e
[gemini]remove registered gradients hooks (#5696)
7 months ago
CjhHa1 bc9063adf1 resolve rebase conflicts on Branch feat/online-serving
7 months ago
Jianghai 61a1b2e798 [Inference] Fix bugs and docs for feat/online-server (#5598)
7 months ago
Jianghai c064032865 [Online Server] Chat Api for streaming and not streaming response (#5470)
7 months ago
Jianghai de378cd2ab [Inference] Finish Online Serving Test, add streaming output api, continuous batching test and example (#5432)
7 months ago
Jianghai 69cd7e069d [Inference] ADD async and sync Api server using FastAPI (#5396)
7 months ago
yuehuayingxueluo 9c2fe7935f
[Inference]Adapt temperature processing logic (#5689)
7 months ago
Yuanheng Zhao 55cc7f3df7
[Fix] Fix Inference Example, Tests, and Requirements (#5688)
7 months ago
flybird11111 77ec773388
[zero]remove registered gradients hooks (#5687)
7 months ago
Yuanheng Zhao 8754abae24 [Fix] Fix & Update Inference Tests (compatibility w/ main)
7 months ago
Yuanheng Zhao 56ed09aba5 [sync] resolve conflicts of merging main
7 months ago
Yuanheng Zhao 537a3cbc4d
[kernel] Support New KCache Layout - Triton Kernel (#5677)
7 months ago
Steve Luo 5cd75ce4c7
[Inference/Kernel] refactor kvcache manager and rotary_embedding and kvcache_memcpy oper… (#5663)
7 months ago
yuehuayingxueluo 5f00002e43
[Inference] Adapt Baichuan2-13B TP (#5659)
7 months ago
Hongxin Liu 7f8b16635b
[misc] refactor launch API and tensor constructor (#5666)
7 months ago
linsj20 91fa553775 [Feature] qlora support (#5586)
7 months ago
flybird11111 8954a0c2e2 [LowLevelZero] low level zero support lora (#5153)
7 months ago
Baizhou Zhang 14b0d4c7e5 [lora] add lora APIs for booster, support lora for TorchDDP (#4981)
7 months ago