10 Commits (colossalchat)

Author SHA1 Message Date
Runyu Lu 66abf1c6e8
[HotFix] CI,import,requirements-test for #5838 (#5892) 5 months ago
Runyu Lu cba20525a8
[Feat] Diffusion Model(PixArtAlpha/StableDiffusion3) Support (#5838) 5 months ago
pre-commit-ci[bot] 7c2f79fa98
[pre-commit.ci] pre-commit autoupdate (#5572) 5 months ago
Li Xingjian 8554585a5f
[Inference] Fix flash-attn import and add model test (#5794) 5 months ago
char-1ee 5f398fc000 Pass inference model shard configs for module init 6 months ago
char-1ee 04386d9eff Refactor modeling by adding attention backend 6 months ago
Runyu Lu 18d67d0e8e
[Feat]Inference RPC Server Support (#5705) 6 months ago
Runyu Lu e37ee2fb65
[Feat]Tensor Model Parallel Support For Inference (#5563) 7 months ago
yuehuayingxueluo f366a5ea1f
[Inference/kernel]Add Fused Rotary Embedding and KVCache Memcopy CUDA Kernel (#5418) 8 months ago
yuehuayingxueluo cea9c86e45 add utils.py 10 months ago
Xu Kai fd6482ad8c
[inference] Refactor inference architecture (#5057) 1 year ago
Bin Jia b6696beb04
[Pipeline Inference] Merge pp with tp (#4993) 1 year ago
Bin Jia 1db6727678
[Pipeline inference] Combine kvcache with pipeline inference (#4938) 1 year ago
Xu Kai 77a9328304
[inference] add llama2 support (#4898) 1 year ago
Jianghai 013a4bedf0
[inference]fix import bug and delete down useless init (#4830) 1 year ago
Jianghai ce7ade3882
[inference] chatglm2 infer demo (#4724) 1 year ago