Commit Graph

3027 Commits (249644c23b0402ccf9d0908f13ed15b41b95145f)
 

Author SHA1 Message Date
FrankLeeeee 1ded7e81ef [git] fixed rebased files
11 months ago
Yuanheng Zhao 1513f20f4d [kernel] Add flash decoding triton kernel for blocked kv cache (#5249)
11 months ago
Jianghai fded91d049 [Inference] Kernel: no pad rotary embedding (#5252)
11 months ago
yuehuayingxueluo d40eb26029 fix bugs in request_handler.py and engine.py
11 months ago
yuehuayingxueluo 10e3c9f923 rm torch.cuda.synchronize
11 months ago
yuehuayingxueluo fab294c7f4 fix CI bugs
11 months ago
yuehuayingxueluo 2a73e828eb fix bugs related to processing padding mask
11 months ago
Jianghai e545a871b8 [Hotfix] Fix accuracy and align attention method api with Triton kernel (#5229)
11 months ago
yuehuayingxueluo fa4fbdbffb adapted to pad_context_forward
11 months ago
yuehuayingxueluo 47e53eaa1c fix bugs in attention.py and request_handler.py
11 months ago
Jianghai bfd9b1b494 [Inference] Pytorch Attention func, pad&nopad input support (#5219)
11 months ago
yuehuayingxueluo 3ad1f3b78b fix beam_width
11 months ago
yuehuayingxueluo b2eb9cd186 Fixed a typo
11 months ago
yuehuayingxueluo bbfebfb9fc fix bugs in sampler
11 months ago
yuehuayingxueluo 02c1bf8b2a add context_attention_unpadded
11 months ago
Yuanheng Zhao 07b5283b6a [kernel] Add triton kernel for context attention (FAv2) without padding (#5192)
11 months ago
yuehuayingxueluo 4df8876fca Fixed a writing error
11 months ago
yuehuayingxueluo 9489dc64d8 precision alignment
11 months ago
yuehuayingxueluo 62968588d1 fix bugs in request_handler
11 months ago
yuehuayingxueluo 62fd08ee44 Fixed a bug in the inference frame
11 months ago
yuehuayingxueluo 86853a37d5 Add padding llama model
11 months ago
Jianghai 0e616462a7 [Inference] add logit processor and request handler (#5166)
11 months ago
yuehuayingxueluo 8daee26989 [Inference] Add the logic of the inference engine (#5173)
11 months ago
Jianghai 93aeacca34 [Inference]Update inference config and fix test (#5178)
11 months ago
Yuanheng Zhao 3de2e62299 [Inference] Add CacheBlock and KV-Cache Manager (#5156)
11 months ago
yuehuayingxueluo fab9b931d9 [Inference]Add BatchInferState, Sequence and InferConfig (#5149)
11 months ago
Yuanheng Zhao 2bb92243d4 [Inference/NFC] Clean outdated inference tests and deprecated kernels (#5159)
11 months ago
Jianghai 56e75eeb06 [Inference] Add readme (roadmap) and fulfill request handler (#5147)
11 months ago
Jianghai 4cf4682e70 [Inference] First PR for rebuild colossal-infer (#5143)
11 months ago
binmakeswell c174c4fc5f
[doc] fix doc typo (#5256)
11 months ago
flybird11111 e830ef917d
[ci] fix shardformer tests. (#5255)
11 months ago
digger yu 756c400ad2
fix typo in applications/ColossalEval/README.md (#5250)
11 months ago
Frank Lee 2b83418719
[ci] fixed ddp test (#5254)
11 months ago
Frank Lee d5eeeb1416
[ci] fixed booster test (#5251)
11 months ago
Frank Lee edf94a35c3
[workflow] fixed build CI (#5240)
11 months ago
digger yu 41e52c1c6e
[doc] fix typo in Colossal-LLaMA-2/README.md (#5247)
11 months ago
Frank Lee 9102d655ab
[hotfix] removed unused flag (#5242)
11 months ago
Hongxin Liu d202cc28c0
[npu] change device to accelerator api (#5239)
11 months ago
Elsa Granger d565df3821
[pipeline] A more general _communicate in p2p (#5062)
11 months ago
Xuanlei Zhao dd2c28a323
[npu] use extension for op builder (#5172)
11 months ago
binmakeswell 7bc6969ce6
[doc] SwiftInfer release (#5236)
11 months ago
github-actions[bot] 4fb4a22a72
[format] applied code formatting on changed files in pull request 5234 (#5235)
11 months ago
binmakeswell b9b32b15e6
[doc] add Colossal-LLaMA-2-13B (#5234)
11 months ago
JIMMY ZHAO ce651270f1
[doc] Make leaderboard format more uniform and good-looking (#5231)
11 months ago
Camille Zhong 915b4652f3
[doc] Update README.md of Colossal-LLAMA2 (#5233)
11 months ago
Tong Li d992b55968
[Colossal-LLaMA-2] Release Colossal-LLaMA-2-13b-base model (#5224)
11 months ago
digger yu b0b53a171c
[nfc] fix typo colossalai/shardformer/ (#5133)
11 months ago
flybird11111 451e9142b8
fix flash attn (#5209)
11 months ago
flybird11111 365671be10
fix-test (#5210)
11 months ago
Hongxin Liu 7f3400b560
[devops] update torch versoin in ci (#5217)
11 months ago