Commit Graph

15 Commits (1f8c7e70469191610d9536029f624b4f30db8caf)

Author SHA1 Message Date
Jianghai 1f8c7e7046
[Inference] User Experience: update the logic of default tokenizer and generation config. (#5337)
10 months ago
Frank Lee 58740b5f68
[inference] added inference template (#5375)
10 months ago
Frank Lee f8e456d202
[inference] simplified config verification (#5346)
10 months ago
yuehuayingxueluo e8f0642f28
[Inference]Add Nopadding Llama Modeling (#5327)
10 months ago
yuehuayingxueluo 4f28cb43c0
[inference]Optimize the usage of the mid tensors space in flash attn (#5304)
10 months ago
yuehuayingxueluo d40eb26029 fix bugs in request_handler.py and engine.py
11 months ago
yuehuayingxueluo fa4fbdbffb adapted to pad_context_forward
11 months ago
yuehuayingxueluo 62fd08ee44 Fixed a bug in the inference frame
11 months ago
yuehuayingxueluo 86853a37d5 Add padding llama model
11 months ago
Jianghai 0e616462a7 [Inference] add logit processor and request handler (#5166)
11 months ago
yuehuayingxueluo 8daee26989 [Inference] Add the logic of the inference engine (#5173)
11 months ago
Jianghai 93aeacca34 [Inference]Update inference config and fix test (#5178)
11 months ago
yuehuayingxueluo fab9b931d9 [Inference]Add BatchInferState, Sequence and InferConfig (#5149)
11 months ago
Jianghai 56e75eeb06 [Inference] Add readme (roadmap) and fulfill request handler (#5147)
11 months ago
Jianghai 4cf4682e70 [Inference] First PR for rebuild colossal-infer (#5143)
11 months ago