ColossalAI/colossalai/inference
Jianghai 56e75eeb06 [Inference] Add readme (roadmap) and fulfill request handler (#5147)
* request handler

* add readme

---------

Co-authored-by: CjhHa1 <cjh18671720497outlook.com>
2024-01-11 13:39:29 +00:00
..
core [Inference] Add readme (roadmap) and fulfill request handler (#5147) 2024-01-11 13:39:29 +00:00
__init__.py [Inference] First PR for rebuild colossal-infer (#5143) 2024-01-11 13:39:29 +00:00
config.py [Inference] Add readme (roadmap) and fulfill request handler (#5147) 2024-01-11 13:39:29 +00:00
readme.md [Inference] Add readme (roadmap) and fulfill request handler (#5147) 2024-01-11 13:39:29 +00:00
sequence.py [Inference] First PR for rebuild colossal-infer (#5143) 2024-01-11 13:39:29 +00:00

readme.md

Colossal-Infer

Introduction

Colossal-Infer is a library for inference of LLMs and MLMs. It is built on top of Colossal AI.

Structures

Overview

https://n4fyd3ptax.feishu.cn/docx/MhlmdHsGkoeoslx9fqucPO17n9b?openbrd=1&doc_app_id=501&blockId=WCGBdWI9hobOEsxkW5uc8HM6n3b&blockType=whiteboard&blockToken=Cca3wKWk7hPnJxbkCX6cMxPQnqd#WCGBdWI9hobOEsxkW5uc8HM6n3b

Roadmap

  • [] design of structures
  • [] Core components
    • [] engine
    • [] request handler
    • [] kv cache manager
    • [] modeling
    • [] custom layers
    • [] online server
  • [] supported models
    • [] llama2