ColossalAI/colossalai/inference
yuehuayingxueluo 8daee26989 [Inference] Add the logic of the inference engine (#5173)
* add infer_struct and infer_config

* update codes

* change InferConfig

* Add hf_model_config to the engine

* rm _get_hf_model_config

* update codes

* made adjustments according to the feedback from the reviewer.

* update codes

* add ci test for config and struct

* Add the logic of the inference engine

* update engine and test

* Recover cache_manager.py

* add logger

* fix conflict

* update codes

* update codes

* update model and tokenizer

* fix add the logic about shardformer

* change kvcache_manager docstring

* add policy

* fix ci bug in test_kvcache_manager.py

* remove codes related o tokenizer and move model_policy

* fix  code style

* add ordered_set to requirements-infer.txt

* Delete extra empty lines

* add ordered_set to requirements-test.txt
2024-01-11 13:39:56 +00:00
..
core [Inference] Add the logic of the inference engine (#5173) 2024-01-11 13:39:56 +00:00
kv_cache [Inference] Add the logic of the inference engine (#5173) 2024-01-11 13:39:56 +00:00
modeling/policy [Inference] Add the logic of the inference engine (#5173) 2024-01-11 13:39:56 +00:00
__init__.py [Inference] First PR for rebuild colossal-infer (#5143) 2024-01-11 13:39:29 +00:00
config.py [Inference] Add the logic of the inference engine (#5173) 2024-01-11 13:39:56 +00:00
readme.md [Inference]Update inference config and fix test (#5178) 2024-01-11 13:39:29 +00:00
struct.py [Inference] Add the logic of the inference engine (#5173) 2024-01-11 13:39:56 +00:00

readme.md

Colossal-Infer

Introduction

Colossal-Infer is a library for inference of LLMs and MLMs. It is built on top of Colossal AI.

Structures

Overview

The main design will be released later on.

Roadmap

  • [] design of structures
  • [] Core components
    • [] engine
    • [] request handler
    • [] kv cache manager
    • [] modeling
    • [] custom layers
    • [] online server
  • [] supported models
    • [] llama2