History

yuehuayingxueluo 8daee26989 [Inference] Add the logic of the inference engine (#5173 ) * add infer_struct and infer_config * update codes * change InferConfig * Add hf_model_config to the engine * rm _get_hf_model_config * update codes * made adjustments according to the feedback from the reviewer. * update codes * add ci test for config and struct * Add the logic of the inference engine * update engine and test * Recover cache_manager.py * add logger * fix conflict * update codes * update codes * update model and tokenizer * fix add the logic about shardformer * change kvcache_manager docstring * add policy * fix ci bug in test_kvcache_manager.py * remove codes related o tokenizer and move model_policy * fix code style * add ordered_set to requirements-infer.txt * Delete extra empty lines * add ordered_set to requirements-test.txt		2024-01-11 13:39:56 +00:00
..
core	[Inference] Add the logic of the inference engine (#5173 )	2024-01-11 13:39:56 +00:00
kv_cache	[Inference] Add the logic of the inference engine (#5173 )	2024-01-11 13:39:56 +00:00
modeling/policy	[Inference] Add the logic of the inference engine (#5173 )	2024-01-11 13:39:56 +00:00
__init__.py	[Inference] First PR for rebuild colossal-infer (#5143 )	2024-01-11 13:39:29 +00:00
config.py	[Inference] Add the logic of the inference engine (#5173 )	2024-01-11 13:39:56 +00:00
readme.md	[Inference]Update inference config and fix test (#5178 )	2024-01-11 13:39:29 +00:00
struct.py	[Inference] Add the logic of the inference engine (#5173 )	2024-01-11 13:39:56 +00:00

readme.md

Colossal-Infer

Introduction

Colossal-Infer is a library for inference of LLMs and MLMs. It is built on top of Colossal AI.

Structures

Overview

The main design will be released later on.

Roadmap

[] design of structures
[] Core components
- [] engine
- [] request handler
- [] kv cache manager
- [] modeling
- [] custom layers
- [] online server
[] supported models
- [] llama2