mirror of https://github.com/hpcaitech/ColossalAI
![]() * add infer_struct and infer_config * update codes * change InferConfig * Add hf_model_config to the engine * rm _get_hf_model_config * update codes * made adjustments according to the feedback from the reviewer. * update codes * add ci test for config and struct * Add the logic of the inference engine * update engine and test * Recover cache_manager.py * add logger * fix conflict * update codes * update codes * update model and tokenizer * fix add the logic about shardformer * change kvcache_manager docstring * add policy * fix ci bug in test_kvcache_manager.py * remove codes related o tokenizer and move model_policy * fix code style * add ordered_set to requirements-infer.txt * Delete extra empty lines * add ordered_set to requirements-test.txt |
||
---|---|---|
.. | ||
core | ||
kv_cache | ||
modeling/policy | ||
__init__.py | ||
config.py | ||
readme.md | ||
struct.py |
readme.md
Colossal-Infer
Introduction
Colossal-Infer is a library for inference of LLMs and MLMs. It is built on top of Colossal AI.
Structures
Overview
The main design will be released later on.
Roadmap
- [] design of structures
- [] Core components
- [] engine
- [] request handler
- [] kv cache manager
- [] modeling
- [] custom layers
- [] online server
- [] supported models
- [] llama2