History

yuehuayingxueluo 86853a37d5 Add padding llama model		2024-01-11 13:39:56 +00:00
..
core	Add padding llama model	2024-01-11 13:39:56 +00:00
kv_cache	Add padding llama model	2024-01-11 13:39:56 +00:00
modeling	Add padding llama model	2024-01-11 13:39:56 +00:00
__init__.py	[Inference] First PR for rebuild colossal-infer (#5143 )	2024-01-11 13:39:29 +00:00
config.py	Add padding llama model	2024-01-11 13:39:56 +00:00
logit_processors.py	[Inference] add logit processor and request handler (#5166 )	2024-01-11 13:39:56 +00:00
readme.md	[Inference]Update inference config and fix test (#5178 )	2024-01-11 13:39:29 +00:00
sampler.py	[Inference] add logit processor and request handler (#5166 )	2024-01-11 13:39:56 +00:00
struct.py	Add padding llama model	2024-01-11 13:39:56 +00:00

readme.md

Colossal-Infer

Introduction

Colossal-Infer is a library for inference of LLMs and MLMs. It is built on top of Colossal AI.

Structures

Overview

The main design will be released later on.

Roadmap

[] design of structures
[] Core components
- [] engine
- [] request handler
- [] kv cache manager
- [] modeling
- [] custom layers
- [] online server
[] supported models
- [] llama2