ColossalAI

History

yuehuayingxueluo f0aab7f9a8 Add Inference test for llama (#4508 ) * add kv cache memory manager * add stateinfo during inference * add * add infer example * finish * finish * format * format * rename file * add kv cache test * revise on BatchInferState * add inference test for llama * fix conflict * feature: add some new features for llama engine * adapt colossalai triton interface * Change the parent class of llama policy * add nvtx * move llama inference code to tensor_parallel * fix __init__.py * rm tensor_parallel * fix: fix bugs in auto_policy.py * fix:rm some unused codes * mv colossalai/tpinference to colossalai/inference/tensor_parallel * change __init__.py * save change * fix engine * Bug fix: Fix hang * remove llama_infer_engine.py --------- Co-authored-by: yuanheng-zhao <jonathan.zhaoyh@gmail.com> Co-authored-by: CjhHa1 <cjh18671720497@outlook.com>		2023-08-30 12:10:26 +08:00
..
chatglm2_6b	[pipeline] add chatglm (#4363 )	2023-08-15 23:25:14 +08:00
__init__.py	[shardformer] added development protocol for standardization (#4149 )	2023-07-04 16:05:01 +08:00
bert.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
blip2.py	[shardformer] update shardformer to use flash attention 2 (#4392 )	2023-08-15 23:25:14 +08:00
bloom.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
chatglm.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
gpt2.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
jit.py	[Shardformer] Merge flash attention branch to pipeline branch (#4362 )	2023-08-15 23:25:14 +08:00
llama.py	Add Inference test for llama (#4508 )	2023-08-30 12:10:26 +08:00
opt.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
sam.py	[Shardformer] Merge flash attention branch to pipeline branch (#4362 )	2023-08-15 23:25:14 +08:00
t5.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
vit.py	[misc] resolve code factor issues (#4433 )	2023-08-15 23:25:14 +08:00
whisper.py	[shardformer] update shardformer to use flash attention 2 (#4392 )	2023-08-15 23:25:14 +08:00