mirror of https://github.com/hpcaitech/ColossalAI
Merge dfefbdc8ff
into b9e60559b8
commit
22cceed4b3
|
@ -172,12 +172,14 @@ distributed training and inference in a few lines.
|
|||
[[blog]](https://www.hpc-ai.tech/blog/one-half-day-of-training-using-a-few-hundred-dollars-yields-similar-results-to-mainstream-large-models-open-source-and-commercial-free-domain-specific-llm-solution)
|
||||
[[HuggingFace model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base)
|
||||
[[Modelscope model weights]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-7b-base/summary)
|
||||
[[openMind_Hub model weights]](https://modelers.cn/models/HPCAITECH/Colossal-LLaMA-2-7B-base)
|
||||
|
||||
- 13B: Construct refined 13B private model with just $5000 USD.
|
||||
[[code]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Colossal-LLaMA-2)
|
||||
[[blog]](https://hpc-ai.com/blog/colossal-llama-2-13b)
|
||||
[[HuggingFace model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-13b-base)
|
||||
[[Modelscope model weights]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-13b-base/summary)
|
||||
[[openMind_Hub model weights]](https://modelers.cn/models/HPCAITECH/Colossal-LLaMA-2-13B-base)
|
||||
|
||||
| Model | Backbone | Tokens Consumed | MMLU (5-shot) | CMMLU (5-shot)| AGIEval (5-shot) | GAOKAO (0-shot) | CEval (5-shot) |
|
||||
| :-----------------------------: | :--------: | :-------------: | :------------------: | :-----------: | :--------------: | :-------------: | :-------------: |
|
||||
|
|
|
@ -25,6 +25,7 @@ Colossal-LLaMA
|
|||
- [Inference](#inference)
|
||||
- [Import from HuggingFace](#import-from-huggingface)
|
||||
- [Import from Modelscope](#import-from-modelscope)
|
||||
- [Import from openMind_Hub](#import-from-openmind_hub)
|
||||
- [Quick Start](#quick-start)
|
||||
- [Usage](#usage)
|
||||
- [Install](#install)
|
||||
|
@ -259,7 +260,30 @@ inputs = inputs.to('cuda:0')
|
|||
output = model.generate(**inputs, **generation_kwargs)
|
||||
print(tokenizer.decode(output.cpu()[0], skip_special_tokens=True)[len(input):])
|
||||
```
|
||||
You can download model weights from [🤗HuggingFace](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base) or [👾Modelscope](https://modelscope.cn/models/colossalai/Colossal-LLaMA-2-7b-base/summary).
|
||||
#### Import from openMind_Hub
|
||||
You can also load our model using openMind_Hub, use the following code:
|
||||
```Python
|
||||
from openmind import AutoModelForCausalLM, AutoTokenizer
|
||||
from openmind_hub import snapshot_download
|
||||
# Colossal-LLaMA-2-7B-base
|
||||
model_dir = snapshot_download('HPCAITECH/Colossal-LLaMA-2-7B-base')
|
||||
# Colossal-LLaMA-2-13B-base
|
||||
model_dir = snapshot_download('HPCAITECH/Colossal-LLaMA-2-13B-base')
|
||||
|
||||
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True)
|
||||
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True).eval()
|
||||
generation_kwargs = {"max_new_tokens": 256,
|
||||
"top_p": 0.95,
|
||||
"temperature": 0.3
|
||||
}
|
||||
|
||||
input = '明月松间照,\n\n->\n\n'
|
||||
inputs = tokenizer(input, return_token_type_ids=False, return_tensors='pt')
|
||||
inputs = inputs.to('cuda:0')
|
||||
output = model.generate(**inputs, **generation_kwargs)
|
||||
print(tokenizer.decode(output.cpu()[0], skip_special_tokens=True)[len(input):])
|
||||
```
|
||||
You can download model weights from [🤗HuggingFace](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base) or [👾Modelscope](https://modelscope.cn/models/colossalai/Colossal-LLaMA-2-7b-base/summary) or [openMind_Hub](https://modelers.cn/models/HPCAITECH/Colossal-LLaMA-2-7B-base).
|
||||
|
||||
#### Quick Start
|
||||
You can run [`inference_example.py`](inference_example.py) to quickly start the inference of our base model by loading model weights from HF.
|
||||
|
|
|
@ -151,6 +151,7 @@ Colossal-AI 为您提供了一系列并行组件。我们的目标是让您的
|
|||
[[博客]](https://hpc-ai.com/blog/colossal-llama-2-13b)
|
||||
[[HuggingFace 模型权重]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-13b-base)
|
||||
[[Modelscope 模型权重]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-13b-base/summary)
|
||||
[[openMind_Hub 模型权重]](https://modelers.cn/models/HPCAITECH/Colossal-LLaMA-2-13B-base)
|
||||
|
||||
| Model | Backbone | Tokens Consumed | MMLU (5-shot) | CMMLU (5-shot) | AGIEval (5-shot) | GAOKAO (0-shot) | CEval (5-shot) |
|
||||
|:------------------------------:|:----------:|:---------------:|:-------------:|:--------------:|:----------------:|:---------------:|:--------------:|
|
||||
|
|
Loading…
Reference in New Issue