diff --git a/README.md b/README.md index c0dc10fdf..289cc331c 100644 --- a/README.md +++ b/README.md @@ -172,12 +172,14 @@ distributed training and inference in a few lines. [[blog]](https://www.hpc-ai.tech/blog/one-half-day-of-training-using-a-few-hundred-dollars-yields-similar-results-to-mainstream-large-models-open-source-and-commercial-free-domain-specific-llm-solution) [[HuggingFace model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base) [[Modelscope model weights]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-7b-base/summary) +[[openMind_Hub model weights]](https://modelers.cn/models/HPCAITECH/Colossal-LLaMA-2-7B-base) - 13B: Construct refined 13B private model with just $5000 USD. [[code]](https://github.com/hpcaitech/ColossalAI/tree/main/applications/Colossal-LLaMA-2) [[blog]](https://hpc-ai.com/blog/colossal-llama-2-13b) [[HuggingFace model weights]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-13b-base) [[Modelscope model weights]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-13b-base/summary) +[[openMind_Hub model weights]](https://modelers.cn/models/HPCAITECH/Colossal-LLaMA-2-13B-base) | Model | Backbone | Tokens Consumed | MMLU (5-shot) | CMMLU (5-shot)| AGIEval (5-shot) | GAOKAO (0-shot) | CEval (5-shot) | | :-----------------------------: | :--------: | :-------------: | :------------------: | :-----------: | :--------------: | :-------------: | :-------------: | diff --git a/applications/Colossal-LLaMA/README.md b/applications/Colossal-LLaMA/README.md index e62b14390..a6e0c1913 100644 --- a/applications/Colossal-LLaMA/README.md +++ b/applications/Colossal-LLaMA/README.md @@ -25,6 +25,7 @@ Colossal-LLaMA - [Inference](#inference) - [Import from HuggingFace](#import-from-huggingface) - [Import from Modelscope](#import-from-modelscope) + - [Import from openMind_Hub](#import-from-openmind_hub) - [Quick Start](#quick-start) - [Usage](#usage) - [Install](#install) @@ -259,7 +260,30 @@ inputs = inputs.to('cuda:0') output = model.generate(**inputs, **generation_kwargs) print(tokenizer.decode(output.cpu()[0], skip_special_tokens=True)[len(input):]) ``` -You can download model weights from [🤗HuggingFace](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base) or [👾Modelscope](https://modelscope.cn/models/colossalai/Colossal-LLaMA-2-7b-base/summary). +#### Import from openMind_Hub +You can also load our model using openMind_Hub, use the following code: +```Python +from openmind import AutoModelForCausalLM, AutoTokenizer +from openmind_hub import snapshot_download +# Colossal-LLaMA-2-7B-base +model_dir = snapshot_download('HPCAITECH/Colossal-LLaMA-2-7B-base') +# Colossal-LLaMA-2-13B-base +model_dir = snapshot_download('HPCAITECH/Colossal-LLaMA-2-13B-base') + +tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True) +model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True).eval() +generation_kwargs = {"max_new_tokens": 256, + "top_p": 0.95, + "temperature": 0.3 + } + +input = '明月松间照,\n\n->\n\n' +inputs = tokenizer(input, return_token_type_ids=False, return_tensors='pt') +inputs = inputs.to('cuda:0') +output = model.generate(**inputs, **generation_kwargs) +print(tokenizer.decode(output.cpu()[0], skip_special_tokens=True)[len(input):]) +``` +You can download model weights from [🤗HuggingFace](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-7b-base) or [👾Modelscope](https://modelscope.cn/models/colossalai/Colossal-LLaMA-2-7b-base/summary) or [openMind_Hub](https://modelers.cn/models/HPCAITECH/Colossal-LLaMA-2-7B-base). #### Quick Start You can run [`inference_example.py`](inference_example.py) to quickly start the inference of our base model by loading model weights from HF. diff --git a/docs/README-zh-Hans.md b/docs/README-zh-Hans.md index 0e175afb0..729f2e6ac 100644 --- a/docs/README-zh-Hans.md +++ b/docs/README-zh-Hans.md @@ -151,6 +151,7 @@ Colossal-AI 为您提供了一系列并行组件。我们的目标是让您的 [[博客]](https://hpc-ai.com/blog/colossal-llama-2-13b) [[HuggingFace 模型权重]](https://huggingface.co/hpcai-tech/Colossal-LLaMA-2-13b-base) [[Modelscope 模型权重]](https://www.modelscope.cn/models/colossalai/Colossal-LLaMA-2-13b-base/summary) +[[openMind_Hub 模型权重]](https://modelers.cn/models/HPCAITECH/Colossal-LLaMA-2-13B-base) | Model | Backbone | Tokens Consumed | MMLU (5-shot) | CMMLU (5-shot) | AGIEval (5-shot) | GAOKAO (0-shot) | CEval (5-shot) | |:------------------------------:|:----------:|:---------------:|:-------------:|:--------------:|:----------------:|:---------------:|:--------------:|