History

Wenwei Zhang 468982bc76 [Doc]: Resolve comments in documentation (#587 ) * fix typos and try pass lint * fix wrong path in CI * fix wrong path in readme * update lint doc * update doc * update doc		2024-01-17 10:47:06 +08:00
..
README.md	[Doc]: Resolve comments in documentation (#587 )	2024-01-17 10:47:06 +08:00
README_zh-CN.md	[Doc]: Resolve comments in documentation (#587 )	2024-01-17 10:47:06 +08:00
chat_format_zh-CN.md	Update main branch and docs (#585 )	2024-01-17 09:46:11 +08:00
lmdeploy.md	[Doc]: update deployment guide (#591 )	2024-01-17 10:32:43 +08:00
lmdeploy_zh_cn.md	[Doc]: update deployment guide (#591 )	2024-01-17 10:32:43 +08:00
openaoe.md	[Doc]: add openaoe docs (#586 )	2024-01-17 10:25:28 +08:00
openaoe_zh_cn.md	[Doc]: add openaoe docs (#586 )	2024-01-17 10:25:28 +08:00
web_demo.py	[Doc]: Resolve comments in documentation (#587 )	2024-01-17 10:47:06 +08:00

README.md

Chat

English | 简体中文

This document briefly shows how to use Transformers, ModelScope, and Web demos to conduct inference with InternLM2-Chat.

You can also know more about the chatml format and how to use LMDeploy for inference and model serving.

Import from Transformers

To load the InternLM2 7B Chat model using Transformers, use the following code:

>>> from transformers import AutoTokenizer, AutoModelForCausalLM
>>> tokenizer = AutoTokenizer.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True)
>>> model = AutoModelForCausalLM.from_pretrained("internlm/internlm2-chat-7b", trust_remote_code=True).cuda()
>>> model = model.eval()
>>> response, history = model.chat(tokenizer, "hello", history=[])
>>> print(response)
Hello! How can I help you today?
>>> response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
>>> print(response)
Sure, here are three tips for effective time management:

1. Prioritize tasks based on importance and urgency: Make a list of all your tasks and categorize them into "important and urgent," "important but not urgent," and "not important but urgent." Focus on completing the tasks in the first category before moving on to the others.
2. Use a calendar or planner: Write down deadlines and appointments in a calendar or planner so you don't forget them. This will also help you schedule your time more effectively and avoid overbooking yourself.
3. Minimize distractions: Try to eliminate any potential distractions when working on important tasks. Turn off notifications on your phone, close unnecessary tabs on your computer, and find a quiet place to work if possible.

Remember, good time management skills take practice and patience. Start with small steps and gradually incorporate these habits into your daily routine.

Import from ModelScope

To load the InternLM model using ModelScope, use the following code:

from modelscope import snapshot_download, AutoTokenizer, AutoModelForCausalLM
import torch
model_dir = snapshot_download('Shanghai_AI_Laboratory/internlm2-chat-7b')
tokenizer = AutoTokenizer.from_pretrained(model_dir, device_map="auto", trust_remote_code=True,torch_dtype=torch.float16)
model = AutoModelForCausalLM.from_pretrained(model_dir,device_map="auto",  trust_remote_code=True,torch_dtype=torch.float16)
model = model.eval()
response, history = model.chat(tokenizer, "hello", history=[])
print(response)
response, history = model.chat(tokenizer, "please provide three suggestions about time management", history=history)
print(response)

Dialogue

You can interact with the InternLM Chat 7B model through a frontend interface by running the following code:

pip install streamlit==1.24.0
pip install transformers==4.30.2
streamlit run ./chat/web_demo.py

The effect is similar to below: