diff --git a/README-ja-JP.md b/README-ja-JP.md
index aeb7b02..24918cb 100644
--- a/README-ja-JP.md
+++ b/README-ja-JP.md
@@ -22,7 +22,6 @@
 [🛠️インストール](./doc/en/install.md) |
 [📊トレーニングパフォーマンス](./doc/en/train_performance.md) |
 [👀モデル](#model-zoo) |
-[🤗HuggingFace](https://huggingface.co/internlm) |
 [🆕更新ニュース](./CHANGE_LOG.md) |
 [🤔Issues 報告](https://github.com/InternLM/InternLM/issues/new)
 
@@ -103,6 +102,22 @@ Transformers を使用して InternLM 7B チャットモデルをロードする
 これらの提案を実践することで、時間管理のスキルを向上させ、効果的に日々のタスクをこなしていくことができます。
 ```
 
+ストリーミング生成を行いたい場合は、「stream_chat」関数を使用できます。
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model_path = "/mnt/petrelfs/share_data/xingshuhao/internlm-chat-7b/"
+model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
+tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
+
+model = model.eval()
+length = 0
+for response, history in model.stream_chat(tokenizer, "你好", history=[]):
+    print(response[length:], flush=True, end="")
+    length = len(response)
+```
+
 ### 対話
 
 以下のコードを実行することで、フロントエンドインターフェースを通して InternLM Chat 7B モデルと対話することができます:
diff --git a/README-zh-Hans.md b/README-zh-Hans.md
index 67946ea..8802bd2 100644
--- a/README-zh-Hans.md
+++ b/README-zh-Hans.md
@@ -22,7 +22,7 @@
 [🛠️安装教程](./doc/install.md) |
 [📊训练性能](./doc/train_performance.md) |
 [👀模型库](#model-zoo) |
-[🤗HuggingFace](https://huggingface.co/internlm) |
+[🤗HuggingFace](https://huggingface.co/spaces/internlm/InternLM-Chat-7B) |
 [🆕Update News](./CHANGE_LOG.md) |
 [🤔Reporting Issues](https://github.com/InternLM/InternLM/issues/new)
 
@@ -178,6 +178,22 @@ InternLM-7B 包含了一个拥有70亿参数的基础模型和一个为实际场
 3. 集中注意力：避免分心，集中注意力完成任务。关闭社交媒体和电子邮件通知，专注于任务，这将帮助您更快地完成任务，并减少错误的可能性。
 ```
 
+如果想进行流式生成，则可以使用 `stream_chat` 接口：
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model_path = "/mnt/petrelfs/share_data/xingshuhao/internlm-chat-7b/"
+model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
+tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
+
+model = model.eval()
+length = 0
+for response, history in model.stream_chat(tokenizer, "你好", history=[]):
+    print(response[length:], flush=True, end="")
+    length = len(response)
+```
+
 ### 通过 ModelScope 加载 
 
 通过以下的代码从 ModelScope 加载 InternLM 模型 （可修改模型名称替换不同的模型）
diff --git a/README.md b/README.md
index c3e4286..b05f7e6 100644
--- a/README.md
+++ b/README.md
@@ -22,7 +22,7 @@
 [🛠️Installation](./doc/en/install.md) |
 [📊Train Performance](./doc/en/train_performance.md) |
 [👀Model](#model-zoo) |
-[🤗HuggingFace](https://huggingface.co/internlm) |
+[🤗HuggingFace](https://huggingface.co/spaces/internlm/InternLM-Chat-7B) |
 [🆕Update News](./CHANGE_LOG.md) |
 [🤔Reporting Issues](https://github.com/InternLM/InternLM/issues/new)
 
@@ -175,6 +175,22 @@ Sure, here are three tips for effective time management:
 Remember, good time management skills take practice and patience. Start with small steps and gradually incorporate these habits into your daily routine.
 ```
 
+The responses can be streamed using `stream_chat`:
+
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+
+model_path = "/mnt/petrelfs/share_data/xingshuhao/internlm-chat-7b/"
+model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
+tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
+
+model = model.eval()
+length = 0
+for response, history in model.stream_chat(tokenizer, "你好", history=[]):
+    print(response[length:], flush=True, end="")
+    length = len(response)
+```
+
 ### Import from ModelScope
 
 To load the InternLM model using ModelScope, use the following code: