diff --git a/README.md b/README.md index a3647e3..c223e15 100644 --- a/README.md +++ b/README.md @@ -218,6 +218,20 @@ print(response) Please refer to the [guidance](./chat/lmdeploy.md) for more usages about model deployment. For additional deployment tutorials, feel free to explore [here](https://github.com/InternLM/LMDeploy). +### 200K-long-context Inference + +By enabling the Dynamic NTK feature of LMDeploy, you can acquire the long-context inference power. + +```python +from lmdeploy import pipeline, GenerationConfig, TurbomindEngineConfig + +backend_config = TurbomindEngineConfig(rope_scaling_factor=2.0, session_len=200000) +pipe = pipeline('internlm/internlm2-chat-7b', backend_config=backend_config) +prompt = 'Use a long prompt to replace this sentence' +response = pipe(prompt) +print(response) +``` + ## Agent InternLM2-Chat models have excellent tool utilization capabilities and can work with function calls in a zero-shot manner. See more examples in [agent session](./agent/). diff --git a/README_zh-CN.md b/README_zh-CN.md index bec34f1..6cd17e6 100644 --- a/README_zh-CN.md +++ b/README_zh-CN.md @@ -214,6 +214,21 @@ print(response) 请参考[部署指南](./chat/lmdeploy.md)了解更多使用案例,更多部署教程则可在[这里](https://github.com/InternLM/LMDeploy)找到。 + +### 20万字超长上下文推理 + +激活 LMDeploy 的 Dynamic NTK 能力,可以轻松把 internlm2-chat-7b 外推到 200K 上下文 + +```python +from lmdeploy import pipeline, GenerationConfig, TurbomindEngineConfig + +backend_config = TurbomindEngineConfig(rope_scaling_factor=2.0, session_len=160000) +pipe = pipeline('internlm/internlm2-chat-7b', backend_config=backend_config) +prompt = 'Use a long prompt to replace this sentence' +response = pipe(prompt) +print(response) +``` + ## 微调&训练 请参考[微调教程](./finetune/)尝试续训或微调 InternLM2。