From 40f8e2ed19464744f238ba35989d7a99dec4d3f9 Mon Sep 17 00:00:00 2001
From: ZwwWayne <wayne.zw@outlook.com>
Date: Wed, 17 Jan 2024 09:43:47 +0800
Subject: [PATCH] clean doc

---
 chat/lmdeploy.md       | 55 ------------------------------------------
 chat/lmdeploy_zh-CN.md | 55 ------------------------------------------
 2 files changed, 110 deletions(-)
 delete mode 100644 chat/lmdeploy.md
 delete mode 100644 chat/lmdeploy_zh-CN.md

diff --git a/chat/lmdeploy.md b/chat/lmdeploy.md
deleted file mode 100644
index 41439d2..0000000
--- a/chat/lmdeploy.md
+++ /dev/null
@@ -1,55 +0,0 @@
-# Inference by LMDeploy
-
-English | [简体中文](lmdeploy_zh_zh-CN.md)
-
-[LMDeploy](https://github.com/InternLM/lmdeploy) is an efficient, user-friendly toolkit designed for compressing, deploying, and serving LLM models.
-
-This article primarily highlights the basic usage of LMDeploy. For a comprehensive understanding of the toolkit, we invite you to refer to [the tutorials](https://lmdeploy.readthedocs.io/en/latest/).
-
-## Installation
-
-Install lmdeploy with pip (python 3.8+)
-
-```shell
-pip install lmdeploy
-```
-
-## Offline batch inference
-
-With just 4 lines of codes, you can execute batch inference using a list of prompts:
-
-```python
-from lmdeploy import pipeline
-pipe = pipeline("internlm/internlm2-chat-7b")
-response = pipe(["Hi, pls intro yourself", "Shanghai is"])
-print(response)
-```
-
-With dynamic ntk, LMDeploy can handle a context length of 200K for `InternLM2`:
-
-```python
-from lmdeploy import pipeline, TurbomindEngineConfig
-engine_config = TurbomindEngineConfig(session_len=200000,
-                                      rope_scaling_factor=2.0),
-pipe = pipeline("internlm/internlm2-chat-7b", engine_config)
-prompt = 'Please offer a long prompt here'
-print(response)
-```
-
-For more information about LMDeploy pipeline usage, please refer to [here](https://lmdeploy.readthedocs.io/en/latest/inference/pipeline.html).
-
-## Serving
-
-LMDeploy's `api_server` enables models to be easily packed into services with a single command. The provided RESTful APIs are compatible with OpenAI's interfaces. Below are an example of service startup:
-
-```shell
-lmdeploy serve api_server internlm/internlm2-chat-7b
-```
-
-The default port of `api_server` is `23333`. After the server is launched, you can communicate with server on terminal through `api_client`:
-
-```shell
-lmdeploy serve api_client http://0.0.0.0:23333
-```
-
-Alternatively, you can test the server's APIs oneline through the Swagger UI at `http://0.0.0.0:23333`. A detailed overview of the API specification is available [here](https://lmdeploy.readthedocs.io/en/latest/serving/restful_api.html).
diff --git a/chat/lmdeploy_zh-CN.md b/chat/lmdeploy_zh-CN.md
deleted file mode 100644
index 835aedd..0000000
--- a/chat/lmdeploy_zh-CN.md
+++ /dev/null
@@ -1,55 +0,0 @@
-# LMDeploy 推理
-
-[English](lmdeploy.md) | 简体中文
-
-[LMDeploy](https://github.com/InternLM/lmdeploy) 是一个高效且友好的 LLM 模型部署工具箱，功能涵盖了量化、推理和服务。
-
-本文主要介绍 LMDeploy 的基本用法，包括[安装](#安装)、[离线批处理](#离线批处理)和[推理服务](#推理服务)。更全面的介绍请参考 [LMDeploy 用户指南](https://lmdeploy.readthedocs.io/zh-cn/latest/)。
-
-
-## 安装
-
-使用 pip（python 3.8+）安装 LMDeploy
-
-```shell
-pip install lmdeploy
-```
-
-## 离线批处理
-
-只用以下 4 行代码，就可以完成 prompts 的批处理：
-
-```python
-from lmdeploy import pipeline
-pipe = pipeline("internlm/internlm2-chat-7b")
-response = pipe(["Hi, pls intro yourself", "Shanghai is"])
-print(response)
-```
-
-LMDeploy 实现了 dynamic ntk，支持长文本外推。使用如下代码，可以把 InternLM2 的文本外推到 200K：
-```python
-from lmdeploy import pipeline, TurbomindEngineConfig
-engine_config = TurbomindEngineConfig(session_len=200000,
-                                      rope_scaling_factor=2.0),
-pipe = pipeline("internlm/internlm2-chat-7b", engine_config)
-prompt = 'Please offer a long prompt here'
-print(response)
-```
-
-更多关于 pipeline 的使用方式，请参考[这里](https://lmdeploy.readthedocs.io/zh-cn/latest/inference/pipeline.html)
-
-## 推理服务
-
-LMDeploy `api_server` 支持把模型一键封装为服务，对外提供的 RESTful API 兼容 openai 的接口。以下为服务启动的示例：
-
-```shell
-lmdeploy serve api_server internlm/internlm2-chat-7b
-```
-
-服务默认端口是23333。在 server 启动后，你可以在终端通过`api_client`与server进行对话，体验对话效果：
-
-```shell
-lmdeploy serve api_client http://0.0.0.0:23333
-```
-
-此外，你还可以通过 Swagger UI `http://0.0.0.0:23333` 在线阅读和试用 `api_server` 的各接口，也可直接查阅[文档](https://lmdeploy.readthedocs.io/zh-cn/latest/serving/restful_api.html)，了解各接口的定义和使用方法。