[Docs] Update Agent docs (#590)

Co-authored-by: BIGWangYuDong <yudongwang1226@gmain.com>
Co-authored-by: ZwwWayne <wayne.zw@outlook.com>
pull/611/head
BigDong 2024-01-17 19:37:51 +08:00 committed by GitHub
parent 69db8d4574
commit 70478bfc61
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
7 changed files with 314 additions and 0 deletions

View File

@ -0,0 +1,22 @@
# InternLM-Chat Agent
English | [简体中文](README_zh-CN.md)
## Introduction
InternLM-Chat-7B v1.1 has been released as the first open-source model with code interpreter capabilities, supportting external tools such as Python code interpreter and search engine.
InternLM2-Chat, open sourced on January 17, 2024, further enhances its capabilities in code interpreter and general tool utilization. With improved and more generalized instruction understanding, tool selection, and reflection abilities, InternLM2-Chat can more reliably support complex agents and multi-step tool calling for more intricate tasks. InternLM2-Chat exhibits decent computational and reasoning abilities even without external tools, surpassing ChatGPT in mathematical performance. When combined with a code interpreter, InternLM2-Chat-20B obtains comparable results to GPT-4 on GSM8K and MATH. Leveraging strong foundational capabilities in mathematics and tools, InternLM2-Chat provides practical data analysis capabilities.
The results of InternLM2-Chat-20B on math code interpreter is as below:
| | GSM8K | MATH |
| :---: | :---: | :--: |
| InternLM2-Chat-20B | 79.6 | 32.5 |
| InternLM2-Chat-20B with Code Interpreter | 84.5 | 51.2 |
| ChatGPT (GPT-3.5) | 78.2 | 28.0 |
| GPT-4 | 91.4 | 45.8 |
## Usages
We offer examples using [Lagent](lagent.md) to build agents based on InternLM2-Chat to call code interpreter or search API. Additionally, we provide an example code using [PAL to evaluate GSM8K math problems](pal_inference.md) with InternLM-Chat-7B.

22
agent/README_zh-CN.md Normal file
View File

@ -0,0 +1,22 @@
# InternLM-Chat 智能体
[English](README.md) | 简体中文
## 简介
InternLM-Chat-7B v1.1 是首个具有代码解释能力的开源对话模型,支持 Python 解释器和搜索引擎等外部工具。
InternLM2-Chat 进一步提高了它在代码解释和通用工具调用方面的能力。基于更强和更具有泛化性的指令理解、工具筛选与结果反思等能力,新版模型可以更可靠地支持复杂智能体的搭建,支持对工具进行有效的多轮调用,完成较复杂的任务。模型在不使用外部工具的条件下已具备不错的计算能力和推理能力,数理表现超过 ChatGPT在配合代码解释器code-interpreter的条件下InternLM2-Chat-20B 在 GSM8K 和 MATH 上可以达到和 GPT-4 相仿的水平。基于在数理和工具方面强大的基础能力InternLM2-Chat 提供了实用的数据分析能力。
以下是 InternLM2-Chat-20B 在数学代码解释器上的结果。
| | GSM8K | MATH |
| :---: | :---: | :--: |
| InternLM2-Chat-20B 单纯依靠内在能力 | 79.6 | 32.5 |
| InternLM2-Chat-20B 配合代码解释器 | 84.5 | 51.2 |
| ChatGPT (GPT-3.5) | 78.2 | 28.0 |
| GPT-4 | 91.4 | 45.8 |
## 体验
我们提供了使用 [Lagent](lagent_zh_cn.md) 来基于 InternLM2-Chat 构建智能体调用代码解释器或者搜索等工具的例子。同时,我们也提供了采用 [PAL 评测 GSM8K 数学题](pal_inference_zh-CN.md) InternLM-Chat-7B 的样例。

73
agent/lagent.md Normal file
View File

@ -0,0 +1,73 @@
# Lagnet
English | [简体中文](lagent_zh-CN.md)
## What's Lagent?
Lagent is a lightweight open-source framework that allows users to efficiently build large language model(LLM)-based agents. It also provides some typical tools to augment LLM. The overview of the framework is shown below:
![image](https://github.com/InternLM/lagent/assets/24351120/cefc4145-2ad8-4f80-b88b-97c05d1b9d3e)
This document primarily highlights the basic usage of Lagent. For a comprehensive understanding of the toolkit, please refer to [examples](https://github.com/InternLM/lagent/tree/main/examples) for more details.
## Installation
Install with pip (Recommended).
```bash
pip install lagent
```
Optionally, you could also build Lagent from source in case you want to modify the code:
```bash
git clone https://github.com/InternLM/lagent.git
cd lagent
pip install -e .
```
## Run ReAct Web Demo
```bash
# You need to install streamlit first
# pip install streamlit
streamlit run examples/react_web_demo.py
```
Then you can chat through the UI shown as below
![image](https://github.com/InternLM/lagent/assets/24622904/3aebb8b4-07d1-42a2-9da3-46080c556f68)
## Run a ReAct agent with InternLM2-Chat
**NOTE:** If you want to run a HuggingFace model, please run `pip install -e .[all]` first.
```python
# Import necessary modules and classes from the "lagent" library.
from lagent.agents import ReAct
from lagent.actions import ActionExecutor, GoogleSearch, PythonInterpreter
from lagent.llms import HFTransformer
# Initialize the HFTransformer-based Language Model (llm) and provide the model name.
llm = HFTransformer('internlm/internlm2-chat-7b')
# Initialize the Google Search tool and provide your API key.
search_tool = GoogleSearch(api_key='Your SERPER_API_KEY')
# Initialize the Python Interpreter tool.
python_interpreter = PythonInterpreter()
# Create a chatbot by configuring the ReAct agent.
chatbot = ReAct(
llm=llm, # Provide the Language Model instance.
action_executor=ActionExecutor(
actions=[search_tool, python_interpreter] # Specify the actions the chatbot can perform.
),
)
# Ask the chatbot a mathematical question in LaTeX format.
response = chatbot.chat('若$z=-1+\sqrt{3}i$,则$\frac{z}{{z\overline{z}-1}}=\left(\ \ \right)$')
# Print the chatbot's response.
print(response.response) # Output the response generated by the chatbot.
>>> $-\\frac{1}{3}+\\frac{{\\sqrt{3}}}{3}i$
```

73
agent/lagent_zh-CN.md Normal file
View File

@ -0,0 +1,73 @@
# Lagnet
[English](lagent.md) | 简体中文
## 简介
[Lagent](https://github.com/InternLM/lagent) 是一个轻量级、开源的基于大语言模型的智能体agent框架支持用户快速地将一个大语言模型转变为多种类型的智能体并提供了一些典型工具为大语言模型赋能。它的整个框架图如下:
![image](https://github.com/InternLM/lagent/assets/24351120/cefc4145-2ad8-4f80-b88b-97c05d1b9d3e)
本文主要介绍 Lagent 的基本用法。更全面的介绍请参考 Lagent 中提供的 [例子](https://github.com/InternLM/lagent/tree/main/examples)。
## 安装
通过 pip 进行安装 (推荐)。
```bash
pip install lagent
```
同时,如果你想修改这部分的代码,也可以通过以下命令从源码编译 Lagent:
```bash
git clone https://github.com/InternLM/lagent.git
cd lagent
pip install -e .
```
## 运行一个 ReAct 智能体的网页样例
```bash
# 需要确保已经安装 streamlit 包
# pip install streamlit
streamlit run examples/react_web_demo.py
```
然后你就可以在网页端和智能体进行对话了,效果如下图所示
![image](https://github.com/InternLM/lagent/assets/24622904/3aebb8b4-07d1-42a2-9da3-46080c556f68)
## 用 InternLM-Chat 构建一个 ReAct 智能体
**注意:**如果你想要启动一个 HuggingFace 的模型,请先运行 pip install -e .[all]。
```python
# Import necessary modules and classes from the "lagent" library.
from lagent.agents import ReAct
from lagent.actions import ActionExecutor, GoogleSearch, PythonInterpreter
from lagent.llms import HFTransformer
# Initialize the HFTransformer-based Language Model (llm) and provide the model name.
llm = HFTransformer('internlm/internlm-chat-7b-v1_1')
# Initialize the Google Search tool and provide your API key.
search_tool = GoogleSearch(api_key='Your SERPER_API_KEY')
# Initialize the Python Interpreter tool.
python_interpreter = PythonInterpreter()
# Create a chatbot by configuring the ReAct agent.
chatbot = ReAct(
llm=llm, # Provide the Language Model instance.
action_executor=ActionExecutor(
actions=[search_tool, python_interpreter] # Specify the actions the chatbot can perform.
),
)
# Ask the chatbot a mathematical question in LaTeX format.
response = chatbot.chat('若$z=-1+\sqrt{3}i$,则$\frac{z}{{z\overline{z}-1}}=\left(\ \ \right)$')
# Print the chatbot's response.
print(response.response) # Output the response generated by the chatbot.
>>> $-\\frac{1}{3}+\\frac{{\\sqrt{3}}}{3}i$
```

62
agent/pal_inference.md Normal file
View File

@ -0,0 +1,62 @@
# Inference on GSM8K with PAL in InternLM-Chat
English | [简体中文](pal_inference_zh-CN.md)
Utilize [PAL](https://github.com/reasoning-machines/pal) paradigm inference on the [GSM8K](https://huggingface.co/datasets/gsm8k) dataset, enabling the model to write code and execute it through the Python interpreter to solve mathematical problems. The usage is as follows:
```bash
python pal_inference.py \
<model> \
<out_dir> \
[--dataset <dataset>] \
[--max_length <length>] \
[--top_p <threshold>] \
[--eoh <end token>] \
[--eoa <end token>] \
[--eos <end token>] \
[--temperature <temp>] \
[--time_out <time>] \
[--verbose, -v] \
[--append, -a]
```
Parameter explanation:
| Parameter | Description |
| :--------: | :--------------------: |
| \<model\> | Path to the model used for inference |
| \<out_dir\> | Generated code will be saved in the specified output folder |
| --dataset <dataset> | Name of the dataset used for code generation (defaults to gsm8k) |
| --max_length <length> | Maximum input token length for the model (defaults to 2048) |
| --top_p <threshold> | Probability threshold for the sum of candidate tokens (defaults to 0.8) |
| --eoh <end token> | User input end identifier (defaults to "") |
| --eoa <end token> | Model input end identifier (defaults to "") |
| --eos <end token> | System input end identifier (defaults to "") |
| --temperature -t <temp> | Sampling temperature during generation (defaults to 1.0) |
| --time_out <time> | Maximum time (in seconds) for executing generated code (defaults to 100) |
| --verbose, -v | Print code error messages (optional) |
| --append, -a | Append output to historical results (optional) |
A simple usage example is as follows:
```bash
python tools/pal_inference.py internlm/internlm-chat-7k ./output -v
```
Each line in the output file includes the input question, correct answer, executed answer, score, and the Python code block generated by the model:
````json
{
"question": "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"target": 18.0,
"answer": 18.0,
"score": 1,
"generation": ["```python\ndef solution():\n eggs_per_day = 16\n eggs_per_breakfast = 3\n eggs_per_muffin = 4\n eggs_used = eggs_per_day - eggs_per_breakfast - eggs_per_muffin\n eggs_sold = eggs_used\n price_per_egg = 2\n eggs_made = eggs_sold * price_per_egg\n result = eggs_made\n return result\n```"]
}
````
Performance of InternLM on GSM8K dataset with and without tools is shown in the table below.
| Method | **InternLM-Chat-7B** |
| -------- | -------------------- |
| w/o tool | 34.5 |
| w tool | 39.2 |

0
agent/pal_inference.py Normal file
View File

View File

@ -0,0 +1,62 @@
# 采用 PAL 在 InternLM1-Chat 上评测 GSM8K
[English](pal_inference.md) | 简体中文
在 [GSM8K](https://huggingface.co/datasets/gsm8k) 数据集上使用 [PAL](https://github.com/reasoning-machines/pal) 范式推理,使模型编写代码并通过 Python 解释器执行来解决数学问题。其用法如下:
```bash
python pal_inference.py \
<model> \
<out_dir> \
[--dataset <dataset>] \
[--max_length <length>] \
[--top_p <threshold>] \
[--eoh <end token>] \
[--eoa <end token>] \
[--eos <end token>] \
[--temperature <temp>] \
[--time_out <time>] \
[--verbose, -v] \
[--append, -a]
```
参数说明:
| 参数 | 说明 |
| :--------: | :--------------------: |
| \<model\> | 用于推理的模型的路径 |
| \<out_dir\> | 生成代码将保存在指定的输出文件夹中 |
| --dataset <dataset> | 用于代码生成的数据集名称默认gsm8k |
| --max_length <length> | 模型最大输入 token 长度默认2048 |
| --top_p <threshold> | 候选 token 相加的概率阈值默认0.8 |
| --eoh <end token> | 用户输入结束标识符 (默认: "") |
| --eoa <end token> | 模型输入结束标识符 (默认: "") |
| --eos <end token> | 系统输入结束标识符. (默认: "") |
| --temperature -t <temp> | 生成过程中的采样温度默认1.0 |
| --time_out <time> | 执行生成的代码的最大时间默认100 |
| --verbose, -v | 打印代码错误信息(可选) |
| --append, -a | 将输出追加到历史结果中(可选) |
简单的使用示例如下:
```bash
python tools/pal_inference.py internlm/internlm-chat-7k ./output -v
```
其输出文件每一行包括输入的问题,正确答案,执行答案,得分,以及模型生成的 Python 代码块:
````json
{
"question": "Janet\u2019s ducks lay 16 eggs per day. She eats three for breakfast every morning and bakes muffins for her friends every day with four. She sells the remainder at the farmers' market daily for $2 per fresh duck egg. How much in dollars does she make every day at the farmers' market?",
"target": 18.0,
"answer": 18.0,
"score": 1,
"generation": ["```python\ndef solution():\n eggs_per_day = 16\n eggs_per_breakfast = 3\n eggs_per_muffin = 4\n eggs_used = eggs_per_day - eggs_per_breakfast - eggs_per_muffin\n eggs_sold = eggs_used\n price_per_egg = 2\n eggs_made = eggs_sold * price_per_egg\n result = eggs_made\n return result\n```"]
}
````
InternLM 在 GSM8K 数据集中带工具和不带工具的性能表现如下表所示。
| Method | **InternLM-Chat-7B** |
| -------- | -------------------- |
| w/o tool | 34.5 |
| w tool | 39.2 |