mirror of https://github.com/InternLM/InternLM
update docs
parent
17199b8678
commit
d4f9c2f49e
|
@ -242,7 +242,7 @@ To learn more about data contamination assessment, please check the [contaminati
|
|||
### Agent Evaluation
|
||||
|
||||
- To evaluate tool utilization, please refer to [T-Eval](https://github.com/open-compass/T-Eval).
|
||||
- For code interpreter evaluation, use the [gsm-8k-agent](https://github.com/open-compass/opencompass/blob/main/configs/datasets/gsm8k/gsm8k_agent_gen_be1606.py) provided in the repository. Additionally, you need to install [Lagent](https://github.com/InternLM/lagent).
|
||||
- For code interpreter evaluation, use the [Math Agent Evaluation](agent/streaming_inference.py) provided in the repository. Additionally, you need to install [Lagent](https://github.com/InternLM/lagent).
|
||||
|
||||
### Subjective Evaluation
|
||||
|
||||
|
|
|
@ -19,4 +19,69 @@ The results of InternLM2-Chat-20B on math code interpreter is as below:
|
|||
|
||||
## Usages
|
||||
|
||||
We offer examples using [Lagent](lagent.md) to build agents based on InternLM2-Chat to call code interpreter or search API. Please see [Inference with Streaming Agents in Lagent](streaming_inference.md). Additionally, we provide an example code using [PAL to evaluate GSM8K math problems](pal_inference.md) with InternLM-Chat-7B.
|
||||
We offer an example using [Lagent](lagent.md) to build agents based on InternLM2-Chat to call the code interpreter. Firstly install the extra dependencies:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Run the following script to perform inference and evaluation on GSM8K and MATH test.
|
||||
|
||||
```bash
|
||||
python streaming_inference.py \
|
||||
--backend=lmdeploy \ # For HuggingFace models: hf
|
||||
--model_path=internlm/internlm2-chat-20b \
|
||||
--tp=2 \
|
||||
--temperature=0.0 \
|
||||
--dataset=math \
|
||||
--output_path=math_lmdeploy.jsonl \
|
||||
--do_eval
|
||||
```
|
||||
|
||||
`output_path` is a jsonl format file to save the inference results. Each line is like
|
||||
|
||||
```json
|
||||
{
|
||||
"idx": 41,
|
||||
"query": "The point $(a, b)$ lies on the line with the equation $3x + 2y = 12.$ When $a = 4$, what is the value of $b$?",
|
||||
"gt": "0",
|
||||
"pred": ["0"],
|
||||
"steps": [
|
||||
{
|
||||
"role": "language",
|
||||
"content": ""
|
||||
},
|
||||
{
|
||||
"role": "tool",
|
||||
"content": {
|
||||
"name": "IPythonInteractive",
|
||||
"parameters": {
|
||||
"command": "```python\nfrom sympy import symbols, solve\n\ndef find_b():\n x, y = symbols('x y')\n equation = 3*x + 2*y - 12\n b = solve(equation.subs(x, 4), y)[0]\n\n return b\n\nresult = find_b()\nprint(result)\n```"
|
||||
}
|
||||
},
|
||||
"name": "interpreter"
|
||||
},
|
||||
{
|
||||
"role": "environment",
|
||||
"content": "0",
|
||||
"name": "interpreter"
|
||||
},
|
||||
{
|
||||
"role": "language",
|
||||
"content": "The value of $b$ when $a = 4$ is $\\boxed{0}$."
|
||||
}
|
||||
],
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
Once it is prepared, just skip the inference stage as follows.
|
||||
|
||||
```bash
|
||||
python streaming_inference.py \
|
||||
--output_path=math_lmdeploy.jsonl \
|
||||
--no-do_infer \
|
||||
--do_eval
|
||||
```
|
||||
|
||||
Please refer to [`streaming_inference.py`](streaming_inference.py) for more information about the arguments.
|
||||
|
|
|
@ -19,4 +19,69 @@ InternLM2-Chat 进一步提高了它在代码解释和通用工具调用方面
|
|||
|
||||
## 体验
|
||||
|
||||
我们提供了使用 [Lagent](lagent_zh-CN.md) 来基于 InternLM2-Chat 构建智能体调用代码解释器或者搜索等工具的例子,请参考 [Lagent 智能体流式推理](streaming_inference_zh-CN.md)。同时,我们也提供了采用 [PAL 评测 GSM8K 数学题](pal_inference_zh-CN.md) InternLM-Chat-7B 的样例。
|
||||
我们提供了使用 [Lagent](lagent_zh-CN.md) 来基于 InternLM2-Chat 构建智能体调用代码解释器的例子。首先安装额外依赖:
|
||||
|
||||
```bash
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
运行以下脚本在 GSM8K 和 MATH 测试集上进行推理和评估:
|
||||
|
||||
```bash
|
||||
python streaming_inference.py \
|
||||
--backend=lmdeploy \ # For HuggingFace models: hf
|
||||
--model_path=internlm/internlm2-chat-20b \
|
||||
--tp=2 \
|
||||
--temperature=0.0 \
|
||||
--dataset=math \
|
||||
--output_path=math_lmdeploy.jsonl \
|
||||
--do_eval
|
||||
```
|
||||
|
||||
`output_path` 是一个存储推理结果的 jsonl 格式文件,每行形如:
|
||||
|
||||
```json
|
||||
{
|
||||
"idx": 41,
|
||||
"query": "The point $(a, b)$ lies on the line with the equation $3x + 2y = 12.$ When $a = 4$, what is the value of $b$?",
|
||||
"gt": "0",
|
||||
"pred": ["0"],
|
||||
"steps": [
|
||||
{
|
||||
"role": "language",
|
||||
"content": ""
|
||||
},
|
||||
{
|
||||
"role": "tool",
|
||||
"content": {
|
||||
"name": "IPythonInteractive",
|
||||
"parameters": {
|
||||
"command": "```python\nfrom sympy import symbols, solve\n\ndef find_b():\n x, y = symbols('x y')\n equation = 3*x + 2*y - 12\n b = solve(equation.subs(x, 4), y)[0]\n\n return b\n\nresult = find_b()\nprint(result)\n```"
|
||||
}
|
||||
},
|
||||
"name": "interpreter"
|
||||
},
|
||||
{
|
||||
"role": "environment",
|
||||
"content": "0",
|
||||
"name": "interpreter"
|
||||
},
|
||||
{
|
||||
"role": "language",
|
||||
"content": "The value of $b$ when $a = 4$ is $\\boxed{0}$."
|
||||
}
|
||||
],
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
如果已经准备好了该文件,可直接跳过推理阶段进行评估:
|
||||
|
||||
```bash
|
||||
python streaming_inference.py \
|
||||
--output_path=math_lmdeploy.jsonl \
|
||||
--no-do_infer \
|
||||
--do_eval
|
||||
```
|
||||
|
||||
请参考 [`streaming_inference.py`](streaming_inference.py) 获取更多关于参数的信息。
|
||||
|
|
|
@ -0,0 +1,10 @@
|
|||
lmdeploy>=0.2.2
|
||||
datasets
|
||||
tqdm
|
||||
numpy
|
||||
pebble
|
||||
jsonlines
|
||||
sympy==1.12
|
||||
antlr4-python3-runtime==4.11.0
|
||||
lagent
|
||||
einops
|
|
@ -1,66 +0,0 @@
|
|||
# Inference with Streaming Agents in Lagent
|
||||
|
||||
English | [简体中文](streaming_inference_zh-CN.md)
|
||||
|
||||
[Lagent](https://github.com/InternLM/lagent) is strongly recommended for agent construction. It supports multiple types of agents and is integrated with commonly used tools, including code interpreters.
|
||||
|
||||
We provide a script for inference and evaluation on GSM8K and MATH test. The usage is as follows:
|
||||
|
||||
```bash
|
||||
python streaming_inference.py \
|
||||
--backend=lmdeploy \ # For HuggingFace models: hf
|
||||
--model_path=internlm/internlm2-chat-20b \
|
||||
--tp=2 \
|
||||
--temperature=0.0 \
|
||||
--dataset=math \
|
||||
--output_path=math_lmdeploy.jsonl \
|
||||
--do_eval
|
||||
```
|
||||
|
||||
`output_path` is a jsonl format file to save the inference results. Each line is like:
|
||||
|
||||
```json
|
||||
{
|
||||
"idx": 41,
|
||||
"query": "The point $(a, b)$ lies on the line with the equation $3x + 2y = 12.$ When $a = 4$, what is the value of $b$?",
|
||||
"gt": "0",
|
||||
"pred": ["0"],
|
||||
"steps": [
|
||||
{
|
||||
"role": "language",
|
||||
"content": ""
|
||||
},
|
||||
{
|
||||
"role": "tool",
|
||||
"content": {
|
||||
"name": "IPythonInteractive",
|
||||
"parameters": {
|
||||
"command": "```python\nfrom sympy import symbols, solve\n\ndef find_b():\n x, y = symbols('x y')\n equation = 3*x + 2*y - 12\n b = solve(equation.subs(x, 4), y)[0]\n\n return b\n\nresult = find_b()\nprint(result)\n```"
|
||||
}
|
||||
},
|
||||
"name": "interpreter"
|
||||
},
|
||||
{
|
||||
"role": "environment",
|
||||
"content": "0",
|
||||
"name": "interpreter"
|
||||
},
|
||||
{
|
||||
"role": "language",
|
||||
"content": "The value of $b$ when $a = 4$ is $\\boxed{0}$."
|
||||
}
|
||||
],
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
Once it is prepared, just skip the inference stage as follows:
|
||||
|
||||
```bash
|
||||
python streaming_inference.py \
|
||||
--output_path=math_lmdeploy.jsonl \
|
||||
--no-do_infer \
|
||||
--do_eval
|
||||
```
|
||||
|
||||
Please refer to [`streaming_inference.py`](streaming_inference.py) for more information about the arguments.
|
|
@ -1,66 +0,0 @@
|
|||
# Lagent 智能体流式推理
|
||||
|
||||
[English](streaming_inference.md) | 简体中文
|
||||
|
||||
推荐使用 [Lagent](https://github.com/InternLM/lagent) 框架,该工具包实现了多种类型智能体并集成了常见工具,包括 Python 代码解释器。
|
||||
|
||||
我们基于此提供一个在 GSM8K 和 MATH 测试集上推理和评估的脚本,使用方式如下:
|
||||
|
||||
```bash
|
||||
python streaming_inference.py \
|
||||
--backend=lmdeploy \ # For HuggingFace models: hf
|
||||
--model_path=internlm/internlm2-chat-20b \
|
||||
--tp=2 \
|
||||
--temperature=0.0 \
|
||||
--dataset=math \
|
||||
--output_path=math_lmdeploy.jsonl \
|
||||
--do_eval
|
||||
```
|
||||
|
||||
`output_path` 是一个存储推理结果的 jsonl 格式文件,每行形如:
|
||||
|
||||
```json
|
||||
{
|
||||
"idx": 41,
|
||||
"query": "The point $(a, b)$ lies on the line with the equation $3x + 2y = 12.$ When $a = 4$, what is the value of $b$?",
|
||||
"gt": "0",
|
||||
"pred": ["0"],
|
||||
"steps": [
|
||||
{
|
||||
"role": "language",
|
||||
"content": ""
|
||||
},
|
||||
{
|
||||
"role": "tool",
|
||||
"content": {
|
||||
"name": "IPythonInteractive",
|
||||
"parameters": {
|
||||
"command": "```python\nfrom sympy import symbols, solve\n\ndef find_b():\n x, y = symbols('x y')\n equation = 3*x + 2*y - 12\n b = solve(equation.subs(x, 4), y)[0]\n\n return b\n\nresult = find_b()\nprint(result)\n```"
|
||||
}
|
||||
},
|
||||
"name": "interpreter"
|
||||
},
|
||||
{
|
||||
"role": "environment",
|
||||
"content": "0",
|
||||
"name": "interpreter"
|
||||
},
|
||||
{
|
||||
"role": "language",
|
||||
"content": "The value of $b$ when $a = 4$ is $\\boxed{0}$."
|
||||
}
|
||||
],
|
||||
"error": null
|
||||
}
|
||||
```
|
||||
|
||||
如果已经准备好了该文件,可直接跳过推理阶段进行评估:
|
||||
|
||||
```bash
|
||||
python streaming_inference.py \
|
||||
--output_path=math_lmdeploy.jsonl \
|
||||
--no-do_infer \
|
||||
--do_eval
|
||||
```
|
||||
|
||||
请参考 [`streaming_inference.py`](streaming_inference.py) 获取更多关于参数的信息。
|
|
@ -1,13 +1,3 @@
|
|||
sentencepiece
|
||||
streamlit
|
||||
transformers>=4.34
|
||||
lmdeploy>=0.2.2
|
||||
datasets
|
||||
tqdm
|
||||
numpy
|
||||
pebble
|
||||
jsonlines
|
||||
sympy==1.12
|
||||
antlr4-python3-runtime==4.11.0
|
||||
lagent
|
||||
einops
|
||||
|
|
Loading…
Reference in New Issue