Add LazyLLM and its usage to ecosystem/README.md (#795)

Co-authored-by: wangzhihong <scse1082@126.com>
2024-09-06 15:21:22 +08:00 · 2024-09-06 15:21:22 +08:00 · 10b97b7a41
parent 4fbc98912c
commit 10b97b7a41
2 changed files with 105 additions and 0 deletions
--- a/ecosystem/README.md
+++ b/ecosystem/README.md
@ -244,3 +244,57 @@ LlamaIndex is a framework for building context-augmented LLM applications.
 It chooses ollama as the LLM inference engine locally. An example can be found from the [Starter Tutorial(Local Models)](https://docs.llamaindex.ai/en/stable/getting_started/starter_example_local/).
 Therefore, you can integrate InternLM2 or InternLM2.5 models to LlamaIndex smoothly if you can deploying them with `ollama` as guided in the [ollama section](#ollama)
 ### [LazyLLM](https://github.com/LazyAGI/LazyLLM)
 LazyLLM is an framework which supports the easiest and laziest way for building multi-agent LLMs applications. It offers extremely high flexibility and ease of use compared to LangChain and LLamaIndex.
 When you have installed `lazyllm` by `pip3 install lazyllm` and `lazyllm install standard`, you can use the following code to build chatbots based on internLM at a very low cost, without worrying about the special tokens (such as `<|im_start|>system` and `<|im_end|>`) of the dialogue model. Don’t worry about not having weight files; as long as you are connected to the internet, the code below will automatically download the weight files and deploy the service for you. Enjoy the convenience that LazyLLM brings to you.
 ```python
 from lazyllm import TrainableModule, WebModule
 # Model will be download automatically if you have an internet connection
 m = TrainableModule('internlm2_5-7b-chat')
 # will launch a chatbot server
 WebModule(m).start().wait()
 ```
 You can use the following code to finetune your model if needed. When the trainset (The dataset needs to be downloaded to the local machine, for example:[
 alpaca_gpt4_zh](https://huggingface.co/datasets/llamafactory/alpaca_gpt4_zh)) of the TrainableModule is set, during the calling of the WebModule's update function, the TrainableModule will be automatically fine-tuned, and then both the TrainableModule and the WebModule will be deployed separately.
 ```python
 from lazyllm import TrainableModule, WebModule
 m = TrainableModule('internlm2-chat-7b').trainset('/patt/to/your_data.json').mode('finetune')
 WebModule(m).update().wait()
 ```
 It is worth mentioning that regardless of which model in the InternLM series you use, you can perform inference and fine-tuning with LazyLLM. You don't need to worry about the model's segmentation strategy or special tokens.<br>
 If you want to build your own RAG application, you don't need to first start the inference service and then configure the IP and port to launch the application like you would with LangChain. Refer to the code below, and with LazyLLM, you can use the internLM series models to build a highly customized RAG application in just ten lines of code, along with document management services (The document requires specifying the local absolute path. You can download it as an example from here: [rag_master](https://huggingface.co/datasets/Jing0o0Xin/rag_master)):
 <details>
 <summary>Click here to get imports and prompts</summary>
 ```python
 import os
 import lazyllm
 from lazyllm import pipeline, parallel, bind, SentenceSplitter, Document, Retriever, Reranker
 prompt = 'You will play the role of an AI Q&A assistant and complete a dialogue task. In this task, you need to provide your answer based on the given context and question.'
 ```
 </details>
 ```python
 documents = Document(dataset_path='/file/to/yourpath', embed=lazyllm.TrainableModule('bge-large-zh-v1.5'), create_ui=False)
 documents.create_node_group(name="sentences", transform=SentenceSplitter, chunk_size=1024, chunk_overlap=100)
 with pipeline() as ppl:
    with parallel().sum as ppl.prl:
        prl.retriever1 = Retriever(documents, group_name="sentences", similarity="cosine", topk=3)
        prl.retriever2 = Retriever(documents, "CoarseChunk", "bm25_chinese", 0.003, topk=3)
    ppl.reranker = Reranker("ModuleReranker", model="bge-reranker-large", topk=1) | bind(query=ppl.input)
    ppl.formatter = (lambda nodes, query: dict(context_str="".join([node.get_content() for node in nodes]), query=query)) | bind(query=ppl.input)
    ppl.llm = lazyllm.TrainableModule("internlm2_5-7b-chat").prompt(lazyllm.ChatPrompter(prompt, extro_keys=["context_str"]))
 lazyllm.WebModule(ppl, port=23456).start().wait()
 ```
 LazyLLM Documents: https://docs.lazyllm.ai/
--- a/ecosystem/README_zh-CN.md
+++ b/ecosystem/README_zh-CN.md
@ -244,3 +244,54 @@ LlamaIndex 是一个用于构建上下文增强型 LLM 应用程序的框架。
 它选择 ollama 作为 LLM 推理引擎。你可以在[入门教程（本地模型）](<(https://docs.llamaindex.ai/en/stable/getting_started/starter_example_local/)>)中找到示例。
 因此，如果能够按照 [ollama 章节](#ollama)使用 ollama 部署浦语模型，你就可以顺利地将浦语模型集成到 LlamaIndex 中。
 ### [LazyLLM](https://github.com/LazyAGI/LazyLLM)
 LazyLLM 是一个的低代码构建多 Agent 大模型应用的开发工具，相比于 LangChain 和 LLamaIndex，其具有极高的灵活性和易用性。
 当您依次通过 `pip3 install lazyllm` 和 `lazyllm install standard` 安装了 LazyLLM 之后, 您可以使用如下代码以极低的成本，基于 InternLM 搭建 chatbots，无论推理还是微调，您都无需考虑对话模型的特殊 token（如`<|im_start|>system`和`<|im_end|>`等 ）。不用担心没有权重文件，只要您能联网，下面的代码将会自动帮您下载权重文件并部署服务，您只需尽情享受 LazyLLM 给您带来的便利。
 ```python
 from lazyllm import TrainableModule, WebModule
 m = TrainableModule('internlm2_5-7b-chat')
 # will launch a chatbot server
 WebModule(m).start().wait()
 ```
 如果您需要进一步微调模型，可以参考如下代码。当 `TrainableModule` 的 `trainset` (数据集需下载到本地，例如：[
 alpaca_gpt4_zh](https://huggingface.co/datasets/llamafactory/alpaca_gpt4_zh))被设置之后，在调用 `WebModule` 的 `update` 函数时，会自动微调 `TrainableModule`，然后对 `TrainableModule` 和 `WebModule` 分别进行部署。
 ```python
 from lazyllm import TrainableModule, WebModule
 m = TrainableModule('internlm2-chat-7b').trainset('/patt/to/your_data.json').mode('finetune')
 WebModule(m).update().wait()
 ```
 值的一提的是，无论您用 InternLM 系列的任何一个模型，都可以使用 LazyLLM 进行推理和微调，您都无需考虑模型的切分策略，也无需考虑模型的特殊 token。<br>
 如果您想搭建自己的 RAG 应用，那么您无需像使用 LangChain 一样先启动服务推理服务，再配置 ip 和端口去启动应用程序。参考如下代码，您可以借助 LazyLLM，使用 InternLM 系列的模型，十行代码搭建高度定制的 RAG 应用，且附带文档管理服务（文档需指定本地绝对路径，可从这里下载：[rag_master](https://huggingface.co/datasets/Jing0o0Xin/rag_master)）：
 <details>
 <summary>点击获取import和prompt</summary>
 ```python
 import os
 import lazyllm
 from lazyllm import pipeline, parallel, bind, SentenceSplitter, Document, Retriever, Reranker
 prompt = '你将扮演一个人工智能问答助手的角色，完成一项对话任务。在这个任务中，你需要根据给定的上下文以及问题，给出你的回答。'
 ```
 </details>
 ```python
 documents = Document(dataset_path='/file/to/yourpath', embed=lazyllm.TrainableModule('bge-large-zh-v1.5'), create_ui=False)
 documents.create_node_group(name="sentences", transform=SentenceSplitter, chunk_size=1024, chunk_overlap=100)
 with pipeline() as ppl:
    with parallel().sum as ppl.prl:
        prl.retriever1 = Retriever(documents, group_name="sentences", similarity="cosine", topk=3)
        prl.retriever2 = Retriever(documents, "CoarseChunk", "bm25_chinese", 0.003, topk=3)
    ppl.reranker = Reranker("ModuleReranker", model="bge-reranker-large", topk=1) | bind(query=ppl.input)
    ppl.formatter = (lambda nodes, query: dict(context_str="".join([node.get_content() for node in nodes]), query=query)) | bind(query=ppl.input)
    ppl.llm = lazyllm.TrainableModule("internlm2_5-7b-chat").prompt(lazyllm.ChatPrompter(prompt, extro_keys=["context_str"]))
 lazyllm.WebModule(ppl, port=23456).start().wait()
 ```
 LazyLLM 官方文档: https://docs.lazyllm.ai/