diff --git a/README.md b/README.md
index 1f7366c..53f69fc 100644
--- a/README.md
+++ b/README.md
@@ -254,6 +254,39 @@ curl http://localhost:23333/v1/chat/completions \
 
 Find more details in the [LMDeploy documentation](https://lmdeploy.readthedocs.io/en/latest/)
 
+#### SGLang inference
+
+##### Installation
+```bash
+pip3 install "sglang[srt]>=0.4.1.post6" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
+```
+
+##### OpenAI Compatible Server
+
+```bash
+python3 -m sglang.launch_server --model internlm/internlm3-8b-instruct --trust-remote-code --chat-template internlm2-chat
+```
+
+##### OpenAI client
+
+```python3
+import openai
+client = openai.Client(
+    base_url="http://127.0.0.1:30000/v1", api_key="EMPTY")
+
+# Chat completion
+response = client.chat.completions.create(
+    model="default",
+    messages=[
+        {"role": "system", "content": "You are a helpful AI assistant"},
+        {"role": "user", "content": "List 3 countries and their capitals."},
+    ],
+    temperature=0,
+    max_tokens=64,
+)
+print(response)
+```
+
 #### Ollama inference
 
 TODO
@@ -401,6 +434,15 @@ response = pipe(messages, gen_config=GenerationConfig(max_new_tokens=2048))
 print(response)
 ```
 
+#### SGLang inference
+
+Installation
+```bash
+pip3 install "sglang[srt]>=0.4.1.post6" --find-links https://flashinfer.ai/whl/cu124/torch2.4/flashinfer/
+```
+
+For offline engine api usage, please refer to [Offline Engine API](https://docs.sglang.ai/backend/offline_engine_api.html)
+
 #### Ollama inference
 
 TODO