However, if you want to build the PyTorch extensions during installation, you can set `CUDA_EXT=1`.
However, if you want to build the PyTorch extensions during installation, you can set `BUILD_EXT=1`.
```bash
```bash
CUDA_EXT=1 pip install colossalai
BUILD_EXT=1 pip install colossalai
```
```
**Otherwise, CUDA kernels will be built during runtime when you actually need them.**
**Otherwise, CUDA kernels will be built during runtime when you actually need them.**
@ -429,7 +429,7 @@ By default, we do not compile CUDA/C++ kernels. ColossalAI will build them durin
If you want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer):
If you want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer):
```shell
```shell
CUDA_EXT=1 pip install .
BUILD_EXT=1 pip install .
```
```
For Users with CUDA 10.2, you can still build ColossalAI from source. However, you need to manually download the cub library and copy it to the corresponding directory.
For Users with CUDA 10.2, you can still build ColossalAI from source. However, you need to manually download the cub library and copy it to the corresponding directory.
If you want to build PyTorch extensions during installation, you can use the command below. Otherwise, the PyTorch extensions will be built during runtime.
If you want to build PyTorch extensions during installation, you can use the command below. Otherwise, the PyTorch extensions will be built during runtime.
```shell
```shell
CUDA_EXT=1 pip install colossalai
BUILD_EXT=1 pip install colossalai
```
```
@ -39,7 +39,7 @@ cd ColossalAI
pip install -r requirements/requirements.txt
pip install -r requirements/requirements.txt
# install colossalai
# install colossalai
CUDA_EXT=1 pip install .
BUILD_EXT=1 pip install .
```
```
If you don't want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer), just don't specify the `CUDA_EXT`:
If you don't want to install and enable CUDA kernel fusion (compulsory installation when using fused optimizer), just don't specify the `CUDA_EXT`: