ColossalAI/examples/inference/stable_diffusion
Runyu Lu bcf0181ecd
[Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895)
* Distrifusion Support source

* comp comm overlap optimization

* sd3 benchmark

* pixart distrifusion bug fix

* sd3 bug fix and benchmark

* generation bug fix

* naming fix

* add docstring, fix counter and shape error

* add reference

* readme and requirement
2024-07-30 10:43:26 +08:00
..
README.md [Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895) 2024-07-30 10:43:26 +08:00
benchmark_sd3.py [Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895) 2024-07-30 10:43:26 +08:00
compute_metric.py [Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895) 2024-07-30 10:43:26 +08:00
requirements.txt [Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895) 2024-07-30 10:43:26 +08:00
run_benchmark.sh [Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895) 2024-07-30 10:43:26 +08:00
sd3_generation.py [Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895) 2024-07-30 10:43:26 +08:00
test_ci.sh [HotFix] CI,import,requirements-test for #5838 (#5892) 2024-07-08 22:32:06 +08:00

README.md

File Structure

|- sd3_generation.py: an example of how to use Colossalai Inference Engine to generate result by loading Diffusion Model.
|- compute_metric.py: compare the quality of images w/o some acceleration method like Distrifusion
|- benchmark_sd3.py: benchmark the performance of our InferenceEngine
|- run_benchmark.sh: run benchmark command

Note: compute_metric.py need some dependencies which need pip install -r requirements.txt, requirements.txt is in examples/inference/stable_diffusion/

Run Inference

The provided example sd3_generation.py is an example to configure, initialize the engine, and run inference on provided model. We've added DiffusionPipeline as model class, and the script is good to run inference with StableDiffusion 3.

For a basic setting, you could run the example by:

colossalai run --nproc_per_node 1 sd3_generation.py -m PATH_MODEL -p "hello world"

Run multi-GPU inference (Patched Parallelism), as in the following example using 2 GPUs:

colossalai run --nproc_per_node 2 sd3_generation.py -m PATH_MODEL