ColossalAI

History

Runyu Lu bcf0181ecd [Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895 ) * Distrifusion Support source * comp comm overlap optimization * sd3 benchmark * pixart distrifusion bug fix * sd3 bug fix and benchmark * generation bug fix * naming fix * add docstring, fix counter and shape error * add reference * readme and requirement		2024-07-30 10:43:26 +08:00
..
README.md	[Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895 )	2024-07-30 10:43:26 +08:00
benchmark_sd3.py	[Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895 )	2024-07-30 10:43:26 +08:00
compute_metric.py	[Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895 )	2024-07-30 10:43:26 +08:00
requirements.txt	[Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895 )	2024-07-30 10:43:26 +08:00
run_benchmark.sh	[Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895 )	2024-07-30 10:43:26 +08:00
sd3_generation.py	[Feat] Distrifusion Acceleration Support for Diffusion Inference (#5895 )	2024-07-30 10:43:26 +08:00
test_ci.sh	[HotFix] CI,import,requirements-test for #5838 (#5892 )	2024-07-08 22:32:06 +08:00

README.md

File Structure

|- sd3_generation.py: an example of how to use Colossalai Inference Engine to generate result by loading Diffusion Model.
|- compute_metric.py: compare the quality of images w/o some acceleration method like Distrifusion
|- benchmark_sd3.py: benchmark the performance of our InferenceEngine
|- run_benchmark.sh: run benchmark command

Note: compute_metric.py need some dependencies which need pip install -r requirements.txt, requirements.txt is in examples/inference/stable_diffusion/

Run Inference

The provided example sd3_generation.py is an example to configure, initialize the engine, and run inference on provided model. We've added DiffusionPipeline as model class, and the script is good to run inference with StableDiffusion 3.

For a basic setting, you could run the example by:

colossalai run --nproc_per_node 1 sd3_generation.py -m PATH_MODEL -p "hello world"

Run multi-GPU inference (Patched Parallelism), as in the following example using 2 GPUs:

colossalai run --nproc_per_node 2 sd3_generation.py -m PATH_MODEL