Making large AI models cheaper, faster and more accessible

ai big-model data-parallelism deep-learning distributed-computing foundation-models heterogeneous-training hpc inference large-scale model-parallelism pipeline-parallelism

History

Hongxin Liu 079bf3cb26 [misc] update pre-commit and run all files (#4752 ) * [misc] update pre-commit * [misc] run pre-commit * [misc] remove useless configuration files * [misc] ignore cuda for clang-format		1 year ago
..
README.md	…
main.py	[misc] update pre-commit and run all files (#4752 )	1 year ago

README.md

Basic MNIST Example with optional FP8 of TransformerEngine

TransformerEngine is a library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.

Thanks for the contribution to this tutorial from NVIDIA.

python main.py
python main.py --use-te   # Linear layers from TransformerEngine
python main.py --use-fp8  # FP8 + TransformerEngine for Linear layers

We are working to integrate it with Colossal-AI and will finish it soon.