ColossalAI

Commit Graph

Author	SHA1	Message	Date
Frank Lee	1beb85cc25	[checkpoint] refactored the API and added safetensors support (#3427 ) * [checkpoint] refactored the API and added safetensors support * polish code	2 years ago
アマデウス	e78a1e949a	fix torch 2.0 compatibility (#3346 )	2 years ago
CsRic	052b03e83f	limit torch version (#3213 ) Co-authored-by: csric <richcsr256@gmail.com>	2 years ago
Frank Lee	93fdd35b5e	[build] fixed the doc build process (#2618 )	2 years ago
Jiarui Fang	bc0e271e71	[buider] use builder() for cpu adam and fused optim in setup.py (#2187 )	2 years ago
Frank Lee	81e0da7fa8	[setup] supported conda-installed torch (#2048 ) * [setup] supported conda-installed torch * polish code	2 years ago
Jiarui Fang	504419d261	[FAW] add cache manager for the cached embedding (#1419 )	2 years ago
Frank Lee	cf6d1c9284	[CLI] refactored the launch CLI and fixed bugs in multi-node launching (#844 ) * [cli] fixed multi-node job launching * [cli] fixed a bug in version comparison * [cli] support launching with env var * [cli] fixed multi-node job launching * [cli] fixed a bug in version comparison * [cli] support launching with env var * added docstring * [cli] added extra launch arguments * [cli] added default launch rdzv args * [cli] fixed version comparison * [cli] added docstring examples and requierment * polish docstring * polish code * polish code	3 years ago
Frank Lee	01e9f834f5	[dependency] removed torchvision (#833 ) * [dependency] removed torchvision * fixed transforms	3 years ago
Frank Lee	05d9ae5999	[cli] add missing requirement (#805 )	3 years ago
Jiarui Fang	54229cd33e	[log] better logging display with rich (#426 ) * better logger using rich * remove deepspeed in zero requirements	3 years ago
BoxiangW	a2f1565672	Update GitHub action and pre-commit settings (#196 ) * Update GitHub action and pre-commit settings * Update GitHub action and pre-commit settings (#198)	3 years ago
Frank Lee	3defa32aee	Support TP-compatible Torch AMP and Update trainer API (#27 ) * Add gradient accumulation, fix lr scheduler * fix FP16 optimizer and adapted torch amp with tensor parallel (#18) * fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes * fixed trainer * Revert "fixed trainer" This reverts commit `2e0b0b7699`. * improved consistency between trainer, engine and schedule (#23) Co-authored-by: 1SAA <c2h214748@gmail.com> Co-authored-by: 1SAA <c2h214748@gmail.com> Co-authored-by: ver217 <lhx0217@gmail.com>	3 years ago
zbian	404ecbdcc6	Migrated project	3 years ago

14 Commits (34966378e81e1c885af9b50670552d3487a60b57)