Frank Lee
cf6d1c9284
[CLI] refactored the launch CLI and fixed bugs in multi-node launching ( #844 )
...
* [cli] fixed multi-node job launching
* [cli] fixed a bug in version comparison
* [cli] support launching with env var
* [cli] fixed multi-node job launching
* [cli] fixed a bug in version comparison
* [cli] support launching with env var
* added docstring
* [cli] added extra launch arguments
* [cli] added default launch rdzv args
* [cli] fixed version comparison
* [cli] added docstring examples and requierment
* polish docstring
* polish code
* polish code
3 years ago
Frank Lee
01e9f834f5
[dependency] removed torchvision ( #833 )
...
* [dependency] removed torchvision
* fixed transforms
3 years ago
Frank Lee
05d9ae5999
[cli] add missing requirement ( #805 )
3 years ago
Frank Lee
6f7d1362c9
[doc] removed outdated installation command ( #730 )
3 years ago
ver217
70e8dd418b
[hotfix] update requirements-test ( #701 )
3 years ago
Jiarui Fang
54229cd33e
[log] better logging display with rich ( #426 )
...
* better logger using rich
* remove deepspeed in zero requirements
3 years ago
ver217
578ea0583b
update setup and workflow ( #222 )
3 years ago
BoxiangW
a2f1565672
Update GitHub action and pre-commit settings ( #196 )
...
* Update GitHub action and pre-commit settings
* Update GitHub action and pre-commit settings (#198 )
3 years ago
Frank Lee
3defa32aee
Support TP-compatible Torch AMP and Update trainer API ( #27 )
...
* Add gradient accumulation, fix lr scheduler
* fix FP16 optimizer and adapted torch amp with tensor parallel (#18 )
* fixed bugs in compatibility between torch amp and tensor parallel and performed some minor fixes
* fixed trainer
* Revert "fixed trainer"
This reverts commit 2e0b0b7699
.
* improved consistency between trainer, engine and schedule (#23 )
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: 1SAA <c2h214748@gmail.com>
Co-authored-by: ver217 <lhx0217@gmail.com>
3 years ago
zbian
404ecbdcc6
Migrated project
3 years ago