History

csric e355144375 [chatgpt] Detached PPO Training (#3195 ) * run the base * working on dist ppo * sync * detached trainer * update detached trainer. no maker update function * facing init problem * 1 maker 1 trainer detached run. but no model update * facing cuda problem * fix save functions * verified maker update * nothing * add ignore * analyize loss issue * remove some debug codes * facing 2m1t stuck issue * 2m1t verified * do not use torchrun * working on 2m2t * working on 2m2t * initialize strategy in ray actor env * facing actor's init order issue * facing ddp model update issue (need unwarp ddp) * unwrap ddp actor * checking 1m2t stuck problem * nothing * set timeout for trainer choosing. It solves the stuck problem! * delete some debug output * rename to sync with upstream * rename to sync with upstream * coati rename * nothing * I am going to detach the replaybuffer from trainer and make it a Ray Actor. Two benefits: 1. support TP trainer. 2. asynchronized buffer operations * experience_maker_holder performs target-revolving _send_experience() instead of length comparison. * move code to ray subfolder * working on pipeline inference * apply comments --------- Co-authored-by: csric <richcsr256@gmail.com>	2023-04-17 14:46:50 +08:00
..
Chat	[chatgpt] Detached PPO Training (#3195 )	2023-04-17 14:46:50 +08:00
README.md	[doc] hide diffusion in application path (#3519 )	2023-04-10 17:52:24 +08:00

[chatgpt] Detached PPO Training (#3195 )

* run the base

* working on dist ppo

* sync

* detached trainer

* update detached trainer. no maker update function

* facing init problem

* 1 maker 1 trainer detached run. but no model update

* facing cuda problem

* fix save functions

* verified maker update

* nothing

* add ignore

* analyize loss issue

* remove some debug codes

* facing 2m1t stuck issue

* 2m1t verified

* do not use torchrun

* working on 2m2t

* working on 2m2t

* initialize strategy in ray actor env

* facing actor's init order issue

* facing ddp model update issue (need unwarp ddp)

* unwrap ddp actor

* checking 1m2t stuck problem

* nothing

* set timeout for trainer choosing. It solves the stuck problem!

* delete some debug output

* rename to sync with upstream

* rename to sync with upstream

* coati rename

* nothing

* I am going to detach the replaybuffer from trainer and make it a Ray Actor. Two benefits: 1. support TP trainer. 2. asynchronized buffer operations

* experience_maker_holder performs target-revolving _send_experience() instead of length comparison.

* move code to ray subfolder

* working on pipeline inference

* apply comments

---------

Co-authored-by: csric <richcsr256@gmail.com>

2023-04-17 14:46:50 +08:00

Chat

[chatgpt] Detached PPO Training (#3195 )

2023-04-17 14:46:50 +08:00

README.md

[doc] hide diffusion in application path (#3519 )

2023-04-10 17:52:24 +08:00

README.md

Applications

This directory contains the applications that are powered by Colossal-AI.

The list of applications include:

Chatbot
FastFold: Optimizing AlphaFold (Biomedicine) Training and Inference on GPU Clusters

Please note that the Chatbot application is migrated from the original ChatGPT folder.

You can find more example code for base models and functions in the Examples directory.