Co-authored-by: Ziyue Jiang <ziyue.jiang@gmail.com>
* [autoparallel] integrate device mesh initialization into autoparallelize * add megatron solution * update gpt autoparallel examples with latest api * adapt beta value to fit the current computation cost