* [autoparallel] integrate device mesh initialization into autoparallelize * add megatron solution * update gpt autoparallel examples with latest api * adapt beta value to fit the current computation cost
* [device] alpha beta profiler * add usage * fix variable name