* clean requirements
* modify example inference struct
* add test ci scripts
* mark test_infer as submodule
* rm deprecated cls & deps
* import of HAS_FLASH_ATTN
* prune inference tests to be run
* prune triton kernel tests
* increment pytest timeout mins
* revert import path in openmoe
* Adapted to the baichuan2-7B model
* modified according to the review comments.
* Modified the method of obtaining random weights.
* modified according to the review comments.
* change mlp layewr 'NOTE'