* Added CPU Adam
* finished the cpu adam
* updated the license
* delete useless parameters, removed resnet
* modified the method off cpu adam unittest
* deleted some useless codes
* removed useless codes
Co-authored-by: ver217 <lhx0217@gmail.com>
Co-authored-by: Frank Lee <somerlee.9@gmail.com>
Co-authored-by: jiaruifang <fangjiarui123@gmail.com>
added branch context;
added vocab parallel layers;
moved split_batch from load_batch to tensor parallel embedding layers;
updated gpt model;
updated unit test cases;
fixed few collective communicator bugs