* first v of vit shardformer
* keep vit
* update
* vit shard add vitattention vitlayer
* update num head shard para
* finish test for vit
* add new_model_class & postprocess
* add vit readme
* delete old files & fix the conflict
* fix sth
* add layernorm to bert
* add layernorm test
* add layernorm test with load state dict
* add use_mixedfusedLN in shard config
* refactor policy to support fused_layernorm
* add dropout layer, add dropout test
* modify seed manager as context manager
* add a copy of col_nn.layer
* add dist_crossentropy loss; separate module test
* polish the code
* fix dist crossentropy loss
* init shardformer code structure
* add implement of sharder (inject and replace)
* add implement of replace layer to colossal layer
* separate different layer policy, add some notion
* implement 1d and 2d slicer, can tell col or row
* fix bug when slicing and inject model
* fix some bug; add inference test example
* add share weight and train example
* add train
* add docstring and readme
* add docstring for other files
* pre-commit
* init shardformer code structure
* add implement of sharder (inject and replace)
* add implement of replace layer to colossal layer
* separate different layer policy, add some notion
* implement 1d and 2d slicer, can tell col or row
* fix bug when slicing and inject model
* fix some bug; add inference test example
* add dropout layer, add dropout test
* modify seed manager as context manager
* add a copy of col_nn.layer
* add dist_crossentropy loss; separate module test
* polish the code
* fix dist crossentropy loss
* init shardformer code structure
* add implement of sharder (inject and replace)
* add implement of replace layer to colossal layer
* separate different layer policy, add some notion
* implement 1d and 2d slicer, can tell col or row
* fix bug when slicing and inject model
* fix some bug; add inference test example
* add share weight and train example
* add train
* add docstring and readme
* add docstring for other files
* pre-commit
* init shardformer code structure
* add implement of sharder (inject and replace)
* add implement of replace layer to colossal layer
* separate different layer policy, add some notion
* implement 1d and 2d slicer, can tell col or row
* fix bug when slicing and inject model
* fix some bug; add inference test example