552 Commits (d16671da75ccb6aecacb2a15826a6d34367373d1)
 

Author SHA1 Message Date
Jiarui Fang d16671da75
[Tensor] initialize the ColoOptimizer (#898) 3 years ago
Jiarui Fang 676f191532
[Tensor] activation is an attr of ColoTensor (#897) 3 years ago
Jiarui Fang e76f76c08b
[Tensor] test parameters() as member function (#896) 3 years ago
Ziyue Jiang cb182da7c5
[tensor] refine linear and add gather for laynorm (#893) 3 years ago
Jiarui Fang 26c49639d8
[Tensor] overriding paramters() for Module using ColoTensor (#889) 3 years ago
ver217 daf59ff72e
[setup] add local version label (#890) 3 years ago
Ziyue Jiang 1d0aba4153
[tensor] add ColoTensor 1Dcol (#888) 3 years ago
Jiarui Fang a0e5971692
[Tensor] test model check results for a simple net (#887) 3 years ago
Jiarui Fang 72cdc06875
[Tensor] make ColoTensor more robust for getattr (#886) 3 years ago
Ziyue Jiang 9bc5a77c31
[tensor] wrap function in the torch_tensor to ColoTensor (#881) 3 years ago
ver217 4df6471f5d
fix import error (#880) 3 years ago
Jiarui Fang 7f76517a85
[Tensor] make a simple net works with 1D row TP (#879) 3 years ago
ver217 c4d903e64a
[gemini] accelerate adjust_layout() (#878) 3 years ago
Jiarui Fang 909211453b
[Tensor] Add some attributes to ColoTensor (#877) 3 years ago
HELSON 425b4a96b8
[gemini] polish stateful_tensor_mgr (#876) 3 years ago
Jiarui Fang e43f83aa5c
[Tensor] get named parameters for model using ColoTensors (#874) 3 years ago
LuGY 2883040286
[example] change qkv processing (#870) 3 years ago
Jiarui Fang 96211c2cc8
[tensor] customized op returns ColoTensor (#875) 3 years ago
Ziyue Jiang 26d4ab8b03
[Tensor] Add function to spec and update linear 1Drow and unit tests (#869) 3 years ago
Frank Lee 11f54c7b6b
[doc] improved docstring and assertion messages for the engine module (#871) 3 years ago
Frank Lee 1c34382678
[doc] improved assertion messages in trainer (#873) 3 years ago
Frank Lee 7a64fae33a
[doc] improved error messages in initialize (#872) 3 years ago
Jiarui Fang 1190b2c4a4
[tensor] add cross_entrophy_loss (#868) 3 years ago
HELSON 3107817172
[gemini] add stateful tensor container (#867) 3 years ago
Jiarui Fang d01d3b8cb0
colo init context add device attr. (#866) 3 years ago
Frank Lee 2238758c2e
[usability] improved error messages in the context module (#856) 3 years ago
Frank Lee 9fdebadd69
[doc] improved docstring in the amp module (#857) 3 years ago
Frank Lee b862d89d00
[doc] improved docstring in the logging module (#861) 3 years ago
Frank Lee 8004c8e938
[doc] improved docstring in the communication module (#863) 3 years ago
Jiarui Fang 8af5f7423d
[tensor] an initial dea of tensor spec (#865) 3 years ago
Jiarui Fang 126ba573a8
[Tensor] add layer norm Op (#852) 3 years ago
Frank Lee a82da26f7e
[cli] refactored micro-benchmarking cli and added more metrics (#858) 3 years ago
Frank Lee ee222dfbf3
[usability] added assertion message in registry (#864) 3 years ago
HELSON f0e654558f
[gemini] polish code (#855) 3 years ago
Jiarui Fang 29159d9b5b
hotfix tensor unittest bugs (#862) 3 years ago
Frank Lee 1258af71cc
[ci] cache cuda extension (#860) 3 years ago
YuliangLiu0306 c6930d8ddf
[pipelinable]use ColoTensor to replace dummy tensor. (#853) 3 years ago
Ziyue Jiang bcc8655021
[Tensor ] Add 1Drow weight reshard by spec (#854) 3 years ago
ver217 d7e0303d1e
[zero] use GeminiMemoryManager when sampling model data (#850) 3 years ago
ver217 232142f402
[utils] refactor profiler (#837) 3 years ago
Jiarui Fang 62f059251b
[Tensor] init a tp network training unittest (#849) 3 years ago
ver217 0dea140760
[hotfix] add deconstructor for stateful tensor (#848) 3 years ago
ver217 0f7ed8c192
fix _post_init_method of zero init ctx (#847) 3 years ago
Ziyue Jiang 2a0a427e04
[tensor]add assert for colo_tensor 1Drow (#846) 3 years ago
Ziyue Jiang 05023ecfee
[Tensor] TP Linear 1D row (#843) 3 years ago
Frank Lee cf6d1c9284
[CLI] refactored the launch CLI and fixed bugs in multi-node launching (#844) 3 years ago
HELSON e5ea3fdeef
[gemini] add GeminiMemoryManger (#832) 3 years ago
YuliangLiu0306 35ea6e1023
[pipelinable]use pipelinable context to initialize non-pipeline model (#816) 3 years ago
Jiarui Fang ea0a2ed25f
[hotfix] the bug of numel() in ColoTensor (#845) 3 years ago
LuGY c1e8d2001e
modefied the pp build for ckpt adaptation (#803) 3 years ago