Making large AI models cheaper, faster and more accessible
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
Jiarui Fang 9f4fb3f28a
[ColoTensor] ColoInitContext initialize parameters in shard mode. (#1937)
2 years ago
..
components_to_test [NFC] polish test component gpt code style (#1567) 2 years ago
test_amp [amp] add torch amp test (#1860) 2 years ago
test_auto_parallel [autoparallel] fix linear logical convert issue (#1857) 2 years ago
test_comm [communication] add p2p_v2.py to support communication with List[Any] (#1407) 2 years ago
test_config [pipeline] refactor the pipeline module (#1087) 2 years ago
test_context [test] refactored with the new rerun decorator (#763) 3 years ago
test_data [unittest] refactored unit tests for change in dependency (#838) 3 years ago
test_data_pipeline_tensor_parallel [engin/schedule] use p2p_v2 to recontruct pipeline_schedule (#1408) 2 years ago
test_ddp [zero] add chunk init function for users (#1729) 2 years ago
test_device [tensor] support runtime ShardingSpec apply (#1453) 2 years ago
test_engine [hotfix] remove potiential circle import (#1307) 2 years ago
test_fx [hotfix] pass test_complete_workflow (#1877) 2 years ago
test_gemini [hotfix] fix zero's incompatibility with checkpoint in torch-1.12 (#1786) 2 years ago
test_layers [inference] overlap comm and compute in Linear1D_Row when stream_chunk_num > 1 (#1876) 2 years ago
test_moe [moe] initialize MoE groups by ProcessGroup (#1640) 2 years ago
test_ops [FAW] export FAW in _ops (#1438) 2 years ago
test_optimizer [hotfix] fix CPUAdam kernel nullptr (#1410) 2 years ago
test_pipeline [fx/meta/rpc] move _meta_registration.py to fx folder / register fx functions with compatibility checks / remove color debug (#1710) 2 years ago
test_tensor [ColoTensor] ColoInitContext initialize parameters in shard mode. (#1937) 2 years ago
test_trainer [pipeline] refactor the pipeline module (#1087) 2 years ago
test_utils [CheckpointIO] a uniform checkpoint I/O module (#1689) 2 years ago
test_zero [zero] migrate zero1&2 (#1878) 2 years ago
__init__.py [zero] Update sharded model v2 using sharded param v2 (#323) 3 years ago