Commit Graph

72 Commits (30412866e0c6860de0787cd9c5e9a5bfffdc712c)

Author SHA1 Message Date
Jiarui Fang 5bec3b2168
[Gemini] open grad checkpoint when model building (#1984)
2 years ago
Jiarui Fang 3712ac7f90
[Gemini] add bert for MemtracerWrapper unintests (#1982)
2 years ago
Jiarui Fang e481489aa6
[Gemini] MemtracerWrapper unittests (#1981)
2 years ago
Jiarui Fang f7e276fa71
[Gemini] add GeminiAdamOptimizer (#1960)
2 years ago
HELSON c6a1a62636
[hotfix] fix zero's incompatibility with checkpoint in torch-1.12 (#1786)
2 years ago
HELSON f69f9bf223
[zero] add chunk init function for users (#1729)
2 years ago
HELSON 1468e4bcfc
[zero] add constant placement policy (#1705)
2 years ago
HELSON b28991dd0a
[feature] A new ZeRO implementation (#1644)
2 years ago
Jiarui Fang c5d39215f6
Revert "[feature] new zero implementation (#1623)" (#1643)
2 years ago
HELSON 5be118f405
[feature] new zero implementation (#1623)
2 years ago
HELSON b80340168e
[zero] add chunk_managerV2 for all-gather chunk (#1441)
2 years ago
HELSON 9056677b13
[zero] add chunk size searching algorithm for parameters in different groups (#1436)
2 years ago
HELSON 039b7ed3bc
[polish] add update directory in gemini; rename AgChunk to ChunkV2 (#1432)
2 years ago
HELSON 0d212183c4
[zero] add has_inf_or_nan in AgChunk; enhance the unit test of AgChunk (#1426)
2 years ago
HELSON 4fb3c52cf0
[zero] add unit test for AgChunk's append, close, access (#1423)
2 years ago
Jiarui Fang bd71e2a88b
[hotfix] add missing file (#1308)
2 years ago
ver217 c4d903e64a
[gemini] accelerate adjust_layout() (#878)
3 years ago
HELSON 3107817172
[gemini] add stateful tensor container (#867)
3 years ago
HELSON e5ea3fdeef
[gemini] add GeminiMemoryManger (#832)
3 years ago
Jiarui Fang 0ce8924ceb
[tensor] reorganize files (#820)
3 years ago
Jiarui Fang ab962b9735
[gemini] a new tensor structure (#818)
3 years ago
Jiarui Fang 4d9332b4c5
[refactor] moving memtracer to gemini (#801)
3 years ago