ColossalAI/colossalai/zero/gemini
botbw 3f7e3131d9
[gemini] optimize reduce scatter d2h copy (#5760)
* [gemini] optimize reduce scatter d2h copy

* [fix] fix missing reduce variable

* [refactor] remove legacy async reduce scatter code

* [gemini] missing sync

* Revert "[refactor] remove legacy async reduce scatter code"

This reverts commit 58ad76d466.

* [gemini] further optimize with async all reduce

* [fix] pass flag from manager to chunk
2024-06-05 14:23:13 +08:00
..
chunk [gemini] optimize reduce scatter d2h copy (#5760) 2024-06-05 14:23:13 +08:00
memory_tracer [npu] change device to accelerator api (#5239) 2024-01-09 10:20:05 +08:00
__init__.py [shardformer]: support gpt-j, falcon, Mistral and add interleaved pipeline for bert (#5088) 2023-11-28 16:54:42 +08:00
gemini_ddp.py [gemini] optimize reduce scatter d2h copy (#5760) 2024-06-05 14:23:13 +08:00
gemini_hook.py [bug] fix early return (#5740) 2024-05-21 14:21:58 +08:00
gemini_mgr.py [chore] remove unnecessary assert since compute list might not be recorded 2024-05-28 05:16:02 +00:00
gemini_optimizer.py [gemini] async grad chunk reduce (all-reduce&reduce-scatter) (#5713) 2024-05-24 10:31:16 +08:00
placement_policy.py [bug] continue fix 2024-05-28 02:41:23 +00:00
utils.py [npu] change device to accelerator api (#5239) 2024-01-09 10:20:05 +08:00