ColossalAI/colossalai/autochunk
oahzxl 05671fcb42
[autochunk] support multi outputs chunk search (#2538)
Support multi outputs chunk search. Previously we only support single output chunk search. It is more flexible and improve performance by a large margin. For transformer, we reduce memory by 40% than previous search strategy.

1. rewrite search strategy to support multi outputs chunk search
2. fix many, many bugs
3. update tests
2023-02-01 13:18:51 +08:00
..
autochunk_codegen.py [autochunk] support multi outputs chunk search (#2538) 2023-02-01 13:18:51 +08:00
estimate_memory.py [autochunk] support multi outputs chunk search (#2538) 2023-02-01 13:18:51 +08:00
reorder_graph.py [autochunk] support multi outputs chunk search (#2538) 2023-02-01 13:18:51 +08:00
search_chunk.py [autochunk] support multi outputs chunk search (#2538) 2023-02-01 13:18:51 +08:00
select_chunk.py [autochunk] support multi outputs chunk search (#2538) 2023-02-01 13:18:51 +08:00
trace_flow.py [autochunk] support multi outputs chunk search (#2538) 2023-02-01 13:18:51 +08:00
trace_indice.py [autochunk] support multi outputs chunk search (#2538) 2023-02-01 13:18:51 +08:00
utils.py [autochunk] support multi outputs chunk search (#2538) 2023-02-01 13:18:51 +08:00