Commit Graph

  • 3d3b31298f Merge branch 'develop' of https://github.com/InternLM/InternLM into improve_unitest lijiaxing 2023-11-16 14:09:13 +0800
  • e8cf27b8c0
    Feat(QA): Check init model weights (#502) jiaxingli 2023-11-16 11:03:19 +0800
  • dd5fbd2edf check_init lijiaxing 2023-11-16 10:15:49 +0800
  • e05751e0c6 check_init lijiaxing 2023-11-15 17:37:30 +0800
  • 721d64d097 check_init lijiaxing 2023-11-15 17:26:52 +0800
  • 1f33c211bb check_init lijiaxing 2023-11-15 17:14:35 +0800
  • a1fd877828 fix(train.py): clear memory pool before optim step huangting4201 2023-11-15 14:40:06 +0800
  • aad3288d0b feat(train): fix/support_rampup_batch_size 877825076@qq.com 2023-11-10 17:37:54 +0800
  • b65e8cb802 init_seed lijiaxing 2023-11-14 17:29:06 +0800
  • be5b9ea2fa
    feat(train): update get_train_data_loader to make logic clearer (#498) YWMditto 2023-11-14 17:05:15 +0800
  • e5a1ff2336 update get_train_data_loader, del old doc YWMditto 2023-11-14 16:36:18 +0800
  • f656ff08a6 update get_train_data_loader YWMditto 2023-11-14 15:39:43 +0800
  • 3c07423151 feat(model/overlap_handler.py): release weight huangting4201 2023-11-14 11:30:26 +0800
  • 0c94e429bb bind seed lijiaxing 2023-11-14 10:14:27 +0800
  • 74754397df feat(model/overlap_handler.py): add memory_pool switch and refactor overlap handler huangting4201 2023-11-13 21:09:59 +0800
  • c53667d70c bind seed lijiaxing 2023-11-13 20:03:19 +0800
  • 343732b4f9 unify time format JiaoPL 2023-11-13 18:53:57 +0800
  • 2b984ffa58
    test(workflow): add ci workflow for acc test (#485) kkscilife 2023-11-13 18:04:01 +0800
  • 815678f73a larger initialize timeout JiaoPL 2023-11-13 15:59:17 +0800
  • 626ed0fc5e
    fix(train): unify the exp paths (#492) jiaopenglong 2023-11-11 20:15:59 +0800
  • 3418898cbe
    fix(alert): send exception of all ranks (#491) jiaopenglong 2023-11-10 19:04:31 +0800
  • 48aff41716 Unify the paths for logs, tensorboard and trace. JiaoPL 2023-11-10 18:31:00 +0800
  • d680876a9a monitor task only if DO_ALERT is True JiaoPL 2023-11-10 16:39:59 +0800
  • 1180f5e618 add --kill-on-bad-exit=1 and change always to !cancelled wangmengke 2023-11-10 16:23:45 +0800
  • a8ff9acbfd Merge branch 'develop' into fix/send_exception_of_all_ranks JiaoPL 2023-11-10 16:02:30 +0800
  • 3f4c3bd94f catch exception of all ranks JiaoPL 2023-11-10 16:01:40 +0800
  • 8ada074cfd
    fix(docs): fix 20B demo log (#490) huangting4201 2023-11-10 15:57:11 +0800
  • a09b4d7d00 feat(docs): fix demo log huangting4201 2023-11-10 15:21:04 +0800
  • ec614497c9 feat(docs): fix demo log huangting4201 2023-11-10 15:18:28 +0800
  • 07026d1821
    fix dataset types when using random dataset (#489) Yang Gao 2023-11-10 15:08:22 +0800
  • a399d74363 fix dataset types when using random dataset gaoyang07 2023-11-10 14:59:52 +0800
  • 5d3242027a
    docs(code-docs): add 20b training demo (#488) huangting4201 2023-11-10 14:00:27 +0800
  • 162eaa4287 feat(docs): change 30B demo to 20B huangting4201 2023-11-10 12:02:24 +0800
  • 0f2a39ec93 feat(docs): change 30B demo to 20B huangting4201 2023-11-10 11:50:35 +0800
  • b7ecdba617
    feat(ckpt): save ckpt when reach total step count (#486) Guoteng 2023-11-09 21:07:16 +0800
  • 8fdc3f025f feat(ckpt): save ckpt when reach total step count 877825076@qq.com 2023-11-09 20:49:40 +0800
  • 5b67db33d0
    fix(metric): use float32 to compute ppl (#481) Pryest 2023-11-09 20:26:46 +0800
  • a435980e0c
    rename vars (#468) jiaopenglong 2023-11-09 20:04:35 +0800
  • 0763bf3972
    init light monitoring on all ranks (#462) jiaopenglong 2023-11-09 20:04:21 +0800
  • 0218e3131c
    feat(tools): support origin internlm architecture in web_demo (#478) YWMditto 2023-11-09 20:01:55 +0800
  • acbf84f353 change train script wangmengke 2023-11-09 18:29:26 +0800
  • 40b65d0553 add ci workflow for acc test wangmengke 2023-11-09 18:24:16 +0800
  • 5defb9e597 unitest_only_forward lijiaxing 2023-11-09 18:07:02 +0800
  • a058258fc0 fix some info YWMditto 2023-11-09 17:01:50 +0800
  • 8f8fe84c03 fix some info YWMditto 2023-11-09 16:58:27 +0800
  • 18bd6429f5 del private info in load_internlm_model.py YWMditto 2023-11-09 16:44:28 +0800
  • 3dab742b75 update tools/load_internlm_model YWMditto 2023-11-09 16:39:23 +0800
  • 7a462c7d3f update apis/inference.py YWMditto 2023-11-09 16:33:20 +0800
  • 47c82aa223 update apis/inference.py YWMditto 2023-11-09 16:17:32 +0800
  • bd7e501b69
    Feat(QA): Check model weights for acc (#476) jiaxingli 2023-11-09 16:16:29 +0800
  • 745fb33ca5 fix(metric): fix metric behavior when ppl exceeds float16 Pryest 2023-11-09 16:01:27 +0800
  • a38af602bc
    feat(doc): add torch_dtype to examples in README (#479) x54-729 2023-11-09 15:58:58 +0800
  • 79e84fade3
    feat(doc): add dynamic ntk example (#480) YWMditto 2023-11-09 13:12:38 +0800
  • 0fb8dbab3a update InternLM/tools/load_internlm_model.py YWMditto 2023-11-07 23:11:39 +0800
  • 1706ae2eaa
    fix(tools): set bos, eos, pad in convert2hf to fix improper generation (#471) x54-729 2023-11-07 23:10:06 +0800
  • 1a97169fab update web_demo.py YWMditto 2023-11-07 23:07:27 +0800
  • eaaa749328 add dynamic ntk compare example YWMditto 2023-11-07 23:01:56 +0800
  • 49f386238c add dynamic ntk compare example YWMditto 2023-11-07 23:01:27 +0800
  • a3946235b2 update readme.md YWMditto 2023-11-07 20:42:07 +0800
  • 7b831a6776 typo x54-729 2023-11-07 20:23:14 +0800
  • 08b6567ab5 add torch_dtype to README examples x54-729 2023-11-07 20:17:03 +0800
  • 7efd96502a support web_demo_internlm YWMditto 2023-11-07 19:57:12 +0800
  • 6f69bd2087
    feat(data): walk folder to get dataset_type_ids_map (#477) Yang Gao 2023-11-07 19:21:10 +0800
  • 2f1812e8c7 fix a bug gaoyang07 2023-11-07 17:38:46 +0800
  • 61f953bb7b walk folder to get dataset_type_ids_map gaoyang07 2023-11-07 17:13:45 +0800
  • 8c8883367a check_weights lijiaxing 2023-11-07 15:34:34 +0800
  • 25604ed040 check_weights lijiaxing 2023-11-07 14:55:55 +0800
  • 4d1b1cd5f1
    fix(data): broadcast list when walking folders (#475) Yang Gao 2023-11-07 13:12:35 +0800
  • 535b0f795e broadcast list when walking folders gaoyang07 2023-11-06 23:22:04 +0800
  • 095ebfff9d
    feat(tools): support dynamic ntk rope in transformers (#470) YWMditto 2023-11-06 23:15:06 +0800
  • ec88e35306 add rotary config in configuration_internlm.py YWMditto 2023-11-06 20:30:34 +0800
  • 94fdd178ba debug for web_demo_internlm YWMditto 2023-11-06 19:54:45 +0800
  • 42ad9cc786
    fix(readme): fix model path in readme (#474) x54-729 2023-11-06 19:26:48 +0800
  • 8038a8985e fix model path in readme x54-729 2023-11-06 18:34:34 +0800
  • b5e4d04a9a fix conflicts yingtongxiong 2023-11-06 12:08:31 +0800
  • b80e6cdcf3 merge origin yingtongxiong 2023-11-06 12:05:53 +0800
  • 7c6d2936b3 reset the sp allreduce in optimizer yingtongxiong 2023-11-06 12:04:01 +0800
  • c517ec5b8c feat(model/overlap_handler.py): delete reduce_scatter_overlap switch huangting4201 2023-11-06 11:57:14 +0800
  • 9b1265c591 modify the sp allreduce and support tf32 for fstp linear yingtongxiong 2023-11-06 10:45:08 +0800
  • 3a1dd36d05 set pos eos pad in convert2hf to fix improper generation x54-729 2023-11-03 22:30:23 +0800
  • 845cccd756 add rope doc YWMditto 2023-11-03 17:19:37 +0800
  • f2d9b63545 support dynamic ntk in transformers YWMditto 2023-11-03 16:46:14 +0800
  • c196825551 support dynamic ntk in transformers YWMditto 2023-11-03 16:41:55 +0800
  • 139b754f29 support dynamic ntk in transformers YWMditto 2023-11-03 16:34:35 +0800
  • 42dfbbebb3 Set bos eos pad in convert2hf to fix improper generation x54-729 2023-11-03 16:26:09 +0800
  • b9c813a972
    fix(tools): fix streaming_chat and update docs (#467) x54-729 2023-11-03 16:12:37 +0800
  • d9bf269402 fix huggingface url in readme x54-729 2023-11-03 15:44:27 +0800
  • 7e123f1e00 rename vars JiaoPL 2023-11-03 15:22:11 +0800
  • debb7e77b9
    refactor grad norm profiling (#466) jiaopenglong 2023-11-03 10:55:26 +0800
  • d537e45456
    send exception to light monitor only if the server is available (#465) jiaopenglong 2023-11-03 10:55:16 +0800
  • cccd216977 fix import of tools/tokenizer.py x54-729 2023-11-02 23:17:30 +0800
  • e5bdfd1892 Add hf link x54-729 2023-11-02 22:12:41 +0800
  • 12442cbf47 Add __init__ to internlm_model x54-729 2023-11-02 22:07:53 +0800
  • 6b2ea75ca9 fix import x54-729 2023-11-02 22:05:54 +0800
  • 3418427083 Add stream_chat example x54-729 2023-11-02 22:05:09 +0800
  • a61bbd84a2 fix stream_chat x54-729 2023-11-02 22:04:50 +0800
  • 3f4ec9bacf move hf model to tools/transformers/internlm_model x54-729 2023-11-02 22:04:22 +0800
  • fce20c9221 refactor grad norm profiling JiaoPL 2023-11-02 18:38:02 +0800
  • 5a18b3b651 fix(model/overlap_handler.py): fix last block hook when pp with activation huangting4201 2023-11-02 16:05:07 +0800
  • efa2b618d1 send exception to light monitor only if the server is available JiaoPL 2023-11-02 14:38:03 +0800