InternLM/internlm/monitor
jiaopenglong 3418898cbe
fix(alert): send exception of all ranks (#491)
* catch exception of all ranks

* monitor task only if DO_ALERT is True
2023-11-10 19:04:31 +08:00
..
__init__.py feat(monitor): add light monitor (#275) 2023-09-05 19:24:01 +08:00
alert.py init light monitoring on all ranks (#462) 2023-11-09 20:04:21 +08:00
monitor.py fix(alert): send exception of all ranks (#491) 2023-11-10 19:04:31 +08:00
utils.py feat(monitor): send exception to light monitor (#420) 2023-10-18 21:00:21 +08:00