ColossalAI/colossalai/nn/loss/loss_3d.py

from colossalai.constants import INPUT_GROUP_3D, WEIGHT_GROUP_3D
from colossalai.nn.layer.parallel_3d import reduce_by_batch_3d
from colossalai.nn.layer.parallel_3d._utils import get_parallel_mode_from_env
from colossalai.registry import LOSSES
from torch.nn.functional import cross_entropy
from torch.nn.modules.loss import _Loss

@LOSSES.register_module
class CrossEntropyLoss3D(_Loss):
    """
    Cross entropy loss for 3D parallelism

    :param depth: depth for 3D parallelism
    :type depth: int
    :param reduction: whether to average the loss, defaults to True
    :type reduction: bool, optional
    """
    def __init__(self, reduction=True, *args, **kwargs):
        super().__init__()
        self.input_parallel_mode = get_parallel_mode_from_env(INPUT_GROUP_3D)
        self.weight_parallel_mode = get_parallel_mode_from_env(WEIGHT_GROUP_3D)
        self.reduction_mean = reduction
        self.loss_args = args
        self.loss_kwargs = kwargs

    def forward(self, logits, targets):
        loss = cross_entropy(logits, targets, reduction='none', *self.loss_args, **self.loss_kwargs)
        if self.reduction_mean:
            loss = loss.mean()
            loss = reduce_by_batch_3d.apply(loss, self.input_parallel_mode, self.weight_parallel_mode, True)
        return loss
Layer integration (#83) * integrated parallel layers for ease of building models * integrated 2.5d layers * cleaned codes and unit tests * added log metric by step hook; updated imagenet benchmark; fixed some bugs * reworked initialization; cleaned codes Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com> 3 years ago			`from colossalai.constants import INPUT_GROUP_3D, WEIGHT_GROUP_3D`
Hotfix/Colossalai layers (#92) * optimized 1d layer apis; reorganized nn.layer modules; fixed tests * fixed 2.5d runtime issue * reworked split batch, now called in trainer.schedule.load_batch Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com> 3 years ago			`from colossalai.nn.layer.parallel_3d import reduce_by_batch_3d`
Layer integration (#83) * integrated parallel layers for ease of building models * integrated 2.5d layers * cleaned codes and unit tests * added log metric by step hook; updated imagenet benchmark; fixed some bugs * reworked initialization; cleaned codes Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com> 3 years ago			`from colossalai.nn.layer.parallel_3d._utils import get_parallel_mode_from_env`
			`from colossalai.registry import LOSSES`
			`from torch.nn.functional import cross_entropy`
			`from torch.nn.modules.loss import _Loss`

			`@LOSSES.register_module`
			`class CrossEntropyLoss3D(_Loss):`
Update layer integration documentations (#108) Update the documentations of layer integration Update _log_hook.py Update _operation.py 3 years ago			`"""`
			`Cross entropy loss for 3D parallelism`
Layer integration (#83) * integrated parallel layers for ease of building models * integrated 2.5d layers * cleaned codes and unit tests * added log metric by step hook; updated imagenet benchmark; fixed some bugs * reworked initialization; cleaned codes Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com> 3 years ago
			`:param depth: depth for 3D parallelism`
			`:type depth: int`
			`:param reduction: whether to average the loss, defaults to True`
			`:type reduction: bool, optional`
			`"""`
			`def __init__(self, reduction=True, args, *kwargs):`
			`super().__init__()`
			`self.input_parallel_mode = get_parallel_mode_from_env(INPUT_GROUP_3D)`
			`self.weight_parallel_mode = get_parallel_mode_from_env(WEIGHT_GROUP_3D)`
			`self.reduction_mean = reduction`
			`self.loss_args = args`
			`self.loss_kwargs = kwargs`

			`def forward(self, logits, targets):`
Hotfix/Colossalai layers (#92) * optimized 1d layer apis; reorganized nn.layer modules; fixed tests * fixed 2.5d runtime issue * reworked split batch, now called in trainer.schedule.load_batch Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com> 3 years ago			`loss = cross_entropy(logits, targets, reduction='none', self.loss_args, *self.loss_kwargs)`
Layer integration (#83) * integrated parallel layers for ease of building models * integrated 2.5d layers * cleaned codes and unit tests * added log metric by step hook; updated imagenet benchmark; fixed some bugs * reworked initialization; cleaned codes Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com> 3 years ago			`if self.reduction_mean:`
Hotfix/Colossalai layers (#92) * optimized 1d layer apis; reorganized nn.layer modules; fixed tests * fixed 2.5d runtime issue * reworked split batch, now called in trainer.schedule.load_batch Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com> 3 years ago			`loss = loss.mean()`
			`loss = reduce_by_batch_3d.apply(loss, self.input_parallel_mode, self.weight_parallel_mode, True)`
Layer integration (#83) * integrated parallel layers for ease of building models * integrated 2.5d layers * cleaned codes and unit tests * added log metric by step hook; updated imagenet benchmark; fixed some bugs * reworked initialization; cleaned codes Co-authored-by: BoxiangW <45734921+BoxiangW@users.noreply.github.com> 3 years ago			`return loss`