manually computing cross entropy loss in pytorch

Question

I am trying to compute cross_entropy loss manually in Pytorch for an encoder-decoder model. I used the code posted here to compute it: Cross Entropy in PyTorch I updated the code to discard padded tokens (-100). The final code is this: To verify that it works fine, I tested it on a text generation task, and I…

Accepted Answer

I solved the problem by updating the code. I discarded before the -100 tokens (the if-statement above), but I forgot to reduce the hidden_state size (which is called n_batch in the code above). After doing that, the loss numbers are identical to the nn.CrossEntropyLoss values. The final code:class CrossEntropyLossManual:    """    y0 is the vector with shape (batch_size,C)    x shape is the same (batch_size), whose entries are integers from 0 to C-1    """    def __init__(self, ignore_index=-100) -> None:        self.ignore_index=ignore_index        def __call__(self, y0, x):        loss = 0.        n_batch, n_class = y0.shape        # print(n_class)        for y1, x1 in zip(y0, x):            class_index = int(x1.item())            if class_index == self.ignore_index:                n_batch -= 1                continue            loss = loss + torch.log(torch.exp(y1[class_index])/(torch.exp(y1).sum()))        loss = - loss/n_batch        return loss

Advertisement

Answer