Skip to content
Advertisement

PyTorch – AssertionError: Size mismatch between tensors

I am trying to adapt a Pytorch script that was created for linear regression. It was originally written to take in a set of random values(created with np.random) as features and targets.

I have now created a dataframe of actual data for analysis:

df = pd.read_csv('file_name.csv')

The df looks like this:

      X1     X2     X3     X4    X5   X6   X7  X8    Y1     Y2
0    0.98  514.5  294.0  110.25  7.0   2  0.0   0  15.55  21.33
1    0.98  514.5  294.0  110.25  7.0   3  0.0   0  15.55  21.33
2    0.98  514.5  294.0  110.25  7.0   4  0.0   0  15.55  21.33
3    0.98  514.5  294.0  110.25  7.0   5  0.0   0  15.55  21.33
4    0.90  563.5  318.5  122.50  7.0   2  0.0   0  20.84  28.28

…and I am currently extracting just two columns(X1 and X2) as my features, and one column(Y1) as my targets, like this:

x = df[['X1', 'X2']]
y = df['Y1']

So features look like this:

      X1     X2
0    0.98  514.5
1    0.98  514.5
2    0.98  514.5
3    0.98  514.5
4    0.90  563.5

and targets look like this:

        Y1
0      15.55
1      15.55
2      15.55
3      15.55
4      20.84

However, when I attempt to convert the features (X1 and X1) and targets(Y1) to tensors, in order to feed them to the NN, the code fails at the line:

dataset = TensorDataset(x_tensor_flat, y_tensor_flat)

I get the error:

line 45, in <module> dataset = TensorDataset(x_tensor, y_tensor)
AssertionError: Size mismatch between tensors

There’s clearly some shaping issue at play, but I can’t work out what. I have tried to flatten as well as transposing the tensors, but I get the same error. Any help would be hugely appreciated.

Here’s the full section of code that is causing the issue:

import pandas as pd
import torch
import torch.optim as optim
import torch.nn as nn
from torch.utils.data import Dataset, TensorDataset, DataLoader
from torch.utils.data.dataset import random_split

device = 'cuda' if torch.cuda.is_available() else 'cpu'


df = pd.read_csv('file_name.csv')
x = df[['X1', 'X2']]
y = df['Y1']


x_tensor = torch.from_numpy(np.array(x)).float()
y_tensor = torch.from_numpy(np.array(y)).float()


train_loader = DataLoader(dataset=train_dataset, batch_size=10)
val_loader = DataLoader(dataset=val_dataset, batch_size=10)


class ManualLinearRegression(nn.Module):
    def __init__(self):
        super().__init__()
        self.linear = nn.Linear(2, 1)

    def forward(self, x):
        return self.linear(x)

def make_train_step(model, loss_fn, optimizer):
    def train_step(x, y):
        model.train()
        yhat = model(x)
        loss = loss_fn(y, yhat)
        loss.backward()
        optimizer.step()
        optimizer.zero_grad()
        return loss.item()
    return train_step


torch.manual_seed(42)

model = ManualLinearRegression().to(device) 
loss_fn = nn.MSELoss(reduction='mean')
optimizer = optim.SGD(model.parameters(), lr=1e-1)
train_step = make_train_step(model, loss_fn, optimizer)

n_epochs = 50
training_losses = []
validation_losses = []
print(model.state_dict())

for epoch in range(n_epochs):
    batch_losses = []
    for x_batch, y_batch in train_loader:
        x_batch = x_batch.to(device)
        y_batch = y_batch.to(device)
        loss = train_step(x_batch, y_batch)
        batch_losses.append(loss)
    training_loss = np.mean(batch_losses)
    training_losses.append(training_loss)

    with torch.no_grad():
        val_losses = []
        for x_val, y_val in val_loader:
            x_val = x_val.to(device)
            y_val = y_val.to(device)
            model.eval()
            yhat = model(x_val)
            val_loss = loss_fn(y_val, yhat).item()
            val_losses.append(val_loss)
        validation_loss = np.mean(val_losses)
        validation_losses.append(validation_loss)

    print(f"[{epoch+1}] Training loss: {training_loss:.3f}t Validation loss: {validation_loss:.3f}")

print(model.state_dict())

Advertisement

Answer

The problem is with how you have called the random_split function. Note that it takes lengths as input, not the percentage or ratio of the split. The error is about the same, i.e., the sum of lengths (80+20) that you have specified is not the same as the length of data (5).

The below code snippet should fix your problem. Also, you do not need to flatten tensors… I think.

dataset = TensorDataset(x_tensor, y_tensor)
val_size = int(len(dataset)*0.2)
train_size = len(dataset)- int(len(dataset)*0.2)
train_dataset, val_dataset = random_split(dataset, [train_size, val_size])
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement