Tag: pytorch

mat1 and mat2 shapes cannot be multiplied (128×4 and 128×64)

Could not find out why the mat1 from the convolutional network is 128×4 and not 4×128. The following is the convolutional network used: The model training code is as follows: The error log shown is: mat1 should be the output of the convolutional network after it is flattened, and mat2 is the linear network following it. Appreciate any help. Thanks!

PyTorch TransformerEncoderLayer different input order gets different results

python pytorch transformer-model

Before I start, I’m very new to Transformers, and sorry for by bad sentence structure, I have a fever right now. Any time I use nn.TransformerEncoderLayer in anyway with a saved model if the data is in a different order I get different results. Is there a way to save the Encode table (or whatever this would be), this would

Reading files with .h5 format and using it in dataset

deep-learning h5py python pytorch vision

I have two folders( one for train and one for test) and each one has around 10 files in h5 format. I want to read them and use them in a dataset. I have a function to read them, but I don’t know how I can use it to read the file in my class. Do you have a suggestion?

Select items from a matrix or tensor using 1-D array of indexes to get 1D or 2D tensors in torch

python pytorch selection

I have a tensor of n-sampled predictions. In this example I sample 4 times a (10,2) result, where this results represents a batch of graphs with each graph having 3+4+3=10 nodes with 2 coordinates. I have a way to select the best sample for each graph, and I have an index telling me this (idx_test). I want to select in

How to input embeddings directly to a huggingface model instead of tokens?

huggingface-transformers machine-learning python pytorch

I’m going over the huggingface tutorial where they showed how tokens can be fed into a model to generate hidden representations: But how can I input word embeddings directly instead of tokens? That is, I have another model that generates word embeddings and I need to feed those into the model Answer Most (every?) huggingface encoder model supports that with

Pytorch model object has no attribute ‘predict’ BERT

bert-language-model huggingface-transformers python pytorch sentence-transformers

I had train a BertClassifier model using pytorch. After creating my best.pt I would like to make in production my model and using it to predict and classifier starting from a sample, so I resume them from the checkpoint. Otherwise after put it in evaluation and freeze model, I use .predict to make in work on my sample but I’m

MultiHeadAttention giving very different values between versions (Pytorch/Tensorflow

attention-model python pytorch tensorflow transformer-model

I’m trying to recreate a transformer that was written in Pytorch and make it Tensorflow. Everything was going pretty well until each version of MultiHeadAttention started giving extremely different outputs. Both methods are an implementation of multi-headed attention as described in the paper “Attention is all you Need”, so they should be able to achieve the same output. I’m converting

RuntimeError: Attempted to set the storage of a tensor on device “cuda:0” to a storage on different device “cpu”

python pytorch

Earlier I have configured the following project https://github.com/zllrunning/face-makeup.PyTorch using Pytorch with CUDA=10.2, Now Pytorch with CUDA=10.2 support is not available for Windows. So, when I am configuring the same project using Pytorch with CUDA=11.3, then I am getting the following error: Please help me in solving this problem. Answer I solved this by adding map_location=lambda storage, loc: storage.cuda() in the

How to rearrange the sample order of a torch dataloader?

dataloader python pytorch pytorch-dataloader sorting

I have a “torch.utils.data.DataLoader”. I want to rearrange the order of the samples. Is it possible? Answer Yes, you can use torch.utils.data.Subset and specify the indices.

What does Tensor[batch_mask, …] do?

bilstm deep-learning lstm python pytorch

I saw this line of code in an implementation of BiLSTM: I assume this is some kind of “masking” operation, but found little information on Google about the meaning of …. Please help:). Original Code: Answer I assume that batch_mask is a boolean tensor. In that case, batch_output[batch_mask] performs a boolean indexing that selects the elements corresponding to True in