Tag: nearest-neighbor

Eliminate for loop when indexing into array

I have two arrays: vals has shape (N,m) where N is ~1 million, and m is 3. The values are floats I have another array indices with shape (N,4). All values in indices are row indices in vals. (Additionally, unlike the example here, every row of indices contains unique values.). I would like replace the following for loop when creating

How to perform operations on very big torch tensors without splitting them

faiss knn nearest-neighbor python torch

My Task: I’m trying to calculate the pair-wise distance between every two samples in two big tensors (for k-Nearest-Neighbours), That is – given tensor test with shape (b1,c,h,w) and tensor train with shape (b2,c,h,w), I need || test[i]-train[j] || for every i,j. (where both test[i] and train[j] have shape (c,h,w), as those are sampes in the batch). The Problem both

Using NearestNeighbors and word2vec to detect sentence similarity

nearest-neighbor python scikit-learn word2vec

I have calculated a word2vec model using python and gensim in my corpus. Then I calculated the mean word2vec vector for each sentence (averaging all the vectors for all the words in the sentence) and stored it in a pandas data frame. The columns of the pandas data frame df are: sentence Book title (the book where the sentence comes