Tag: numpy

Numba: indexing a vector is giving an error

I started using python and numba recently. My problem is: I have a matrix (n rows and m columns).In a for loop I have to change the values of specific columns. Without numba, the code is running fine. But when I use njit(), it just crashes. Note: In my real project, each row don’t have the same values. This is

reshape the array with specific form

numpy python

I have an specific array which each rows has to array. I want to reshape it. But, I don’t know how to reshape it to a 2d array. Here is my array: Here is the desired output: Any help appreciated. Thanks I’ve tried to use the reshape. But, it does not solve. Answer It’s ragged, it’s not concatenated. Is it

Resolving conflicts in Pandas dataframe

dataframe numpy pandas python record-linkage

I am performing record linkage on a dataframe such as: When my model overpredicts and links the same ID_1 to more than one ID_2 (indicated by a 1 in Predicted Link) I want to resolve the conflicts based on the Probability-value. If one predicted link has a higher probability than the other I want to keep a 1 for that,

Creating a function to standardize categorical variables (python)

function loops numpy pandas python

I don’t know if it is right to say “standardize” categorical variable string, but basically I want to create a function to set all observations F or f in the column below to 0 and M or m to 1: I tried this: But I got an error: Any ideas? Thanks! Answer There is no replace function defined in your

np.where for 2d array, manipulate whole rows

array-broadcasting arrays numpy python

I want to rebuild the following logic with numpy broadcasting function such as np.where: From a 2d array check per row if the first element satisfies a condition. If the condition is true then return the first three elements as a row, else the last three elements. A short MWE in form of a for-loop which I want to circumvent:

Images Have Grey Values of True and False

numpy python

I’m planning to process some images using PyCharm. However, I find a bug and start to find the reason. Finally, I find that the images have grey values of True and False, but they should be 1 and 0, is there any way to change it? The image is generated in PyCharm using: The Python version is 3.8.12. Answer You

pandas rename multiple columns using regex pattern

dataframe numpy pandas python series

I have a dataframe like as shown below I would like to remove the keyword US – from all my column names I tried the below but there should be better way to do this But my real data has 70 plus columns and this is not efficient. Any regex approach to rename columns based on regex to exclude the

Linear sum assignment (SciPy) and balancing the costs

assignment-problem numpy optimization python scipy

I am having difficulty using scipy.optimize.linear_sum_assignment to evenly distribute tasks (costs) to workers, where each worker can be assigned multiple tasks. The cost matrix represents the workload of each task for each worker. We want to minimize the total costs of all workers, while evenly distributing the costs of each worker. In this example, we have 3 workers named a,

Import large .tiff file as sparse matrix

numpy python sparse-matrix tiff

I have a large .tiff file (4.4gB, 79530 x 54980 values) with 1 band. Since only 16% of the values are valid, I was thinking it’s better to import the file as sparse matrix, to save RAM. When I first open it as np.array and then transform it into a sparse matrix using csr_matrix(), my kernel already crashes. See code

Deduplicate numpy array by another array

numpy python

I have two numpy arrays: a is the index of items, and b is the score of corresponding items. Now I want to sort these items descendingly by the scores in b while only keeping the largest score of a single item. The results should be the non-duplicated item index a_new and the score of these items b_new. In the