My df has USA states-related information. I want to rank the states based on its contribution. My code: Expected Answer: Compute state_capacity by summing state values from all years. Then Rank the States based on the state capacity My approach: I am able to compute the state capacity using groupby. I ran int…
Tag: numpy
Add new column with specific increasing of a quarter using python
I have a dataframe, df, that has a quarters column where I would like to add an additional increased quarters column adjacent to it (increased by 2) Data Desired Doing However this is not adding 2 consistently to the entire column I am still troubleshooting, any suggestion is appreciated Answer Reformat the s…
Pandas conditional counting by date
I want to count all orders done by each customer at each order date, to find out how many orders were done at the time of each order. Input: Expected output: The following code works but is extremely slow. Taking upwards of 10 hours for 100k+ rows. There is certainly a better way. Answer Try sort_values to ge…
comparing numpy arrays with tolerance
I’m trying to compare floating numbers that are stored in numpy arrays. I would like them to be compared with a tolerance and every number of the array should be compared with every number of the other array. My attempt is shown underneath, I used two simple arrays as examples but it has the problem tha…
Python DataFrame: Map two dataframes based on day of month?
I have two dataframes. month_data dataframe has days from start of the month to the end. student_df with each student’s only present data. I’m trying to map both dataframes so that the remaining days left for each student should be marked as absent month_data month_data = pd.DataFrame({‘day_…
How to optimize time while converting list to dataframe?(Part II)
I didn’t get any proper answers to my previous question: How to optimize time while converting list to dataframe? Let me explain the example more: Let’s consider the data frame more precisely as I want the output dataframe when converted to csv as The character PH,AG, AD,N should not be mapped. It…
Numpy append 2D array in for loop over rows
I want to append a 2D array created within a for-loop vertically. I tried append method, but this won’t stack vertically (I wan’t to avoid reshaping the result later), and I tried the vstack() function, but this won’t work on an empty array. Does anyone know how to solve this? I can think of…
Numpy condition on a vector rather than an element
I have a numpy array that represents an image, it’s dimensions are (w, h, 4), 4 is for RGBA. Now I want to replace all white pixels with transparent pixels. I wish I could do something like np.where(pic == np.array([255, 255, 255, 255]), np.array([0, 0, 0, 0]), pic) but this exact code obviously doesn&#…
How to set a numpy array in a pandas data frame cell?
I have a pandas dataframe. I want to fill some of the cells with numpy array but I get the following ValueError. I wil not fill with zero array in real life. This is the simplified example code to replicate the error ValueError: could not broadcast input array from shape (10,) into shape (1,) Answer One worka…
ValueError: operands could not be broadcast together with shapes (31,2) (2,31)
I’m trying to plot 2 arrays but I’m receiving this error while passing to the function. Not really sure what is causing this error. I’m using the function like this plotModel(x,y,theta) but looks like the error is between x and theta. Also, these are my 2 arrays: How can I solve this problem…