Tag: pandas

what would be the most efficient way to do this in pandas

I’m trying to figure out the most efficient way to join two dataframes such as below. I’ve tried pd.merge and maybe using the rank function but cannot seem to figure a way. Thanks in advance df1 What I’m trying to achieve is this df2 Answer You might want to use groupby with unstack as advis…

Remove rows in pandas dataframe if any of specific columns contains a specific value

dataframe pandas python

I have the following df: Data Frame I have not been able to figure out how to delete a row if any of the columns containing the word “test” is less than 95. For example, I would have to delete the entire index row 1 because the column “heat.test” is 80 (the same for rows 0 and 3). In o…

Change structure of dictionary in Python Pandas

dictionary nested pandas python

Is there a way of changing structure of nested dictionary? I have a column in dataframe with many rows of dictionaries, which looks like that: Is there a way of modifying structure, so that it will looks like without changing actual values? Answer You should read about the function apply() in pandas. You buil…

How to organise multiple stock data in pandas dataframe for plotting

matplotlib pandas plotly-python python seaborn

I have over a hundred stocks (actually crypto but that does not matter) I wish to plot, all on the same line plot. I end up with a dataframe that looks like this: I don’t know how to make a line plot from this dataframe, I don’t even know if it is possible. Is there a way? Or is there

I want to select data from different df, how can I speed it up?

dataframe pandas python

I want to take the last data before the specified time from different time intervals df, my code is as follows: On my computer, the running time of get_result_df() is 204ms, how can I speed up the running speed of get_result_df()? I optimized it, and the running time was reduced to 53ms. Is there any room for…

Pandas – partition a dataframe into two groups with an approximate mean value

pandas partitioning python

I want to split all rows into two groups that have similar means. I have a dataframe of about 50 rows but this could go into several thousands with a column of interest called ‘value’. So far I tried using cumulative sum for which total column was created then I essentially made the split based on…

How to add randomly elements to a column of dataframe (Equally distributed to groups)

dataframe pandas python

Suppose I have the following dataframe: I want to groupby the dataset based on “Type” and then add a new column named as “Sampled” and randomly add yes/no to each row, the yes/no should be distributed equally. The expected dataframe can be: Answer You can use numpy.random.choice: outpu…

New rows based on a string – Pandas. python

new-operator pandas python row string

I have this pandas df I need to be able to break down the ‘cast” field in such a way that it is in several rows Example: I understand that I should do it with pandas, but it is very complicated, can you help me? Answer You can use spit and explode:

Iterating through a column and mapping values

iteration loops pandas python

Here is what I am trying to do. I want to substitute the values of this data frame. For example. Bernard to be substituted as 1, and then Drake as 2 and so on and so forth. How to iterate through the column to write a function that can do the following. Answer The function already exists – pd.factorize.…