I would like to know if there is someway of replacing all DataFrame negative numbers by zeros? Answer If all your columns are numeric, you can use boolean indexing: For the more general case, this answer shows the private method _get_numeric_data: With timedelta type, boolean indexing seems to work on separate columns, but not on the whole dataframe. So you
Tag: dataframe
How to loop over grouped Pandas dataframe?
DataFrame: Code: I’m trying to just loop over the aggregated data, but I get the error: ValueError: too many values to unpack @EdChum, here’s the expected output: The output is not the problem, I wish to loop over every group. Answer df.groupby(‘l_customer_id_i’).agg(lambda x: ‘,’.join(x)) does already return a dataframe, so you cannot loop over the groups anymore. In general: df.groupby(…)
Pandas Replace NaN with blank/empty string
I have a Pandas Dataframe as shown below: I want to remove the NaN values with an empty string so that it looks like so: Answer This might help. It will replace all NaNs with an empty string.
How to split a dataframe by unique groups and save to a csv
I have a pandas dataframe I would like to iterate over. A simplified example of my dataframe: I would like to iterate over each unique gene and create a new file named: For the above example I should get three iterations with 3 outfiles and 3 dataframes: The resulting data frame contents split up by chunks will be sent to
Find the column name of the second largest value of each row in a Pandas DataFrame
I am trying to find column name associated with the largest and second largest values in a DataFrame, here’s a simplified example (the real one has over 500 columns): Needs to become: I can find the column name with the largest value (i,e, 1larg above) with idxmax, but how can I find the second largest? Answer (You don’t have any
Using Python Pandas to bin data in one df according to bins defined in a second df
I am attempting to bin data in one dataframe according to bins defined in a second dataframe. I am thinking that some combination of pd.bin and pd.merge might get me there? This is basically the form each dataframe is currently in: df: And this is the table with the bins, df2: I would like to match the bin, and find
python pandas flatten a dataframe to a list
I have a df like so: I want to flatten the df so it is one continuous list like so: [‘1/2/2014’, ‘a’, ‘6’, ‘z1’, ‘1/2/2014’, ‘a’, ‘3’, ‘z1′,’1/3/2014’, ‘c’, ‘1’, ‘x3’] I can loop through the rows and extend to a list, but is a much easier way to do it? Answer You can use .flatten() on the DataFrame converted
Pandas – Compute z-score for all columns
I have a dataframe containing a single column of IDs and all other columns are numerical values for which I want to compute z-scores. Here’s a subsection of it: Some of my columns contain NaN values which I do not want to include into the z-score calculations so I intend to use a solution offered to this question: how to
How do I create test and train samples from one dataframe with pandas?
I have a fairly large dataset in the form of a dataframe and I was wondering how I would be able to split the dataframe into two random samples (80% and 20%) for training and testing. Thanks! Answer I would just use numpy’s randn: And just to see this has worked:
pandas apply function that returns multiple values to rows in pandas dataframe
I have a dataframe with a timeindex and 3 columns containing the coordinates of a 3D vector: I would like to apply a transformation to each row that also returns a vector but if I do: I end up with a Pandas series whose elements are tuples. This is beacause apply will take the result of myfunc without unpacking it.