I have this DataFrame and want only the records whose EPS column is not NaN: …i.e. something like df.drop(….) to get this resulting dataframe: How do I do that? Answer Don’t drop, just take the rows where EPS is not NA:
Tag: pandas
T-test in Pandas
If I want to calculate the mean of two categories in Pandas, I can do it like this: I have a lot of data formatted this way, and now I need to do a T-test to see if the mean of cat1 and cat2 are statistically different. How can I do that? Answer it depends what sort of t-test you
pandas pivot dataframe to 3d data
There seem to be a lot of possibilities to pivot flat table data into a 3d array but I’m somehow not finding one that works: Suppose I have some data with columns=[‘name’, ‘type’, ‘date’, ‘value’]. When I try to pivot via I get Am I reading docs from dev pandas maybe? It seems like this is the usage described there.
How to add a new column to an existing DataFrame?
I have the following indexed DataFrame with named columns and rows not- continuous numbers: I would like to add a new column, ‘e’, to the existing data frame and do not want to change anything in the data frame (i.e., the new column always has the same length as the DataFrame). How can I add column e to the above
Remove duplicates by columns A, keeping the row with the highest value in column B
I have a dataframe with repeat values in column A. I want to drop duplicates, keeping the row with the highest value in column B. So this: Should turn into this: I’m guessing there’s probably an easy way to do this—maybe as easy as sorting the DataFrame before dropping duplicates—but I don’t know groupby’s internal logic well enough to figure
Pandas: create two new columns in a dataframe with values calculated from a pre-existing column
I am working with the pandas library and I want to add two new columns to a dataframe df with n columns (n > 0). These new columns result from the application of a function to one of the columns in the dataframe. The function to apply is like: One method for creating a new column for a function returning
pandas: filter rows of DataFrame with operator chaining
Most operations in pandas can be accomplished with operator chaining (groupby, aggregate, apply, etc), but the only way I’ve found to filter rows is via normal bracket indexing This is unappealing as it requires I assign df to a variable before being able to filter on its values. Is there something more like the following? Answer I’m not entirely sure
Create a Pandas Dataframe by appending one row at a time
How do I create an empty DataFrame, then add rows, one by one? I created an empty DataFrame: Then I can add a new row at the end and fill a single field with: It works for only one field at a time. What is a better way to add new row to df? Answer You can use df.loc[i], where
How to add hovering annotations to a plot
I am using matplotlib to make scatter plots. Each point on the scatter plot is associated with a named object. I would like to be able to see the name of an object when I hover my cursor over the point on the scatter plot associated with that object. In particular, it would be nice to be able to quickly