I have a correlation matrix (in the form of a DataFrame) from which I return a Series which is the top n correlated pairs of columns and the value of the correlation: See this for an example of what I mean. I take the resulting Series object and then cast as a dictionary like so: The resulting keys of this
Tag: pandas
Using result_type with pandas apply function
I want to use apply on a pandas.DataFrame that I created, and return for each row a list of values, where each value is a column in itself. I wrote the following code: When I add result_type=’expand’ in order to change the returned array into separate columns I get the following error: However if I drop the result_type field it
Assigning a string (from a list of string) to a dataframe name pandas
I have a list of names, [‘name1’, ‘name2’,… ‘nameN’], that I would like to use as names for the resulting dataframes after filtering the original dataframe by each name in a for loop Is there a function in python, similar to assign in R, that I can use to accomplish this. Any other solutions are welcome. As requested. Here is
Is it possible to input values for confidence interval/ error bars on seaborn barplot?
I’m used to doing my barplots on seaborn and I like it’s layout for showing confidence bars, but I have a special case in a dataset where I already have the confidence interval, like this: Is there a way to manually input the values for seaborn confidence interval lines? Or to use it as “None” and use some matlib function
Why Pandas gives AttributeError: ‘SeriesGroupBy’ object has no attribute ‘pct’?
I’m trying to pass a user defined function pct to Pandas agg method, and it works if I only pass that function but it doesn’t when I use the dictionary format for defining the functions. Does anyone know why? returns as expected But returns the following error: Answer There is string ‘pct’, need variable pct – lambda function by removing
How to move a column in a pandas dataframe
I want to take a column indexed ‘length’ and make it my second column. It currently exists as the 5th column. I have tried: I see the following error: TypeError: must be str, not list I’m not sure how to interpret this error because it actually should be a list, right? Also, is there a general method to move any
Convert dataframe to a rec array (and objects to strings)
I have a pandas dataframe with a mix of datatypes (dtypes) that I wish to convert to a numpy structured array (or record array, basically the same thing in this case). For purely numeric dataframes, this is easy to do with the to_records() method. I also need the dtypes of pandas columns to be converted to strings rather than objects
Delete rows that do not contain specific text
I have a tabular file that looks like this: I’m trying to create a script to go through and delete the entire row if column 2 (‘KEGG_KOs’) does not begin with ‘K0’. I’m trying to create an output of: Previous responses have referred people to pandas DataFrame but I’ve had no luck using those responses to help. Any would be
How to find last occurence index matching a certain value in a Pandas Series?
How do I find the last occurrence index for a certain value in a Pandas Series? For example, let’s say I have a Series that looks like follows: And I want to find the last index for a True value (i.e. index 3), how would you go about it? Answer Use last_valid_index: Output: Using @user3483203 example Output
Pandas: Conditionally replace values based on other columns values
I have a dataframe (df) that looks like this: Now my goal is for each add_rd in the event column, the associated NaN-value in the environment column should be replaced with a string RD. What I did so far I stumbled across df[‘environment’] = df[‘environment].fillna(‘RD’) which replaces every NaN (which is not what I am looking for), pd.isnull(df[‘environment’]) which is