Tag: pandas

add random elemnt from list to existing dataframe string

I have a dataframe. df[‘Skill’]=python, sql, java. Now for each string I want to add random element (high, low, medium). For Eg: df[‘Skill’]=python:high, sql:low, java:medium. I have tried one code but it adds score[‘low’, ‘high’, ‘medium’] at the en…

Count number of consecutive True in column, restart when False

pandas python

I work with the following column in a pandas df: I want to add column B that counts the number of consecutive “True” in A. I want to restart everytime a “False” comes up. Desired output: Answer Using cumsum identify the blocks of rows where the values in column A stays True, then group…

seaborn: ‘rows’ and ‘x_vars’ at the same time

dataframe matplotlib pandas python seaborn

I want a seaborn multiplot that varies the x-axis variable by column, but varies the subset of data shown by row. I can use PairGrid to vary the variables graphed, and I can use FacetGrid to vary the subsets graphed, but I don’t see any facility to do both at once, even though it seems like a natural ex…

Pandas apply() with axis=0 unexpected behaviour

apply pandas python

I’m using the .apply() method in pandas. I get the same results when using axis=0 and axis=1. When using axis=0 I’d expect a series with four elements (indexed A, B, C, D) as a result. Can anyone tell me why the axis argument doesn’t work in this case? I’m adding a reproducible example…

Turn cell into False based another row/column Pandas

pandas python

I have the following table of boolean values: index val1 val2 val3 val4 val5 val6 1 True False True True True False 2 False True True False True False 3 False False False True False True 4 True True True False False True I also have the following dictionary: How do I change the table so for every key column

Cleaner way to selectively multiply pandas DataFrame values

dataframe pandas python

Given this example: Where the values in df are multiplied by non-NaN values from factors, is there a cleaner way to do this with pandas? (or numpy for that matter) I had a look at .mul(), but that doesn’t appear to allow me to do what’s required here. Additionally, what if factors contains rows wi…

Pandas “A value is trying to be set on a copy of a slice from a DataFrame”

dataframe pandas python

Having a bit of trouble understanding the documentation See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy dfbreed[‘x’] = dfbreed.apply(testbreed, axis=1) C:/Users/erasmuss/PycharmProjects/Sarah/farmdata.py:38:…

Pandas, how can cast a column as floats that hold floats, strings, and strings that cannot be converted?

pandas python

Let’s say I have the following pandas df: And I wish to remove all rows in which the value of the column floats cannot be cast as float and cast all values as float that can be cast as such: Answer use to_numeric()+dropna(): OR in 2 steps: output:

How to clean data so that the correct arrival code is there for the city pair?

dataframe pandas python sql

How to clean data so that the correct arrival code is there for the city pair? From the picture, the CSV is like column 1: City Pair (Departure – Arrival), column 2 is meant to be the Departure Code, and column 3 is meant to be the Arrival Code. As you can see for row 319 in the first column,

best way to iterate through elements of pandas Series

pandas python

All of the following seem to be working for iterating through the elements of a pandas Series. I’m sure there’s more ways of doing it. What are the differences and which is the best way? Answer TL;DR Iterating in pandas is an antipattern and can usually be avoided by vectorizing, applying, aggrega…