Skip to content
Advertisement

Tag: pandas

How to create dummies for certain columns with pandas.get_dummies()

I just want Column A and D to get dummies not for Column B. If I used pd.get_dummies(df), all columns turned into dummies. I want the final result containing all of columns , which means column C and column B exit,like ‘A_x’,’A_y’,’B’,’C’,’D_j’,’D_l’. Answer It can be done without concatenation, using get_dummies() with required parameters

pandas dataframe str.contains() AND operation

I have a df (Pandas Dataframe) with three rows: The function df.col_name.str.contains(“apple|banana”) will catch all of the rows: How do I apply AND operator to the str.contains() method, so that it only grabs strings that contain BOTH “apple” & “banana”? I’d like to grab strings that contains 10-20 different words (grape, watermelon, berry, orange, …, etc.) Answer You can do

Pandas replace all items in a row with NaN if one value is NaN

I want to get rid of some records with NaNs. This works perfectly: However, it changes the shape of my dataframe, and the index is no longer uniformly spaced. Therefore, I’d like to replace all items in these rows with np.nan. Is there a simple way to do this? I was thinking about resampling the dataframe after dropna, but that

Fill empty cells in column with value of other columns

I have a HC list in which every entry should have an ID, but some entries do not have an ID. I would like to fill those empty cells by combining the the first name column and the last name column. How would I go about this? I tried googling for fillna and the like but couldn’t get it to

label-encoder encoding missing values

I am using the label encoder to convert categorical data into numeric values. How does LabelEncoder handle missing values? Output: For the above example, label encoder changed NaN values to a category. How would I know which category represents missing values? Answer Don’t use LabelEncoder with missing values. I don’t know which version of scikit-learn you’re using, but in 0.17.1

dataframe to long format

I have the following df: I would like to change it so that looks like this: The reason is that I have a df that is similarly shaped and I need to merge the two dfs. I have recently had similar df shaping issues that I have been unable to find simple quick solutions to with python. Does anyone know

Advertisement