I have a df where column A is either blank or has a string in it. I tried to write the if statement (all columns are strings) below. Basically, if there is something (any value) in df[A], then the new column value will be a concatenation of columns A, B and C. If there is no value in df[A], then it will concatenate columns B and C.
the part where it’s idf df[A] returns a true or false value, right? just like if I were to write bool(df[A]). So if the value is true, then it should execute the first block, if not, then it should execute the ‘else’ block.
if df[A]: df[new_column] = df[column_A] + df[column_B] + df[column_C] else: df[new_column] = df[column_B]+df[column_C]
I get this error: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Advertisement
Answer
As far as I understand your question, you want to perform the IF-condition for each element. The “+” seems to be a string concatenation, since there are strings in df[‘A’].
In this case, you don’t need the IF-condition at all, because adding an empty string to another leads to the same result as not adding the string.
import pandas as pd d = {'A': ['Mr ', '', 'Mrs '], 'B': ['Max ', 'John ', 'Marie '], 'C': ['Power', 'Doe', 'Curie']} df = pd.DataFrame(data=d) df['new'] = df['A'] + df['B'] + df['C']
Results in:
>>> df A B C new 0 Mr Max Power Mr Max Power 1 John Doe John Doe 2 Mrs Marie Curie Mrs Marie Curie
In the case that “blank” refers to NaN and not to an empty string you can do the following:
df['new'] = df.apply(lambda x: ''.join(x.dropna().astype(str)), axis=1)
Have a look at this question, which seems to be similar: questions 33098383