I have CSV files like following
JavaScript
x
14
14
1
ID Result1 Result2 Result3
2
3
0 1 1 0
4
5
1 0 1 0
6
7
2 0 0 0
8
9
3 0 0 0
10
11
4 1 0 0
12
13
5 0 0 0
14
I want to replace Result1, Result2, and Result3 columns by all zero if the sum of their column values is greater than 1. I tried this script
JavaScript
1
11
11
1
new = []
2
df1 = pd.read_csv(r"C:Userstest.csv")
3
df= df1.loc[:, df1.columns != 'ID']
4
l = list(df.columns.values)
5
6
for l in l:
7
if sum(df[l]) > 1:
8
df[l].loc[(df[l] > 0)] = 0
9
new.append(l)
10
df2 = df[new]
11
When I do this operation Result3 column is dropped since it has all zero values. How do I do this operation only on the columns which satisfy the condition without affecting another column which didn’t satisfy the condition?
Advertisement
Answer
JavaScript
1
12
12
1
data = {'Result1': {0: 1, 1: 0, 2: 0, 3: 0, 4: 1, 5: 0},
2
'Result2': {0: 1, 1: 1, 2: 0, 3: 0, 4: 0, 5: 0},
3
'Result3': {0: 0, 1: 0, 2: 0, 3: 0, 4: 0, 5: 0}}
4
5
df = pd.DataFrame.from_dict(data)
6
7
for col in df.columns:
8
if df[col].sum() > 1:
9
df[col] = 0
10
11
print(df)
12
gives
JavaScript
1
8
1
Result1 Result2 Result3
2
0 0 0 0
3
1 0 0 0
4
2 0 0 0
5
3 0 0 0
6
4 0 0 0
7
5 0 0 0
8
In your example dataframe, this makes all columns zero. To see that this is not always the case, put a 1 in the last column and verify that it survives the operation.