JavaScript
x
4
1
col != ['SourceFile','Label']
2
3
df['FileDescription']=df[col].apply(lambda row:'_'.join(row.values.astype(str)),axis=1)
4
I want to combine the elements in all columns except two columns, ‘SourceFile’ and ‘Label’. I tried the above code. Which resulted in value error. There is so many columns. So I can’t use
JavaScript
1
3
1
col=['SourceFile','AggregationType','APP14Flags0','APP14Flags1','Application','ArchivedFileName','Artist', ..]
2
df['FileDescription']=df[col].apply(lambda row:'_'.join(row.values.astype(str)),axis=1)
3
Advertisement
Answer
col != ['SourceFile','Label']
is syntactically wrong and it gives NameError not the ValueError.
First get the columns you don’t want and convert it to set.
JavaScript
1
2
1
col = set(['SourceFile','Label'])
2
Now get all columns as set:
JavaScript
1
2
1
allCols = set(df.columns.to_list())
2
Finally take the set difference and assign back as a list:
JavaScript
1
2
1
cols = list(set.difference(allCols, col))
2
Now you can use aggregate method:
JavaScript
1
2
1
df[col].astype(str).agg('_'.join)
2
See the sample execution:
JavaScript
1
20
20
1
df
2
0 1 2 3 4 5 6 7 8 9
3
0 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0
4
1 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0
5
2 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0
6
3 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0
7
4 4.0 5.0 6.0 7.0 8.0 9.0 10.0 11.0 12.0 13.0
8
9
col= set([0])
10
allCols = set(df.columns.to_list())
11
12
col = list(set.difference(allCols, col))
13
df[col].astype(str).agg('_'.join, axis=1)
14
0 1.0_2.0_3.0_4.0_5.0_6.0_7.0_8.0_9.0
15
1 2.0_3.0_4.0_5.0_6.0_7.0_8.0_9.0_10.0
16
2 3.0_4.0_5.0_6.0_7.0_8.0_9.0_10.0_11.0
17
3 4.0_5.0_6.0_7.0_8.0_9.0_10.0_11.0_12.0
18
4 5.0_6.0_7.0_8.0_9.0_10.0_11.0_12.0_13.0
19
dtype: object
20