I’ve DataFrame with 4 columns and want to merge the first 3 columns in a new DataFrame.
The data is identical, the order is irrelevant and any duplicates must remain.
JavaScript
x
6
1
import pandas as pd
2
3
data = [['tom', 'nick', 'john', 10], ['bob', 'jane', 'nick', 15]]
4
5
df = pd.DataFrame(data, columns = ['col1', 'col2', 'col3','col4'])
6
Desired DataFrame
JavaScript
1
11
11
1
+-----+-----+
2
|col_a|col_b|
3
+-----+-----+
4
|tom |10 |
5
|nick |10 |
6
|john |10 |
7
|bob |15 |
8
|jane |15 |
9
|nick |15 |
10
+-----+-----+
11
How do I get this done?
Advertisement
Answer
Here is one way of merging the first three columns with the help of numpy
:
JavaScript
1
3
1
a = df.values
2
pd.DataFrame({'col_a': np.ravel(a[:, :3]), 'col_b': np.repeat(a[:, 3], 3)})
3
JavaScript
1
8
1
col_a col_b
2
0 tom 10
3
1 nick 10
4
2 john 10
5
3 bob 15
6
4 jane 15
7
5 nick 15
8