Skip to content
Advertisement

Join columns in a single Pandas DataFrame

I’ve DataFrame with 4 columns and want to merge the first 3 columns in a new DataFrame.

The data is identical, the order is irrelevant and any duplicates must remain.

import pandas as pd 
   
data = [['tom', 'nick', 'john', 10], ['bob', 'jane', 'nick', 15]] 

df = pd.DataFrame(data, columns = ['col1', 'col2', 'col3','col4'])

Desired DataFrame

+-----+-----+
|col_a|col_b|
+-----+-----+
|tom  |10   |
|nick |10   |
|john |10   |
|bob  |15   |
|jane |15   |
|nick |15   |
+-----+-----+

How do I get this done?

Advertisement

Answer

Here is one way of merging the first three columns with the help of numpy:

a = df.values
pd.DataFrame({'col_a': np.ravel(a[:, :3]), 'col_b': np.repeat(a[:, 3], 3)})

  col_a col_b
0   tom    10
1  nick    10
2  john    10
3   bob    15
4  jane    15
5  nick    15
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement