Skip to content
Advertisement

Taking the 1st and 2nd, 4th and 5th etc rows from a single Pandas column and put in two new columns, Python

Below is a sample of a pandas dataframe, a single column with 1000’s of rows.

I need second/third columns putting data in rows 1 and 2, 4 and 5 etc in the second/third column

                       0
 0    \path_1file1.csv                                                                                                                                                     
 1    23082022 DS
 2    520i2146002
 3    \path_2file2.csv                                                                                                                                
 4    230822 NS
 5    520i2146002
 4    \path_3file3.csv                                                                                                                                             
 5    24082022 DS
 6    520i2146002

Desired Output

                       0        1            2     
 0    \path_1file1.csv      23082022 DS    520i2146002                                                                                                                                           
 1    \path_2file2.csv      230822 NS      520i2146002                                                                                                                             
 2    \path_3file3.csv      24082022 DS    520i2146002                                                                                                                                    
        

Can only manage to pull out the odds with:

df = pd.DataFrame({0: df[0].iloc[::2].values, 'value': df[0].iloc[1::2].values})

Suggestions?

Advertisement

Answer

Make three subsets by taking every third value – starting at 0, 1, and 2. Reset each subset’s index, and concat them all together.

df = pd.concat([df.loc[::3].reset_index(drop=True),
                df.loc[1::3].reset_index(drop=True), 
                df.loc[2::3].reset_index(drop=True)], axis=1, ignore_index=True)
print(df)

Output:

                    0            1            2
0  \path_1file1.csv  23082022 DS  520i2146002
1  \path_2file2.csv    230822 NS  520i2146002
2  \path_3file3.csv  24082022 DS  520i2146002


Slightly more concise:

df = pd.concat([df.loc[i::3].reset_index(drop=True) for i in range(3)],
               axis=1, ignore_index=True)

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement