Skip to content
Advertisement

python – Append dataframe but remove null column from each table

I have two dataframes in python:

df1

Column1 Column2 Column3 Column4 Column5
1a 123 RJ 12.1 test1
2a 432 MT 23.2 test3
3a 234 DR 34.3 test5

df2

Column1 Column3 Column6
1a RJ 695
2a MT 568
3a DR 232

And I wish to append them together and save them as CSV, separated by pipe.

But when I simply append, there are lots of columns with nulls:

df3 = df1.append(df2, ignore_index=True)
Column1 Column2 Column3 Column4 Column5 Column6
1a 123 RJ 12.1 test1
2a 432 MT 23.2 test3
3a 234 DR 34.3 test5
1a RJ 695
2a MT 568
3a DR 232

Then when write into CSV give me this result:

df3.to_csv('df3.csv', sep='|', index=False) #I will also remove header ',header=False'

Column1 | Column2 | Column3 | Column4 | Column5|Column6|

1a| 123|RJ| 12.1| test1||

2a| 432|MT| 23.2| test3||

3a|234|DR| 34.3|test5||

1a| |RJ|| |695|

2a| |MT|| |568|

3a| |DR|| |232 |

But what I need as results is this output, ignoring nulls (don’t worry about header):

Column1 | Column2 | Column3 | Column4 | Column5|Column6|

1a|123|RJ|12.1|test1

2a|432|MT|23.2|test3

3a|234|DR|34.3|test5

1a|RJ|695

2a|MT|568

3a|DR|232

Any Ideas? Thanks in advance.

I’ve tried this code below but still got ‘NaN’

pipe_delim_rows = df3.apply(lambda x: '|'.join([str(v) for v in x.values if v not in (np.nan, '', None)]), axis=1)

with open('file.txt', 'w') as f:
    for item in pipe_delim_rows:
        f.write(item + 'n')

Advertisement

Answer

Unless I’m misunderstanding, you just want df1 on top of df2.

If that’s the case, just write them both out to the same file. No need to create a third DataFrame if you don’t care about headers or column spacing.

import pandas as pd

df1 = pd.DataFrame({'Column1': {0: '1a', 1: '2a', 2: '3a'},
                    'Column2': {0: 123, 1: 432, 2: 234},
                    'Column3': {0: 'RJ', 1: 'MT', 2: 'DR'},
                    'Column4': {0: 12.1, 1: 23.2, 2: 34.3},
                    'Column5': {0: 'test1', 1: 'test3', 2: 'test5'}})
df2 = pd.DataFrame({'Column1': {0: '1a', 1: '2a', 2: '3a'},
                    'Column3': {0: 'RJ', 1: 'MT', 2: 'DR'},
                    'Column6': {0: 695, 1: 568, 2: 232}})

with open('file.txt', 'w', newline='') as f:
    df1.to_csv(f, sep='|', index=False, header=False)
    df2.to_csv(f, sep='|', index=False, header=False)

File.txt

1a|123|RJ|12.1|test1
2a|432|MT|23.2|test3
3a|234|DR|34.3|test5
1a|RJ|695
2a|MT|568
3a|DR|232
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement