I have two dataframes in python:
df1
Column1 | Column2 | Column3 | Column4 | Column5 |
---|---|---|---|---|
1a | 123 | RJ | 12.1 | test1 |
2a | 432 | MT | 23.2 | test3 |
3a | 234 | DR | 34.3 | test5 |
df2
Column1 | Column3 | Column6 |
---|---|---|
1a | RJ | 695 |
2a | MT | 568 |
3a | DR | 232 |
And I wish to append them together and save them as CSV, separated by pipe.
But when I simply append, there are lots of columns with nulls:
df3 = df1.append(df2, ignore_index=True)
Column1 | Column2 | Column3 | Column4 | Column5 | Column6 |
---|---|---|---|---|---|
1a | 123 | RJ | 12.1 | test1 | |
2a | 432 | MT | 23.2 | test3 | |
3a | 234 | DR | 34.3 | test5 | |
1a | RJ | 695 | |||
2a | MT | 568 | |||
3a | DR | 232 |
Then when write into CSV give me this result:
df3.to_csv('df3.csv', sep='|', index=False) #I will also remove header ',header=False'
Column1 | Column2 | Column3 | Column4 | Column5|Column6|
1a| 123|RJ| 12.1| test1||
2a| 432|MT| 23.2| test3||
3a|234|DR| 34.3|test5||
1a| |RJ|| |695|
2a| |MT|| |568|
3a| |DR|| |232 |
But what I need as results is this output, ignoring nulls (don’t worry about header):
Column1 | Column2 | Column3 | Column4 | Column5|Column6|
1a|123|RJ|12.1|test1
2a|432|MT|23.2|test3
3a|234|DR|34.3|test5
1a|RJ|695
2a|MT|568
3a|DR|232
Any Ideas? Thanks in advance.
I’ve tried this code below but still got ‘NaN’
pipe_delim_rows = df3.apply(lambda x: '|'.join([str(v) for v in x.values if v not in (np.nan, '', None)]), axis=1) with open('file.txt', 'w') as f: for item in pipe_delim_rows: f.write(item + 'n')
Advertisement
Answer
Unless I’m misunderstanding, you just want df1 on top of df2.
If that’s the case, just write them both out to the same file. No need to create a third DataFrame if you don’t care about headers or column spacing.
import pandas as pd df1 = pd.DataFrame({'Column1': {0: '1a', 1: '2a', 2: '3a'}, 'Column2': {0: 123, 1: 432, 2: 234}, 'Column3': {0: 'RJ', 1: 'MT', 2: 'DR'}, 'Column4': {0: 12.1, 1: 23.2, 2: 34.3}, 'Column5': {0: 'test1', 1: 'test3', 2: 'test5'}}) df2 = pd.DataFrame({'Column1': {0: '1a', 1: '2a', 2: '3a'}, 'Column3': {0: 'RJ', 1: 'MT', 2: 'DR'}, 'Column6': {0: 695, 1: 568, 2: 232}}) with open('file.txt', 'w', newline='') as f: df1.to_csv(f, sep='|', index=False, header=False) df2.to_csv(f, sep='|', index=False, header=False)
File.txt
1a|123|RJ|12.1|test1 2a|432|MT|23.2|test3 3a|234|DR|34.3|test5 1a|RJ|695 2a|MT|568 3a|DR|232