I have an text file as below format. I want to read through all the records in the file and output in a dataframe.
JavaScript
x
10
10
1
NEW ACCOUNT ABC COMPANY 00123
2
3
CCY/BALANCE USD 3,600
4
5
ACCOUNT APPROVAL ABC COMPANY 00123
6
7
NEW ACCOUNT BBC COMPANY 00124
8
9
CCY/BALANCE USD 5,600
10
Expected output:
JavaScript
1
5
1
TRAN DESCRIPTION CUSTOMER NAME A/C NO. CCY BALANCE
2
NEW ACCOUNT ABC COMPANY 00123 USD 3,600.00
3
ACCOUNT APPROVAL ABC COMPANY 00123
4
NEW ACCOUNT BBC COMPANY 00124 USD 5,600.00
5
There will be two types of trans description. Code I am trying as below, but it only works for one line of the text file. How can I modify to read through all the records in the files? Thanks!
JavaScript
1
9
1
text = ‘NEW ACCOUNT ABC COMPANY 00123’
2
sep = ' '
3
lst = text.split(sep)
4
while(' ' in lst) :
5
lst.remove(' ')
6
lst = np.array(lst).reshape(1,3)
7
8
df = pd.DataFrame(lst,columns =['TRAN DESCRIPTION', 'CUSTOMER NAME', 'A/C NO.'])
9
Advertisement
Answer
Try this :
JavaScript
1
23
23
1
import pandas as pd
2
import numpy as np
3
from io import StringIO
4
5
t = """NEW ACCOUNT ABC COMPANY 00123
6
7
CCY/BALANCE USD 3,600
8
9
ACCOUNT APPROVAL ABC COMPANY 00123
10
11
NEW ACCOUNT BBC COMPANY 00124
12
13
CCY/BALANCE USD 5,600"""
14
15
names=['TRAN DESCRIPTION', 'CUSTOMER NAME', 'A/C NO.']
16
df = pd.read_fwf(StringIO(t), header=None, names=names)
17
# or df = pd.read_fwf(r'path_to_your_textfile.txt', header=None, names=names)
18
19
df['CCY/BALANCE'] = np.where(df['CUSTOMER NAME'] == 'CCY/BALANCE', df['A/C NO.'], np.nan)
20
df['CCY/BALANCE'] = df['CCY/BALANCE'].shift(-1)
21
22
out = df[df['TRAN DESCRIPTION'].notna()].reset_index(drop=True)
23