I am new to coding , recently started learning to code. Currently I am stuck in the process to split a column. Please help me
I have this dataframe
data = ['TOOK22JAN1515100HG','BOOK22FEB1643200GH','TOOK22MAR1742200HG'] df= pd.DataFrame(data)
and I want to split it into
0 TOOK22JAN1515100HG TOOK 22-01-15 15100 HG 1 BOOK22FEB1643200GH BOOK 22-02-16 43200 GH 2 TOOK22MAR1742200HG TOOK 22-03-17 42200 HG
Really appreciate for taking your time and answering to my problem.
PS: this is just an example of option symbol which is combination of Index + date + strike + type (stock market)
Advertisement
Answer
Use str.extract
to explode your string:
pattern = r'(?P<id>[A-Z]{4})(?P<date>w{7})(?P<val>d+)(?P<misc>[A-Z]{2})' df = df.join(df[0].str.extract(pattern)) df['date'] = pd.to_datetime(df['date']) df['val'] = df['val'].astype(int) print(df) # Output 0 id date val misc 0 TOOK22JAN1515100HG TOOK 2015-01-22 15100 HG 1 BOOK22FEB1643200GH BOOK 2016-02-22 43200 GH 2 TOOK22MAR1742200HG TOOK 2017-03-22 42200 HG