How to split a columns based on the index of the string in the columns while using a efficient method to parse all the Dataframe

Question

I have a column filled with a string value: col_1 10500 25020 35640 45440 50454 62150 75410 I want to be able to create two other columns with strings values that have been splitted from the first. Also I want an efficient way to do that. Supposed result : col_1 col_2 col_3 10500 10 500 25020 25 020 35640 35

Accepted Answer

Use str accessor:df = df.join(df['col_1'].astype(str).str.extract('(?Pd{2})(?Pd{3})'))print(df)# Output: col_1 col_2 col_30 10500 10 5001 25020 25 0202 35640 35 6403 45440 45 4404 50454 50 4545 62150 62 1506 75410 75 410Or simple in few steps:df['col_1'] = df['col_1'].astype(str)df['col_2'] = df['col_1'].str[:2]df['col_3'] = df['col_1'].str[2:]print(df)# Output col_1 col_2 col_30 10500 10 5001 25020 25 0202 35640 35 6403 45440 45 4404 50454 50 4545 62150 62 1506 75410 75 410Another example:df['col_1'] = df['col_1'].astype(str)df['col_4'] = df['col_1'].str[:2] + '-' + df['col_1'].str[2:]print(df)# Output col_1 col_40 10500 10-5001 25020 25-0202 35640 35-6403 45440 45-4404 50454 50-4545 62150 62-1506 75410 75-410

col_1	col_2	col_3
10500	10	500
25020	25	020
35640	35	640
45440	45	440
50454	50	454
62150	62	150
75410	75	410

Advertisement

Answer