Skip to content
Advertisement

Python pandas .str.split regex=True ValueError: Columns must be same length as key

I need help with this pandas split with regex. I’m getting the error ValueError: Columns must be same length as key.

my column of data is like this

PURCHASE AUTHORIZED ON 03/30 UOFU BOOKSTORE 1 …
PURCHASE AUTHORIZED ON 03/29 WM SUPERC Wal-Mart Sup …
PURCHASE AUTHORIZED ON 03/29 KFC/AW #526 …
PURCHASE AUTHORIZED ON 03/31 UU VISITOR PARKING …
ATM WITHDRAWAL AUTHORIZED ON 04/03 Main Street …

my code is

df[['Auth_date', 'Description']] = df['Description'].str.split('(?<=d{2}d{2}).', regex=True)

desired results would be.

Auth_date                            Description
PURCHASE AUTHORIZED ON 03/30         UOFU BOOKSTORE 1 …
PURCHASE AUTHORIZED ON 03/29         WM SUPERC Wal-Mart Sup …
PURCHASE AUTHORIZED ON 03/29         KFC/AW #526 …
PURCHASE AUTHORIZED ON 03/31         UU VISITOR PARKING …
ATM WITHDRAWAL AUTHORIZED ON 04/03   Main Street …

Advertisement

Answer

Given:

                                         Description
0    PURCHASE AUTHORIZED ON 03/30 UOFU BOOKSTORE 1 …
1  PURCHASE AUTHORIZED ON 03/29 WM SUPERC Wal-Mar...
2         PURCHASE AUTHORIZED ON 03/29 KFC/AW #526 …
3  PURCHASE AUTHORIZED ON 03/31 UU VISITOR PARKING …
4   ATM WITHDRAWAL AUTHORIZED ON 04/03 Main Street …

Doing:

df[['Auth_date', 'Description']] = df['Description'].str.split('(?<=d{2}/d{2}).', expand=True, regex=True)
print(df)

Output:

                Description                           Auth_date
0        UOFU BOOKSTORE 1 …        PURCHASE AUTHORIZED ON 03/30
1  WM SUPERC Wal-Mart Sup …        PURCHASE AUTHORIZED ON 03/29
2             KFC/AW #526 …        PURCHASE AUTHORIZED ON 03/29
3      UU VISITOR PARKING …        PURCHASE AUTHORIZED ON 03/31
4             Main Street …  ATM WITHDRAWAL AUTHORIZED ON 04/03

Works fine for me.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement