Skip to content
Advertisement

Pandas: Split and/or update columns, based on inconsistent data?

So I have a column that contains baseball team names, and I want to split it into the 2 new columns, that will contain separately city name and team name.

Team
New York Giants
Atlanta Braves
Chicago Cubs
Chicago White Sox

I would like to get something like this:

Team City Franchise
New York Giants New York Giants
Atlanta Braves Atlanta Braves
Chicago Cubs Chicago Cubs
Chicago White Sox Chicago White Sox

What I have tried so far?

  • using split and rsplit –> it gets the job done, but can’t unify it.
  • did the count df['cnt'] = df.asc.apply(lambda x: len(str(x).split(' '))) to get number of strings, so I know what kind of cases I have

There are 3 different cases:

  1. Standard one (e.g. Atlanta Braves)
  2. City with 2 strings (e.g. New York Giants)
  3. Franchise name with 2 strings (e.g. Chicago White Sox )

What I would like to do?

  • Split based on conditions (if cnt=2 then split on 1st occurence). Can’t find syntax for this, how this would go?
  • Update based on names (e.g. if ['Col_name'].str.contains("York" or "Angeles") then split on 2nd occurence . Also, can’t find working syntax, example for this?

What would be a good approach to solve this?

Thanks!

Advertisement

Answer

Use:

#part of cities with space
cities = ['York','Angeles']

#test rows
m = df['Team'].str.contains('|'.join(cities))

#first split by first space to 2 new columns
df[['City','Franchise']] = df['Team'].str.split(n=1, expand=True)
#split by second space only filtered rows
s = df.loc[m, 'Team'].str.split(n=2)
 
#update values
df.update(pd.concat([s.str[:2].str.join(' '), s.str[2]], axis=1, ignore_index=True).set_axis(['City','Franchise'], axis=1))
print (df)
                Team      City  Franchise
0    New York Giants  New York     Giants
1     Atlanta Braves   Atlanta     Braves
2       Chicago Cubs   Chicago       Cubs
3  Chicago White Sox   Chicago  White Sox
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement