Skip to content

how do i split a column into two in python on the basis of data in it

for instance the column i want to split is duration here, it has data points like – 110 or 2 seasons, i want to make a differerent column for seasons and in place of seasons in my current column it should say null as this would make the type of column int from string screenshot of my data

i tried the split function but that’s for splliting in between data points, unlike splitting different other data points



I have tried to replicate a portion of your dataframe in order to provide the below solution – note that it will also change the np.NaN values to ‘Null’ as requested.

Creating the sample dataframe off of your screenshot:

movies_dic = {'release_year': [2021,2020,2021,2021,2021,1940,2018,2008,2021], 
'duration':[np.NaN, 94, 108, 97, 104, 60, '4 Seasons', 90, '1 Season']}
stack_df = pd.DataFrame(movies_dic)

The issue is likely that the ‘duration’ column is of object dtypes – namely it contains both string and integer values in it. I have made 2 small functions that will make use of the data types and allocate them to their respective column. The first is taking all the ‘string’ rows and placing them in the ‘series_duration’ column:

def series(x):
    if type(x) == str:
        return x
        return 'Null'

Then the movies function keeps the integer values (i.e. those without the word ‘Season’ in them) as is:

def movies(x):
    if type(x) == int:
        return x
        return 'Null'

stack_df['series_duration'] = stack_df['duration'].apply(lambda x: series(x))

stack_df['duration'] = stack_df['duration'].apply(lambda x: movies(x))

release_year    duration    series_duration
0   2021    Null           Null
1   2020    94             Null
2   2021    108            Null
3   2021    97             Null
4   2021    104            Null
5   1940    60             Null
6   2018    Null           4 Seasons
7   2008    90             Null
8   2021    Null           1 Season
User contributions licensed under: CC BY-SA
3 People found this is helpful