Skip to content
Advertisement

How do I convert pandas.core.series.Series back to a Dataframe following a groupby?

I tried manipulating a Dataframe and the output was (unexpectedly) of a pandas.core.series.Series type while I was aiming for another Dataframe output.

For reference, the original Dataframe looked like this –

      Character                                               Line
0  Leslie Knope                                             Hello.
1  Leslie Knope                                                Hi.
2  Leslie Knope  My name is Leslie Knope, and I work for the Pa...
3  Leslie Knope                     Can I ask you a few questions?
4  Leslie Knope  Would you say that you are, "Enjoying yourself...
5  Leslie Knope                        I'm gonna put a lot of fun.
6         Child     Ms. Knope, there's a drunk stuck in the slide.
7  Leslie Knope                   Sir, this is a children's slide.
8  Leslie Knope               You're not allowed to sleep in here.
9         Extra                                           What is?

I was hoping to combine all consecutive rows with the same Character value. So, all ‘Leslie Knope’ lines from “Hello” to “I’m gonna put a lot of fun” would be rolled into one row, the “Child” line would stay as is and then the next two “Leslie Knope” lines would be rolled into one.

This is the code I used to achieve that (to an extent):

df['key'] = (df['Character'] != df['Character'].shift(1)).astype(int).cumsum()
print(df.head(5))
df2 = df.groupby(['key', 'Character'])['Line'].apply(' '.join)

This is the df2 output –

key  Character   
1    Leslie Knope    Hello. Hi. My name is Leslie Knope, and I work...
2    Child              Ms. Knope, there's a drunk stuck in the slide.
3    Leslie Knope    Sir, this is a children's slide. You're not al...
4    Extra                                                    What is?
5    Leslie Knope    You know, when I first tell people that I work...
                       
210  Ann Perkins     I'm really fired up. You know they say that de...
211  Leslie Knope    Soul sista, soul sista Gonna get your phone, s...
212  Ann Perkins                                                 Yeah.
213  Leslie Knope                                 Sweet Lady Marmalard
214  Ron Swanson     I've created this office as a symbol of how I ...
Name: Line, Length: 214, dtype: object

I was hoping to get df2 as another Dataframe that collapses the consecutive lines spoken by the character as I wanted and have the lines in a Lines column. Not really sure what’s happening here since df2 is of pandas.core.series.Series type, so I would appreciate help with either of the following –

  1. An alternative approach to collapsing the consecutive lines spoken by the character
  2. A way to convert df2 to a Dataframe with the Key, Character, and Lines column.

Thanks in advance!

Advertisement

Answer

All you need to do is chain .reset_index() to your last line.

It became a series when you applied your groupby function to a single column (Lines). The other columns then became the index for your series.

Edit: to get rid of the ‘key’ column, just add `.drop(‘key’,axis=1)

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement