I tried manipulating a Dataframe and the output was (unexpectedly) of a pandas.core.series.Series type while I was aiming for another Dataframe output.
For reference, the original Dataframe looked like this –
Character Line 0 Leslie Knope Hello. 1 Leslie Knope Hi. 2 Leslie Knope My name is Leslie Knope, and I work for the Pa... 3 Leslie Knope Can I ask you a few questions? 4 Leslie Knope Would you say that you are, "Enjoying yourself... 5 Leslie Knope I'm gonna put a lot of fun. 6 Child Ms. Knope, there's a drunk stuck in the slide. 7 Leslie Knope Sir, this is a children's slide. 8 Leslie Knope You're not allowed to sleep in here. 9 Extra What is?
I was hoping to combine all consecutive rows with the same Character value. So, all ‘Leslie Knope’ lines from “Hello” to “I’m gonna put a lot of fun” would be rolled into one row, the “Child” line would stay as is and then the next two “Leslie Knope” lines would be rolled into one.
This is the code I used to achieve that (to an extent):
df['key'] = (df['Character'] != df['Character'].shift(1)).astype(int).cumsum() print(df.head(5)) df2 = df.groupby(['key', 'Character'])['Line'].apply(' '.join)
This is the df2 output –
key Character 1 Leslie Knope Hello. Hi. My name is Leslie Knope, and I work... 2 Child Ms. Knope, there's a drunk stuck in the slide. 3 Leslie Knope Sir, this is a children's slide. You're not al... 4 Extra What is? 5 Leslie Knope You know, when I first tell people that I work... 210 Ann Perkins I'm really fired up. You know they say that de... 211 Leslie Knope Soul sista, soul sista Gonna get your phone, s... 212 Ann Perkins Yeah. 213 Leslie Knope Sweet Lady Marmalard 214 Ron Swanson I've created this office as a symbol of how I ... Name: Line, Length: 214, dtype: object
I was hoping to get df2 as another Dataframe that collapses the consecutive lines spoken by the character as I wanted and have the lines in a Lines column. Not really sure what’s happening here since df2 is of pandas.core.series.Series type, so I would appreciate help with either of the following –
- An alternative approach to collapsing the consecutive lines spoken by the character
- A way to convert df2 to a Dataframe with the Key, Character, and Lines column.
Thanks in advance!
Advertisement
Answer
All you need to do is chain .reset_index()
to your last line.
It became a series when you applied your groupby function to a single column (Lines). The other columns then became the index for your series.
Edit: to get rid of the ‘key’ column, just add `.drop(‘key’,axis=1)