Skip to content

Tag: pandas

Performance tuning: string wordcount in df

I have a df with column “free text”. I wish to count how many characters and words each cell has. Currently, I do it like this: Problem is, that it is pretty slow. I thought about using np.where but I wasn’t sure how. Would appreciate your help here. Answer IIUC: you can try via str.len() an…

Group by Issue with Years Pandas

I’m following the answer for this StackOverflow post to group a column of years by decades to make it easier for me to visualize later, but I’m not getting the same results. It seems like when DSM did it, it yielded integers for years, while mine is yielding floats for years. I’ve implemente…

stacked chart combine with alluvial plot – python

Surprisingly little info out there regarding python and the pyalluvial package. I’m hoping to combine stacked bars and a corresponding alluvial in the same figure. Using below, I have three unique groups, which is outlined in Group. I want to display the proportion of each Group for each unique Point. I…

How to sum a sequence in pandas?

I need to do some coding in python and I can’t do this code: I need to do something like this as result: For me the sequence matters most in my analysis. It’s a sum of the results in interviews. Thanks guys for the help! Answer Here is another approach using reindex and unstack: