Skip to content

Tag: dataframe

I’m trying to create a table from text

I want to create a table with two columns separated by “:”. So the capitalized words as the first column and everything after the “:” as the second column. I was originally tried to do this from a PDF but that wasn’t working so I copied it to a text file thinking it might be easi…

How to delete duplicates pandas

I need to check if there are some duplicates value in one column of a dataframe using Pandas and, if there is any duplicate, delete the entire row. I need to check just the first column. Example: What i need is: I can delete the ‘object’ duplicates with the following code, but I can’t delete…

PySpark Incremental Count on Condition

Given a Spark dataframe with the following columns I am trying to construct an incremental/running count for each id based on when the contents of the event column evaluate to True. Here a new column called results would be created that contained the incremental count. I’ve tried using window functions …