Skip to content

Tag: pandas

Apply strip() to all cells in dataframe with multiple data types

I have a dataframe that has multiple data types. Part of my processing code is to apply the strip() function before I work on the df. My example df: Here is my code: It doesn’t seem to be processing for all strings though. I’m still seeing spaces before and after in some of my output cells. Questi…

How to drop duplicates in pandas but keep more than the first

Let’s say I have a pandas DataFrame: I want to drop duplicates if they exceed a certain threshold n and replace them with that minimum. Let’s say that n=3. Then, my target dataframe is EDIT: Each set of consecutive repetitions is considered separately. In this example, rows 8 and 9 should be kept.…

how to save space training

I have written an intent classification program. This is first trained with training data and then tested with test data. The training process takes a few seconds. What is the best way to save such a training, so that it does not have to be trained again with every call? Is it enough to save train_X and train…