looking for the difference between ocurrences in a datframe

Question

I have a dataframe like this (the real one is 7 million records and 345 features) the following image is only a small fraction related to if a cliente make an operation in a month. What I want to do is create a column at the end with the mean difference between each operation. For example in the first record

Accepted Answer

numpy.flatnonzero: Identify where the non-zero values arenumpy.diff: Find the difference between adjacent values.  When passed results from flatnonzero it finds the differences between positionsnumpy.mean: Find the average of valuesProduce a new columns 'MD' with the average positional distance between non-zero valuesdf.assign(MD=[np.diff(np.flatnonzero(a)).mean() for a in df.to_numpy()])

Advertisement

Answer