How to compare each date in a cell with all the dates in a column

Question

I have a dataframe with three columns lets say I want to compare each date in Date column with all the other dates in the Date column and only keep those rows which lie within 6 months of atleast one of all the dates. Desired Output: I have tried a couple of approches such a nested loops, but I got

Accepted Answer

IIUC, this should work for you:import pandas as pdimport itertoolsfrom io import StringIOdata = StringIO("""Name;Address;Datefaraz;xyz;2022-01-01Abdul;abc;2022-06-06Zara;qrs;2021-02-25""")df = pd.read_csv(data, sep=';', parse_dates=['Date'])df_date = pd.DataFrame([sorted(l, reverse=True) for l in itertools.combinations(df['Date'], 2)], columns=['Date1', 'Date2'])df_date['diff'] = (df_date['Date1'] - df_date['Date2']).dt.daysdf[df.Date.isin(df_date[df_date['diff'] <= 180].iloc[:, :-1].T[0])]Output:    Name Address       Date0  faraz     xyz 2022-01-011  Abdul     abc 2022-06-06

Advertisement

Answer