df:
| country | year | index |
|---|---|---|
| Turkiye | 1992 | NaN |
| Spain | 1992 | NaN |
| US | 1992 | 1 |
| Turkiye | 1993 | 1 |
| Spain | 1993 | 1 |
| US | 1993 | 0 |
| Turkiye | 1994 | 1 |
| France | 1994 | 0 |
| Italy | 1994 | NaN |
| Turkiye | 1995 | 0 |
Here, for example, in 1992 Turkiye and Spain are NaNs but the index exists for the US. So I am only interested in the earliest date that the index exists for, the country does not matter in this case.
My code is:
a = np.where(df["Index"]!= None) a["year"].min()
a is not a data frame, I think for this reason I am having a problem. How can I solve this issue?
Advertisement
Answer
use .loc with .idxmin after .dropna
df.loc[df.dropna()['year'].idxmin()] country US year 1992 index 1.0 Name: 2, dtype: object