df: country year index Turkiye 1992 NaN Spain 1992 NaN US 1992 1 Turkiye 1993 1 Spain 1993 1 US 1993 0 Turkiye 1994 1 France 1994 0 Italy 1994 NaN Turkiye 1995 0 Here, for example, in 1992 Turkiye and Spain are NaNs but the index exists for the US. So I am only interested in the earliest date

How can I check the value that exists for the earliest date?

df:

country	year	index
Turkiye	1992	NaN
Spain	1992	NaN
US	1992	1
Turkiye	1993	1
Spain	1993	1
US	1993	0
Turkiye	1994	1
France	1994	0
Italy	1994	NaN
Turkiye	1995	0

Here, for example, in 1992 Turkiye and Spain are NaNs but the index exists for the US. So I am only interested in the earliest date that the index exists for, the country does not matter in this case.

My code is:

a = np.where(df["Index"]!= None)
a["year"].min()

JavaScript
​x
 
a = np.where(df["Index"]!= None)
a["year"].min()
​

a is not a data frame, I think for this reason I am having a problem. How can I solve this issue?

Answer

use .loc with .idxmin after .dropna

df.loc[df.dropna()['year'].idxmin()]


country      US
year       1992
index       1.0
Name: 2, dtype: object

JavaScript
 
df.loc[df.dropna()['year'].idxmin()]
​
​
country      US
year       1992
index       1.0
Name: 2, dtype: object
​

Advertisement

Answer