Skip to content
Advertisement

How to select rightmost column with a value?

I have a DataFrame df with some country statistics for years from 2014 to 2018. Some of the countries have values for each of the years, while some countries are missing some. The DataFrame looks like this:

countryName   2014    2015    2016    2017    2018
Afghanistan     ..      ..     281      ..      ..
Albania         ..      ..      ..     891     901
Algeria        791     801     804     817     820
...

I want to keep only the most recent data value, so for the DataFrame above, the result should be:

countryName   value
Afghanistan     281 # the most recent value from 2016
Albania         901 # the most recent value from 2018
Algeria         820 # the most recent value from 2018
...

Advertisement

Answer

You could use the forward fill on the columns axis to get the last numeric values

data = [{'countryName': 'Afghanistan',
  '2014': nan,
  '2015': nan,
  '2016': 281.0,
  '2017': nan,
  '2018': nan},
 {'countryName': 'Albania',
  '2014': nan,
  '2015': nan,
  '2016': nan,
  '2017': 891.0,
  '2018': 901.0},
 {'countryName': 'Algeria',
  '2014': 791.0,
  '2015': 801.0,
  '2016': 804.0,
  '2017': 817.0,
  '2018': 820.0}]

df = pd.DataFrame(data)

    df.ffill(1).iloc[:, [0, -1]].set_axis(['countryName', 'value'], axis=1)

   countryName  value
0  Afghanistan  281.0
1      Albania  901.0
2      Algeria  820.0
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement