Here is the original data:
Name Wine Year 0 Mark Volnay 1983 1 Mark Volnay 1979 3 Mary Volnay 1979 4 Mary Volnay 1999 5 Mary Champagne 1993 6 Mary Champagne 1989
I would like to be able to get the value of Year in function of the values of Name and Wine. It would return all the values in the Year column of the entries that have the corresponding values in the Name and Wine columns.
For example: with the key ['Mark', 'Volnay'] I would get the values [1983, 1979]
I tried manipulating the data and here is the best I could get.
Keep one instance of each key:
Name Wine Year 1 Jean Volnay 1979 4 Pierre Volnay 1999 6 Pierre Champagne 1989
Remove the Year column
Name Wine 1 Jean Volnay 4 Pierre Volnay 6 Pierre Champagne
Get the values in a list
[['Mark', 'Volnay'], ['Mary', 'Volnay'], ['Mary', 'Champagne']]
I now have the keys I need, but I can’t get the values in the original dataframe in function of the value of the key.
Advertisement
Answer
You can also use groupby with get_group
def getyear(datafrae,keys:list):
values = df.groupby(['Name', 'Wine']).get_group(tuple(key))['Year']
dedupvalues = [*dict.fromkeys(values).keys()] #incase of duplicates
return dedupvalues
keys = ['Mark', 'Volnay'] print(getyear(df,keys)) [1983, 1979]