Check column for variable and get value from another column in matched row

Tags: , , ,



How can I get the values of one column in a csv-file by matching attributes in another column?

CSV-file would look like that:

One,Two,Three
x,car,5
x,bus,7
x,car,9
x,car,6

I only want to get the values of column 3, if they have the value “car” in column 2. I also do not want them to be added but rather have them printed in a list, or like that:

5
9
6

My approach is looking like that, but doesn’t really work:

import pandas as pd

df = pd.read_csv(r"example.csv")

ITEMS = [car] #I will need more items, this is just examplified

for item in df.Two:
    if item in ITEMS:
        print(df.Three)

How can I get the exact value for a matched item?

Answer

In one line you can do it like:

print(df['Three'][df['Two']=='car'].values)

Output:

[5 9 6]

For multiple items try:

df = pd.DataFrame({'One': ['x','x','x','x', 'x'],'Two': ['car','bus','car','car','jeep'],'Three': [5,7,9,6,10]})

myitems = ['car', 'bus']
res_list = []

for item in myitems:
    res_list += df['Three'][df['Two']==item].values.tolist()

print(*sorted(res_list), sep='n')

Output:

5
6
7
9

Explanation

  1. df['Two']=='car' returns a Dataframe with boolean True at row positions where value in column Two of of df is car
  2. .values gets these boolean values as a numpy.ndarray, result would be [True False True True]
  3. We can filter the values in column Three by using this list of booleans like so: df['Three'][<Boolean_list>]
  4. To combine the resulting arrays we convert each numpy.ndarray to python list using tolist() and append it to res_list
  5. Then we use sorted to sort res_list


Source: stackoverflow