How can I get the values of one column in a csv-file by matching attributes in another column?
CSV-file would look like that:
One,Two,Three x,car,5 x,bus,7 x,car,9 x,car,6
I only want to get the values of column 3, if they have the value “car” in column 2. I also do not want them to be added but rather have them printed in a list, or like that:
5 9 6
My approach is looking like that, but doesn’t really work:
import pandas as pd
df = pd.read_csv(r"example.csv")
ITEMS = [car] #I will need more items, this is just examplified
for item in df.Two:
if item in ITEMS:
print(df.Three)
How can I get the exact value for a matched item?
Advertisement
Answer
In one line you can do it like:
print(df['Three'][df['Two']=='car'].values)
Output:
[5 9 6]
For multiple items try:
df = pd.DataFrame({'One': ['x','x','x','x', 'x'],'Two': ['car','bus','car','car','jeep'],'Three': [5,7,9,6,10]})
myitems = ['car', 'bus']
res_list = []
for item in myitems:
res_list += df['Three'][df['Two']==item].values.tolist()
print(*sorted(res_list), sep='n')
Output:
5 6 7 9
Explanation
df['Two']=='car'returns a Dataframe with booleanTrueat row positions where value in column Two of ofdfis car.valuesgets these boolean values as anumpy.ndarray, result would be[True False True True]- We can filter the values in column Three by using this list of booleans like so:
df['Three'][<Boolean_list>] - To combine the resulting arrays we convert each
numpy.ndarrayto pythonlistusingtolist()and append it tores_list - Then we use
sortedto sortres_list