How can I get the values of one column in a csv-file by matching attributes in another column?
CSV-file would look like that:
JavaScript
x
7
1
One,Two,Three
2
x,car,5
3
x,bus,7
4
x,car,9
5
x,car,6
6
7
I only want to get the values of column 3, if they have the value “car” in column 2. I also do not want them to be added but rather have them printed in a list, or like that:
JavaScript
1
4
1
5
2
9
3
6
4
My approach is looking like that, but doesn’t really work:
JavaScript
1
10
10
1
import pandas as pd
2
3
df = pd.read_csv(r"example.csv")
4
5
ITEMS = [car] #I will need more items, this is just examplified
6
7
for item in df.Two:
8
if item in ITEMS:
9
print(df.Three)
10
How can I get the exact value for a matched item?
Advertisement
Answer
In one line you can do it like:
JavaScript
1
2
1
print(df['Three'][df['Two']=='car'].values)
2
Output:
JavaScript
1
2
1
[5 9 6]
2
For multiple items try:
JavaScript
1
10
10
1
df = pd.DataFrame({'One': ['x','x','x','x', 'x'],'Two': ['car','bus','car','car','jeep'],'Three': [5,7,9,6,10]})
2
3
myitems = ['car', 'bus']
4
res_list = []
5
6
for item in myitems:
7
res_list += df['Three'][df['Two']==item].values.tolist()
8
9
print(*sorted(res_list), sep='n')
10
Output:
JavaScript
1
5
1
5
2
6
3
7
4
9
5
Explanation
df['Two']=='car'
returns a Dataframe with booleanTrue
at row positions where value in column Two of ofdf
is car.values
gets these boolean values as anumpy.ndarray
, result would be[True False True True]
- We can filter the values in column Three by using this list of booleans like so:
df['Three'][<Boolean_list>]
- To combine the resulting arrays we convert each
numpy.ndarray
to pythonlist
usingtolist()
and append it tores_list
- Then we use
sorted
to sortres_list