Good day.
If I have the following array:
[11, "apples", 22, 11], [12, "pear", 24, 11], [13, "bannana", 18, 11], [14, "pear", 17, 11]
How can I change the array to only show data from user pear
? I want to collect all the values from column 1 of user pear
. (12, 14)
Or alrternatively how can I find the values that are unique in colum 2, e.g. apples, pear and bannana. And then filter by pear
to find the data only of pear
. [12, “pear”, 24, 11], [14, “pear”, 17, 11]
What have I tried and vary forms of it:
uniqueRows = np.unique(array, axis=:,1)
This is what I can use to filter if I have the unique values.
new_arr = np.array([[11, "apples", 22, 11], [12, "pear", 24, 11], [13, "bannana", 18, 11], [14, "pear", 17, 11]]) new_val = np.array(["pear"]) result = np.in1d(new_arr[:, 1], new_val) z = new_arr[result]
Advertisement
Answer
Pandas Way
import numpy as np import pandas as pd new_arr = np.array([[11, "apples", 22, 11], [12, "pear", 24, 11], [13, "banana", 18, 11], [14, "pear", 17, 11]]) df = pd.DataFrame(new_arr,columns=['A','B','C','D']) result = df[df.B=='pear'] print(result) ''' A B C D 1 12 pear 24 11 3 14 pear 17 11 ''' #or result_2 = df['B'].drop_duplicates() print(result_2) ''' 0 apples 1 pear 2 banana '''
However instead of drop_duplicate you can use unique() but this way is faster.