Skip to content
Advertisement

using set() with pandas

May I ask you please if we can use set() to read the data in a specific column in pandas? For example, I have the following output from a DataFrame df1:

    df1= [    
           0 -10 2 5 
           1  24 5 10 
           2  30 3 6 
           3  30 2 1 
           4  30 4 5
                     ]

where the first column is the index.. I tried first to isolate the second column

                                       [-10 
                                         24 
                                         30 
                                         30 
                                         30] 

using the following: x = pd.DataFrame(df1, coulmn=[0]) Then, I transposed the column using the following XX = x.T Then, I used set() function.

However, instead of obtaining [-10 24 30] I got the following [0 1 2 3 4]

So set() read the index instead of reading the first column

Advertisement

Answer

set() takes an itterable.

using a pandas dataframe as an itterable yields the column names in turn.

Since you’ve transposed the dataframe, your index values are now column names, so when you use the transposed dataframe as an itterable you get those index values.

If you want to use set to get the values in the column using set() you can use:

x = pd.DataFrame(df1, colmns=[0])
set(x.iloc[:,0].values)

But if you just want the unique values in column 0 then you can use

df1[[0]].unique()
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement