Skip to content
Advertisement

pandas dataframe loc usage: what does supplying length of index to loc actually mean?

I have read about dataframe loc. I could not understand why the length of dataframe(indexPD) is being supplied to loc as a first argument. Basically what does this loc indicate?

tp_DataFrame = pd.DataFrame(columns=list(props_file_data["PART_HEADER"].split("|")))

indexPD = len(tp_DataFrame)

tp_DataFrame.loc[indexPD, 'item_id'] = something

Advertisement

Answer

That is simply telling pandas you want to do the operation on all of the rows of that column of your dataframe. Consider this pandas Dataframe:

df = pd.DataFrame(zip([1,2,3], [4,5,6]), columns=['a', 'b'])

   a  b
0  1  4
1  2  5
2  3  6

Your transformation df.loc[len(df), 'b'] = -1 is equivalent to df.loc[:, 'b'] = -1. You are applying this -1 transformation to all rows of the desired column, both yield:

   a  b
0  1 -1
1  2 -1
2  3 -1

The purpose of the first argument is so you specify which indices in that column will suffer the transformation. For instance, if you only want the first 2 rows to suffer the transformation then you can specify it like this:

df.loc[[0,1], 'b'] = -1

   a  b
0  1 -1
1  2 -1
2  3  6
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement