I have read about dataframe loc
. I could not understand why the length of dataframe(indexPD)
is being supplied to loc
as a first argument. Basically what does this loc
indicate?
tp_DataFrame = pd.DataFrame(columns=list(props_file_data["PART_HEADER"].split("|"))) indexPD = len(tp_DataFrame) tp_DataFrame.loc[indexPD, 'item_id'] = something
Advertisement
Answer
That is simply telling pandas you want to do the operation on all of the rows of that column of your dataframe. Consider this pandas Dataframe:
df = pd.DataFrame(zip([1,2,3], [4,5,6]), columns=['a', 'b']) a b 0 1 4 1 2 5 2 3 6
Your transformation df.loc[len(df), 'b'] = -1
is equivalent to df.loc[:, 'b'] = -1
. You are applying this -1
transformation to all rows of the desired column, both yield:
a b 0 1 -1 1 2 -1 2 3 -1
The purpose of the first argument is so you specify which indices in that column will suffer the transformation. For instance, if you only want the first 2 rows to suffer the transformation then you can specify it like this:
df.loc[[0,1], 'b'] = -1 a b 0 1 -1 1 2 -1 2 3 6