Couldnt come up with a better title, so here we are. I am running the following code:
dow_23457 = df dow_23457 = dow_23457.set_index('date', inplace = True) dow_23457 = dof_2345i.shift(24) dow_23457 = dow_23457.reset_index()
As far as I understand, I first make a copy of ‘df’ and then I change the copy. What makes me confused is that when I run the second line, the ‘date’ column, becomes the index even in the ‘df’ data frame. The changes from the two following lines, only applies to the copied (dow_23457) data frame though. How can this happen?
Advertisement
Answer
I first make a copy of ‘df’ and then I change the copy
Nope! When you do dow_23457 = df
, you’re making dow_23457
look at the same underlying object df
has been looking at. Direct assignment doesn’t copy data in the language.
You need to be explicit:
dow_23457 = df.copy()
which makes dow_23457
now look at an entirely different, newly made dataframe object which is independent of what df
looks at. (well except you had some lists, dicts etc. in the cells of the dataframe, which resists in the copying process… but you shouldn’t have them in the cells of a dataframe in the first place!)
For more on this “naming” subject, you might want to see here (it has also a video form as well as plain text).