Skip to content
Advertisement

Pandas “A value is trying to be set on a copy of a slice from a DataFrame”

Having a bit of trouble understanding the documentation

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy dfbreed[‘x’] = dfbreed.apply(testbreed, axis=1) C:/Users/erasmuss/PycharmProjects/Sarah/farmdata.py:38: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

Code is basically to re-arrange and clean some data to make analysis easier. Code in given row-by per each animal, but has repetitions, blanks, and some other sparse values Idea is to basically stack rows into columns and grab the useful data (Weight by date and final BCS) per animal

Initial DF few snippets of the dataframe

Output Format Output DF/csv

JavaScript

I want to take the BCS and Breed dataframes which are multi-indexed on the column by Breed or BCS and then by date to take the first non-NaN value in the rows of dates and set that into a column named breed.

I had a lot of trouble getting the columns to pick the first unique values in-situ on the DF I found a work-around with a 2015 answer:

2015 Answer

which defined the function at the top. reading through the setting a value on the copy-of a slice makes sense intuitively, but I can’t seem to think of a way to make it work as a direct-replacement or index-based.

Should I be looping through?

Trying from The second answer here I get

JavaScript

which returns

ValueError: Must have equal len keys and value when setting with an iterable

I’m thinking this has something to do with the multi-index keys come up as:

MultiIndex([(‘Breed’, ‘1/28/2021’), (‘Breed’, ‘2/12/2021’), (‘Breed’, ‘2/4/2021’), (‘Breed’, ‘3/18/2021’), (‘Breed’, ‘7/30/2021’)], names=[None, ‘Date’]) MultiIndex([(‘BCS’, ‘1/28/2021’), (‘BCS’, ‘2/12/2021’), (‘BCS’, ‘2/4/2021’), (‘BCS’, ‘3/18/2021’), (‘BCS’, ‘7/30/2021’)], names=[None, ‘Date’])

Sorry for the long question(s?) Can anyone help me out?

Thanks.

Advertisement

Answer

You created dfbreed as:

JavaScript

So it is a view of the original DataFrame (limited to just this one column).

Remember that a view has not any own data buffer, it is only a tool to “view” a fragment of the original DataFrame, with read only access.

When you attempt to perform dfbreed['x'] = dfbreed.apply(...), you actually attempt to violate the read-only access mode.

To avoid this error, create dfbreed as an “independent” DataFrame:

JavaScript

Now dfbreed has its own data buffer and you are free to change the data.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement