I’m new to programming, I’m working on a python project using pandas I wanted to change values of each row of a dataset using .loc, but it seems like it won’t work, the idea is to make a row take EOL value if the row is equal to 0, the code doesn’t bring an error, but my dataset is unchanged after the iterations. Here is the code:
for machines in telemetry_days['machineID']:
EOL = 365
i = 0
for row in telemetry_days['failure_comp1'].loc[(telemetry_days['machineID'] == machines)]:
if (row != 0):
EOL = row
elif (row == 0):
telemetry_days['failure_comp1'].loc[(telemetry_days['machineID'] == machines)].iloc[i] = EOL
i = i + 1
I think it’s because i’m using .iloc so it won’t change the value of ‘failure_comp1’ in the dataset. But I can’t figure out how to get a specific row from .loc without using .iloc., if anyone as any suggestions I’d be very grateful, thanks Here is the structure of the whole dataset (don’t mind the NaNs): enter image description here Here is what i have for example (for one ‘machine’):
index failure_comp1
67 0
254 150
568 0
850 0
998 345
I want it to become this:
index failure_comp1
67 365
254 150
568 150
850 150
998 345
It’s a time series dataset and i want to label each component of machines with it’s End Of Life time (number of days), i’ve already got it labeled at the date where it fails, but I want to have it labeled for each row of that specific component.
So I wouldn’t iterate through the rows (although you could if you want, I’ll show that solution too). But what I would do is use a .groupby('macineID')
. 1) Then convert all the 0s to nan. 2) forward fill the nans. 3) this will leave the first 0 as a nan, so finally fillna with 365.
Given as a sample data set:
import pandas as pd
telemetry_days = pd.DataFrame({
0, 0, 0, 0,
0, 12, 0, 0,
345, 12, 0, 0]})
import pandas as pd
import numpy as np
telemetry_days['failure_comp1'] = telemetry_days['failure_comp1'].replace(0, np.nan)
telemetry_days['failure_comp1'] = telemetry_days.groupby('machineID', as_index=False)['failure_comp1'].ffill().fillna(365)
If you want to use the .loc or .iloc:
Here’s how I would do it. I would loop through each unique machineID, filter the dataframe to get just those machines, then iterrate through that sub-group. I also would not hard code the i
(index). .iteritems()
and or iterrows()
will returns the index value for you, so just use that.
for machines in telemetry_days['machineID'].unique():
EOL = 365
for i, row in telemetry_days[telemetry_days['machineID'] == machines]['failure_comp1'].iteritems():
if (row != 0):
EOL = row
elif (row == 0):
telemetry_days['failure_comp1'].iloc[i] = EOL