Below is the Dataframe i’m working. I want to replace NaN values in ‘Score’ columns using values from column ‘Country’ and ‘Sectors’
Country Sectors Score 0 USA MECH NaN 1 IND ELEC 10.0 2 USA CHEM NaN 3 RUS ENT 34.0 4 PAK MATH 45.0 5 SL LAN 23.0 6 USA CHEM 56.0 7 IND ELEC 32.0 8 USA CHEM NaN 9 RUS ENT 45.0 10 PAK MATH 45.0
Below is the code which I’ve tried
import pandas as pd
import numpy as np
df = pd.read_csv('../Data/Data.csv')
df['Score'] = df[(df['Country'] == 'USA') & (df['Sectors'] == 'CHEM') & (df['Score'].isnull())]['Score'].fillna(10)
print(df)
```but I am getting below result```
Country Sectors Score
0 USA MECH NaN
1 IND ELEC NaN
2 USA CHEM 10.0
3 RUS ENT NaN
4 PAK MATH NaN
5 SL LAN NaN
6 USA CHEM NaN
7 IND ELEC NaN
8 USA CHEM 10.0
9 RUS ENT NaN
10 PAK MATH NaN
I want to replace only NaN values specific to country == ‘USA’ and Sectors == ‘CHEM’ and keep all values as it is. Could anyone please help?“`
Advertisement
Answer
You can use np.where:
>>> df = pd.DataFrame({'Country':['USA', 'IND','USA'], 'Sectors':['MECH', 'ELEC','CHEM'], 'Score':[45.0, 30.0, np.NaN]})
>>> df["Score"] = np.where((df["Country"]=='USA') & (df['Sectors'] == 'CHEM'), 10, df["Score"])
>>> df
Country Sectors Score
0 USA MECH 45.0
1 IND ELEC 30.0
2 USA CHEM 10.0
If df["Country"]=='USA' and df['Sectors'] == 'CHEM', the df['Score'] is set to 10, else, the original value in df['Score'] is set.