Skip to content
Advertisement

How to assign an item in a pandas dataframe after checking for conditions?

I am iterating through a pandas dataframe (originally a csv file) and checking for specific keywords in each row of a certain column. If it appears at least once, I add 1 to a score. There are like 7 keywords, and if the score is >=6, I would like to assign an item of another column (but in this row) with a string (here it is “Software and application developer”) and safe the score. Unfortunately, the score is everywhere the same what I find hard to believe. This is my code so far:

for row in data.iterrows():
devScore=0
if row[1].str.contains("developer").any() | row[1].str.contains("developpeur").any():
    devScore=devScore+1
if row[1].str.contains("symfony").any():
    devScore=devScore+1
if row[1].str.contains("javascript").any():
    devScore=devScore+1
if row[1].str.contains("java").any() | row[1].str.contains("jee").any():
    devScore=devScore+1
if row[1].str.contains("php").any():
    devScore=devScore+1
if row[1].str.contains("html").any() | row[1].str.contains("html5").any():
    devScore=devScore+1
if row[1].str.contains("application").any() | row[1].str.contains("applications").any():
    devScore=devScore+1
if devScore>=6:
    data["occupation"]="Software and application developer"
    data["score"]=devScore

Advertisement

Answer

You assign a constant onto the whole column here:

data["occupation"]="Software and application developer"
data["score"]=devScore

They are supposed to be:

for idx, row in data.iterrows():
    # blah blah
    #
    .
    .
    data.loc[idx, "occupation"]="Software and application developer"
    data.loc[idx, "score"]=devScore
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement