So I’m trying to look down a specific column of my csv file for a partial string. If that meets a certain condition, it’ll write something else in a different column.
For example:
Letter Grade Percentage A Ninety Five Percent C Seventy Three Percent B Eighty Two Percent
The “Percentage” column will always have the same format of “Ninety Five Percent” that is number in words + Percent.
I want to be able to open up a csv file, have python parse the percentage column for the first word such if it contains “Ninety” then it’ll assign “A” in the letter grade column, if it contains “Eighty” then B in the letter grade and so on and so forth.
This is my current code:
import pandas as pd df = pd.DataFrame(pd.read_csv(r'file.csv', dtype=str)) A = 'Ninety' B = 'Eighty' C = 'Seventy' D = 'Sixty' if df[df['Percentage'].isin(A)]: df['Letter Grade'] = df['Letter Grade'].str.replace['', 'A', regex=False] elif df[df['Percentage'].isin(B)]: df['Letter Grade'] = df['Letter Grade'].str.replace['', 'B', regex=False] elif df[df['Percentage'].isin(C)]: df['Letter Grade'] = df['Letter Grade'].str.replace['', 'C', regex=False] elif df[df['Percentage'].isin(D)]: df['Letter Grade'] = df['Letter Grade'].str.replace['', 'D', regex=False] else: df['Letter Grade'] = df['Letter Grade'].str.replace['', 'F', regex=False] df.to_csv(r'file.csv', index=False)
It is currently giving me this error: The truth value of a DataFrame is ambiguous.
Advertisement
Answer
Try with numpy.select
:
import numpy as np df["Letter Grade"] = np.select([df["Percentage"].str.contains(A), df["Percentage"].str.contains(B), df["Percentage"].str.contains(C), df["Percentage"].str.contains(D)], ["A","B","C","D"])
Alternatively with findall
and map
:
mapper = {"Ninety": "A", "Eighty": "B", "Seventy": "C", "Sixty": "D"} df["Letter Grade"] = df["Percentage"].str.findall("|".join(mapper.keys())).str[0].map(mapper)