I have a list of terms in Python that look like this.
Fruit
apple
banana
grape
orange
As well as a list of individual sentences that may contain the name of that fruit in a data frame. Something similar to this:
Customer Review
1 ['the banana was delicious','he called the firetruck','I had only half an orange']
2 ['I liked the banana','there was a worm in my apple','Cantaloupes are better then melons']
3 ['It could use some more cheese','the grape and orange was sour']
And I want to take the sentences in the review column, match them with the fruit mentioned in the text and print out a data frame of that as a final result. So, something like this:
Fruit Review
apple ['the banana was delicious','I liked the banana']
banana ['there was a worm in my apple']
grape ['the grape and orange was sour']
orange ['the grape and orange was sour','I had only half an orange']
Hoe could I go about doing this?
Advertisement
Answer
While the exact answer depends on how you’re storing the data, I think the methodology is the same:
- Create and store an empty list for every fruit name to store its reviews
- For each review, check each of the fruits to see if they appear. If a fruit appears in the comment at all, add the review to that fruit’s list
Here’s an example of what that would look like:
#The list of fruits
fruits = ['apple', 'banana', 'grape', 'orange']
#The collection of reviews (based on the way it was presented, I'm assuming it was in a dictionary)
reviews = {
'1':['the banana was delicious','he called the firetruck','I had only half an orange'],
'2':['I liked the banana','there was a worm in my apple','Cantaloupes are better then melons'],
'3':['It could use some more cheese','the grape and orange was sour']
}
fruitDictionary = {}
#1. Create and store an empty list for every fruit name to store its reviews
for fruit in fruits:
fruitDictionary[fruit] = []
for customerReviews in reviews.values():
#2. For each review,...
for review in customerReviews:
#...check each of the fruits to see if they appear.
for fruit in fruits:
# If a fruit appears in the comment at all,...
if fruit.lower() in review:
#...add the review to that fruit's list
fruitDictionary[fruit].append(review)
This differs from previous answers in that sentences like “I enjoyed this grape. I thought the grape was very juicy” are only added to the grape section once.
If your data is stored as a list of lists, the process is very similar:
#The list of fruits
fruits = ['apple', 'banana', 'grape', 'orange']
#The collection of reviews
reviews = [
['the banana was delicious','he called the firetruck','I had only half an orange'],
['I liked the banana','there was a worm in my apple','Cantaloupes are better then melons'],
['It could use some more cheese','the grape and orange was sour']
]
fruitDictionary = {}
#1. Create and store an empty list for every fruit name to store its reviews
for fruit in fruits:
fruitDictionary[fruit] = []
for customerReviews in reviews:
#2. For each review,...
for review in customerReviews:
#...check each of the fruits to see if they appear.
for fruit in fruits:
# If a fruit appears in the comment at all,...
if fruit.lower() in review:
#...add the review to that fruit's list
fruitDictionary[fruit].append(review)