Skip to content
Advertisement

Searching over a list of individual sentences by a specific term in Python

I have a list of terms in Python that look like this.

Fruit
apple
banana
grape
orange

As well as a list of individual sentences that may contain the name of that fruit in a data frame. Something similar to this:

Customer     Review
1            ['the banana was delicious','he called the firetruck','I had only half an orange']
2            ['I liked the banana','there was a worm in my apple','Cantaloupes are better then melons']
3            ['It could use some more cheese','the grape and orange was sour']

And I want to take the sentences in the review column, match them with the fruit mentioned in the text and print out a data frame of that as a final result. So, something like this:

Fruit     Review
apple     ['the banana was delicious','I liked the banana']
banana    ['there was a worm in my apple']
grape     ['the grape and orange was sour']
orange    ['the grape and orange was sour','I had only half an orange']

Hoe could I go about doing this?

Advertisement

Answer

While the exact answer depends on how you’re storing the data, I think the methodology is the same:

  1. Create and store an empty list for every fruit name to store its reviews
  2. For each review, check each of the fruits to see if they appear. If a fruit appears in the comment at all, add the review to that fruit’s list

Here’s an example of what that would look like:

#The list of fruits
fruits = ['apple', 'banana', 'grape', 'orange']

#The collection of reviews (based on the way it was presented, I'm assuming it was in a dictionary)
reviews = {
    '1':['the banana was delicious','he called the firetruck','I had only half an orange'],
    '2':['I liked the banana','there was a worm in my apple','Cantaloupes are better then melons'],
    '3':['It could use some more cheese','the grape and orange was sour']
}

fruitDictionary = {}
#1. Create and store an empty list for every fruit name to store its reviews
for fruit in fruits:
    fruitDictionary[fruit] = []
for customerReviews in reviews.values():
    #2. For each review,...
    for review in customerReviews:
        #...check each of the fruits to see if they appear.
        for fruit in fruits: 
            # If a fruit appears in the comment at all,...
            if fruit.lower() in review: 
                #...add the review to that fruit's list
                fruitDictionary[fruit].append(review) 

This differs from previous answers in that sentences like “I enjoyed this grape. I thought the grape was very juicy” are only added to the grape section once.

If your data is stored as a list of lists, the process is very similar:

#The list of fruits
fruits = ['apple', 'banana', 'grape', 'orange']

#The collection of reviews
reviews = [
    ['the banana was delicious','he called the firetruck','I had only half an orange'],
    ['I liked the banana','there was a worm in my apple','Cantaloupes are better then melons'],
    ['It could use some more cheese','the grape and orange was sour']
]

fruitDictionary = {}
#1. Create and store an empty list for every fruit name to store its reviews
for fruit in fruits:
    fruitDictionary[fruit] = []
for customerReviews in reviews:
    #2. For each review,...
    for review in customerReviews:
        #...check each of the fruits to see if they appear.
        for fruit in fruits: 
            # If a fruit appears in the comment at all,...
            if fruit.lower() in review: 
                #...add the review to that fruit's list
                fruitDictionary[fruit].append(review) 
User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement