Python, NLP: How to find all trigrams from text files with adjectives as the middle term

Question

I think the question is self-explanatory but here goes the detailed meaning of the question. I want to extract all trigrams from text files using the nltk library having adjectives as the middle term. Example Text - A red ball was with the good boy. Example of output - and so on Answer This code should do it:

Accepted Answer

This code should do it:import nltkfrom nltk.tokenize import word_tokenizenltk.download('punkt')nltk.download('averaged_perceptron_tagger')text = word_tokenize("He is a very handsome man. Her childern are funny. She has a lovely voice")text_tags = nltk.pos_tag(text)results = list()for i, (txt, tag) in enumerate(text_tags):    if tag in ["JJ", "JJR", "JJS"]:        if (i > 0) and (i < len(text_tags)-1):            results.append((text_tags[i-1][0], txt, text_tags[i+1][0]))# output: [('very', 'handsome', 'man'), ('are', 'funny', '.'), ('a', 'lovely', 'voice')]

Advertisement

Answer