Skip to content
Advertisement

How do I remove 1 instance of x characters in a string and find the word it makes in Python3?

This is what I have so far, but I’m stuck. I’m using nltk for the word list and trying to find all the words with the letters in “sand”. From this list I want to find all the words I can make from the remaining letters.

import nltk.corpus.words.words()
pwordlist = []

for w in wordlist:
    if 's' in w:
        if 'a' in w:
            if 'n' in w:
                if 'd' in w:
                    pwordlist.append(w)

In this case I have to use all the letters to find the words possible. I think this will work for finding the possible words with the remaining letters, but I can’t figure out how to remove only 1 instance of the letters in ‘sand’.

puzzle_letters = nltk.FreqDist(x)

[w for w in pwordlist if len(w) = len(pwordlist) and nltk.FreqDist(w) = puzzle_letters]

Advertisement

Answer

I would separate the logic into four sections:

  1. A function contains(word, letters), which we’ll use to detect whether a word contains “sand”
  2. A function subtract(word, letters), which we’ll use to remove “sand” from the word.
  3. A function get_anagrams(word), which finds all of the anagrams of a word.
  4. The main algorithm that combines all of the above to find words that are anagrams of other words once you remove “sand”.

 

from collections import Counter

words = ??? #todo: somehow get a list of every English word.

def contains(word, letters):
    return not Counter(letters) - Counter(word)

def subtract(word, letters):
    remaining = Counter(word) - Counter(letters)
    return "".join(remaining.elements())

anagrams = {}
for word in words:
    base = "".join(sorted(word))
    anagrams.setdefault(base, []).append(word)
def get_anagrams(word):
    return anagrams.get("".join(sorted(word)), [])

for word in words:
    if contains(word, "sand"):
        reduced_word = subtract(word, "sand")
        matches = get_anagrams(reduced_word)
        if matches:
            print word, matches

Running the above code on the Words With Friends dictionary, I get a lot of results, including:

...
cowhands ['chow']
credentials ['reticle', 'tiercel']
cyanids ['icy']
daftness ['efts', 'fest', 'fets']
dahoons ['oho', 'ooh']
daikons ['koi']
daintiness ['seniti']
daintinesses ['sienites']
dalapons ['opal']
dalesman ['alme', 'lame', 'male', 'meal']
...
User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement