I’m working on my first Python project for a device that reads a string from OCR and outputs braille. The braille device can only output 6 letters at a time. I’m stuck trying to scan each character in the 6 character long multi-string array.
For simplicity’s sake, for now I only want to print “this is (insert character)” for every character in the multi-string array. In reality, the output would be to run the code that tells the first two motors to create the character in braille, and then do it for the remaining 5 characters with the 10 remaining motors with a short delay between each 6 character long string. How do I scan each 6 character long string and loop it for the rest of the strings in the array?
Here’s where I’m at so far:
from PIL import Image import pytesseract img = Image.open('img file path') text = [item for item in (pytesseract.image_to_string(img, lang='eng', config='--psm 6')).split('n')] oneLineStr = ' '.join(text) # displays: The quick brown fox jumps over the lazy dog. print(oneLineStr) arr6elem = [] for idx in range(0, len(oneLineStr), 6): arr6elem.append(oneLineStr[idx:idx + 6]) # displays: ['The qu', 'ick br', 'own fo', 'x jump', 's over', ' the l', 'azy do', 'g.'] print(arr6elem) # Don't know what to do from this point # Want to scan each 6-element string in list and for each string, see which elements it consists of # (capital/lower case characters, numbers, spaces, commas, apostrophes, periods, etc.) # Then, print "this is a" for letter a, or "this is a colon" for :, etc. # So that output looks like: # ["'this is T', 'this is h', 'this is e', this is a space', 'this is q', 'this is u'", "'this is i', 'this is c'...]
Advertisement
Answer
A dictionary should do the trick:
punctuation = { ' ': 'a space', ',': 'a comma', "'": 'an apostrophes', '.': 'a period' } for word in arr6elem: for char in word: print('This is {}'.format(punctuation.get(char, char)))
Once you’ve built your punctuation dict with all the items you need, the loop will either get the corresponding value from it, or default to the actual char.
Output: # This is T # This is h # This is e # This is a space # This is q # This is u # This is i # This is c # This is k # This is a space # This is b # This is r # This is o # This is w # This is n # This is a space # This is f # ...