How do you scan characters in multi-character multi-string arrays in Python?

I’m working on my first Python project for a device that reads a string from OCR and outputs braille. The braille device can only output 6 letters at a time. I’m stuck trying to scan each character in the 6 character long multi-string array.

For simplicity’s sake, for now I only want to print “this is (insert character)” for every character in the multi-string array. In reality, the output would be to run the code that tells the first two motors to create the character in braille, and then do it for the remaining 5 characters with the 10 remaining motors with a short delay between each 6 character long string. How do I scan each 6 character long string and loop it for the rest of the strings in the array?

Here’s where I’m at so far:

from PIL import Image
import pytesseract


img = Image.open('img file path')
text = [item for item in (pytesseract.image_to_string(img, lang='eng', config='--psm 6')).split('n')]
oneLineStr = ' '.join(text)
# displays: The quick brown fox jumps over the lazy dog.
print(oneLineStr)

arr6elem = []
for idx in range(0, len(oneLineStr), 6):
    arr6elem.append(oneLineStr[idx:idx + 6])
# displays: ['The qu', 'ick br', 'own fo', 'x jump', 's over', ' the l', 'azy do', 'g.']
print(arr6elem)

# Don't know what to do from this point
# Want to scan each 6-element string in list and for each string, see which elements it consists of
# (capital/lower case characters, numbers, spaces, commas, apostrophes, periods, etc.)
# Then, print "this is a" for letter a, or "this is a colon" for :, etc.
# So that output looks like:
# ["'this is T', 'this is h', 'this is e', this is a space', 'this is q', 'this is u'", "'this is i', 'this is c'...]

JavaScript
​x
 
from PIL import Image
import pytesseract
​
​
img = Image.open('img file path')
text = [item for item in (pytesseract.image_to_string(img, lang='eng', config='--psm 6')).split('n')]
oneLineStr = ' '.join(text)
# displays: The quick brown fox jumps over the lazy dog.
print(oneLineStr)
​
arr6elem = []
for idx in range(0, len(oneLineStr), 6):
    arr6elem.append(oneLineStr[idx:idx + 6])
# displays: ['The qu', 'ick br', 'own fo', 'x jump', 's over', ' the l', 'azy do', 'g.']
print(arr6elem)
​
# Don't know what to do from this point
# Want to scan each 6-element string in list and for each string, see which elements it consists of
# (capital/lower case characters, numbers, spaces, commas, apostrophes, periods, etc.)
# Then, print "this is a" for letter a, or "this is a colon" for :, etc.
# So that output looks like:
# ["'this is T', 'this is h', 'this is e', this is a space', 'this is q', 'this is u'", "'this is i', 'this is c'...]
​

Answer

A dictionary should do the trick:

punctuation = {
    ' ': 'a space',
    ',': 'a comma',
    "'": 'an apostrophes',
    '.': 'a period'
}

for word in arr6elem:
    for char in word:
        print('This is {}'.format(punctuation.get(char, char)))

JavaScript
 
punctuation = {
    ' ': 'a space',
    ',': 'a comma',
    "'": 'an apostrophes',
    '.': 'a period'
}
​
for word in arr6elem:
    for char in word:
        print('This is {}'.format(punctuation.get(char, char)))
​

Once you’ve built your punctuation dict with all the items you need, the loop will either get the corresponding value from it, or default to the actual char.

Output:
# This is T
# This is h
# This is e
# This is a space
# This is q
# This is u
# This is i
# This is c
# This is k
# This is a space
# This is b
# This is r
# This is o
# This is w
# This is n
# This is a space
# This is f
# ...

JavaScript
 
Output:
# This is T
# This is h
# This is e
# This is a space
# This is q
# This is u
# This is i
# This is c
# This is k
# This is a space
# This is b
# This is r
# This is o
# This is w
# This is n
# This is a space
# This is f
# ...
​

Advertisement

Answer