Skip to content
Advertisement

How to get multiple sentences in arrays into a single response in python?

As a hobby, I started doing a project with amazon textract which helps in extracting text from a photo or a pdf. Now I ran into a problem. According to what I read from it’s docs, every word in the photo is a small “block”. When I try printing, it prints fine, but if I have to use that text to send somewhere, like an email etc, I need the whole text as a single file. So I would need all blocks of texts to be stored in a single response to help my further use. This is where I am stuck for a few days. Help appreciated. Thank you

def processor(name):
    textract = boto3.client('textract')
    response = textract.detect_document_text(
        Document = {
            'S3Object':{
                'Bucket':bucketName,
                'Name':name
            }
        }
        
     )
    for item in response["Blocks"]:
        if item["BlockType"] == "LINE":
            print (item["Text"])

Advertisement

Answer

The one liner below should do the job

single_response = ' '.join(item["Text"] for item in response["Blocks"] if item["BlockType"] == "LINE")
User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement