I’m loading the following JSON into my program:
d = [ { "extraction_method": "lattice", "top": 18.0, "left": 18.0, "width": 575.682861328125, "height": 756.0, "right": 593.68286, "bottom": 774.0, "data": [ [ { "top": 108.0, "left": 18.0, "width": 575.682861328125, "height": 53.67376708984375, "text": "apple foo textrhello world" }, ...
I want to extract the substring “apple foo text” with:
print(d[0][0]['data'][0][0]['text'])
But it only returns hello world
. I know it is because of the carriage return statement, but I’m not sure how to to print the substring before. How would I get just the text before the statement? Any help would be appreciated.
Advertisement
Answer
To navigate to string, you’re using:
string = d[0][0]['data'][0][0]['text']
To get the desired substring split text on carriage return.
substring = string.split('r')[0] print(substring) # Result is apple foo text