I’m loading the following JSON into my program:
d =
[
{
"extraction_method": "lattice",
"top": 18.0,
"left": 18.0,
"width": 575.682861328125,
"height": 756.0,
"right": 593.68286,
"bottom": 774.0,
"data": [
[
{
"top": 108.0,
"left": 18.0,
"width": 575.682861328125,
"height": 53.67376708984375,
"text": "apple foo textrhello world"
},
...
I want to extract the substring “apple foo text” with:
print(d[0][0]['data'][0][0]['text'])
But it only returns hello world. I know it is because of the carriage return statement, but I’m not sure how to to print the substring before. How would I get just the text before the statement? Any help would be appreciated.
Advertisement
Answer
To navigate to string, you’re using:
string = d[0][0]['data'][0][0]['text']
To get the desired substring split text on carriage return.
substring = string.split('r')[0]
print(substring) # Result is apple foo text