if this is the html of a whatsapp message (βπ how πΉ are you πβ) then how to iterate through elements of this message and get them (print them) in order as they are by selenium?
JavaScript
βx
19
19
1
<span dir="ltr" class="i0jNr selectable-text copyable-text">
2
<span>
3
<img crossorigin="anonymous"
4
src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt="π
"
5
draggable="false" class="b75 emoji wa i0jNr selectable-text copyable-text" data-plain-text="π
"
6
style="background-position: -60px -40px;">
7
" how "
8
<img crossorigin="anonymous"
9
src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt="πΉ"
10
draggable="false" class="b60 emoji wa i0jNr selectable-text copyable-text" data-plain-text="πΉ"
11
style="background-position: -60px -40px;">
12
" are you"
13
<img crossorigin="anonymous"
14
src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt="π"
15
draggable="false" class="b25 emoji wa i0jNr selectable-text copyable-text" data-plain-text="π"
16
style="background-position: -40px -40px;">
17
</span>
18
</span>
19
β
output this should be
JavaScript
1
6
1
π
2
how
3
πΉ
4
are you
5
π
6
β
or output can also be like this
JavaScript
1
2
1
π
how πΉ are you π
2
β
i tried this
JavaScript
1
12
12
1
chats = driver.find_elements_by_class_name("message-in")
2
for i in range(0,len(chats)):
3
messages = chats[i].find_elements_by_class_name("i0jNr")
4
for j in range(0,len(messages)):
5
if messages[j].text == "" :
6
emojis = chats[i].find_elements_by_class_name("emoji")
7
for emoji in emojis:
8
print(emoji.get_attribute('alt'))
9
break
10
else:
11
print(messages[j].text)
12
β
this is giving output as
JavaScript
1
6
1
how
2
are you
3
π
4
πΉ
5
π
6
β
so how to get elements of this in order as they are ?
Advertisement
Answer
You can iterate over the child of span
element and print the text in case of string and alt text in case of img
tag
JavaScript
1
13
13
1
from bs4 import BeautifulSoup as bs4
2
from bs4 import NavigableString, Tag
3
β
4
soup = bs4(html, 'html.parser')
5
β
6
s = soup.find('span', attrs={'class':'i0jNr'})
7
s = s.find('span')
8
for i in s.children:
9
if isinstance(i, NavigableString):
10
print(i.strip())
11
elif isinstance(i, Tag):
12
print(i.attrs['alt'])
13
β
here is code sample for your use case Itβs output is for this message is
JavaScript
1
6
1
π
2
how
3
πΉ
4
are you
5
π
6
β