Skip to content

selenium , how to print elements of this html in order as they are?

if this is the html of a whatsapp message (“πŸ˜… how πŸ‘Ή are you πŸŽ‚”) then how to iterate through elements of this message and get them (print them) in order as they are by selenium?

   <span dir="ltr" class="i0jNr selectable-text copyable-text">
    <span>
        <img crossorigin="anonymous"
            src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt="πŸ˜…"
            draggable="false" class="b75 emoji wa i0jNr selectable-text copyable-text" data-plain-text="πŸ˜…"
            style="background-position: -60px -40px;">
        " how "
        <img crossorigin="anonymous"
            src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt="πŸ‘Ή"
            draggable="false" class="b60 emoji wa i0jNr selectable-text copyable-text" data-plain-text="πŸ‘Ή"
            style="background-position: -60px -40px;">
        " are you"
        <img crossorigin="anonymous"
            src="data:image/gif;base64,R0lGODlhAQABAIAAAAAAAP///yH5BAEAAAAALAAAAAABAAEAAAIBRAA7" alt="πŸŽ‚"
            draggable="false" class="b25 emoji wa i0jNr selectable-text copyable-text" data-plain-text="πŸŽ‚"
            style="background-position: -40px -40px;">
    </span>
</span>

output this should be

πŸ˜…
 how
πŸ‘Ή
 are you
πŸŽ‚

or output can also be like this

πŸ˜… how πŸ‘Ή are you πŸŽ‚

i tried this

chats = driver.find_elements_by_class_name("message-in")
for i in range(0,len(chats)):
    messages = chats[i].find_elements_by_class_name("i0jNr")
    for j in range(0,len(messages)):
        if messages[j].text == "" :        
            emojis = chats[i].find_elements_by_class_name("emoji")
            for emoji in emojis:
                print(emoji.get_attribute('alt'))
                break
        else:
            print(messages[j].text)

this is giving output as

 how
 are you
πŸ˜…
πŸ‘Ή
πŸŽ‚ 

so how to get elements of this in order as they are ?

Advertisement

Answer

You can iterate over the child of span element and print the text in case of string and alt text in case of img tag

from bs4 import BeautifulSoup as bs4
from bs4 import NavigableString, Tag

soup = bs4(html, 'html.parser')

s = soup.find('span', attrs={'class':'i0jNr'})
s = s.find('span')
for i in s.children:
    if isinstance(i, NavigableString):
        print(i.strip())
    elif isinstance(i, Tag):
        print(i.attrs['alt'])

here is code sample for your use case It’s output is for this message is

πŸ˜…
how
πŸ‘Ή
are you
πŸŽ‚
User contributions licensed under: CC BY-SA
3 People found this is helpful