Skip to content
Advertisement

Selenium : Get text inside an element but not texts inside the nested tags within it

Lets say I have an element

<div class="ProductVariants__PriceContainer-sc-1unev4j-9 jjiIua">
    ₹199 
    <span class="ProductVariants__MRPText-sc-1unev4j-10 jEinXG">
        ₹690
    </span>
    <div class="Product__Dicount">
        No discount available for this product
    </div>
</div>

When I am fetching the element by classname

div_containing_radio = driver.find_element(by=By.XPATH, value="//div[starts-with(@class, 'ProductVariants__RadioButtonInner')]//ancestor::div[starts-with(@class, 'ProductVariants__VariantCard')]")
div_containing_radio.find_element(by=By.CSS_SELECTOR, value=".ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua").text

This gives me

'₹199 ₹690 No discount available for this product'

What I wanted was just ₹199.

Note that I can’t just format the text and get the first text on split by space as the structure of the page keeps changing.

Advertisement

Answer

Using little bit JS:

js_query = """
            var x = document.querySelector('.ProductVariants__PriceContainer-sc-1unev4j-9.jjiIua').childNodes;
            var l = "";
    
            x.forEach(i => {
                if (i.nodeName === '#text') {
                    l += ' ' + i.textContent;
                }
            });
            return l;
"""

price = driver.execute_script(js_query).strip()
print(price)

Output:

₹199

What we are doing with JS is we are fetching all the child nodes of our target div element. Then we are iterating through all of these nodes and getting textContent values from text nodes only. Simultaneously, we are adding all those values into a string type variable l. We return l from JS and strip it off of useless characters in Python. That’s it.

Advertisement