Skip to content
Advertisement

Xpath: How to check if a tag comes before text or after text?

Assume I have the following two example pieces of HTML:

<p>This is some text: <b>ABCD12345</b></p>

<p><b>Name:</b> John Doe</p>

I’m able to separate the <b> and non-<b> parts, but I (also) want to know how to determine whether the <b> part is at the start or at the end of the text (in other words; whether it has text before or after). How to do that?

I’m using Python (with lxml) if it matters (I don’t think it really does).

Advertisement

Answer

This XPath,

not(/p/b/following-sibling::text())

will return true iff there are no text nodes following b in p, as in your first case:

<p>This is some text: <b>ABCD12345</b></p>

This XPath,

not(/p/b/preceding-sibling::text())

will return true iff there are no text nodes preceding b in p, as in your second case:

<p><b>Name:</b> John Doe</p>

If it’s not the absence but the presence of text before/after the b element that’s of interest, you can change the not() to boolean() in those XPath expressions.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement