Skip to content
Advertisement

spacy matcher returns right answer when two words are set as seperate ‘TEXT’ conditional object only. Why is it?

I’m trying to set a matcher finding word ‘iPhone X’.

The sample code says I should follow below.

JavaScript

I tried another approach by putting like below.

JavaScript

Why is the second approach not working? I assumed if I put the two word ‘iPhone’ and ‘X’ together, it might work as the same way cause it regard the word with space in the middle as a long unique word. But it didn’t.

The possible reason I could think of is, matcher condition should be a single word without empty space. Am I right? or is there another reason the second approach not working?

Thank you.

Advertisement

Answer

The answer is in how Spacy tokenizes the string:

JavaScript

As you see, the iPhone and X are separate tokens. See the Matcher reference:

A pattern added to the Matcher consists of a list of dictionaries. Each dictionary describes one token and its attributes.

Thus, you cannot use them both in one token definition.

Advertisement