I’m following the transformer’s pretrained model xlm-roberta-large-xnli example
from transformers import pipeline classifier = pipeline("zero-shot-classification", model="joeddav/xlm-roberta-large-xnli")
and I get the following error
ValueError: Couldn't instantiate the backend tokenizer from one of: (1) a `tokenizers` library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
I’m using Transformers version '4.1.1'
Advertisement
Answer
According to Transformers v4.0.0
release, sentencepiece
was removed as a required dependency. This means that
“The tokenizers that depend on the SentencePiece library will not be available with a standard transformers installation”
including the XLMRobertaTokenizer
. However, sentencepiece
can be installed as an extra dependency
pip install transformers[sentencepiece]
or
pip install sentencepiece
if you have transformers already installed.