I’m following the transformer’s pretrained model xlm-roberta-large-xnli example
JavaScript
x
4
1
from transformers import pipeline
2
classifier = pipeline("zero-shot-classification",
3
model="joeddav/xlm-roberta-large-xnli")
4
and I get the following error
JavaScript
1
2
1
ValueError: Couldn't instantiate the backend tokenizer from one of: (1) a `tokenizers` library serialization file, (2) a slow tokenizer instance to convert or (3) an equivalent slow tokenizer class to instantiate and convert. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.
2
I’m using Transformers version '4.1.1'
Advertisement
Answer
According to Transformers v4.0.0
release, sentencepiece
was removed as a required dependency. This means that
“The tokenizers that depend on the SentencePiece library will not be available with a standard transformers installation”
including the XLMRobertaTokenizer
. However, sentencepiece
can be installed as an extra dependency
JavaScript
1
2
1
pip install transformers[sentencepiece]
2
or
JavaScript
1
2
1
pip install sentencepiece
2
if you have transformers already installed.