I am following this tutorial here: https://huggingface.co/transformers/training.html – though, I am coming across an error, and I think the tutorial is missing an import, but i do not know which.
These are my current imports:
JavaScript
x
9
1
# Transformers installation
2
! pip install transformers
3
# To install from source instead of the last release, comment the command above and uncomment the following one.
4
# ! pip install git+https://github.com/huggingface/transformers.git
5
6
! pip install datasets transformers
7
8
from transformers import pipeline
9
Current code:
JavaScript
1
4
1
from datasets import load_dataset
2
3
raw_datasets = load_dataset("imdb")
4
JavaScript
1
4
1
from transformers import AutoTokenizer
2
3
tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
4
JavaScript
1
2
1
inputs = tokenizer(sentences, padding="max_length", truncation=True)
2
The error:
JavaScript
1
7
1
NameError Traceback (most recent call last)
2
3
<ipython-input-9-5a234f114e2e> in <module>()
4
----> 1 inputs = tokenizer(sentences, padding="max_length", truncation=True)
5
6
NameError: name 'sentences' is not defined
7
Advertisement
Answer
The error states that you do not have a variable called sentences
in the scope. I believe the tutorial presumes you already have a list of sentences and are tokenizing it.
Have a look at the documentation The first argument can be either a string or list of string or list of list of strings.
JavaScript
1
2
1
__call__(text: Union[str, List[str], List[List[str]]], )
2