I had train a BertClassifier model using pytorch. After creating my best.pt I would like to make in production my model and using it to predict and classifier starting from a sample, so I resume them from the checkpoint. Otherwise after put it in evaluation and freeze model, I use .predict to make in work on my sample but I’m
Tag: bert-language-model
stacking LSTM layer on top of BERT encoder in Keras
I have been trying to stack a single LSTM layer on top of Bert embeddings, but whilst my model starts to train it fails on the last batch and throws the following error message: This is how I build the model and I honestly cannot figure out what is going wrong here: this is the full output: The code runs
BERT get sentence embedding
I am replicating code from this page. I have downloaded the BERT model to my local system and getting sentence embedding. I have around 500,000 sentences for which I need sentence embedding and it is taking a lot of time. Is there a way to expedite the process? Would sending batches of sentences rather than one sentence at a time
Load a model as DPRQuestionEncoder in HuggingFace
I would like to load the BERT’s weights (or whatever transformer) into a DPRQuestionEncoder architecture, such that I can use the HuggingFace save_pretrained method and plug the saved model into the RAG architecture to do end-to-end fine-tuning. But I got the following error I am using the last version of Transformers. Answer As already mentioned in the comments, DPRQuestionEncoder does
Hugging Face: NameError: name ‘sentences’ is not defined
I am following this tutorial here: https://huggingface.co/transformers/training.html – though, I am coming across an error, and I think the tutorial is missing an import, but i do not know which. These are my current imports: Current code: The error: Answer The error states that you do not have a variable called sentences in the scope. I believe the tutorial presumes
How do I interpret my BERT output from Huggingface Transformers for Sequence Classification and tensorflow?
Short TL;DR: I am using BERT for a sequence classification task and don’t understand the output I get. This is my first post, so please bear with me: I am using bert for a sequence classification task with 3 labels. To do this, I am using huggingface transformers with tensorflow, more specifically the TFBertForSequenceClassification class with the bert-base-german-cased model (yes,
BERT DataLoader: Difference between shuffle=True vs Sampler?
I trained a DistilBERT model with DistilBertForTokenClassification on ConLL data fro predicting NER. Training seem to have completed with no problems but I have 2 problems during evaluation phase. I’m getting negative loss value During training, I used shuffle=True for DataLoader. But during evaluation, when I do shuffle=True for DataLoader, I get very poor metric results(f_1, accuracy, recall etc). But
Removing SEP token in Bert for text classification
Given a sentiment classification dataset, I want to fine-tune Bert. As you know that BERT created to predict the next sentence given the current sentence. Thus, to make the network aware of this, they inserted a [CLS] token in the beginning of the first sentence then they add [SEP] token to separate the first from the second sentence and finally