Skip to content
Advertisement

How do I check if a tokenizer/model is already saved

I am using HuggingFace Transformers with PyTorch. My modus operandi is to download a pre-trained model and save it in a local project folder.

While doing so, I can see that .bin file is saved locally, which stands for the model. However, I am also downloading and saving a tokenizer, for which I cannot see any associated file.

So, how do I check if a tokenizer is saved locally before downloading? Secondly, apart from the usual os.path.isfile(...) check, is there any other better way to prioritize local copy usage from a given location before downloading?

Advertisement

Answer

I’ve used this code in the past for this purpose. You can adapt it to your setting.

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement