Add --text-only option to the download script

This commit is contained in:
oobabooga 2023-02-12 00:42:56 -03:00
parent 42cc307409
commit bf9dd8f8ee
2 changed files with 13 additions and 11 deletions

View file

@ -105,14 +105,12 @@ After downloading the model, follow these steps:
1. Place the files under `models/gpt4chan_model_float16` or `models/gpt4chan_model`.
2. Place GPT-J 6B's config.json file in that same folder: [config.json](https://huggingface.co/EleutherAI/gpt-j-6B/raw/main/config.json).
3. Download GPT-J 6B under `models/gpt-j-6B`:
3. Download GPT-J 6B's tokenizer files (they will be automatically detected when you attempt to load GPT-4chan):
```
python download-model.py EleutherAI/gpt-j-6B
python download-model.py EleutherAI/gpt-j-6B --text-only
```
You don't really need all of GPT-J 6B's files, just the tokenizer files, but you might as well download the whole thing. Those files will be automatically detected when you attempt to load GPT-4chan.
#### Converting to pytorch (optional)
The script `convert-to-torch.py` allows you to convert models to .pt format, which is sometimes 10x faster to load to the GPU: