From a04266161db6994838f014559e65e3e5c394bec9 Mon Sep 17 00:00:00 2001 From: oobabooga <112222186+oobabooga@users.noreply.github.com> Date: Thu, 25 May 2023 01:23:46 -0300 Subject: [PATCH] Update README.md --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 57d4cad..fd5b122 100644 --- a/README.md +++ b/README.md @@ -101,7 +101,7 @@ pip install -r requirements.txt The base installation covers [transformers](https://github.com/huggingface/transformers) models (`AutoModelForCausalLM` and `AutoModelForSeq2SeqLM` specifically) and [llama.cpp](https://github.com/ggerganov/llama.cpp) (GGML) models. -To use 4-bit GPU models, the additional installation steps below are necessary: +To use GPTQ models, the additional installation steps below are necessary: [GPTQ models (4 bit mode)](https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md) @@ -223,6 +223,8 @@ Optionally, you can use the following command-line flags: #### Accelerate 4-bit +⚠️ Not supported on Windows at the moment. + | Flag | Description | |---------------------------------------------|-------------| | `--load-in-4bit` | Load the model with 4-bit precision (using bitsandbytes). |