Add cache_8bit option
This commit is contained in:
parent
42f816312d
commit
c0655475ae
7 changed files with 32 additions and 5 deletions
|
@ -337,6 +337,7 @@ Optionally, you can use the following command-line flags:
|
|||
|`--max_seq_len MAX_SEQ_LEN` | Maximum sequence length. |
|
||||
|`--cfg-cache` | ExLlama_HF: Create an additional cache for CFG negative prompts. Necessary to use CFG with that loader, but not necessary for CFG with base ExLlama. |
|
||||
|`--no_flash_attn` | Force flash-attention to not be used. |
|
||||
|`--cache_8bit` | Use 8-bit cache to save VRAM. |
|
||||
|
||||
#### AutoGPTQ
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue