Remove flexgen 2

This commit is contained in:
oobabooga 2023-07-25 15:18:25 -07:00
parent 75c2dd38cf
commit 77d2e9f060
4 changed files with 1 additions and 16 deletions

View file

@ -178,7 +178,7 @@ Optionally, you can use the following command-line flags:
| Flag | Description |
|--------------------------------------------|-------------|
| `--loader LOADER` | Choose the model loader manually, otherwise, it will get autodetected. Valid options: transformers, autogptq, gptq-for-llama, exllama, exllama_hf, llamacpp, rwkv, flexgen |
| `--loader LOADER` | Choose the model loader manually, otherwise, it will get autodetected. Valid options: transformers, autogptq, gptq-for-llama, exllama, exllama_hf, llamacpp, rwkv |
#### Accelerate/transformers
@ -255,14 +255,6 @@ Optionally, you can use the following command-line flags:
| `--warmup_autotune` | (triton) Enable warmup autotune. |
| `--fused_mlp` | (triton) Enable fused mlp. |
#### FlexGen
| Flag | Description |
|------------------|-------------|
| `--percent PERCENT [PERCENT ...]` | FlexGen: allocation percentages. Must be 6 numbers separated by spaces (default: 0, 100, 100, 0, 100, 0). |
| `--compress-weight` | FlexGen: Whether to compress weight (default: False).|
| `--pin-weight [PIN_WEIGHT]` | FlexGen: whether to pin weights (setting this to False reduces CPU memory by 20%). |
#### DeepSpeed
| Flag | Description |