Reorganize model loading UI completely (#2720)
This commit is contained in:
parent
57be2eecdf
commit
7ef6a50e84
16 changed files with 365 additions and 243 deletions
|
@ -211,6 +211,12 @@ Optionally, you can use the following command-line flags:
|
|||
| `--extensions EXTENSIONS [EXTENSIONS ...]` | The list of extensions to load. If you want to load more than one extension, write the names separated by spaces. |
|
||||
| `--verbose` | Print the prompts to the terminal. |
|
||||
|
||||
#### Model loader
|
||||
|
||||
| Flag | Description |
|
||||
|--------------------------------------------|-------------|
|
||||
| `--loader LOADER` | Choose the model loader manually, otherwise, it will get autodetected. Valid options: autogptq, gptq-for-llama, transformers, llamacpp, rwkv, flexgen |
|
||||
|
||||
#### Accelerate/transformers
|
||||
|
||||
| Flag | Description |
|
||||
|
@ -265,7 +271,6 @@ Optionally, you can use the following command-line flags:
|
|||
|
||||
| Flag | Description |
|
||||
|---------------------------|-------------|
|
||||
| `--gptq-for-llama` | Use GPTQ-for-LLaMa to load the GPTQ model instead of AutoGPTQ. |
|
||||
| `--wbits WBITS` | Load a pre-quantized model with specified precision in bits. 2, 3, 4 and 8 are supported. |
|
||||
| `--model_type MODEL_TYPE` | Model type of pre-quantized model. Currently LLaMA, OPT, and GPT-J are supported. |
|
||||
| `--groupsize GROUPSIZE` | Group size. |
|
||||
|
@ -280,7 +285,6 @@ Optionally, you can use the following command-line flags:
|
|||
|
||||
| Flag | Description |
|
||||
|------------------|-------------|
|
||||
| `--flexgen` | Enable the use of FlexGen offloading. |
|
||||
| `--percent PERCENT [PERCENT ...]` | FlexGen: allocation percentages. Must be 6 numbers separated by spaces (default: 0, 100, 100, 0, 100, 0). |
|
||||
| `--compress-weight` | FlexGen: Whether to compress weight (default: False).|
|
||||
| `--pin-weight [PIN_WEIGHT]` | FlexGen: whether to pin weights (setting this to False reduces CPU memory by 20%). |
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue