Add HQQ quant loader (#4888)
--------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
This commit is contained in:
parent
64a57d9dc2
commit
674be9a09a
16 changed files with 79 additions and 0 deletions
|
@ -305,6 +305,12 @@ List of command-line flags
|
|||
|-------------|-------------|
|
||||
| `--model_type MODEL_TYPE` | Model type of pre-quantized model. Currently gpt2, gptj, gptneox, falcon, llama, mpt, starcoder (gptbigcode), dollyv2, and replit are supported. |
|
||||
|
||||
#### HQQ
|
||||
|
||||
| Flag | Description |
|
||||
|-------------|-------------|
|
||||
| `--hqq-backend` | Backend for the HQQ loader. Valid options: PYTORCH, PYTORCH_COMPILE, ATEN. |
|
||||
|
||||
#### DeepSpeed
|
||||
|
||||
| Flag | Description |
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue