Add HQQ quant loader (#4888)

---------

Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
This commit is contained in:
Water 2023-12-18 19:23:16 -05:00 committed by GitHub
parent 64a57d9dc2
commit 674be9a09a
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
16 changed files with 79 additions and 0 deletions

View file

@ -305,6 +305,12 @@ List of command-line flags
|-------------|-------------|
| `--model_type MODEL_TYPE` | Model type of pre-quantized model. Currently gpt2, gptj, gptneox, falcon, llama, mpt, starcoder (gptbigcode), dollyv2, and replit are supported. |
#### HQQ
| Flag | Description |
|-------------|-------------|
| `--hqq-backend` | Backend for the HQQ loader. Valid options: PYTORCH, PYTORCH_COMPILE, ATEN. |
#### DeepSpeed
| Flag | Description |