Add ExLlama support (#2444)
This commit is contained in:
parent
dea43685b0
commit
9f40032d32
12 changed files with 156 additions and 47 deletions
16
docs/ExLlama.md
Normal file
16
docs/ExLlama.md
Normal file
|
|
@ -0,0 +1,16 @@
|
|||
# ExLlama
|
||||
|
||||
## About
|
||||
|
||||
ExLlama is an extremely optimized GPTQ backend for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code.
|
||||
|
||||
# Installation:
|
||||
|
||||
1) Clone the ExLlama repository into your `repositories` folder:
|
||||
|
||||
```
|
||||
cd repositories
|
||||
git clone https://github.com/turboderp/exllama
|
||||
```
|
||||
|
||||
2) Follow the remaining set up instructions in the official README: https://github.com/turboderp/exllama#exllama
|
||||
Loading…
Add table
Add a link
Reference in a new issue