text-generation-webui/docs/ExLlama.md
2023-06-16 20:35:38 -03:00

16 lines
465 B
Markdown

# ExLlama
## About
ExLlama is an extremely optimized GPTQ backend for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code.
# Installation:
1) Clone the ExLlama repository into your `repositories` folder:
```
cd repositories
git clone https://github.com/turboderp/exllama
```
2) Follow the remaining set up instructions in the official README: https://github.com/turboderp/exllama#exllama