Add ExLlama support (#2444)

2023-06-16 20:35:38 -03:00 · 2023-06-16 20:35:38 -03:00 · 9f40032d32
commit 9f40032d32
parent dea43685b0
12 changed files with 156 additions and 47 deletions
--- a/docs/ExLlama.md
+++ b/docs/ExLlama.md
@ -0,0 +1,16 @@
+# ExLlama
+
+## About
+
+ExLlama is an extremely optimized GPTQ backend for LLaMA models. It features much lower VRAM usage and much higher speeds due to not relying on unoptimized transformers code.
+
+# Installation:
+
+1) Clone the ExLlama repository into your `repositories` folder:
+
+```
+cd repositories
+git clone https://github.com/turboderp/exllama
+```
+
+2) Follow the remaining set up instructions in the official README: https://github.com/turboderp/exllama#exllama