Add RoPE scaling support for transformers (including dynamic NTK)
https://github.com/huggingface/transformers/pull/24653
This commit is contained in:
parent
f4caaf337a
commit
d8fb506aff
5 changed files with 16 additions and 9 deletions
|
@ -299,12 +299,12 @@ Optionally, you can use the following command-line flags:
|
|||
| `--rwkv-strategy RWKV_STRATEGY` | RWKV: The strategy to use while loading the model. Examples: "cpu fp32", "cuda fp16", "cuda fp16i8". |
|
||||
| `--rwkv-cuda-on` | RWKV: Compile the CUDA kernel for better performance. |
|
||||
|
||||
#### RoPE (for llama.cpp and ExLlama only)
|
||||
#### RoPE (for llama.cpp, ExLlama, and transformers)
|
||||
|
||||
| Flag | Description |
|
||||
|------------------|-------------|
|
||||
|`--compress_pos_emb COMPRESS_POS_EMB` | Positional embeddings compression factor. Should typically be set to max_seq_len / 2048. |
|
||||
|`--alpha_value ALPHA_VALUE` | Positional embeddings alpha factor for NTK RoPE scaling. Scaling is not identical to embedding compression. Use either this or compress_pos_emb, not both. |
|
||||
|`--alpha_value ALPHA_VALUE` | Positional embeddings alpha factor for NTK RoPE scaling. Use either this or compress_pos_emb, not both. |
|
||||
|
||||
#### Gradio
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue