Generalize multimodality (llava/minigpt4 7b and 13b now supported) (#1741)
This commit is contained in:
parent
a2b25322f0
commit
e9e75a9ec7
22 changed files with 812 additions and 371 deletions
|
@ -33,7 +33,7 @@ Most of these have been created by the extremely talented contributors that you
|
|||
|[send_pictures](https://github.com/oobabooga/text-generation-webui/blob/main/extensions/send_pictures/)| Creates an image upload field that can be used to send images to the bot in chat mode. Captions are automatically generated using BLIP. |
|
||||
|[whisper_stt](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/whisper_stt)| Allows you to enter your inputs in chat mode using your microphone. |
|
||||
|[sd_api_pictures](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/sd_api_pictures)| Allows you to request pictures from the bot in chat mode, which will be generated using the AUTOMATIC1111 Stable Diffusion API. See examples [here](https://github.com/oobabooga/text-generation-webui/pull/309). |
|
||||
|[llava](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/llava) | Adds LLaVA multimodal model support. For detailed description see [README.md](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/llava/README.md) in the extension directory. |
|
||||
|[multimodal](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/multimodal) | Adds multimodality support (text+images). For detailed description see [README.md](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/multimodal/README.md) in the extension directory. |
|
||||
|[openai](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/openai)| Creates an API that mimics the OpenAI API and can be used as a drop-in replacement. |
|
||||
|[superbooga](https://github.com/oobabooga/text-generation-webui/tree/main/extensions/superbooga)| An extension that uses ChromaDB to create an arbitrarily large pseudocontext, taking as input text files, URLs, or pasted text. Based on https://github.com/kaiokendev/superbig. |
|
||||
|
||||
|
@ -50,7 +50,8 @@ Most of these have been created by the extremely talented contributors that you
|
|||
| `def bot_prefix_modifier(string)` | Applied in chat mode to the prefix for the bot's reply (more on that below). |
|
||||
| `def custom_generate_reply(...)` | Overrides the main text generation function. |
|
||||
| `def custom_generate_chat_prompt(...)` | Overrides the prompt generator in chat mode. |
|
||||
| `def tokenizer_modifier(state, prompt, input_ids, input_embeds)` | Modifies the `input_ids`/`input_embeds` fed to the model. Should return `prompt`, `input_ids`, `input_embeds`. See `llava` extension for an example |
|
||||
| `def tokenizer_modifier(state, prompt, input_ids, input_embeds)` | Modifies the `input_ids`/`input_embeds` fed to the model. Should return `prompt`, `input_ids`, `input_embeds`. See `multimodal` extension for an example |
|
||||
| `def custom_tokenized_length(prompt)` | Used in conjunction with `tokenizer_modifier`, returns the length in tokens of `prompt`. See `multimodal` extension for an example |
|
||||
|
||||
Additionally, the script may define two special global variables:
|
||||
|
||||
|
@ -78,7 +79,7 @@ input_hijack = {
|
|||
```
|
||||
This is only relevant in chat mode. If your extension sets `input_hijack['state']` to `True` at any moment, the next call to `modules.chat.chatbot_wrapper` will use the values inside `input_hijack['value']` as the user input for text generation. See the `send_pictures` extension above for an example.
|
||||
|
||||
Additionally, your extension can set the value to be a callback, in the form of `def cb(text: str, visible_text: str) -> [str, str]`. See the `llava` extension above for an example.
|
||||
Additionally, your extension can set the value to be a callback, in the form of `def cb(text: str, visible_text: str) -> [str, str]`. See the `multimodal` extension above for an example.
|
||||
|
||||
## The `bot_prefix_modifier`
|
||||
|
||||
|
@ -100,13 +101,22 @@ Marie Antoinette will become very enthusiastic in all her messages.
|
|||
|
||||
In order to use your extension, you must start the web UI with the `--extensions` flag followed by the name of your extension (the folder under `text-generation-webui/extension` where `script.py` resides).
|
||||
|
||||
You can activate more than one extension at a time by providing their names separated by spaces. The input, output and bot prefix modifiers will be applied in the specified order. For `custom_generate_chat_prompt`, only the first declaration encountered will be used and the rest will be ignored.
|
||||
You can activate more than one extension at a time by providing their names separated by spaces. The input, output and bot prefix modifiers will be applied in the specified order.
|
||||
|
||||
|
||||
```
|
||||
python server.py --extensions enthusiasm translate # First apply enthusiasm, then translate
|
||||
python server.py --extensions translate enthusiasm # First apply translate, then enthusiasm
|
||||
```
|
||||
|
||||
Do note, that for:
|
||||
- `custom_generate_chat_prompt`
|
||||
- `custom_generate_reply`
|
||||
- `tokenizer_modifier`
|
||||
- `custom_tokenized_length`
|
||||
|
||||
only the first declaration encountered will be used and the rest will be ignored.
|
||||
|
||||
## `custom_generate_reply` example
|
||||
|
||||
Once defined in a `script.py`, this function is executed in place of the main generation functions. You can use it to connect the web UI to an external API, or to load a custom model that is not supported yet.
|
||||
|
@ -167,7 +177,7 @@ def custom_generate_chat_prompt(user_input, state, **kwargs):
|
|||
|
||||
# Building the prompt
|
||||
i = len(shared.history['internal']) - 1
|
||||
while i >= 0 and len(encode(''.join(rows))[0]) < max_length:
|
||||
while i >= 0 and get_encoded_length(''.join(rows)) < max_length:
|
||||
if _continue and i == len(shared.history['internal']) - 1:
|
||||
rows.insert(1, bot_turn_stripped + shared.history['internal'][i][1].strip())
|
||||
else:
|
||||
|
@ -190,7 +200,7 @@ def custom_generate_chat_prompt(user_input, state, **kwargs):
|
|||
# Adding the Character prefix
|
||||
rows.append(apply_extensions("bot_prefix", bot_turn_stripped.rstrip(' ')))
|
||||
|
||||
while len(rows) > min_rows and len(encode(''.join(rows))[0]) >= max_length:
|
||||
while len(rows) > min_rows and get_encoded_length(''.join(rows)) >= max_length:
|
||||
rows.pop(1)
|
||||
|
||||
prompt = ''.join(rows)
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue