[extension/openai] add edits & image endpoints & fix prompt return in non --chat modes (#1935)

2023-05-11 10:06:39 -04:00 · 2023-05-11 10:06:39 -04:00 · 309b72e549
commit 309b72e549
parent 23d3f6909a
5 changed files with 203 additions and 11 deletions
--- a/extensions/openai/README.md
+++ b/extensions/openai/README.md
@ -11,6 +11,15 @@ Optional (for flask_cloudflared, embeddings):
 pip3 install -r requirements.txt
 ```

+It listens on tcp port 5001 by default. You can use the OPENEDAI_PORT environment variable to change this.
+
+To enable the bare bones image generation (txt2img) set: SD_WEBUI_URL to point to your Stable Diffusion API ([Automatic1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui)).
+
+Example:
+```
+SD_WEBUI_URL=http://127.0.0.1:7861
+```
+
 ### Embeddings (alpha)

 Embeddings requires ```sentence-transformers``` installed, but chat and completions will function without it loaded. The embeddings endpoint is currently using the HuggingFace model: ```sentence-transformers/all-mpnet-base-v2``` for embeddings. This produces 768 dimensional embeddings (the same as the text-davinci-002 embeddings), which is different from OpenAI's current default ```text-embedding-ada-002``` model which produces 1536 dimensional embeddings. The model is small-ish and fast-ish. This model and embedding size may change in the future.
@ -67,17 +76,22 @@ const api = new ChatGPTAPI({

 ## Compatibility & not so compatibility

-What's working:
-
 | API endpoint | tested with | notes |
 | --- | --- | --- |
 | /v1/models | openai.Model.list() | returns the currently loaded model_name and some mock compatibility options |
 | /v1/models/{id} | openai.Model.get() | returns whatever you ask for, model does nothing yet anyways |
 | /v1/text_completion | openai.Completion.create() | the most tested, only supports single string input so far |
 | /v1/chat/completions | openai.ChatCompletion.create() | depending on the model, this may add leading linefeeds |
+| /v1/edits | openai.Edit.create() | Assumes an instruction following model, but may work with others |
+| /v1/images/generations | openai.Image.create() | Bare bones, no model configuration, response_format='b64_json' only. |
 | /v1/embeddings | openai.Embedding.create() | Using Sentence Transformer, dimensions are different and may never be directly comparable to openai embeddings. |
 | /v1/moderations | openai.Moderation.create() | does nothing. successfully. |
 | /v1/engines/\*/... completions, embeddings, generate | python-openai v0.25 and earlier | Legacy engines endpoints |
+| /v1/images/edits | openai.Image.create_edit() | not supported |
+| /v1/images/variations | openai.Image.create_variation() | not supported |
+| /v1/audio/\* | openai.Audio.\* | not supported |
+| /v1/files\* | openai.Files.\* | not supported |
+| /v1/fine-tunes\* | openai.FineTune.\* | not supported |

 The model name setting is ignored in completions, but you may need to adjust the maximum token length to fit the model (ie. set to <2048 tokens instead of 4096, 8k, etc). To mitigate some of this, the max_tokens value is halved until it is less than truncation_length for the model (typically 2k).

@ -99,6 +113,10 @@ Some hacky mappings:

 defaults are mostly from openai, so are different. I use the openai defaults where I can and try to scale them to the webui defaults with the same intent.

+### Models
+
+This has been successfully tested with Koala, Alpaca, gpt4-x-alpaca, GPT4all-snoozy,  wizard-vicuna, stable-vicuna and Vicuna 1.1 - ie. Instruction Following models. If you test with other models please let me know how it goes. Less than satisfying results (so far): RWKV-4-Raven, llama, mpt-7b-instruct/chat
+
 ### Applications

 Everything needs OPENAI_API_KEY=dummy set.
@ -120,4 +138,7 @@ Everything needs OPENAI_API_KEY=dummy set.
 * model changing, esp. something for swapping loras or embedding models
 * consider switching to FastAPI + starlette for SSE (openai SSE seems non-standard)
 * do something about rate limiting or locking requests for completions, most systems will only be able handle a single request at a time before OOM
-* the whole api, images (stable diffusion), audio (whisper), fine-tunes (training), edits, files, etc.
+
+## Bugs? Feedback? Comments? Pull requests?
+
+Are all appreciated, please @matatonic and I'll try to get back to you as soon as possible.