[extension/openai] add edits & image endpoints & fix prompt return in non --chat modes (#1935)

This commit is contained in:
matatonic 2023-05-11 10:06:39 -04:00 committed by GitHub
parent 23d3f6909a
commit 309b72e549
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
5 changed files with 203 additions and 11 deletions

View file

@ -11,6 +11,15 @@ Optional (for flask_cloudflared, embeddings):
pip3 install -r requirements.txt
```
It listens on tcp port 5001 by default. You can use the OPENEDAI_PORT environment variable to change this.
To enable the bare bones image generation (txt2img) set: SD_WEBUI_URL to point to your Stable Diffusion API ([Automatic1111](https://github.com/AUTOMATIC1111/stable-diffusion-webui)).
Example:
```
SD_WEBUI_URL=http://127.0.0.1:7861
```
### Embeddings (alpha)
Embeddings requires ```sentence-transformers``` installed, but chat and completions will function without it loaded. The embeddings endpoint is currently using the HuggingFace model: ```sentence-transformers/all-mpnet-base-v2``` for embeddings. This produces 768 dimensional embeddings (the same as the text-davinci-002 embeddings), which is different from OpenAI's current default ```text-embedding-ada-002``` model which produces 1536 dimensional embeddings. The model is small-ish and fast-ish. This model and embedding size may change in the future.
@ -67,17 +76,22 @@ const api = new ChatGPTAPI({
## Compatibility & not so compatibility
What's working:
| API endpoint | tested with | notes |
| --- | --- | --- |
| /v1/models | openai.Model.list() | returns the currently loaded model_name and some mock compatibility options |
| /v1/models/{id} | openai.Model.get() | returns whatever you ask for, model does nothing yet anyways |
| /v1/text_completion | openai.Completion.create() | the most tested, only supports single string input so far |
| /v1/chat/completions | openai.ChatCompletion.create() | depending on the model, this may add leading linefeeds |
| /v1/edits | openai.Edit.create() | Assumes an instruction following model, but may work with others |
| /v1/images/generations | openai.Image.create() | Bare bones, no model configuration, response_format='b64_json' only. |
| /v1/embeddings | openai.Embedding.create() | Using Sentence Transformer, dimensions are different and may never be directly comparable to openai embeddings. |
| /v1/moderations | openai.Moderation.create() | does nothing. successfully. |
| /v1/engines/\*/... completions, embeddings, generate | python-openai v0.25 and earlier | Legacy engines endpoints |
| /v1/images/edits | openai.Image.create_edit() | not supported |
| /v1/images/variations | openai.Image.create_variation() | not supported |
| /v1/audio/\* | openai.Audio.\* | not supported |
| /v1/files\* | openai.Files.\* | not supported |
| /v1/fine-tunes\* | openai.FineTune.\* | not supported |
The model name setting is ignored in completions, but you may need to adjust the maximum token length to fit the model (ie. set to <2048 tokens instead of 4096, 8k, etc). To mitigate some of this, the max_tokens value is halved until it is less than truncation_length for the model (typically 2k).
@ -99,6 +113,10 @@ Some hacky mappings:
defaults are mostly from openai, so are different. I use the openai defaults where I can and try to scale them to the webui defaults with the same intent.
### Models
This has been successfully tested with Koala, Alpaca, gpt4-x-alpaca, GPT4all-snoozy, wizard-vicuna, stable-vicuna and Vicuna 1.1 - ie. Instruction Following models. If you test with other models please let me know how it goes. Less than satisfying results (so far): RWKV-4-Raven, llama, mpt-7b-instruct/chat
### Applications
Everything needs OPENAI_API_KEY=dummy set.
@ -120,4 +138,7 @@ Everything needs OPENAI_API_KEY=dummy set.
* model changing, esp. something for swapping loras or embedding models
* consider switching to FastAPI + starlette for SSE (openai SSE seems non-standard)
* do something about rate limiting or locking requests for completions, most systems will only be able handle a single request at a time before OOM
* the whole api, images (stable diffusion), audio (whisper), fine-tunes (training), edits, files, etc.
## Bugs? Feedback? Comments? Pull requests?
Are all appreciated, please @matatonic and I'll try to get back to you as soon as possible.