New universal API with streaming/blocking endpoints (#990)
Previous title: Add api_streaming extension and update api-example-stream to use it * Merge with latest main * Add parameter capturing encoder_repetition_penalty * Change some defaults, minor fixes * Add --api, --public-api flags * remove unneeded/broken comment from blocking API startup. The comment is already correctly emitted in try_start_cloudflared by calling the lambda we pass in. * Update on_start message for blocking_api, it should say 'non-streaming' and not 'streaming' * Update the API examples * Change a comment * Update README * Remove the gradio API * Remove unused import * Minor change * Remove unused import --------- Co-authored-by: oobabooga <112222186+oobabooga@users.noreply.github.com>
This commit is contained in:
parent
459e725af9
commit
654933c634
12 changed files with 346 additions and 286 deletions
|
@ -269,6 +269,13 @@ Optionally, you can use the following command-line flags:
|
|||
| `--auto-launch` | Open the web UI in the default browser upon launch. |
|
||||
| `--gradio-auth-path GRADIO_AUTH_PATH` | Set the gradio authentication file path. The file should contain one or more user:password pairs in this format: "u1:p1,u2:p2,u3:p3" |
|
||||
|
||||
#### API
|
||||
|
||||
| Flag | Description |
|
||||
|---------------------------------------|-------------|
|
||||
| `--api` | Enable the API extension. |
|
||||
| `--public-api` | Create a public URL for the API using Cloudfare. |
|
||||
|
||||
Out of memory errors? [Check the low VRAM guide](docs/Low-VRAM-guide.md).
|
||||
|
||||
## Presets
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue