SDK Usage - Plugin

You can use an OpenAI-compatible SDK with the AI Proxy Advanced plugin in multiple ways, depending on the required use case.

You want to…	Then use…
Allow the user to select their target model, based on some header or request parameter	OpenAI SDK with multiple models on the same provider
Proxy the same request to an LLM provider of the user’s choosing	OpenAI SDK with multiple providers
Use the OpenAI SDK for Azure, and allow the user to choose the Azure Deployment ID	Multiple Azure OpenAI deployments on one route
Proxy an unsupported model, like Whisper-2	OpenAI-compatible SDK for unsupported models

Templated model parameters

The plugin enables you to substitute values in the config.model.name and any config.model.options.* field with specific placeholders, similar to those in the Request Transformer Advanced templating system.

Available templated parameters:

$(headers.name)
$(uri_captures.name)
$(query_params.name)

name is either the header name, URI named capture (in the route path), or the query parameter name, respectively.

Use case examples

Use OpenAI SDK with multiple models on the same provider

To read the desired model from the user, rather than hard coding it into the plugin config for each route, you can read it from a path parameter. For example:

- name: openai-chat
  paths:
  - "~/(?<model>[^#?/]+)/chat/completions$"
  methods:
  - POST
  plugins:
  - name: ai-proxy-advanced
    config:
      route_type: "llm/v1/chat"
      auth:
        header_name: "Authorization"
        header_value: "{vault://env/OPENAI_AUTH_HEADER}"
      logging:
        log_statistics: true
        log_payloads: false
      model:
        provider: "openai"
        name: "$(uri_captures.model)"

You can now target two different models using an OpenAI-compatible SDK:

client = OpenAI(
  base_url="http://localhost:8000/gpt-4"
)

The Python SDK (in OpenAI standard mode) fills in the URL with http://localhost:8000/gpt-4/chat/completions. Kong Gateway recognizes the model as gpt-4, as well as the route.

Using OpenAI SDK with multiple providers

Kong AI Proxy Advanced can help you arbitrate using the same OpenAI SDK with multiple LLM providers.

For example, set up two routes:

- name: cohere-chat
  paths:
  - "~/cohere/(?<model>[^#?/]+)/chat/completions$"
  methods:
  - POST
  plugins:
  - name: ai-proxy-advanced
    config:
      route_type: "llm/v1/chat"
      auth:
        header_name: "Authorization"
        header_value: "{vault://env/COHERE_AUTH_HEADER}"
      logging:
        log_statistics: true
        log_payloads: false
      model:
        provider: "cohere"
        name: "$(uri_captures.model)"

- name: mistral-chat
  paths:
  - "~/mistral(?<model>[^#?/]+)/chat/completions$"
  methods:
  - POST
  plugins:
  - name: ai-proxy-advanced
    config:
      route_type: "llm/v1/chat"
      auth:
        header_name: "Authorization"
        header_value: "{vault://env/MISTRAL_AUTH_HEADER}"
      logging:
        log_statistics: true
        log_payloads: false
      model:
        provider: "mistral"
        name: "$(uri_captures.model)"

Now, you can select your desired provider using the SDK:

client = OpenAI(
  base_url="http://127.0.0.1:8000/cohere"
)

client = OpenAI(
  base_url="http://127.0.0.1:8000/mistral"
)

Use multiple Azure OpenAI deployments on one route

With AI Proxy Advanced, you can create two routes to point to two different deployments of an Azure OpenAI model:

- name: azure-chat-gpt-3-5
  paths:
  - "~/openai/deployments/azure-gpt-3-5/chat/completions$"
  methods:
  - POST
  plugins:
  - name: ai-proxy-advanced
    config:
      route_type: "llm/v1/chat"
      auth:
        header_name: "api-key"
        header_value: "{vault://env/AZURE_AUTH_HEADER}"
      logging:
        log_statistics: true
        log_payloads: false
      model:
        provider: "azure"
        name: "gpt-35-turbo"
        options:
          azure_instance: "my-openai-instace"
          azure_deployment_id: "my-gpt-3-5"

- name: azure-chat-gpt-4
  paths:
  - "~/openai/deployments/azure-gpt-4/chat/completions$"
  methods:
  - POST
  plugins:
  - name: ai-proxy-advanced
    config:
      route_type: "llm/v1/chat"
      auth:
        header_name: "api-key"
        header_value: "{vault://env/AZURE_AUTH_HEADER}"
      logging:
        log_statistics: true
        log_payloads: false
      model:
        provider: "azure"
        name: "gpt-4"
        options:
          azure_instance: "my-openai-instace"
          azure_deployment_id: "my-gpt-4"

If you have to use only one route to proxy multiple models deployed in the same instance, you can use request parameters:

- name: azure-chat
  paths:
  - "~/openai/deployments/(?<azure_instance>[^#?/]+)/chat/completions"
  methods:
  - POST
  plugins:
  - name: ai-proxy-advanced
    config:
      route_type: "llm/v1/chat"
      auth:
        header_name: "api-key"
        header_value: "{vault://env/AZURE_AUTH_HEADER}"
      logging:
        log_statistics: true
        log_payloads: false
      model:
        provider: "azure"
        name: "$(uri_captures.azure_instance)"
        options:
          azure_instance: "my-openai-instace"
          azure_deployment_id: "$(uri_captures.azure_instance)"

Now you can set the SDK endpoint to http://localhost:8000/azure. When the Azure instance parameter is set to "my-gpt-3-5", the Python SDK produces the URL http://localhost:8000/openai/deployments/my-gpt-3-5/chat/completions and is directed to the respective Azure deployment ID and model.

Use an unsupported model

Kong Gateway can perform a best-effort to support models that are not programmed with to/from format transformers, or are untested.

Caution: The following use cases are unsupported, but may work depending on your setup. Use at your own discretion.

For the following examples, you must set the route_type to preserve mode.

For example, you could use the Whisper-2 audio transcription model with a route:

- name: openai-any
  paths:
  - "~/openai/(?<op_path>[^#?]+)"
  methods:
  - POST
  plugins:
  - name: ai-proxy-advanced
    config:
      route_type: "preserve"
      auth:
        header_name: "Authorization"
        header_value: "{vault://env/OPENAI_AUTH_HEADER}"
      logging:
        log_statistics: true
        log_payloads: false
      model:
        name: "whisper-2"
        provider: "openai"
        options:
          upstream_path: "$(uri_captures.op_path)"

Now you can POST a file for transcription, using multipart/form-data formatting:

curl --location 'http://localhost:8000/openai/v1/audio/transcriptions' \
     --form 'model=whisper-2' \
     --form 'file=@"me_saying_hello.m4a"'

The response comes back unaltered:

{
  "text": "Hello!"
}

前へ Load Balancing

次へ Configure Streaming with AI Proxy Advanced