Get started with AI Gateway

With Kong’s AI Gateway, you can deploy AI infrastructure for traffic being sent to one or more LLMs, which lets you semantically route, secure, observe, accelerate, and govern using a special set of AI plugins that are bundled with Kong Gateway distributions.

Kong AI Gateway is a set of AI plugins, which can be used by installing Kong Gateway and then by following the documented configuration instructions for each plugin. The AI plugins are supported in all deployment modes, including Konnect, self-hosted traditional, hybrid, and DB-less, and on Kubernetes via the Kong Ingress Controller.

AI plugins are fully supported by Konnect in both hybrid mode, and as a fully managed service.

You can enable most Kong Gateway AI capabilities with one of the following plugins:

AI Proxy: The open source AI proxy plugin.
AI Proxy Advanced: The enterprise version offering more advanced load balancing, routing, and retries.

These plugins enable upstream connectivity to the LLMs and direct Kong Gateway to proxy traffic to the intended LLM models. Once these plugins are installed and your AI traffic is being proxied, you can use any other Kong Gateway plugin to add more enhanced capabilities.

The main difference between simply adding an LLM’s API behind Kong Gateway and using the AI plugins, is that with the former, you can only interact at the API level with internal traffic. With AI plugins, Kong can understand the prompts that are being sent through the Gateway. The plugins can introspect the body and provide more specific AI capabilities to your traffic, beyond treating the LLMs as “just APIs”.

Prerequisites

Run Kong Gateway in Konnect, or use your distribution of choice:

The easiest way to get started is to run Kong Gateway for free on Konnect
To run Kong Gateway locally, use the quickstart script, or see all installation options

Set up AI Gateway

1. Create an ingress route

Create a service and a route to define the ingress route to consume your LLMs.

Kong Gateway Admin API

Konnect API

decK (YAML)

Create a Gateway service:

curl -i -X POST http://localhost:8001/services \
  --data name="llm_service" \
  --data url="http://fake.host.internal"

Then, create a route for the service:

curl -i -X POST http://localhost:8001/services/llm_service/routes \
  --data name="openai-llm" \
  --data paths="/openai"

Create a Gateway service:

curl -i -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/ \
  --header "accept: application/json" \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer TOKEN" \
  --data '
    {
      "name": "llm_service",
      "url": "http://fake.host.internal"
    }

Then, create a route for the service:

curl -i -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceID}/routes \
  --header "accept: application/json" \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer TOKEN" \
  --data  '
    {
      "name": "openai-llm",
      "paths": [
        {
          "/openai"
        }
      ]
    }

Take note of the route ID in the response.

Add a service and a route to your decK state file:

services:
  - name: llm_service
    url: http://fake.host.internal
routes:
  - name: openai-llm
    paths:
    - "/openai"
    service:
      name: llm_service

Adding this route entity creates an ingress rule on the /openai path.

These examples use the fake URL http://fake.host.internal as the upstream URL for the service - but you don’t need to replace it with a real one. This is because the service upstream URL won’t really matter, because after installing the AI Proxy plugin (or AI Proxy Advanced), the upstream proxying destination will be determined dynamically based on your AI Proxy plugin configuration.

2. Install the AI Proxy plugin

Configure your destination LLM using either AI Proxy or AI Proxy Advanced so that all traffic sent to your route is redirected to the correct LLM.

This example uses the AI Proxy plugin.

Kong Gateway Admin API

Konnect API

decK (YAML)

curl -i -X POST http://localhost:8001/routes/openai-llm/plugins \
  --header "accept: application/json" \
  --header "Content-Type: application/json" \
  --data '
  {
    "name": "ai-proxy",
    "config": {
      "route_type": "llm/v1/chat",
      "model": {
        "provider": "openai"
      }
    }
  }'

Replace {route_id} with the ID of the route created in the previous step:

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/routes/{routeId}/plugins \
  --header "accept: application/json" \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer TOKEN" \
  --data '
  {
    "name": "ai-proxy",
    "config": {
      "route_type": "llm/v1/chat",
      "model": {
        "provider": "openai"
      }
    }
  }'

Add an AI Proxy plugin entry to your decK state file:

plugins:
  - name: ai-proxy
    route: openai-llm
    config:
      route_type: "llm/v1/chat"
      model:
        provider: openai

In this simple example, we are allowing the client to consume all models in the openai provider. You can restrict the models that can be consumed by specifying the model name explicitly using the config.model.name parameter, and manage the LLM credentials in Kong Gateway itself so that the client doesn’t have to send them.

3. Validate the connection to the LLM

Make your first request to OpenAI via Kong Gateway:

curl --http1.1 http://localhost:8000/openai \
  --header "Content-Type: application/json" \
  --header "Authorization: Bearer $OPENAI_API_KEY" \
  --data '{
     "model": "gpt-4o-mini",
     "messages": [{"role": "user", "content": "Say this is a test!"}]
   }'

The response body should contain the response This is a test!:

{
  "id": "chatcmpl-AIm1TMhTkcH1sf67GYXIM5fsfu94X9Gdk",
  "object": "chat.completion",
  "created": 1729037867,
  "model": "gpt-4o-mini-2024-07-18",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "This is a test! How can I assist you today?",
        "refusal": null
      },
      "logprobs": null,
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 13,
    "completion_tokens": 12,
    "total_tokens": 25,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  },
  "system_fingerprint": "fp_r0bdr52e6e"
}

Now, your traffic is being properly proxied to OpenAI via Kong Gateway and the AI Proxy plugin.

Installing other AI plugins

The AI Proxy and AI Proxy Advanced plugins are able to understand the incoming OpenAI protocol. This allows you to:

Route to all supported LLMs, even the ones that don’t natively support the OpenAI specification, as Kong will automatically transform the request. Write once, and use all LLMs.
Extract observability metrics for AI.
Cache traffic using the AI Semantic cache plugin plugin.
Secure traffic with the AI Prompt Guard and AI Semantic Prompt Guard plugins.
Provide prompt templates with AI Prompt template.
Programmatically inject system or assistant prompts to all incoming prompts with the AI Prompt Decorator.

See all the AI plugins for more capabilities.

For example, you can rate limit AI traffic based on the number of tokens that are being sent (as opposed to the number of API requests) using the AI Rate Limiting Advanced plugin:

Kong Gateway Admin API

Konnect API

decK (YAML)

curl -i -X POST http://localhost:8001/services/llm_service/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
    "name": "ai-rate-limiting-advanced",
    "config": {
      "llm_providers": [
        {
          "name": "openai",
          "limit": 5,
          "window_size": 60
      }
    ]
  }
}'

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
    "name": "ai-rate-limiting-advanced",
    "config": {
      "llm_providers": [
        {
          "name": "openai",
          "limit": 5,
          "window_size": 60
      }
    ]
  }
}'

plugins:
  - name: ai-rate-limiting-advanced
    service: llm_service
    config:
      llm_providers:
      - name: openai
        limit: 5
        window_size: 60

Every other Kong Gateway plugin can also be used in addition to the AI plugins, for advanced access control, authorization and authentication, security, observability, and more.