Set up AI Semantic Cache with OpenAI - Plugin

Prerequisites

OpenAI account and subscription
Redis configured as a vector database
Redis configured as a cache

A service and a route for the LLM provider. You need a service to contain the route for the LLM provider. Create a service first:

curl -X POST http://localhost:8001/services \
--data "name=ai-semantic-cache" \
--data "url=http://localhost:32000"

Remember that the upstream URL can point anywhere empty, as it won’t be used by the plugin.

Then, create a route:

curl -X POST http://localhost:8001/services/ai-semantic-cache/routes \
--data "name=openai-semantic-cache" \
--data "paths[]=~/openai-semantic-cache$"

OpenAI Example

routeで有効にする

Kong Admin API

Konnect API

Kubernetes

Declarative (YAML)

Konnect Terraform

次のリクエストを行います。

curl -X POST http://localhost:8001/routes/{routeName|Id}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
  "name": "ai-semantic-cache",
  "config": {
    "embeddings": {
      "auth": {
        "header_name": "Authorization",
        "header_value": "Bearer OPENAI_API_KEY"
      },
      "model": {
        "provider": "openai",
        "name": "text-embedding-3-large",
        "options": {
          "upstream_url": "https://api.openai.com/v1/embeddings"
        }
      }
    },
    "vectordb": {
      "dimensions": 3072,
      "distance_metric": "cosine",
      "strategy": "redis",
      "threshold": 0.1,
      "redis": {
        "host": "redis-stack.redis.svc.cluster.local",
        "port": 6379
      }
    }
  }
}
    '

ROUTE_NAME

IDを、このプラグイン構成が対象とするルートのid またはnameに置き換えてください。

独自のアクセストークン、リージョン、コントロールプレーン（CP）ID、ルートIDを代入して、次のリクエストをしてください。

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/routes/{routeId}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer TOKEN" \
    --data '{"name":"ai-semantic-cache","config":{"embeddings":{"auth":{"header_name":"Authorization","header_value":"Bearer OPENAI_API_KEY"},"model":{"provider":"openai","name":"text-embedding-3-large","options":{"upstream_url":"https://api.openai.com/v1/embeddings"}}},"vectordb":{"dimensions":3072,"distance_metric":"cosine","strategy":"redis","threshold":0.1,"redis":{"host":"redis-stack.redis.svc.cluster.local","port":6379}}}}'

地域固有のURLと個人アクセストークンの詳細については、 Konnect API referenceをご参照ください。

まず、KongPlugin リソースを作成します：

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-semantic-cache-example
plugin: ai-semantic-cache
config:
  embeddings:
    auth:
      header_name: Authorization
      header_value: Bearer OPENAI_API_KEY
    model:
      provider: openai
      name: text-embedding-3-large
      options:
        upstream_url: https://api.openai.com/v1/embeddings
  vectordb:
    dimensions: 3072
    distance_metric: cosine
    strategy: redis
    threshold: 0.1
    redis:
      host: redis-stack.redis.svc.cluster.local
      port: 6379
" | kubectl apply -f -

次に、次のようにingressに注釈を付けて、KongPluginリソースをイングレスに適用します。

kubectl annotate ingress INGRESS_NAME konghq.com/plugins=ai-semantic-cache-example

INGRESS_NAMEを、このプラグイン構成がターゲットとするイングレス名に置き換えます。 kubectl get ingressを実行すると、利用可能なイングレスを確認できます。

注： KongPluginリソースは一度だけ定義するだけで、ネームスペース内の任意のサービス、コンシューマー、またはルートに適用できます。プラグインをクラスター全体で利用可能にしたい場合は、KongPluginの代わりにKongClusterPluginとしてリソースを作成してください。

このセクションを宣言型構成ファイルに追加します。

plugins:
- name: ai-semantic-cache
  route: ROUTE_NAME|ID
  config:
    embeddings:
      auth:
        header_name: Authorization
        header_value: Bearer OPENAI_API_KEY
      model:
        provider: openai
        name: text-embedding-3-large
        options:
          upstream_url: https://api.openai.com/v1/embeddings
    vectordb:
      dimensions: 3072
      distance_metric: cosine
      strategy: redis
      threshold: 0.1
      redis:
        host: redis-stack.redis.svc.cluster.local
        port: 6379

ROUTE_NAME

IDを、このプラグイン構成が対象とするルートのid またはnameに置き換えてください。

前提条件： パーソナルアクセストークンの設定

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "kpat_YOUR_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Kong Konnectゲートウェイプラグインを作成するには、Terraform 構成に以下を追加します。

resource "konnect_gateway_plugin_ai_semantic_cache" "my_ai_semantic_cache" {
  enabled = true

  config = {
    embeddings = {
      auth = {
        header_name = "Authorization"
        header_value = "Bearer OPENAI_API_KEY"
      }
      model = {
        provider = "openai"
        name = "text-embedding-3-large"
        options = {
          upstream_url = "https://api.openai.com/v1/embeddings"
        }
      }
    }
    vectordb = {
      dimensions = 3072
      distance_metric = "cosine"
      strategy = "redis"
      threshold = 0.1
      redis = {
        host = "redis-stack.redis.svc.cluster.local"
        port = 6379
      }
    }
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  route = {
    id = konnect_gateway_route.my_route.id
  }
}

This configures the following:

embeddings.auth.header_value: The API key for OpenAI. This uses OpenAI’s API Key explicitly, but you can use an environment variable instead if you want.
model.provider: The model provider you want to use. In this example, OpenAI.
model.name: The AI model to use for generating embeddings. This example is configured with text-embedding-3-large, but you can also choose text-embedding-3-small for OpenAI.
model.options.upstream_url: The upstream URL for the LLM provider.
vectordb.dimensions: The dimensionality for the vectors. Since this example uses text-embedding-3-large, OpenAI uses 3072 as the default embedding dimension.
vectordb.distance_metric: The distance metric to use for vectors. This example uses cosine because OpenAI recommends it.
vectordb.strategy: Defines the vector database, in this case, Redis.
vectordb.threshold: Defines the similarity threshold for accepting semantic search results. In the example, this is configured to as a low threshold, meaning it would include results that are only somewhat similar.
vectordb.redis.host: The host of your vector database.
vectordb.redis.port: The port to use for your vector database.
config.embeddings.name: The AI model to use for generating embeddings. This example is configured with text-embedding-3-large, but you can also choose text-embedding-3-small for OpenAI.

This uses OpenAI’s API Key explicitly, but you can use an environment variable instead if you want.

More information

Redis Documentation: Vectors - Learn how to use vector fields and perform vector searches in Redis
Redis Documentation: How to Perform Vector Similarity Search Using Redis in NodeJS

前へ Set up AI Semantic Cache with Mistral

次へ AI Semantic Cache Changelog