Groundfloor Docs

LLM Gateway

Workspace-scoped LLM virtual keys and LiteLLM proxy usage.

Audience: App developers building AI features
Base path: /v1/workspaces/{workspace_id}/llm
Auth: Keycloak bearer token + workspace permissions
Inference path: App → LiteLLM directly (Control Plane is not in the data path)

Control Plane mints workspace virtual keys, exposes the model catalog, and records Compute Unit (CU) usage via webhooks. Your app calls LiteLLM with an OpenAI-compatible client.


Architecture

1. Admin: POST /v1/workspaces/{id}/llm/virtual-key  (administer)
2. Store key in Secrets as LITELLM_API_KEY
3. App: POST {LITELLM_URL}/v1/chat/completions
         Authorization: Bearer {LITELLM_API_KEY}
4. LiteLLM → webhook → Control Plane → billing events

Control Plane endpoints

MethodPathPermissionPurpose
GET/v1/workspaces/{id}/llm/modelsreadGroundfloor catalog + BYO models
GET/v1/workspaces/{id}/llm/usagereadRecent CU rollup
POST/v1/workspaces/{id}/llm/virtual-keyadministerMint or rotate virtual key (shown once)

Step 1 — Mint a virtual key

POST /v1/workspaces/{workspace_id}/llm/virtual-key
Authorization: Bearer {keycloak_access_token}
{ "virtual_key": "sk-…" }

The key is returned once. Store it immediately:

PUT /v1/workspaces/{workspace_id}/secrets/LITELLM_API_KEY
Authorization: Bearer {keycloak_access_token}
Content-Type: application/json

{ "value": "sk-…", "description": "LiteLLM workspace virtual key" }

Requires administer on the workspace (workspace owner / admin role).


Step 2 — List models

GET /v1/workspaces/{workspace_id}/llm/models
Authorization: Bearer {keycloak_access_token}

Returns Groundfloor default catalog entries plus BYO models discovered from LiteLLM when provider keys exist in Secrets.

Default Groundfloor model ids (use these in LiteLLM calls):

gf_idRoleUpstream (via LiteLLM)
gf-chat-defaultchatopenai/gpt-4o-mini
gf-chat-prochatanthropic/claude-sonnet-4-5
gf-code-defaultcodeanthropic/claude-sonnet-4-5
gf-embed-defaultembeddingopenai/text-embedding-3-small
gf-vision-defaultvisionopenai/gpt-4o

Pass gf_id as the model field when calling LiteLLM — routing resolves to the upstream provider.


Step 3 — Call LiteLLM (OpenAI-compatible)

Local dev (deploy/PHASE2-DEPS.md):

LITELLM_URL=http://localhost:4000
curl -s "${LITELLM_URL}/v1/chat/completions" \
  -H "Authorization: Bearer ${LITELLM_API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gf-chat-default",
    "messages": [{ "role": "user", "content": "Hello" }]
  }'

TypeScript (any OpenAI SDK):

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: process.env.LITELLM_URL ?? "http://localhost:4000/v1",
  apiKey: process.env.LITELLM_API_KEY, // from Secrets, server-side only
});

const completion = await client.chat.completions.create({
  model: "gf-chat-default",
  messages: [{ role: "user", content: "Hello" }],
});

Production LITELLM_URL is environment-specific — inject via Shell/host env, not the federated bundle.


BYO provider keys

Customers can add their own provider keys to Secrets (e.g. OPENAI_API_KEY, ANTHROPIC_API_KEY). LiteLLM routes to those providers and additional models appear in GET …/llm/models with "origin": "byo".

Always route through LiteLLM — do not call OpenAI/Anthropic directly with hardcoded keys. Direct calls bypass workspace spend caps and CU attribution.


Usage and billing

GET /v1/workspaces/{workspace_id}/llm/usage?since_seconds=86400&limit=100
Authorization: Bearer {keycloak_access_token}

Returns token counts and CU estimates per completion. Full account billing: GET /v1/accounts/{id}/billing/summary.

LiteLLM enforces max_budget per virtual key (spend cap at workspace level).


Environment variables

VariableWherePurpose
LITELLM_URLControl Plane API .envAdmin client for mint/models
LITELLM_MASTER_KEYControl Plane API .envLiteLLM admin API
LITELLM_API_KEYWorkspace secretApp inference (per workspace)

Bring up LiteLLM locally:

docker compose -f deploy/docker-compose.phase2-deps.yml --profile litellm up -d
curl -s http://localhost:4000/health/liveness

Provider keys in Secrets are required before models respond (empty model list until keys exist).


Errors

StatusMeaning
403Need administer to mint virtual key
502LiteLLM unreachable from Control Plane
LiteLLM 401Invalid or expired virtual key
LiteLLM empty/errorMissing provider keys in Secrets

On this page