Why Edge Functions for AI
Three reasons Edge Functions are the right choice for AI API calls:
1. **Security**: Your OpenAI/Anthropic API key lives as a server-side secret, never exposed to the client. Anyone with your frontend code can't extract the key.
2. **Database access**: Edge Functions run inside Supabase's infrastructure and have direct, low-latency access to your PostgreSQL database. You can fetch user context, store results, and log usage in the same function that calls the AI.
3. **Streaming support**: Edge Functions support Response streaming, which lets you send AI output to the client word-by-word — dramatically improving perceived performance for long AI responses.
Basic OpenAI Proxy Function
The simplest Edge Function: receive a prompt, call OpenAI, return the response.
```typescript import OpenAI from "npm:openai"
const openai = new OpenAI({ apiKey: Deno.env.get("OPENAI_API_KEY") })
Deno.serve(async (req) => {
const { prompt } = await req.json()
const chat = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: prompt }],
max_tokens: 500
})
return new Response(
JSON.stringify({ result: chat.choices[0].message.content }),
{ headers: { "Content-Type": "application/json" } }
)
})
```
Deploy: `supabase functions deploy openai-proxy`. Secrets: `supabase secrets set OPENAI_API_KEY=sk-...`.
Adding Authentication and Rate Limiting
Every AI Edge Function should verify the user is authenticated and check their usage limits:
```typescript
import { createClient } from "npm:@supabase/supabase-js"
Deno.serve(async (req) => {
// Verify auth token
const token = req.headers.get("Authorization")?.replace("Bearer ", "")
const supabase = createClient(Deno.env.get("SUPABASE_URL"), Deno.env.get("SUPABASE_SERVICE_ROLE_KEY"))
const { data: { user }, error } = await supabase.auth.getUser(token)
if (error || !user) return new Response("Unauthorized", { status: 401 })
// Check rate limit (max 20 requests/hour)
const oneHourAgo = new Date(Date.now() - 3600000).toISOString()
const { count } = await supabase
.from("ai_usage_log")
.select("*", { count: "exact" })
.eq("user_id", user.id)
.gte("created_at", oneHourAgo)
if (count >= 20) return new Response("Rate limit exceeded", { status: 429 })
// ... call OpenAI and log usage
})
```
Streaming Responses to WeWeb
Streaming sends AI output to the client progressively — users see text appearing word by word instead of waiting for the full response.
In the Edge Function:
```typescript
const stream = await openai.chat.completions.create({
model: "gpt-4o",
messages,
stream: true
})
const readable = new ReadableStream({ async start(controller) { for await (const chunk of stream) { const text = chunk.choices[0]?.delta?.content || "" controller.enqueue(new TextEncoder().encode(text)) } controller.close() } })
return new Response(readable, {
headers: { "Content-Type": "text/event-stream" }
})
```
In WeWeb: use a custom JavaScript action to fetch the stream URL and update a page variable character by character as chunks arrive.
Building a RAG Pipeline (Retrieval Augmented Generation)
RAG improves AI answers by injecting relevant knowledge into the prompt at query time. Architecture:
1. **Knowledge ingestion** (run once): For each document in your knowledge base, call OpenAI's embedding API to get a 1536-dimensional vector. Store vectors in Supabase using the pgvector extension.
2. **Query time**: When a user asks a question, embed the question (same embedding API), then run a similarity search in Supabase: `SELECT content, 1 - (embedding <=> query_embedding) AS similarity FROM documents ORDER BY similarity DESC LIMIT 3`.
3. **Augmented prompt**: Inject the top 3 matching documents into the system prompt: "Answer using only the following context: [docs]. If the answer isn't in the context, say you don't know."
Result: the AI answers only from your documentation, with zero hallucination about things you haven't documented.