Build a chatbot
A "chat with your documents" knowledge base with source citations and an embeddable widget.
A TurfAI chatbot is a deployable RAG front end: ingest documents into a knowledge base, then answer questions over them with source citations — embeddable on any site via a widget. A chatbot is composed from ordinary workflow blocks, so you can customize ingestion and answering later. It builds directly on the knowledge base & RAG loop — index documents, then query them with cited answers — and wraps it in a browser-safe public endpoint plus an embeddable widget.
What you'll build
A chatbot you create once with your JWT, then chat with from a browser using a separate
public key — no JWT exposed client-side. The bot answers from documents you've indexed, and
each answer carries a sources array you can render as citations.
Prerequisites
- A JWT in
$TURFAI_JWTfor the owner-side calls (create / manage the chatbot) — see Authentication. The public chat endpoint does not use the JWT. - One or more indexed documents you own. You index by following
enable RAG → poll status →
completed; a chatbot can only answer from documents whoseprocessing_statusiscompleted. - Base URL:
https://apisandbox.turfai.in/api(used as$BASEbelow).
Two credentials, two audiences. The JWT is yours and stays server-side — it creates and
manages the chatbot. The chatbot key (cb_…) is what the browser sends as X-Chatbot-Key;
it's scoped to one chatbot's public chat endpoint and is safe to ship in page markup.
1. Ingest documents
Index the documents the bot should know. Enable RAG on each (directly, or via a rag_enable
task in an ingestion workflow), then wait until indexing finishes — a chatbot only sees
documents whose processing_status is completed.
BASE="https://apisandbox.turfai.in/api"
DOC_ID=45
# Queue the document for embedding (returns immediately)
curl -X POST "$BASE/documents/$DOC_ID/enable-rag" \
-H "Authorization: Bearer $TURFAI_JWT" \
-H "Content-Type: application/json" \
-d '{ "force_reprocess": false }'
# Poll until processing_status == "completed"
curl -s "$BASE/documents/$DOC_ID/rag-status" \
-H "Authorization: Bearer $TURFAI_JWT"import os, time, requests
BASE = "https://apisandbox.turfai.in/api"
HEAD = {"Authorization": f"Bearer {os.environ['TURFAI_JWT']}", "Content-Type": "application/json"}
DOC_ID = 45
requests.post(f"{BASE}/documents/{DOC_ID}/enable-rag", headers=HEAD,
json={"force_reprocess": False}).raise_for_status()
while True:
s = requests.get(f"{BASE}/documents/{DOC_ID}/rag-status", headers=HEAD).json()
if s["processing_status"] == "completed":
break
if s["processing_status"] == "failed":
raise RuntimeError(s.get("error") or "RAG indexing failed")
time.sleep(3)const BASE = "https://apisandbox.turfai.in/api";
const HEAD = {
Authorization: `Bearer ${process.env.TURFAI_JWT}`,
"Content-Type": "application/json",
};
const DOC_ID = 45;
await fetch(`${BASE}/documents/${DOC_ID}/enable-rag`, {
method: "POST",
headers: HEAD,
body: JSON.stringify({ force_reprocess: false }),
});
// Poll until processing_status == "completed"
for (;;) {
const s = await (
await fetch(`${BASE}/documents/${DOC_ID}/rag-status`, { headers: HEAD })
).json();
if (s.processing_status === "completed") break;
if (s.processing_status === "failed") throw new Error(s.error ?? "RAG indexing failed");
await new Promise((r) => setTimeout(r, 3000));
}See Knowledge base & RAG for the full indexing model, the wait task for
automating the poll inside a workflow, and the
Documents API for the endpoint catalog.
2. Create the chatbot
POST /chatbots with your JWT. The response contains a cb_-prefixed api_key — this is the
public chat key.
The api_key is shown exactly once. The server stores only a SHA-256 hash and never returns
the plaintext again — not from GET /chatbots/:id, not from GET /chatbots. Capture it from
this response and store it securely. If you lose it,
regenerate the key (which invalidates the old one).
curl -X POST "$BASE/chatbots" \
-H "Authorization: Bearer $TURFAI_JWT" \
-H "Content-Type: application/json" \
-d '{
"data": {
"name": "Support KB",
"slug": "support-kb",
"welcome_message": "Hi! Ask me anything about our policies.",
"branding": { "primary_color": "#2563eb", "logo_url": "https://www.example.com/logo.svg" },
"allowed_origins": ["https://www.example.com"],
"rate_limit": 60
}
}'res = requests.post(f"{BASE}/chatbots", headers=HEAD, json={
"data": {
"name": "Support KB",
"slug": "support-kb",
"welcome_message": "Hi! Ask me anything about our policies.",
"branding": {"primary_color": "#2563eb", "logo_url": "https://www.example.com/logo.svg"},
"allowed_origins": ["https://www.example.com"],
"rate_limit": 60,
},
})
chatbot = res.json()["data"]
chatbot_key = chatbot["api_key"] # shown once — store it now
print(chatbot["slug"], chatbot_key)const res = await fetch(`${BASE}/chatbots`, {
method: "POST",
headers: HEAD,
body: JSON.stringify({
data: {
name: "Support KB",
slug: "support-kb",
welcome_message: "Hi! Ask me anything about our policies.",
branding: { primary_color: "#2563eb", logo_url: "https://www.example.com/logo.svg" },
allowed_origins: ["https://www.example.com"],
rate_limit: 60,
},
}),
});
const { data: chatbot } = await res.json();
const chatbotKey = chatbot.api_key; // shown once — store it now
console.log(chatbot.slug, chatbotKey);The 201 response (note the one-time api_key):
{
"data": {
"id": 7,
"name": "Support KB",
"slug": "support-kb",
"active": true,
"welcome_message": "Hi! Ask me anything about our policies.",
"branding": { "primary_color": "#2563eb", "logo_url": "https://www.example.com/logo.svg" },
"allowed_origins": ["https://www.example.com"],
"rate_limit": 60,
"query_count": 0,
"api_key_rotated_at": "2026-06-19T10:00:00.000Z",
"createdAt": "2026-06-19T10:00:00.000Z",
"api_key": "cb_3f9a8c1d4b2e6f7a9c0d1e2f3a4b5c6d7e8f9a0b1c2d3e4f"
}
}| Field | Notes |
|---|---|
slug | URL-safe, unique. Lowercase alphanumeric + hyphens. Used in the public URL and embed code. |
api_key | cb_-prefixed, 48 hex chars. Returned only here and on regenerate. Sent by the browser as X-Chatbot-Key. |
branding | JSON, e.g. { "primary_color": "#2563eb", "logo_url": "…" }. Surfaced by the public config endpoint and the widget. |
allowed_origins | JSON array of exact origins allowed to call the public endpoint. Empty array (the default) = no origin restriction. |
rate_limit | Max requests per minute per chatbot (default 60, range 1–1000). Over-limit requests get 429. |
Documents searched are the chatbot owner's documents — the public chat runs under the
owner's context. To scope a bot to a specific document set, attach a collection (via
PUT /chatbots/:id); the bot then answers only from indexed documents in that collection.
3. Chat over the public API
The public endpoint authenticates with the X-Chatbot-Key header, not a JWT — safe to call
from a browser. Pass a session_id to carry multi-turn context across turns.
# CHATBOT_KEY is the cb_ key captured at create time
curl -X POST "$BASE/chatbots/support-kb/chat" \
-H "X-Chatbot-Key: $CHATBOT_KEY" \
-H "Content-Type: application/json" \
-d '{ "query": "What is the remote work policy?", "session_id": "sess-1" }'import requests
BASE = "https://apisandbox.turfai.in/api"
CHATBOT_KEY = "cb_…" # captured at create time (not the JWT)
r = requests.post(f"{BASE}/chatbots/support-kb/chat",
headers={"X-Chatbot-Key": CHATBOT_KEY, "Content-Type": "application/json"},
json={"query": "What is the remote work policy?", "session_id": "sess-1"})
data = r.json()
print(data["answer"])
for s in data["sources"]:
print(s["document_title"], round(s["similarity_score"], 2), s.get("signed_url"))const BASE = "https://apisandbox.turfai.in/api";
const CHATBOT_KEY = "cb_…"; // captured at create time (not the JWT)
const res = await fetch(`${BASE}/chatbots/support-kb/chat`, {
method: "POST",
headers: { "X-Chatbot-Key": CHATBOT_KEY, "Content-Type": "application/json" },
body: JSON.stringify({ query: "What is the remote work policy?", session_id: "sess-1" }),
});
const data = await res.json();
console.log(data.answer);
data.sources.forEach((s: any) => console.log(s.document_title, s.similarity_score));Response
{
"answer": "Based on the company handbook, employees may work remotely up to 3 days per week, subject to manager approval…",
"sources": [
{
"document_id": 42,
"document_title": "Company Handbook 2025",
"chunk_text": "…employees may work from home up to three days per week with manager approval…",
"page_number": 15,
"similarity_score": 0.92,
"file_url": "gs://turfai-docs/handbook.pdf",
"signed_url": "https://storage.googleapis.com/turfai-docs/handbook.pdf?X-Goog-Signature=…"
}
],
"confidence": 0.88,
"session_id": "sess-1"
}Render the sources as citation badges so users can verify answers. Each entry carries the
document_title it came from, the matched chunk_text (with page_number), a similarity_score
(0–1), and a short-lived signed_url (valid ~1 hour) you can link straight to. The public
chat passes through the same shape as the underlying RAG query.
Multi-turn memory. Reuse the same session_id across turns and prior turns inform the next
answer (so a follow-up like "and for managers?" resolves in context). The first call can
auto-create a session; the value is echoed back in the response. Omit session_id to start
fresh. The widget manages this for you via localStorage.
If the bot is scoped to a collection that has no indexed documents, the endpoint returns early with a safe non-answer rather than hallucinating:
{ "answer": "No indexed documents found in the assigned collection. Please index documents first.", "sources": [], "confidence": 0, "session_id": "sess-1" }4. Embed the widget
Drop the script on any allowed-origin page. The data-chatbot attribute is the chatbot slug.
<script src="https://app.turfai.com/widget.js"
data-chatbot="support-kb"
data-api="https://apisandbox.turfai.in"></script>Or initialize programmatically:
<script src="https://app.turfai.com/widget.js"></script>
<script>
TurfAIChat.init({
chatbotSlug: "support-kb",
apiUrl: "https://apisandbox.turfai.in",
});
</script>The widget runs in a shadow DOM (CSS-isolated from the host page), persists sessions in
localStorage (keyed by slug), and pulls branding from the public config endpoint
(GET /chatbots/:slug/config, no auth), which returns:
{
"name": "Support KB",
"slug": "support-kb",
"welcome_message": "Hi! Ask me anything about our policies.",
"branding": { "primary_color": "#2563eb", "logo_url": "https://www.example.com/logo.svg" }
}Customization. Set branding.primary_color and branding.logo_url (and the
welcome_message) via create / PUT /chatbots/:id — the widget reads them live from the config
endpoint, so changes apply without re-embedding. To restrict where the widget may run, set
allowed_origins. The widget itself is open-source and rebuildable from the
platform repo (vite build --mode widget emits the IIFE widget.js) if you need a self-hosted
or custom-styled build.
Hardening
In v0.5, chatbot public chat is not tokenised by Data Shield — PII in indexed documents (or in questions) can reach the LLM. Don't expose a public chatbot over documents whose PII must never leave your boundary until that path is covered.
allowed_origins/ CORS. Set this to the exact origins that may embed the widget (e.g."https://www.example.com", scheme + host, no trailing slash). TheOPTIONSpreflight reflects the request origin intoAccess-Control-Allow-Origin, allows theContent-TypeandX-Chatbot-Keyheaders, and caches for 24h. An empty array (the default) allows any origin — lock it down before going to production.- Rate limiting.
rate_limitis the max requests per minute per chatbot (default60,1–1000). Over-limit requests get HTTP429with{ "error": "Rate limit exceeded", "message": "Maximum N requests per minute exceeded.", "retryAfter": 60 }. Limiting is in-memory and per-chatbot, so it resets on a rolling 60-second window.
Rotating the key
Rotate a leaked or lost key with POST /chatbots/:id/regenerate-key (JWT). It returns a fresh
one-time api_key and immediately invalidates the old one — any embed still using the old key
starts getting 401, so update your embeds promptly.
curl -X POST "$BASE/chatbots/7/regenerate-key" \
-H "Authorization: Bearer $TURFAI_JWT"
# -> { "data": { …, "api_key": "cb_…new…", "api_key_rotated_at": "…" } }new_key = requests.post(f"{BASE}/chatbots/7/regenerate-key", headers=HEAD).json()["data"]["api_key"]
print(new_key) # update your embeds with this — the old key is now deadconst out = await (await fetch(`${BASE}/chatbots/7/regenerate-key`, {
method: "POST",
headers: HEAD,
})).json();
const newKey = out.data.api_key; // update your embeds — old key is now deadRegeneration may require a step-up (re-authentication) depending on your tenant's security policy. See regenerate-key in the reference.
Troubleshooting
CORS / preflight failure (browser blocks the request, no response body). The page's origin
isn't in allowed_origins. Add the exact origin (scheme + host, no path or trailing slash), or
clear allowed_origins to [] while testing. Server-to-server calls (curl, Python, Node) aren't
subject to CORS — if curl works but the browser doesn't, it's an origin mismatch.
429 Rate limit exceeded. You exceeded rate_limit requests in the rolling 60s window. Back
off and retry after retryAfter seconds, or raise rate_limit via PUT /chatbots/:id (max
1000).
Answers are empty / "no documents found". The bot can only see documents whose
processing_status is completed. Confirm indexing finished
(poll rag-status) and chunk_count > 0. If the bot is
scoped to a collection, make sure that collection contains indexed documents — otherwise it
returns the canned "no indexed documents" answer with sources: [].
401 Invalid or missing API key. The X-Chatbot-Key doesn't match the stored hash — usually
a stale key after a rotation, a missing header, or a key copied with whitespace. Re-capture it
from create / regenerate (it's only shown then). A bot with no key on file (legacy row) also
401s with a "regenerate the key" message — fix it via regenerate-key.
404 Chatbot not found or inactive. Wrong slug, or the chatbot's active is false. Check
the slug and flip active: true via PUT /chatbots/:id.
Reference
- Full chatbot endpoints (config, public chat, CRUD, regenerate-key): Chatbot API.
- The indexing and query model behind it: Knowledge base & RAG.
- Governance coverage for the chat path: Data Shield.