Build a chatbot

A "chat with your documents" knowledge base with source citations and an embeddable widget.

A TurfAI chatbot is a deployable RAG front end: ingest documents into a knowledge base, then answer questions over them with source citations — embeddable on any site via a widget. A chatbot is composed from ordinary workflow blocks, so you can customize ingestion and answering later. It builds directly on the knowledge base & RAG loop — index documents, then query them with cited answers — and wraps it in a browser-safe public endpoint plus an embeddable widget.

What you'll build

A chatbot you create once with your JWT, then chat with from a browser using a separate public key — no JWT exposed client-side. The bot answers from documents you've indexed, and each answer carries a sources array you can render as citations.

Prerequisites

A JWT in $TURFAI_JWT for the owner-side calls (create / manage the chatbot) — see Authentication. The public chat endpoint does not use the JWT.
One or more indexed documents you own. You index by following enable RAG → poll status → completed; a chatbot can only answer from documents whose processing_status is completed.
Base URL: https://apisandbox.turfai.in/api (used as $BASE below).

Two credentials, two audiences. The JWT is yours and stays server-side — it creates and manages the chatbot. The chatbot key (cb_…) is what the browser sends as X-Chatbot-Key; it's scoped to one chatbot's public chat endpoint and is safe to ship in page markup.

1. Ingest documents

Index the documents the bot should know. Enable RAG on each (directly, or via a rag_enable task in an ingestion workflow), then wait until indexing finishes — a chatbot only sees documents whose processing_status is completed.

BASE="https://apisandbox.turfai.in/api"
DOC_ID=45

# Queue the document for embedding (returns immediately)
curl -X POST "$BASE/documents/$DOC_ID/enable-rag" \
  -H "Authorization: Bearer $TURFAI_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "force_reprocess": false }'

# Poll until processing_status == "completed"
curl -s "$BASE/documents/$DOC_ID/rag-status" \
  -H "Authorization: Bearer $TURFAI_JWT"

import os, time, requests

BASE = "https://apisandbox.turfai.in/api"
HEAD = {"Authorization": f"Bearer {os.environ['TURFAI_JWT']}", "Content-Type": "application/json"}
DOC_ID = 45

requests.post(f"{BASE}/documents/{DOC_ID}/enable-rag", headers=HEAD,
              json={"force_reprocess": False}).raise_for_status()

while True:
    s = requests.get(f"{BASE}/documents/{DOC_ID}/rag-status", headers=HEAD).json()
    if s["processing_status"] == "completed":
        break
    if s["processing_status"] == "failed":
        raise RuntimeError(s.get("error") or "RAG indexing failed")
    time.sleep(3)

const BASE = "https://apisandbox.turfai.in/api";
const HEAD = {
  Authorization: `Bearer ${process.env.TURFAI_JWT}`,
  "Content-Type": "application/json",
};
const DOC_ID = 45;

await fetch(`${BASE}/documents/${DOC_ID}/enable-rag`, {
  method: "POST",
  headers: HEAD,
  body: JSON.stringify({ force_reprocess: false }),
});

// Poll until processing_status == "completed"
for (;;) {
  const s = await (
    await fetch(`${BASE}/documents/${DOC_ID}/rag-status`, { headers: HEAD })
  ).json();
  if (s.processing_status === "completed") break;
  if (s.processing_status === "failed") throw new Error(s.error ?? "RAG indexing failed");
  await new Promise((r) => setTimeout(r, 3000));
}

See Knowledge base & RAG for the full indexing model, the wait task for automating the poll inside a workflow, and the Documents API for the endpoint catalog.

2. Create the chatbot

POST /chatbots with your JWT. The response contains a cb_-prefixed api_key — this is the public chat key.

The api_key is shown exactly once. The server stores only a SHA-256 hash and never returns the plaintext again — not from GET /chatbots/:id, not from GET /chatbots. Capture it from this response and store it securely. If you lose it, regenerate the key (which invalidates the old one).

curl -X POST "$BASE/chatbots" \
  -H "Authorization: Bearer $TURFAI_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "data": {
      "name": "Support KB",
      "slug": "support-kb",
      "welcome_message": "Hi! Ask me anything about our policies.",
      "branding": { "primary_color": "#2563eb", "logo_url": "https://www.example.com/logo.svg" },
      "allowed_origins": ["https://www.example.com"],
      "rate_limit": 60
    }
  }'

res = requests.post(f"{BASE}/chatbots", headers=HEAD, json={
    "data": {
        "name": "Support KB",
        "slug": "support-kb",
        "welcome_message": "Hi! Ask me anything about our policies.",
        "branding": {"primary_color": "#2563eb", "logo_url": "https://www.example.com/logo.svg"},
        "allowed_origins": ["https://www.example.com"],
        "rate_limit": 60,
    },
})
chatbot = res.json()["data"]
chatbot_key = chatbot["api_key"]  # shown once — store it now
print(chatbot["slug"], chatbot_key)

const res = await fetch(`${BASE}/chatbots`, {
  method: "POST",
  headers: HEAD,
  body: JSON.stringify({
    data: {
      name: "Support KB",
      slug: "support-kb",
      welcome_message: "Hi! Ask me anything about our policies.",
      branding: { primary_color: "#2563eb", logo_url: "https://www.example.com/logo.svg" },
      allowed_origins: ["https://www.example.com"],
      rate_limit: 60,
    },
  }),
});
const { data: chatbot } = await res.json();
const chatbotKey = chatbot.api_key; // shown once — store it now
console.log(chatbot.slug, chatbotKey);

The 201 response (note the one-time api_key):

{
  "data": {
    "id": 7,
    "name": "Support KB",
    "slug": "support-kb",
    "active": true,
    "welcome_message": "Hi! Ask me anything about our policies.",
    "branding": { "primary_color": "#2563eb", "logo_url": "https://www.example.com/logo.svg" },
    "allowed_origins": ["https://www.example.com"],
    "rate_limit": 60,
    "query_count": 0,
    "api_key_rotated_at": "2026-06-19T10:00:00.000Z",
    "createdAt": "2026-06-19T10:00:00.000Z",
    "api_key": "cb_3f9a8c1d4b2e6f7a9c0d1e2f3a4b5c6d7e8f9a0b1c2d3e4f"
  }
}

Field	Notes
`slug`	URL-safe, unique. Lowercase alphanumeric + hyphens. Used in the public URL and embed code.
`api_key`	`cb_`-prefixed, 48 hex chars. Returned only here and on regenerate. Sent by the browser as `X-Chatbot-Key`.
`branding`	JSON, e.g. `{ "primary_color": "#2563eb", "logo_url": "…" }`. Surfaced by the public config endpoint and the widget.
`allowed_origins`	JSON array of exact origins allowed to call the public endpoint. Empty array (the default) = no origin restriction.
`rate_limit`	Max requests per minute per chatbot (default `60`, range `1–1000`). Over-limit requests get `429`.

Documents searched are the chatbot owner's documents — the public chat runs under the owner's context. To scope a bot to a specific document set, attach a collection (via PUT /chatbots/:id); the bot then answers only from indexed documents in that collection.

3. Chat over the public API

The public endpoint authenticates with the X-Chatbot-Key header, not a JWT — safe to call from a browser. Pass a session_id to carry multi-turn context across turns.

# CHATBOT_KEY is the cb_ key captured at create time
curl -X POST "$BASE/chatbots/support-kb/chat" \
  -H "X-Chatbot-Key: $CHATBOT_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "query": "What is the remote work policy?", "session_id": "sess-1" }'

import requests

BASE = "https://apisandbox.turfai.in/api"
CHATBOT_KEY = "cb_…"  # captured at create time (not the JWT)

r = requests.post(f"{BASE}/chatbots/support-kb/chat",
                  headers={"X-Chatbot-Key": CHATBOT_KEY, "Content-Type": "application/json"},
                  json={"query": "What is the remote work policy?", "session_id": "sess-1"})
data = r.json()
print(data["answer"])
for s in data["sources"]:
    print(s["document_title"], round(s["similarity_score"], 2), s.get("signed_url"))

const BASE = "https://apisandbox.turfai.in/api";
const CHATBOT_KEY = "cb_…"; // captured at create time (not the JWT)

const res = await fetch(`${BASE}/chatbots/support-kb/chat`, {
  method: "POST",
  headers: { "X-Chatbot-Key": CHATBOT_KEY, "Content-Type": "application/json" },
  body: JSON.stringify({ query: "What is the remote work policy?", session_id: "sess-1" }),
});
const data = await res.json();
console.log(data.answer);
data.sources.forEach((s: any) => console.log(s.document_title, s.similarity_score));

Response

{
  "answer": "Based on the company handbook, employees may work remotely up to 3 days per week, subject to manager approval…",
  "sources": [
    {
      "document_id": 42,
      "document_title": "Company Handbook 2025",
      "chunk_text": "…employees may work from home up to three days per week with manager approval…",
      "page_number": 15,
      "similarity_score": 0.92,
      "file_url": "gs://turfai-docs/handbook.pdf",
      "signed_url": "https://storage.googleapis.com/turfai-docs/handbook.pdf?X-Goog-Signature=…"
    }
  ],
  "confidence": 0.88,
  "session_id": "sess-1"
}

Render the sources as citation badges so users can verify answers. Each entry carries the document_title it came from, the matched chunk_text (with page_number), a similarity_score (0–1), and a short-lived signed_url (valid ~1 hour) you can link straight to. The public chat passes through the same shape as the underlying RAG query.

Multi-turn memory. Reuse the same session_id across turns and prior turns inform the next answer (so a follow-up like "and for managers?" resolves in context). The first call can auto-create a session; the value is echoed back in the response. Omit session_id to start fresh. The widget manages this for you via localStorage.

If the bot is scoped to a collection that has no indexed documents, the endpoint returns early with a safe non-answer rather than hallucinating:

{ "answer": "No indexed documents found in the assigned collection. Please index documents first.", "sources": [], "confidence": 0, "session_id": "sess-1" }

Drop the script on any allowed-origin page. The data-chatbot attribute is the chatbot slug.

<script src="https://app.turfai.com/widget.js"
        data-chatbot="support-kb"
        data-api="https://apisandbox.turfai.in"></script>

Or initialize programmatically:

<script src="https://app.turfai.com/widget.js"></script>
<script>
  TurfAIChat.init({
    chatbotSlug: "support-kb",
    apiUrl: "https://apisandbox.turfai.in",
  });
</script>

The widget runs in a shadow DOM (CSS-isolated from the host page), persists sessions in localStorage (keyed by slug), and pulls branding from the public config endpoint (GET /chatbots/:slug/config, no auth), which returns:

{
  "name": "Support KB",
  "slug": "support-kb",
  "welcome_message": "Hi! Ask me anything about our policies.",
  "branding": { "primary_color": "#2563eb", "logo_url": "https://www.example.com/logo.svg" }
}

Customization. Set branding.primary_color and branding.logo_url (and the welcome_message) via create / PUT /chatbots/:id — the widget reads them live from the config endpoint, so changes apply without re-embedding. To restrict where the widget may run, set allowed_origins. The widget itself is open-source and rebuildable from the platform repo (vite build --mode widget emits the IIFE widget.js) if you need a self-hosted or custom-styled build.

Hardening

In v0.5, chatbot public chat is not tokenised by Data Shield — PII in indexed documents (or in questions) can reach the LLM. Don't expose a public chatbot over documents whose PII must never leave your boundary until that path is covered.

allowed_origins / CORS. Set this to the exact origins that may embed the widget (e.g. "https://www.example.com", scheme + host, no trailing slash). The OPTIONS preflight reflects the request origin into Access-Control-Allow-Origin, allows the Content-Type and X-Chatbot-Key headers, and caches for 24h. An empty array (the default) allows any origin — lock it down before going to production.
Rate limiting. rate_limit is the max requests per minute per chatbot (default 60, 1–1000). Over-limit requests get HTTP 429 with { "error": "Rate limit exceeded", "message": "Maximum N requests per minute exceeded.", "retryAfter": 60 }. Limiting is in-memory and per-chatbot, so it resets on a rolling 60-second window.

Rotating the key

Rotate a leaked or lost key with POST /chatbots/:id/regenerate-key (JWT). It returns a fresh one-time api_key and immediately invalidates the old one — any embed still using the old key starts getting 401, so update your embeds promptly.

curl -X POST "$BASE/chatbots/7/regenerate-key" \
  -H "Authorization: Bearer $TURFAI_JWT"
# -> { "data": { …, "api_key": "cb_…new…", "api_key_rotated_at": "…" } }

new_key = requests.post(f"{BASE}/chatbots/7/regenerate-key", headers=HEAD).json()["data"]["api_key"]
print(new_key)  # update your embeds with this — the old key is now dead

const out = await (await fetch(`${BASE}/chatbots/7/regenerate-key`, {
  method: "POST",
  headers: HEAD,
})).json();
const newKey = out.data.api_key; // update your embeds — old key is now dead

Regeneration may require a step-up (re-authentication) depending on your tenant's security policy. See regenerate-key in the reference.

Troubleshooting

CORS / preflight failure (browser blocks the request, no response body). The page's origin isn't in allowed_origins. Add the exact origin (scheme + host, no path or trailing slash), or clear allowed_origins to [] while testing. Server-to-server calls (curl, Python, Node) aren't subject to CORS — if curl works but the browser doesn't, it's an origin mismatch.

429 Rate limit exceeded. You exceeded rate_limit requests in the rolling 60s window. Back off and retry after retryAfter seconds, or raise rate_limit via PUT /chatbots/:id (max 1000).

Answers are empty / "no documents found". The bot can only see documents whose processing_status is completed. Confirm indexing finished (poll rag-status) and chunk_count > 0. If the bot is scoped to a collection, make sure that collection contains indexed documents — otherwise it returns the canned "no indexed documents" answer with sources: [].

401 Invalid or missing API key. The X-Chatbot-Key doesn't match the stored hash — usually a stale key after a rotation, a missing header, or a key copied with whitespace. Re-capture it from create / regenerate (it's only shown then). A bot with no key on file (legacy row) also 401s with a "regenerate the key" message — fix it via regenerate-key.

404 Chatbot not found or inactive. Wrong slug, or the chatbot's active is false. Check the slug and flip active: true via PUT /chatbots/:id.

Reference

Full chatbot endpoints (config, public chat, CRUD, regenerate-key): Chatbot API.
The indexing and query model behind it: Knowledge base & RAG.
Governance coverage for the chat path: Data Shield.