Knowledge base & RAG

The TurfAI knowledge base lets you index documents as vector embeddings and query them in natural language, with answers that cite their sources. This is the RAG pattern: retrieve relevant chunks, then generate an answer grounded in them.

What you'll build

A working "ask your documents" loop: enable RAG on an uploaded document, wait for it to index, then ask questions and get an answer with cited sources. By the end you'll have run the three calls that power every RAG feature in TurfAI — including chatbots and the rag_query workflow task.

Prerequisites

A JWT in $TURFAI_JWT — see Authentication. All examples send Authorization: Bearer $TURFAI_JWT.
One or more uploaded documents you own. You index by document id (a number), so have an id ready — e.g. 45 below. Uploading is out of scope here; see the Documents API.
Base URL: https://apisandbox.turfai.in/api.

Indexing is asynchronous. You enable RAG, the document is queued and embedded in the background, and only once its status is completed will queries return its chunks. The flow below is enable → poll → query.

The indexing flow

rag_processing_status moves through not_started → queued → processing → completed, or failed if embedding errors out. Don't query until it's completed.

1. Enable RAG on a document

POST /documents/:id/enable-rag queues the document for embedding and returns immediately. Pass force_reprocess: true to re-index a document that's already completed.

BASE="https://apisandbox.turfai.in/api"
DOC_ID=45

curl -X POST "$BASE/documents/$DOC_ID/enable-rag" \
  -H "Authorization: Bearer $TURFAI_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "force_reprocess": false }'

import os, requests

BASE = "https://apisandbox.turfai.in/api"
HEAD = {"Authorization": f"Bearer {os.environ['TURFAI_JWT']}", "Content-Type": "application/json"}
DOC_ID = 45

r = requests.post(f"{BASE}/documents/{DOC_ID}/enable-rag", headers=HEAD,
                  json={"force_reprocess": False})
r.raise_for_status()
print(r.json())  # {"status": "queued", "job_id": "...", "document_id": "45", ...}

const BASE = "https://apisandbox.turfai.in/api";
const HEAD = {
  Authorization: `Bearer ${process.env.TURFAI_JWT}`,
  "Content-Type": "application/json",
};
const DOC_ID = 45;

const res = await fetch(`${BASE}/documents/${DOC_ID}/enable-rag`, {
  method: "POST",
  headers: HEAD,
  body: JSON.stringify({ force_reprocess: false }),
});
console.log(await res.json()); // { status: "queued", job_id: "...", document_id: "45" }

{
  "status": "queued",
  "job_id": "rag-embed-45-1718800000000",
  "document_id": "45",
  "message": "Document queued for RAG processing"
}

Calling enable-rag on a document that's already processing (or already completed without force_reprocess) returns a 400. That's expected — treat it as "already indexing / indexed".

2. Poll the RAG status

GET /documents/:id/rag-status returns the live processing_status. Poll it until it's completed, then query. On failed, read the error field.

# Poll once; re-run until processing_status == "completed"
curl -s "$BASE/documents/$DOC_ID/rag-status" \
  -H "Authorization: Bearer $TURFAI_JWT"

import time

while True:
    s = requests.get(f"{BASE}/documents/{DOC_ID}/rag-status", headers=HEAD).json()
    status = s["processing_status"]
    print(status, s.get("chunk_count"))
    if status == "completed":
        break
    if status == "failed":
        raise RuntimeError(s.get("error") or "RAG indexing failed")
    time.sleep(3)

async function waitForRag(docId: number) {
  while (true) {
    const s = await (
      await fetch(`${BASE}/documents/${docId}/rag-status`, { headers: HEAD })
    ).json();
    if (s.processing_status === "completed") return s;
    if (s.processing_status === "failed") throw new Error(s.error ?? "RAG indexing failed");
    await new Promise((r) => setTimeout(r, 3000));
  }
}
await waitForRag(DOC_ID);

{
  "document_id": "45",
  "rag_enabled": true,
  "processing_status": "completed",
  "chunk_count": 128,
  "processed_at": "2026-06-19T10:21:44.000Z",
  "error": null,
  "embedding_model": "text-embedding-004"
}

processing_status is one of not_started, queued, processing, completed, failed.

Indexing inside a workflow

Enabling RAG mid-workflow uses the rag_enable task. Because indexing is async, follow it with a wait task that polls the rag-status endpoint until it's completed before any rag_query step runs:

{
  "type": "wait",
  "config": {
    "endpoint": "/api/internal/documents/{{document_id}}/rag-status",
    "method": "GET",
    "watch_field": "processing_status",
    "success_value": "completed",
    "failure_values": ["failed"]
  }
}

3. Query the knowledge base

POST /rag/query retrieves the most relevant chunks across the documents you own and generates a cited answer.

curl -X POST "$BASE/rag/query" \
  -H "Authorization: Bearer $TURFAI_JWT" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the remote work policy?",
    "top_k": 5,
    "use_reranking": true,
    "similarity_threshold": 0.3
  }'

r = requests.post(f"{BASE}/rag/query", headers=HEAD, json={
    "query": "What is the remote work policy?",
    "top_k": 5,
    "use_reranking": True,
    "similarity_threshold": 0.3,
})
data = r.json()
print(data["answer"])
for s in data["sources"]:
    print(s["document_title"], round(s["similarity_score"], 2), s.get("signed_url"))

const res = await fetch(`${BASE}/rag/query`, {
  method: "POST",
  headers: HEAD,
  body: JSON.stringify({
    query: "What is the remote work policy?",
    top_k: 5,
    use_reranking: true,
    similarity_threshold: 0.3,
  }),
});
const data = await res.json();
console.log(data.answer);
data.sources.forEach((s: any) => console.log(s.document_title, s.similarity_score));

Response

{
  "answer": "Employees may work remotely up to 3 days per week, subject to manager approval…",
  "sources": [
    {
      "document_id": 45,
      "document_title": "Remote Work Policy.pdf",
      "chunk_text": "…employees may work from home up to three days per week with manager approval…",
      "chunk_index": 12,
      "similarity_score": 0.92,
      "page_number": 3,
      "file_url": "gs://turfai-docs/policy.pdf",
      "signed_url": "https://storage.googleapis.com/turfai-docs/policy.pdf?X-Goog-Signature=…",
      "metadata": {}
    }
  ],
  "confidence": 0.88,
  "processing_time_ms": 742,
  "query": "What is the remote work policy?",
  "session_id": "sess-abc123",
  "timestamp": "2026-06-19T10:25:01.000Z"
}

Each entry in sources carries the document_id and document_title it came from, the matched chunk_text (with chunk_index / page_number), and a similarity_score (0–1). file_url is the raw gs:// path; the API also returns a short-lived signed_url (valid ~1 hour) you can link users straight to.

Request knobs

Field	Default	What it does
`top_k`	`5`	Number of chunks to retrieve (1–20). Raise for broad questions, lower for precision.
`similarity_threshold`	`0.3`	Drop chunks below this cosine score. Raise to cut noise, lower to recover misses (e.g. multi-lingual docs).
`use_reranking`	`false`	Re-rank retrieved chunks with a cross-encoder before generating — better ordering at a small latency cost.
`filters`	—	Scope the search: `{ "document_ids": [45, 46] }`, `collection_ids`, or `tenant_id`.
`session_id`	—	Multi-turn context (see below). Omit to start fresh.

Multi-turn sessions

For conversational Q&A, pass a session_id so prior turns inform the next answer. The first query auto-creates a session (returned as session_id); reuse it on follow-ups. You can also pre-create one with POST /rag/sessions, and list / rename / delete sessions under /rag/sessions.

# Turn 1 — note the session_id in the response
curl -s -X POST "$BASE/rag/query" \
  -H "Authorization: Bearer $TURFAI_JWT" -H "Content-Type: application/json" \
  -d '{ "query": "How many remote days are allowed?" }'

# Turn 2 — pass that session_id so "and for managers?" resolves in context
curl -s -X POST "$BASE/rag/query" \
  -H "Authorization: Bearer $TURFAI_JWT" -H "Content-Type: application/json" \
  -d '{ "query": "And for managers?", "session_id": "sess-abc123" }'

first = requests.post(f"{BASE}/rag/query", headers=HEAD,
                      json={"query": "How many remote days are allowed?"}).json()
sid = first["session_id"]

followup = requests.post(f"{BASE}/rag/query", headers=HEAD,
                         json={"query": "And for managers?", "session_id": sid}).json()
print(followup["answer"])

const first = await (await fetch(`${BASE}/rag/query`, {
  method: "POST", headers: HEAD,
  body: JSON.stringify({ query: "How many remote days are allowed?" }),
})).json();

const followup = await (await fetch(`${BASE}/rag/query`, {
  method: "POST", headers: HEAD,
  body: JSON.stringify({ query: "And for managers?", session_id: first.session_id }),
})).json();
console.log(followup.answer);

In workflows and chatbots

rag_query task — query the knowledge base mid-workflow (e.g. employee Q&A that emails the answer).
Chatbots — a chatbot is a deployable RAG front end with an embeddable widget and source citations; see Build a chatbot.

The RAG chat path is not tokenised by Data Shield in v0.5 — RAG chat and chatbot public chat are explicitly out of Data Shield's coverage. Don't index documents whose PII must never reach the LLM until that path is covered.

Troubleshooting

No results, or the answer says it can't find anything. Your similarity_threshold may be too high — lower it (e.g. 0.2) to recover near-misses, especially for multi-lingual documents. Also confirm the document's processing_status is completed and chunk_count > 0.

Irrelevant or poorly-ordered sources. Set use_reranking: true to re-order chunks with a cross-encoder, and raise similarity_threshold to cut weak matches. Narrow the search with filters.document_ids when you know which documents should answer the question.

Document stuck in processing. Embedding runs in the background; large PDFs take longer. Keep polling rag-status. If it doesn't advance, re-run enable-rag with force_reprocess: true.

Status failed. Read the error field from rag-status for the cause. Fix the input (or backend config) and re-enable with force_reprocess: true.

Scanned PDFs / images. PDFs are processed with the default Google File Search backend's native vision, which extracts text from scanned pages — but OCR behaviour for scanned PDFs isn't formally guaranteed, so verify with a real sample. Standalone image files (JPEG/PNG) aren't a supported file type; convert them to PDF before indexing. See Google File Search file handling for the supported-type matrix.

Reference

Full endpoints — enable-rag, disable-rag, rag-status, /rag/query, and sessions: RAG API.
The dedicated RAG Query Service API (the /rag/query call proxies to it).
Backend setup (Google File Search vs pgvector) is operator-side and lives in the platform docs.

On this page