Resources · Informational
Defending text-to-SQL agents from prompt injection
Chion is an AI SQL workforce. This post explains how its 13-stage SQL pipeline defends against prompt injection.
Text-to-SQL agents present an unusual attack surface. A user-controlled string flows into two trust boundaries inside the same request: first into the LLM prompt that generates SQL, then into the database that executes it. Either path, exploited, gets an attacker further than they should be. We treat both as adversarial by default (even for our own logged-in customers) because the cost of a single rule slipping past validation is unacceptably high for a tool that connects to production PostgreSQL.
This post walks through how Chion defends the 13-step SQL pipeline against prompt injection, what each layer actually catches, and the failure modes we still consider open work.
These defenses apply across Chion's entire surface: the verified SQL generator, the multi-turn conversational analytics layer, and the SQL Skills Generator that auto-distills verified queries into reusable CHION.md skills for Claude Code, Codex, and Cursor. Every skill the generator emits has already been validated by the same triplet described below, so an attacker cannot inject through a generated skill any more than they can through a fresh question.
Defense 1: User input is data, never instructions
Every user-controlled value that gets interpolated into an LLM prompt (the natural-language question, an entity name, a follow-up turn, a keyword filter, a column value rendered for grounding) passes through a single sanitizer with a bounded length cap:
escapePromptDisplay(userInput, { maxLength: 400 })The sanitizer does three things. It strips markdown constructs (##, **, __, ---, backticks, {, }, [, ]) and any XML-like tag via regex; these are the substring shapes that LLMs treat as prompt directives or system-prompt boundaries. It truncates at 400 characters, a hard ceiling that prevents an attacker from burying a directive deep in a long prefix. And it returns a tagged string that the prompt builder always wraps in delimited quotes, so the LLM sees user input as a fenced literal, not as continuation of our instructions.
The sanitization does not persist through stringification. If a sanitized turn gets serialized into conversation history and the history then gets re-interpolated into a future prompt, the protective wrapping is gone. Every re-injection re-runs the sanitizer. The discipline is encoded in the prompt-builder type signature: a raw string cannot reach the LLM without first being run through escapePromptDisplay, and history turns are escaped on every reload.
Defense 2: The SQL validator triplet
Even if a prompt-injection attempt somehow steers the LLM into emitting a destructive query, the two-layer validator (L1 + L2) plus execution-time enforcement (LIMIT wrap + statement_timeout) block execution before the database sees it.
L1: assertReadOnlySelect. The first gate parses the SQL with a Postgres-aware AST and asserts the top-level statement type is SELECT. INSERT, UPDATE, DELETE, DROP, TRUNCATE, ALTER, GRANT, CREATE, and CALL all reject. WITH CTEs are allowed but only when the head statement remains a SELECT; write-side CTEs (WITH … INSERT) are rejected at parse time. Pure read-only set operations like UNION ALL SELECT are correctly allowed; we've had to remind ourselves on review that not every SELECT-shaped surface is a write.
L2: validateQuery regex sweep. The second gate runs a forbidden-keyword regex against the raw text after L1 has cleared the AST. This is deliberately belt-and-suspenders. If an obscure SQL dialect feature lets a mutation past the AST parser, the regex catches the keyword. The duplication caught a real bug once: a CTE interpolation introduced a trailing semicolon that broke L1's single-statement contract. The fix was wrapWithBase(), a 7-step CTE wrapper that strips trailing semicolons and handles WITH / WITH RECURSIVE / leading comments / duplicate-CTE guards before concatenation, preventing single-statement contract violations across the board.
L3: statement_timeout + LIMIT wrap. The third gate is the Postgres connection itself. Every query runs with statement_timeout set per request, against a read-only role with no DDL privileges, with a LIMIT wrapped around the outermost SELECT that caps results at 1,000 rows or 12,000 cells (whichever hits first). Even if both software gates failed, the database refuses to execute long-running or unbounded queries.
Defense 3: Re-escape history on every turn
Multi-turn conversations are where most production prompt-injection attacks live. A user can put a benign question in turn 1, an injection payload in turn 2, and rely on the LLM's tendency to weight recent context heavily. Our discovery layer reloads the entire conversation history on every turn, and the sanitization re-runs from scratch on every reload. There is no cached "already-safe" history. The cost is a few extra milliseconds per request; the benefit is that no malicious payload survives a turn boundary.
Chion's conversational analytics surface applies this on every multi-turn exchange. The fact that the user has been chatting cleanly for ten turns earns them no escape from sanitization on turn eleven.
What this does not catch
A few attack classes survive the layers above and we want to be honest about them.
Schema-shape exfiltration. An attacker who can ask plain-English questions can probe the schema: "what tables have an email column?", "list every table whose name contains 'audit'". This is by design: natural-language schema exploration is a feature, not a bug. The mitigation is enforcing per-user RLS at the database role level, which the validator pipeline cannot bypass.
Column-name hallucination. An attacker who knows your schema can craft a question that nudges the LLM to reference a column that exists but the attacker shouldn't see. Our defense is the typed SQL contract: phase 08 binds columns, roles (x/y/y2/series), aggregation, and grain to a typed shape before SQL generation in phase 09. Columns outside the contract reject at compile time. But the contract is informed by the LLM, which means a sufficiently clever question can broaden the contract's column set. RLS is again the floor: a column outside the user's row-level scope rejects at execution regardless of what the contract permits.
Cell-value leakage through narration. Phase 12 produces a grounded narrative summary, which is the only place LLM output sees actual cell values. We escape every value before interpolation, but the LLM still sees them. For workloads where individual cells are sensitive (PII, health, regulated finance), the narrative phase should be disabled and only the chart + SQL surfaced. We're working on a per-customer toggle.
The defense in one sentence
Treat every user-controlled string as data, sanitize on every entry into a prompt (including history re-injection), validate generated SQL with the two-layer validator (L1 + L2) before execution, and run against a database role with RLS and statement timeouts. None of these is sufficient alone; all of them together constitute the production posture.
The complete security model (vault-encrypted credentials, audit logging, the trust boundaries between LLM and database) is documented on the Chion trust center. The pipeline architecture, including all 13 steps, is on the How it works page. Analysts evaluating the SQL generator specifically should read the analyst-focused walkthrough.