Most text-to-SQL is a prompt wrapper. We built a pipeline.

When you type a question, we don't just ask an LLM to write SQL and hope it works. We profile your schema first — every column, every data type, every relationship. That profile becomes a contract. The SQL must conform to the contract or it gets rejected and repaired. Then we run it read-only. You get the chart. You get the SQL. If the number looks wrong, you can check what ran.

How it works: three steps, no guesswork

1. Ask

Type 'Revenue by region last quarter' or 'Top 10 customers by churn risk.' No SQL syntax. No table names.

2. Verify

Before anything touches your database, we check that the SQL is read-only, within budget, and structurally valid against a contract built from your schema.

3. Chart

Every result shows the interactive D3.js chart and the exact SQL that produced it. If the number looks wrong, you can check the query.

Schema profiling: we read your database before writing a single query

Before the LLM sees anything, we run a profiling pass against your database. We read every table, every column, every data type. We compute cardinality — how many distinct values each column has. We sample actual values so the system knows what "Acme Corp" or "Q3 2024" looks like in your data.

Each column gets classified into one of four types: temporal (dates, timestamps), quantitative (numbers you aggregate), categorical (things you group by), or identifier (primary keys, foreign keys). This classification is stored in column_profiles alongside semantic attributes extracted by a separate AI pass that labels each column with a human-readable description.

The profile also includes value samples — the actual strings and numbers that appear in high-cardinality columns. These samples power entity resolution later in the pipeline. When you say "show me Acme Corp," we need to know that "Acme Corp" exists in your customers.company_name column — not hallucinate a table that doesn't exist.

SQL contract enforcement

The schema profile becomes a contract. Think of it like a type system for SQL generation. The contract defines exactly which tables and columns the SQL is allowed to reference, which aggregations are valid for each column type, and what the row budget is.

Our row budget is 1,000 rows and 12,000 cells. If the query would return more than that, the pipeline coarsens the time grain first (daily → weekly → monthly), then applies TopK ranking, but it never auto-filters date ranges. That's a deliberate design choice — silently dropping dates changes the answer.

If the LLM generates SQL that violates the contract — references a column that doesn't exist, uses an aggregation on a categorical column, or produces a structurally invalid query — the contract rejects it. The pipeline then enters a repair loop, running up to 3 automatic repair cycles. Each cycle gets the error message, the failing SQL, and the contract back, and tries again. If all 3 cycles fail, the query is blocked and the user gets a clear explanation of what went wrong.

Three layers of SQL safety

L1 — Pre-validation

Before SQL reaches the database, we block anything that isn't a SELECT statement. INSERT, UPDATE, DELETE, DROP, ALTER — all rejected at the contract level. This runs in code, not in the LLM.

L2 — Runtime lint

The query gets linted for structural issues: SELECT * is blocked (columns must be explicit), LIMIT is enforced (no unbounded queries), and JOIN conditions are validated against the schema profile.

L3 — Post-execution truncation

After execution, results are truncated to the row budget with a disclosure flag. If we cut rows, the chart shows it. The user always knows if they're seeing a subset.

Read our full security architecture →

Supported databases

PostgreSQL — fully supported, connect in under 2 minutes
BigQuery — on the roadmap
Snowflake — on the roadmap
MySQL — on the roadmap

Ready to try text-to-SQL that works?

Connect your PostgreSQL database and ask your first question in plain English. Free to start — no credit card required.