Insights Generator · built on your verified SQL skills

Answers your team can defendevery insight reuses a query you approved

Chion's Insights Generator answers questions from your team's verified SQL skills and saved queries — it routes each question to logic your team already approved instead of improvising new SQL. Every insight traces back to a verified query, so there are no hallucinations and nothing alters your source-of-truth logic.

No new SQL is generated against your source of truth — only verified skills run.

Most insight tools invent SQL or hide it in a black box. Chion only answers with queries your team has verified.

Your Skills Generator builds the verified library; the Insights Generator turns it into answers. Here is every phase between connecting a database and reading the answer, labeled user-action, LLM, internal, or output.

What you ask

"Revenue by region last quarter." Plain English. No SQL, no schema knowledge required.

What Chion runs

Schema profile → semantic retrieval → typed SQL contract → read-only execution. Same input → same SQL, every time.

What you can audit

The exact SQL appears under every chart. Every claim cites the row that proved it.

Auto-generate skills by uploading SQL queries from your enterprise.

Chion's semantic pipeline distills every verified query into a reusable skill — tagged by department, domain, and role.

Jump to phase0001020304

How a question routes to a verified query — not a fresh guess

From plain-English question to typed contract to verified SELECT to grounded narrative.

Action Legend

Click a category to filter the pipeline. Click again to show all steps.

  1. Setup

    Steps 00

    One-time configuration by the user.

    00

    Connect with read-only credentials. That’s the first constraint we enforce — not the last. We scan your schema, not your data. We never see a row until you ask a question that requires one. PostgreSQL is live today. BigQuery, Snowflake, and MySQL are on the roadmap.

    Row-Level Security (RLS) is honored end-to-end. Chion connects with read-only credentials stored in an encrypted vault. No data leaves your database; only metadata and query results pass through the pipeline.

  2. Automatic Profiling

    Steps 01–04

    Runs once per data source. No user involvement after setup.

    01

    Connects to your database and scans every table, column, and constraint. Builds a complete structural map before any AI touches the data.

    All scans respect your RLS policies. Foreign keys, indexes, and constraint metadata are captured to inform join paths downstream.

  3. 02

    Measures cardinality, null rates, min/max ranges, and value distributions for every column. Classifies each as temporal, metric, categorical, or identifier.

  4. 03

    An LLM reads each column in context and assigns human-readable labels through a semantic layer that maps physical columns (revenue_q) to business-friendly names (Revenue by Quarter). Embeddings are generated for RAG retrieval; this is also where verified queries auto-promote into reusable skills.

    Currently powered by Anthropic Claude. The architecture is model-agnostic: OpenAI, Google, and on-premise options are on the roadmap. The model receives column names, types, and statistical profiles. No raw row data is sent.

  5. 04

    Top values per column are sampled, ranked by frequency, and embedded into a vector index. This powers fuzzy entity matching at query time. The full semantic layer is exportable as a portable CHION.md skillpack that drops into Claude Code, Codex, or Cursor.

    Embeddings are generated via the same LLM provider used in Step 3. Vectors are stored in Supabase pgvector for sub-50ms retrieval.

  6. Query Time

    Steps 05–10

    Triggered each time the user asks a question.

    05

    You type a plain-English question, such as "Revenue by region last quarter" or "Top 10 customers by churn risk." This is the second and final user interaction.

  7. 06

    Natural-language intent extraction through 7 discovery strategies: entity lookup, comparison, top-K ranking, time-bounded analysis, and more. Follow-up questions stay in conversational context so you can refine results without restating the data shape.

    The LLM classifies query type, extracts hard constraints, and identifies entities. All strategies run in parallel for speed.

  8. 07

    User terms are resolved to physical columns via vector search. An adaptive scorer fills a 1,200-token context window: no hard thresholds, just natural score breaks.

    Vector search uses the embeddings built in Steps 3–4. The adaptive scorer uses distribution analysis and normalization, not a fixed cutoff.

  9. 08

    The system evaluates the query structure, result shape, and number of series to select the best chart type. Compatibility scoring ranks 8 chart types: single line, multi-series line, dual-axis line, stacked area, standard bar, grouped bar, stacked bar, and animated bar race.

    Each chart type has a compatibility function that scores the match (≥0.7 strong, ≥0.4 acceptable). The fallback chain is line → bar → scatter → table. Selection is deterministic: same data shape always produces the same chart type. Dual-axis and multi-series layouts are assigned automatically based on series count and scale divergence.

  10. 09

    A typed SQL contract is generated (allowed columns, filters, aggregations, grain, and ordering) bound to the live schema and RLS policy. The SQL generator cannot introduce out-of-contract elements.

    The contract enforces RLS-aware column access. Out-of-contract references are a hard failure, not a silent omission. This contract layer is what makes the pipeline model-agnostic: the LLM generates SQL against the contract, not against your data.

  11. 10

    Read-only SQL generation and verification. SQL is generated against the typed contract, lint-checked for safety, and verified. A 3-layer defense enforces row budgets, blocks SELECT *, and caps output at 1,000 rows / 12,000 cells. Every chart you see is rendered from the SQL underneath, never paraphrased.

    Pre-validation enforces a 200-bucket budget. Runtime lint blocks SELECT * and enforces LIMIT. Post-execution truncation surfaces a disclosure to the user. All queries are SELECT-only, enforced by a parser-level read-only check. The same verification stack runs regardless of which model generated the SQL. Provider choice affects latency and cost, not security posture.

  12. Results

    Steps 11–12

    Delivered back to the user.

    11

    SQL runs against your live database with RLS enforced. If it fails, up to 3 repair cycles rewrite the query with error context. Still failing? A hard failure with actionable suggestions.

    Repairs are deterministic rewrites using the error message and contract, not random retries. Credentials are vault-resolved per execution.

  13. 12

    Results render as interactive D3.js charts with zoom, pan, and dual-axis support. A grounded narrative references actual data. No hallucinated summaries.

    The narrative is generated by an LLM but grounded in the actual query results. Every claim traces back to a row. Chart type was selected in Step 8; this step renders it with full interactivity.

Every insight reuses a query your team already verified

No invented SQL — it reuses logic you approved

Three-layer SQL defense. 1k-row / 12k-cell caps. Read-only, always.

Grounded narratives

Every figure traces to a row. No hallucinated summaries, ever.

Reusable skills

Verified queries compound into a shared, RLS-aware library your team owns.

Want to see this run on your data? Start a free trial →

Common questions

Quick answers; each links to the matching pipeline phase below.

How does Chion turn a question into SQL?

A typed SQL contract is generated against your live schema during SQL contract generation. The LLM emits SQL bound to that contract; out-of-contract column references are rejected before generation completes. Read more in the SQL contract step below. Jump to phase →

What stops Chion from running a destructive query?

Three layers of read-only enforcement: a parser-level SELECT-only check, a runtime lint that blocks SELECT * and forces LIMIT, and a LIMIT wrap plus statement_timeout at execution. Any INSERT, UPDATE, DELETE, DROP, TRUNCATE, ALTER, GRANT, or REVOKE statement aborts before reaching your database. Detail in the SQL generation and verification step. Jump to phase →

Where do skills come from?

Auto-generated from every query you run. The semantic analysis step reads each touched column, assigns business-friendly labels via a semantic layer, and embeds them for RAG retrieval. Every verified query becomes a candidate skill in your library, tagged by department, domain, and role. Promotion frequency surfaces as confidence badges (✓ / ✓✓ / ✓✓✓) in the auto-generated index. Jump to phase →

Does Chion respect Row-Level Security?

Yes. RLS policies are honored end-to-end on every query, enforced during execution. If a row is hidden from the role Chion connects with, it is hidden from Chion. We recommend pointing Chion at a read replica with a dedicated read-only role. Jump to phase →

Can I see the SQL?

Yes. Every chart shows the verified SQL underneath. You can copy it into any PostgreSQL client and re-run it yourself. If a number looks wrong, the SQL is the first thing to inspect. No SQL is hidden from the user. Jump to phase →

Schema profiling
Schema ingestion works against supported PostgreSQL providers (Aurora, Azure, GCP, Neon, Supabase). Provider-specific setup notes and connection limits live on each integration page.

Validator & credential safety
Credentials never leave the read-only credential vault and AES-256-GCM encryption boundary; the validator enforces SELECT-only access on every query.