Why PostgreSQL first
PostgreSQL has the richest SQL dialect of any production database: CTEs, window functions, JSONB operators, array aggregations, FILTER clauses, LATERAL joins, recursive queries, and PERCENTILE_CONT. Supporting one dialect deeply — with full schema profiling and validation — produces better results than supporting five dialects superficially.
What Chion generates natively
- CTEs: WITH clauses for cohort analysis, ranked subsets, running totals, recursive hierarchy traversals
- Window functions: ROW_NUMBER, RANK, DENSE_RANK, LAG, LEAD, NTILE, PERCENTILE_CONT, aggregate windows
- JSONB: jsonb_extract_path_text, jsonb_array_elements, containment operators
- Array aggregations: array_agg, unnest, ANY/ALL operators
- Date arithmetic: DATE_TRUNC, EXTRACT, INTERVAL arithmetic, generate_series for time buckets
- FILTER clauses: COUNT(*) FILTER (WHERE condition) for conditional aggregations
Schema profiling for PostgreSQL
Chion reads PostgreSQL-specific metadata: pg_catalog for column types, pg_constraint for foreign keys, pg_index for index information, and information_schema for table structure. This metadata drives the SQL contract — the typed boundary that constrains what the LLM can generate.
What's on the roadmap
MySQL, BigQuery, and Snowflake are planned. Each requires dialect-specific profiling, validation rules, and SQL contract templates. We'd rather ship one dialect that works correctly than four that hallucinate.