Toggling ANSI SQL
Problem
You want to ensure your SQL statements are using the ANSI compliant query engine.
Solution
If you have a SQL Notebook add the following statement to the beginning of your notebook:
SET ansi_mode = true;
If you have a Python Notebook add the following statement:
spark.conf.set("spark.sql.ansi.enabled", "true")
Discussion
If you're running queries that are using ANSI SQL functions please ensure that ANSI mode is enabled; otherwise, you may get incorrect results in your Spark/Databricks queries.
Important
- On Databricks Clusters the
spark.sql.ansi.enabled
configuration is not enabled by default. - On Databricks SQL Warehouses
spark.sql.ansi.enabled
is enabled by default.
Note
What's the difference between Databricks Cluster vs Databricks SQL Warehouse?
Databricks Clusters are general-purpose compute resources that support multiple languages and use cases, while Databricks SQL warehouses are optimized specifically for SQL analytics.
Additionally, Databricks SQL Clusters use the Photon runtime, which is a rewrite of the Spark engine in C++ instead of Scala/JVM. The initial launch of photon was mainly for improving the performance of Spark SQL.
When spark.sql.ansi.enabled
is set to true, Spark SQL uses an ANSI compliant dialect instead of being Hive compliant. For example, Spark will throw an exception at runtime instead of returning null results if the inputs to a SQL operator/function are invalid. Some ANSI dialect features may be not from the ANSI SQL standard directly, but their behaviors align with ANSI SQL’s style.
See Also
- Azure Databricks Documentation:
ANSI_MODE
- Azure Databricks Dcoumentation: ANSI compliance in Databricks Runtime
- Azure Databricks Documentation: Set Spark configuration properties on Azure Databricks
- Reddit: r/dataengineering: "What the different between Databricks SQL vs Databricks cluster with Photon runtime?"
- Chaos Genius: Databricks SQL Warehouse—Serverless vs Pro vs Classic (2025)