Loading Data from Unity Catalog Into a Databricks Notebook

Problem

You want to load data in the Unity Catalog into your Databricks Notebook.

Solution

Via Spark/Databricks SQL

-- loading
SELECT * FROM csv.`/Volumes/my_catalog/my_schema/my_volume/data.csv`; 
-- listing
LIST '/Volumes/my_catalog/my_schema/my_volume/'

Via dbutils

df = spark.read.format('csv').load(
  '/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv',
  header=True,
  inferSchema=True
)
dbutils.data.summarize(df)

Python

os.listdir('/Volumes/my_catalog/my_schema/my_volume/path/to/directory')

or via Pandas

df = pd.read_csv('/Volumes/my_catalog/my_schema/my_volume/data.csv')

or pip install-ing a python package placed inside Unity Catalog

%pip install /Volumes/my_catalog/my_schema/my_volume/my_library.whl

R

df <- read.df("/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv", source = "csv", header="true", inferSchema = "true")
dbutils.data.summarize(df)

Scala

val df = spark.read.format("csv")
  .option("inferSchema", "true")
  .option("header", "true")
  .load("/databricks-datasets/Rdatasets/data-001/csv/ggplot2/diamonds.csv")
dbutils.data.summarize(df)

Discussion

In general, external data should be placed in the Unity Catalog Volumes. See the discussion in recipe "Saving Results from a Databricks Notebook to a File" for more information about Unity Catalog.

See Also

Azure References


Updated on August 7, 2025