Automated Notebooks Tutorial

This tutorial will provide guidance and suggested practices to ensure a notebook is ready to be used in an automation.

Notebook Parameters

  • notebook widgets should be used to define which catalog and schema the code targets
  • development work and testing should be done in the sandbox catalog
  • automated pipelines should typically point to the curated catalog
  • notebook parameters allow the automation to dynamically point the code to the correct location

Notebook Location

  • ensure the notebook is located in the appropriate git repository
  • it is important that the path of the notebook does not change
  • automations always execute code from the main branch of a repository
  • pull requests should be enforced when merging code into the main branch
  • code reviews during the pull request ensure code quality
  • always be sure to keep the code in sync with the remote repository through frequent pulls and/or rebasing main onto your working branch

Scheduling and Performance

when building a notebook that will execute on a recurring basis, it is important to consider how the data will be managed

  • ensure the notebook chooses the correct strategy for querying the data
    • look for watermark columns that can be used to limit the amount of data that is initially returned from the source
    • try to do delta/incremental data loads to prevent the need to continuously reload the same data

Updated on August 7, 2025