Data Warehouse Transition Meeting

2025-1-17

  • NiFi migration
    • Migrate code from NiFi processors to Databricks notebooks
    • Scala vs Python - Suhas to explore pros/cons of migrating scala processors or rewriting java code as python
    • Suhas to provide overview
      • DW & a few PE members
    • "Local" dev?
  • Moving Config db to data lake
    • Any blockers for Dev, QA and Prod?
      • Pipeline moves to "local" catalog
    • Meeting with PE to review the output of the config pipeline output
    • Proposing after v1.29
    • Sequences were an issue for auto incremental
    • Current NiFi uses two different sequence columns
    • Databricks doesnt support sequence
      • Need to create table in local to mimic current sequence config
    • Alternative approach is using config in psql from ETL
      • If kept in config in psql, explore live connection in databricks
  • Databasin testing to move OMOP ETL staging
    • waiting on a few features
  • Team to decide if scala/java is worth investing in
  • Final task is to move PSQL code to SparkSQL
  • MDClone
    • Release based notebook that reference a specific branch to find SQL file
  • Working session to review MDClone script execution process
  • Next Friday 10-3

Updated on August 7, 2025