RDC ETL v2 Workflow

RDC_ETL_Notebook (main notebook)

  • imports libraries
  • set notebook path
  • get paths and parameters
  • Notebook definitions
  • Starting the RDC ETL process
    • Get configuration info from config_main table
      • Check the etl run status etl_status_sql_string

        • Get this run's
          • run_type (full,transform)
          • main_id
          • Department_name
          • TargetOMOPEnv (dev, qa, prod)
      • Increment Job_id for new job (unless reuse_job_id = true)

      • If configuration is active and is not currently running

        • Update config_main table to indicate that it's running now

        Stage Process

        Step 1: Populate the metadata table

        Step 2: Truncate all relevant stage_omop tables

        Step 3: Set up variables

        • Get list of stage functions
        • Get the last run date
        • For each enabled record, run the named notebook

        OMOP Process

        • RDC definitions
        • Get delta rules (job_def_sql_string)
        • For each enabled record, call omop_stagings
          • Apply NK
          • Apply Concept
          • Error Recording
          • Delta Process
          • Merge Process

Updated on August 12, 2025