All items in the SOW accounted for in the data pull
No extra items (which might fall outside the IRB)
Patient count -- double check that the patient count doesn't exceed the IRB-approved limit
Regulatory reviewer to list in the Billable Epic the max number of records approved
If max record count is missing from the BE, make sure to reach out to the assigned regulatory reviewer to verify
Look at the data itself -- does it make sense?
e.g. are zip codes numeric?
are dates formatted as dates?
Inclusion criteria --
Are test patients and invalid patients being excluded?
Are study-specific criteria being appropriately applied?
e.g. age range is correctly applied, date shifts are in the right direction if comparing dates
Time frames: are data elements from the correct time window
E.g. From the correct date range around some sort of index event
When the files are small enough, it can be useful to load them into a spreadsheet (if they are not in one already) and use the filters to see the various values presented
Always spot check the output -- even if you're not loading an entire file (i.e. the file is very, very large and you only sample a portion)
Joins to ZC tables:
Is it the right ZC table?
If applying a filter to any of the options -- are all appropriate options selected?
Joins -- are they appropriate inner vs left vs unions?
NO RIGHT JOINS
Best practices: is code commented to explain what's going on?
Comments at the start of every temp table explaining what they're for
Is there a query header that documents the overall goal or project purpose?