Data Warehousing Roles

The purpose of this document is to outline the roles that exist within the I2DB Data Warehousing team. Each section should provide the title, description, and required skills for each role on the team.

NOTE: Job title does not match role...

Project Manager

Engages with stakeholders to guide the project to a successful completion. The Project Manager is responsible for overseeing the planning, execution, and delivery of projects within the I2DB Platform Engineering team. They work closely with team members and clients to ensure project objectives are met. The Project Manager also facilitates backlog management and planning, following the team's Agile methodologies.

Skills

  • Strong leadership and communication skills to effectively collaborate with stakeholders and team members.
  • Proven experience in project management, including planning, scheduling, and resource allocation.
  • Familiarity with Agile methodologies and the ability to guide the team through the Agile process.
  • Excellent problem-solving and decision-making abilities to address project challenges and risks.
  • Proficiency in project management tools and software for tracking progress and managing project documentation.
  • Ability to prioritize tasks, manage timelines, and ensure project deliverables are met.
  • Strong organizational skills to handle multiple projects simultaneously.
  • Knowledge of project management best practices.

Tools

  • Azure DevOps
  • Microsoft Teams & Office 365
  • Zoom
  • REDCap
  • OneTrust
  • DocuSign
  • Box
  • Google Docs

Technical Lead

The Technical Lead is responsible for providing technical guidance and leadership to the Platform Engineering team. They play a crucial role in overseeing the technical aspects of projects, ensuring adherence to best practices, and driving the successful delivery of a variety of solutions. The Technical Lead collaborates with stakeholders and team members to understand project requirements and assist in translating them into technical solutions.

Skills

  • Strong technical expertise in relevant programming languages and technologies.
  • Proficiency in conducting code reviews and ensuring code quality.
  • Excellent problem-solving and analytical skills.
  • Strong communication and leadership abilities.
  • Ability to collaborate effectively with cross-functional teams.
  • Experience in mentoring and coaching team members.

Tools

  • Azure DevOps
  • Communication and collaboration tools (e.g., Microsoft Teams)
  • Document management tools (e.g., Google Drive, Microsoft SharePoint, Box)
  • Risk management tools (e.g., OneTrust)
  • Reporting and analytics tools (e.g., Databricks, Power BI)
  • Development tools and frameworks used by the team (e.g., Python, Java, JavaScript, Azure Databricks, Git)

Data Architect

The Data Architect is responsible for designing and overseeing the overall data architecture within an organization. They collaborate with stakeholders to understand data requirements and translate them into scalable and efficient data solutions. Data Architects ensure data integrity, security, and performance by designing data models, schemas, and data flow. They are familiar with data warehousing concepts and best practices.

Skills

  • Strong understanding of data warehousing concepts and best practices
  • Ability to collaborate with stakeholders to understand data requirements
  • Proficiency in designing scalable and efficient data solutions
  • Knowledge of data modeling and schema design
  • Familiarity with data integrity, security, and performance considerations

Tools

  • Azure Databricks (primary tool used by the team)
  • Git repositories (for version control)
  • Database management systems (e.g., Microsoft SQL Server, MySQL, PostgreSQL)
  • Data modeling tools (e.g., ER/Studio, PowerDesigner)
  • Data integration tools (e.g., Informatica, Talend)
  • Data visualization tools (e.g., Power BI, Tableau)
  • Data quality tools (e.g., Informatica Data Quality, Talend Data Quality)
  • Data governance tools (e.g., Collibra, Informatica Axon)

Data Engineer

Data Engineers are responsible for designing, developing, and maintaining data pipelines and ETL processes. They work closely with stakeholders to understand data requirements and ensure the accuracy and efficiency of data pipelines. Data Engineers have proficiency in programming languages like Python and SQL for data manipulation and transformation. They also have experience with data integration tools and frameworks and possess knowledge of data modeling and database design principles.

Has access to make changes to schemas/assets in "development" environments.

Skills

  • Proficiency in programming languages like Python and SQL for data manipulation and transformation
  • Experience with data integration tools and frameworks
  • Knowledge of data modeling and database design principles
  • Understanding of ETL (Extract, Transform, Load) processes
  • Expertise in transforming data from the EPIC Clarity source system into the OMOP data model using ETL processes.
  • Familiarity with data quality and data governance concepts
  • Strong problem-solving and analytical skills

Tools

  • Azure Databricks (primary tool used by the team)
  • Git repositories (for version control)
  • Data integration tools (e.g., Apache Nifi)
  • Database management systems (e.g., Microsoft SQL Server, MySQL, PostgreSQL)
  • Data manipulation and transformation libraries (e.g., Spark SQL, Pandas)

Database Administrator

The Database Administrator is responsible for managing on-prem and cloud-based databases. They play a crucial role in ensuring the availability, performance, and security of the databases. The Database Administrator handles tasks such as monitoring, upgrading, backup management, and database performance tuning.

Change management to operational environments, including access requests.

Skills

  • Strong knowledge of database management systems, such as Microsoft SQL Server, MySQL, or PostgreSQL.
  • Proficiency in database administration tasks, including installation, configuration, and maintenance.
  • Experience with database monitoring and performance tuning to optimize query performance and ensure efficient database operations.
  • Familiarity with backup and recovery strategies to ensure data integrity and availability.
  • Understanding of database security principles and implementation, including user access control and data encryption.
  • Knowledge of database replication and high availability configurations.
  • Ability to troubleshoot and resolve database-related issues.
  • Familiarity with scripting languages like SQL, Bash, and PowerShell for automation tasks.
  • Experience with cloud-based database services like Azure SQL Database, Amazon RDS, or Google Firebase.

Tools

  • Database management systems (Microsoft SQL Server, MySQL, PostgreSQL)
  • Database monitoring and performance tuning tools
  • Backup and recovery tools
  • Database security tools and techniques
  • Scripting languages (SQL, Bash, PowerShell)
  • Cloud-based database services (Azure SQL Database, Amazon RDS, Google Firebase)

Data Steward

A Data Steward is responsible for ensuring the quality, integrity, and governance of an organization's data assets. They collaborate with various stakeholders to define and enforce data standards, policies, and procedures. Data Stewards play a crucial role in data management, including data classification, data lineage, data privacy, and data security. They work closely with data owners, data users, and IT teams to ensure data compliance and adherence to regulatory requirements. Data Stewards also assist in data documentation, data cataloging, and data quality initiatives to improve the overall data governance framework within the organization.

Skills

  • Understanding of data governance principles and practices to ensure data quality, integrity, and compliance.
  • Proficiency in analyzing data to identify patterns, trends, and anomalies.
  • Knowledge of data modeling techniques and experience in designing and implementing data models, specifically the OMOP data model.
  • Ability to define and enforce data quality standards and perform data quality assessments.
  • Familiarity with healthcare data, including EPIC Clarity data and the OMOP data model.
  • Proficiency in SQL for data querying and manipulation. Knowledge of programming languages like Python or R is beneficial for data analysis and transformation tasks.
  • Understanding of data privacy regulations and best practices for ensuring data security and confidentiality.

Tools

  • Azure Databricks
  • EPIC Clarity
  • OMOP Data Model
  • SQL Tools

Updated on August 7, 2025