Data Collaboration Workflow
In the WUSM Data Lake environment, the Data Collaboration Workflow describes how the ICS Data Warehousing (DW) team works directly with project teams to build, review, and publish data assets. This workflow is distinct from the brokerage workflow in that the ICS DW team is the primary builder of data assets, and the project team is the primary reviewer before assets are published to the curated catalog.
Overview
- The ICS DW team collaborates with project teams to develop data assets for specific projects or studies.
- Development work is performed in the
data_warehouse_dev
catalog, which is managed by the ICS DW team. - Regulatory and quality review by ICS occurs while assets are in
data_warehouse_dev
. - Upon ICS approval, assets are promoted to the project team's schema in the
review
catalog for project team review. - The project team reviews the assets in their
review
schema and can provide feedback or request changes. - If changes are required, the ICS DW team updates the assets in
data_warehouse_dev
and the review cycle repeats. - Once the project team approves the assets in the
review
schema, ICS DW uses Databasin to publish the assets to the project team's schema in the curated catalog.
Schema Architecture and Access
flowchart LR
subgraph Dev Catalog
direction TB
D1["**data_warehouse_dev**.
**[project_schema]**
(Development schema - D1)"]
end
subgraph Review Catalog
direction TB
R1["**review**.
**[project_schema]**
(Review schema - R1)"]
end
subgraph Curated Catalog
direction TB
C1["**curated**.
**[project_schema]**
(Curated schema - C1)"]
end
%% User groups
subgraph User_Groups [User Security Groups]
direction TB
ICS_DW["rdc
(ICS DW Team)"]
ProjectTeam["[project_schema]
(Project Team)"]
Databasin["Databasin
(Used by ICS DW)"]
end
%% Access relationships
ICS_DW -- "R/W access" --> D1
ICS_DW -- "R/W access" --> R1
ProjectTeam -- "Read access" --> R1
ProjectTeam -- "Read access" --> C1
Databasin -- "R/W access" --> C1
Diagram Key:
- D1 is the ICS DW development schema in
data_warehouse_dev
. - R1 is the project team's review schema in the
review
catalog. - C1 is the project team's curated schema in the
curated
catalog.
- The ICS DW team has R/W access to both the development and review schemas.
- The project team has read access to their review and curated schemas.
- Databasin is used by ICS DW to publish assets to the curated schema.
Workflow Steps
- Development: The ICS DW team develops and tests data assets in the
data_warehouse_dev
catalog (D1). - ICS Review: The ICS DW team performs regulatory and quality review of the assets in
data_warehouse_dev
. - Promote to Review: Upon ICS approval, assets are promoted to the project team's schema in the
review
catalog (R1). - Project Team Review: The project team reviews the assets in their review schema (R1). The team can provide feedback or request changes from ICS DW. If changes are required, ICS DW updates the assets in
data_warehouse_dev
and the review cycle repeats. - Promotion to Curated: When the project team approves the assets in R1, ICS DW uses Databasin to publish the assets to the project team's curated schema (C1). Only ICS DW has write access to the curated schema; the project team has read-only access.
- Team Accesses Curated Data: The project team can now query and use the approved data in their curated schema. Any further changes require a new development and review cycle.
Notes
- This workflow is intended for collaborations where ICS DW is the primary builder of data assets.
- The project team is responsible for reviewing and approving assets before they are published to curated.
- All promotions to curated are performed using Databasin for auditability and consistency.
- For more details on access controls and catalog policies, see the Data Lake Catalog Policies.