A Three-Step Framework for Implementing a Hybrid Data Mesh

A Recap

Large organizations are evolving organically and are complex; hence, a hybrid approach works best.

The Three-step Framework

Step 1: Define Domain

  • What are the parameters that define the domain? e.g., Departmental, product based, geo-based
  • Where is the domain placed in the governance-flexibility spectrum?
  • The functional context implies the task that the domain is assigned to perform. The functional context is the raison d’être for the domain.
  • The organizational constraints can be business constrained imposed on the domain like regulations, people and skills, and operational dependencies.
  • A department like marketing or sales focuses on a specific function within a business.
  • A product group that focuses on creating a specific product or service.
  • A subsidiary of a parent company.

Step 2: Determine Domain Node

  • What are the capabilities required for the domain node?
  • Which component fulfills the decision support capabilities for the domain?

A domain node fulfills the technical capabilities of a domain.

  1. Data Security: It ensures that the data stored in the domain node is secure.
  2. Data Catalog: It ensures that the data in the domain node is well cataloged and curated for meaningful discoveries.
  3. Data Sharing: A robust data sharing mechanism is employed in the domain node to share data between the domains securely.

Step 3: Establishing Governance Framework

  • Who are the key stakeholders required for the domain? e.g., domain owner, data steward, etc.
  • What are the skill sets required to manage the domain?
  • Which data will be cataloged?
  • Who will get access to which data?
  • How will the access control be implemented?
  1. Roles and Responsibilities
  2. Data Cataloging
  3. Data Sharing

Roles and Responsibilities:

  1. Executive Sponsor: This role has the authority and budget and is accountable for establishing data governance. Typically, this role is a CXO-level role tasked with the overall ownership of data.
  2. Data Governance Lead: This role has the overall accountability and responsibility for implementing the data governance program. Data governance needs a program-level focus if it has yielded the right benefits.
  3. Data Owners: This role comes with authority and budget for overseeing the quality and protection of data within a domain. The role also decides who has the right to access and maintain that data and its usage.
  4. Data Steward: This role oversees the definition and usage of data within a domain. This role is typically an expert in a specific data domain and works with other data stewards across the enterprise. In addition, the role ensures that the data quality is maintained.
  5. Data Publishing Manager: This role is responsible for quality assurance, checking, and publishing them for internal and external data sharing.

Data Cataloging:

Data cataloging is organizing the inventory of available data so that they can be easily identified and used.

Data Sharing:

The degree of relative domain independence determines how independent a domain is compared to other domains.

  1. The functional context of the domain within the organizational ecosystem.
  2. The people and skills that determine the smooth execution of the domain.
  3. The external or internal regulations that govern the domain.
  4. The operational independence of the domain concerning other domains.
  5. The technical capabilities possessed by the domain for implementing technology.
  1. Data sharing between the hub domain and the spoke domains and vice versa
  2. Data sharing between Data Mesh domains
  1. Firstly, the hub data publishers, who have data ownership, publish the metadata of the hub domain node into its data catalog.
  2. The hub domain node steward reviews the published catalog to ensure it aligns with its governance framework.
  3. The steward then approves or rejects the published catalog contents. If approved, the catalog is updated with the metadata.
  4. When a spoke domain data requestor requires data from the hub node, the data requestor browses the hub data catalog to identify the data of interest.
  5. Once the data of interest is identified, the data requestor requests the data from the hub through the data share service.
  6. The request for data access is routed to the data publisher. The data publisher reviews the request and approves or rejects the request for data access.
  7. If the request is approved, the data publisher shares the data with the data requestor through the data share service that enables data sharing between the hub and the spoke nodes. The terms of data usage are also clarified.
  8. Finally, the data requestor reviews the terms of data usage. Upon accepting the terms, the data requestor can start consuming the data usage.
  9. The data publisher constantly monitors the data usage patterns through the data share service.
  1. Firstly, the data publishers, who have data ownership, publish the metadata of the domain node into the data catalog.
  2. The enterprise data mesh steward reviews the published catalog to ensure that it aligns with the organization’s governance framework.
  3. The steward then approves or rejects the published catalog contents. If approved, the catalog is updated with the metadata.
  4. When a data requestor from another node requires the data from another node, the data requestor browses the data mesh catalog to identify the data of interest.
  5. Once the data of interest is identified, the data requestor requests the data from the hub through the data share service.
  6. The request for data access is routed to the data publisher. The data publisher reviews the request and approves or rejects the request for data access.
  7. If the request is approved, the data publisher shares the data with the data requestor through the data share service that enables data sharing between the nodes. As in the hub-spoke architecture, the terms of data usage are also clarified.
  8. Finally, the data requestor reviews the terms of data usage. Upon accepting the terms, the data requestor can start consuming the data usage.
  9. The data publisher constantly monitors the data usage pattern through the data share service.

Conclusion

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Pradeep Menon

Pradeep Menon

3.6K Followers

Creating impact through Technology | #CTO at #Microsoft| Data & AI Strategy | Cloud Computing | Design Thinking | Blogger | Public Speaker | Published Author