Data Mesh

Moving beyond the Data Lake. Treating data as a first-class product owned by domain teams.

Domain Ownership

Scalability isn't just about disk space; it's about organizational throughput. A centralized team eventually becomes a bottleneck. They eventually lose domain context of the data they are ingesting.

Data Mesh shifts ownership to the left. The Checkout Team owns the `Checkout` data product. They are responsible for its quality, its schema, and its uptime.

The Four Pillars

  • 1.
    Domain Ownership: Teams own their data end-to-end.
  • 2.
    Data as a Product: Data is treated with the same rigor (SLAs, documentation) as a microservice API.
  • 3.
    Self-Serve Infrastructure: A central platform team provides the tools (buckets, pipelines, catalogs) so domains can build easily.
  • 4.
    Federated Governance: Global rules (security, PII handling) applied locally by each domain.

Data Contracts

Schema Enforcement

Schemas act as the contract. Breaking changes are caught at compile time, not in production pipelines.

Quality SLAs

"This dataset is updated every 15 minutes." "This field is never null." Consumers can trust the data they are using.

Discoverability

A centralized Data Catalog allows anyone in the company to find and request access to data products.

Related Projects