Posts

Showing posts with the label data catalogs

How Data Mesh architecture and Data Catalogs help decentralized data teams.

Image
Not too long ago, Data Administrators had to change their long habit of having a monolith database. They were forced to accept and agree to the  Polyglot persistence - the developer's teams have started to choose different data storage and technologies that would support each application team's data model requirements. The time has arrived to break down  also the Data Lake monolith paradigm .  Refactoring monolith Data Lake makes a lot of sense.   The central data lake as well as the central data team is often a huge bottleneck . The central data team is usually busy with fixing broken data pipes and taking care of constant data changes made by the domain owners/development teams.  Data Mesh architecture is coming to the rescue here. Instead of a centralized data team, there would be multiple decentralised domain data teams, producing data sets or consuming other teams' data sets. Domain data team usually knows their domain data very well and are aware ...

Data Quake

Image
Data Quake. That's what it is. Dave Wells  have just gave this great definition that clearly describes what's happening in the data management world during the recent years. I am greatly enjoying Dave’s session today at Enterprise Data World summit and couldn't resist writing down the summary. Everything that we did in the last decade becomes wrong now. We have used to believe that application logic can run faster and do better if it sits inside the database layer. Now this architecture is being considered a wrong choice. Same goes for data normalization or strong schema. Some people even say that data warehouses are dead. We need to rethink everything. Data schema used to be defined during the design phase. Now we define schema-on-read, after the data have been persisted. Good news - I have always believed that and Dave have just mentioned - there is no schema-less data. Despite the fact that we do not get to design the schema anymore, for Big Data we need to un...