Posts

Showing posts with the label data fabric

Having fun isn't hard when you have a modern data catalog

Image
Data Catalog and Data Fabric are any data architecture enablers. You can use centralized architecture or decentralized, Data Catalogs will enable effective management and help interact with the data. Taking a closer look we figure out that Data Catalog is one of the main technology pillars of Data Fabric which has a much wider approach, including also data semantic enrichment, data preparation as well as data recommendation engines and various data orchestrators. Data Fabric empowered by Data Catalog, is an abstraction layer that helps applications to connect to data, regardless of database technology and data server location, using built-in APIs. However, a traditionally manually managed data catalog does not qualify as a Data Fabric unit. Modern Data Catalog is actively driven by the meta-data and scans data sources regularly with no need for manual maintenance. Modern Data Catalogs usually would have built-in fully-automated end-to-end data lineage and enforc...

The Greatest Reasons to use or not to use a De-centralized Data Management Architecture

Image
Imagine having a dancing party for your data. Everyone in harmony waltzing, stepping on their partner's toes from time to time.  Distributed data management is no less amusing. It's chaotic, occasionally hair-raising but with the right approach can even be perfect.  A central huge data warehouse is a struggle to scale efficiently and hard to innovate. There is no clear ownership of the data domains and it is a single point of failure. During peak usage times, data access and processing can be slow. Even implementing updates or upgrades can be quite complex and time-consuming. Centralized databases are attractive targets for cyberattacks and successful breaches can compromise a large amount of sensitive data.  As an alternative to a centralized Data Warehouse, data can be owned and managed by the domains, producing it. When considering a decentralized approach, we need to make sure there is a self-serve data infrastructure platform that allows different domains or teams to...

The Greatest Reasons to use or not to use a Centralized Data Access Architecture

Image
When developing a modern Data Platform Layer, one of the main decisions is whether to opt for centralized or decentralized data access architecture.  There is no “one-fits-all” solution, both have advantages and disadvantages. A Centralized Data Access Architecture would usually mean duplicating data from the operational layer into the analytical layer and applying various transformations to data to support and speed up data analytics. Operational online transaction processing Layer , where all microservices and their operational databases are located. Analytical Data Layer , where we would have data lakes that support data Scientists' work and a data warehouse, that supports Business Intelligence. Transformations, ETL or ELT data pipelines , which are moving data from the operational layer into the analytical layer. I f we opt for a Centralized Data Access architecture, what would be the benefits and the drawbacks? CONSISTENCY : consolidating data into a central location can...