Modern Modular Data Pipelines Example

 I would like to share with you my favourite example of the modern data pipeline.

It's amazing.

The first cool thing that we see is that this great pipeline is utilizing a full range of cloud services built for diverse use cases. Choosing the correct tool for each use case can be one of the key factors for the success of your idea, allowing you to get things running as fast as possible without reinventing the wheel.
Another cool thing is if we want to pull data from any non-trivial data source, like Twitter or Jira or GitHub, Azure Databricks is our first friend.
However, the most noticeable advantage of this pipeline, is that instead of having a monolithic data flow, this pipeline is actually multiple pipelines that are running in parallel. Short, simple, and independent pipelines. Multiple independent pipelines can work in parallel and on different frequencies. One pipeline failure would not impact others. This is an easy way to scale each pipeline separately to speed up only specific tasks and save money on keeping other operations on lower compute.

Comments

Popular posts from this blog

Unlocking Microsoft Fabric: A Simple Guide when you only have a personal account.

ETL to ELT journey: Break free your Transformations and discover Happiness and Data Zen

Coding is a rollercoaster of efficiency and eyebrow-raising discoveries.