Maria's journey in the data fields

Posts

Showing posts from 2019

SQL Server Insert Parent and Child Records with One Statement

- August 15, 2019

A few days ago, one of the developers asked me if that was possible to generate test data by performing multiple nested INSERT statements , each of them involving inserting new rows into several parent tables and in the same statement reusing the autogenerated primary keys for the foreign key columns in the child table. The developer was working with PostgreSql and so I tried to find a solution for both PostgreSql and SQL Server to learn the differences in the ANSI features implementation in these different database systems. Read the solution on my blogpost here: https://www.mssqltips.com/sqlservertip/6142/sql-server-insert-parent-and-child-records-with-one-statement/ Yours, Maria P.S. (Picture is taken from Kendra Little website)

RIP SQLblog.com

- August 05, 2019

DBbest did their best but failed to maintain great knowledge sharing place where brightest SQL used to exchange new ideas and shared their wisdom. I want to thank Adam Machanic for giving me the opportunity to host my blog over there and be visible to the huge number of SQLblog readers. I will move my favourite articles over here if I succeed to find them. Yours, Maria

Data Exploration with Python and SQL Server using Jupyter Notebooks

- July 31, 2019

When it comes to data-driven research, presenting the results is a complicated task. It is convenient to have an initial dataset handy, if anyone asks to re-run the computations or wants to interact with the data. Reproducibility across a number of fields is a tough task and there aren’t too many tools that can help. It’s easy to show numbers in Excel or in Power Point, but in many use cases, the context and the pathway to the results is lost. What is the Jupyter Notebooks? Jupyter Notebooks is a great tool that is becoming more and more popular these days. Jupyter Notebook combines live code execution with textual comments, equations and graphical visualizations. It helps you to follow and understand how the researcher got to his conclusions. The audience can play with the data set either during the presentation or later on. Some people say that Project Jupyter is a revolution in the data exploration world just like the discovery of Jupiter's moons was a revolu...

Data Quake

- March 17, 2019

Data Quake. That's what it is. Dave Wells have just gave this great definition that clearly describes what's happening in the data management world during the recent years. I am greatly enjoying Dave’s session today at Enterprise Data World summit and couldn't resist writing down the summary. Everything that we did in the last decade becomes wrong now. We have used to believe that application logic can run faster and do better if it sits inside the database layer. Now this architecture is being considered a wrong choice. Same goes for data normalization or strong schema. Some people even say that data warehouses are dead. We need to rethink everything. Data schema used to be defined during the design phase. Now we define schema-on-read, after the data have been persisted. Good news - I have always believed that and Dave have just mentioned - there is no schema-less data. Despite the fact that we do not get to design the schema anymore, for Big Data we need to un...

Serverless ETL: Read, Enrich and Transform Data with AWS Glue Service

- March 14, 2019

More and more companies are aiming to move away from managing their own servers and moving towards a cloud platform. Going server-less, offers a lot of benefits like lower administrative overhead and server costs. In the server-less architecture, developers work with event driven functions which are being managed by cloud services. Such architecture is highly scalable and boosts developer productivity. AWS Glue service is an ETL service that utilizes a fully managed Apache Spark environment. Glue ETL that can clean, enrich your data and load it to common database engines inside AWS cloud (EC2 instances or Relational Database Service) or put the file to S3 storage in a great variety of formats, including PARQUET. I have recently published 3 blogposts on how to use AWS Glue service when you want to load data into SQL Server hosted on AWS cloud platform. 1. Serverless ETL using AWS Glue for RDS databases 2. Join and Import JSON files from s3 to SQL Server RDS instance...