The Distributor. Think and Rethink every thing.

The key player in the transactional replication topology is the Distributor. Misconfiguration of the Distributor can lead to increased load on the production server, can interfere with the regular application activities and even cause production databases to become inaccessible. In addition, its configuration greatly impacts the replicated data latency. Changing the Distributor’s configuration after setting up the replication can be problematic and can impact application activity.
The main aspects to be considered when planning a replication design are:
  • Number of publications on each publisher in the future replication topology
  • EPS in the replicated tables
  • Number of planned publishers
  • Distance between Publisher, Distributor and Subscribers
  • Number of subscribers for each publication
  • Size of the published Databases
Most of the answers to the above questions will lead to the the decision whether you want to have a dedicated Distributor server or configure the Publisher or the Subscriber to perform the Distributor role too. Both approaches have their pros and cons.
Keep in mind that a dedicated Distributor server will require one more SQL Server license and, most likely, additional monitoring application licenses. To lower the costs the Distributor can be configured on top of the standard edition of SQL Server which, luckily for us, supports 2 node failover clustering to make sure your Distributor is always up.
Number of publications
For each publication there is a Log Reader Agent that monitors and detects data changes on tables marked for a replication. The Log Reader Agent is actually an independent executable, running on the server that performs a Distributor role. The larger the number of your publications, the more processes you might find in the server memory space.
If you plan to configure your Publisher to act as a Distributor, take into consideration that those processes will eat up the resources on your production server.  
image
EPS in the replication tables All relevant transactions will be read off the database transaction log file by the Log Reader and moved to the distribution database on the Distributor. Each subscriber’s distribution service will read those transactions and apply them on the subscription database. Every time a subscriber pulls data from the distribution database, the IO subsystem and buffer pool are serving those requests. The bigger the number of the subscribers - the higher such load will be.  This can be another point of having a dedicated Distributor with it’s own buffer pool and dedicated storage.
Moreover, it’s a good idea to isolate distribution database files from other production databases. We have seen a noticeable impact on the production server whose database files LUN was located on the same NetApp array together with the Distributor database files’ LUN.
Number of publishers Each Publisher can be served by only one Distributor. If the number of publications on the Publisher is low and the number of data changes is not significant, you can avoid extra costs and make each Publisher perform also the Distributor role. However, when you have several Distributors in your environment it will get less convenient to monitor the distribution databases since monitoring code needs to be maintained at several locations. 
For instance, if we have a procedure that produces a list of subscriptions that have a problematic status, like uninitialized or expired or a subscription which is in a retrying or failed state.  
If one of the Publishers has enough spare resources, it can play the Distributor role for the rest of Publishers. Just make sure you configure the permissions correctly to make sure that only allowed people can access the Distribution database.
Distance between the Publisher, the Distributor and the Subscribers. And number of subscribers The Distributor server must have a reliable network to the Publisher and should be located as close as possible. This way Log Reader agents will read off the transaction logs fast enough and transaction logs will not be growing and will not endanger the production databases. Even in a Simple recovery mode, transaction log VLFs are not marked as free till all the transactions have been processed by the log reader.  
The subscription service named “Distribution agent” is also an executable. This agent can be configured in one of two approaches:
  • pull subscriptions (agent is running on the Subscriber server)
  • push subscriptions (agent is running on the Distributor sever)
If subscribers are located far from the Distributor, it’s better to configure pull subscriptions in order minimize the network utilization when moving initial copies of the tables (snapshots).
If Subscribers are close to the Distributor and it’s important to lower the load on the Subscriber server, push subscriptions are also an option. Just remember that there are already a bunch of exe files running on the Distributor, by configuring the push subscriptions you are adding another executable per publication for every subscriber.
It is common to create publications on the database level. However, if there are several tables with high EPS, separate publications can be created for each table. In such cases, distribution agents for those publications run in parallel and improve data replication speed. However this will increase number of executables on Distributor.
In case the subscriber server has more spare resources and located near the Production server, you can configure the Distributor on the Subscriber which can be another way to cut the licensing costs.
Size of the published databases Of course, the size of a replicated database does not impact the latency of the data, EPS does. The higher the number of the data changes, the longer it will take to replicate the data. Small but busy databases will most probably have higher replication delay than big “lazy” datasets with low number of changes. However, when you initialize your publication you first need to move the whole tables marked for replication to the Subscriber and then start replicating the transactions.
At this point database/table sizes make a whole world of difference. If tables are small you can initialize the replication using the Snapshot Agent. Every replicated table will be BCPed to the csv file, copied to the snapshot folder on the Distributor and then pulled by the Subscriber or pushed by the Distributor to the Subscriber depending on the subscription type that you have chosen. The larger the tables, the more of them – the longer the process will take. Especially when the Subscriber is located far away. In such a case you can initialize your subscription using database backup/restore which can take less time when copied to the Subscriber.
Still, in case of “big data” you may have to wait hours and even days till your hundreds of gigabytes have been copied to the Subscriber. Meanwhile, your distribution database will keep all the transactions since the replication initialization point. Watch it’s size closely in order to avoid the situation where it takes up all the disk space on the drive.

Other random thoughts:

Loopback detection for BiDirectional Transactional Replication 
If you are using BiDirectional Transactional replication you are most probably familiar with the loopback detection mechanism. Distribution server uses it to make sure that the transaction originated on the server A and replicated to server B will not be replicated back to the server A. Loopback detection logic is a part of the sys.sp_MSget_repl_commands procedure and is evaluated when the subscription agent pulls the transactions from the Distributor. Which unfortunately means that all transactions that arrived at the Distributor from server A and were applied to server B will anyway return back to the Distributor and will stay in the distribution database till the Subscriber discards them.
By the way, @loopback_detection property of the sp_addsubscription system procedure is true by default, you don’t need to turn it on.
Tempdb on the Distributor
The workload on the Distributor server is not too much different from other production servers. Transactions data is being stored inside two tables dbo.MSrepl_transactions and dbo.MSrepl_commandsReplication agents activities are logged into tables dbo.MSlogreader_history and dbo.MSdistribution_history.The process of pulling transactions by the sys.sp_MSget_repl_commands procedure uses several temporary variables. It gets executed by every subscriber of every publication.  The more publications * subscribers your Distributor is serving, the larger the number of objects created inside tempdb. In addition, for each publisher there is a Replication monitoring refresher job. It executes the sys.sp_replmonitorrefreshdata procedure in a continuous loop which uses several temporary tables. To minimize the contention on the GAM/SGAM pages you should have multiple tempdb data files, as you most probably do on the rest of your production servers.  
Distribution database
In case you have decided to create a dedicated Distributor server there is an another important decision to take: whether you want one distribution database for all Publishers or separated distribution databases. A Publisher can be linked to only one distribution database but the distribution database can serve multiple Publishers.
Environments with a low level of data changes can share one distribution database. For a busy or potentially busy databases it’s better to create separate distribution database per publisher ( not per publication (!) ).  The less rows each transaction on the published database is inserting/updating/deleting, the more similar EPS you will get on the distribution database. For instance, single UPDATE that changes 10 rows in the published table will be translated into 10 separate UPDATE statements and inserted into the distribution database.
When several publishers are sharing the same distribution database its EPS can get quite high. Separation of the databases makes it possible to separate the data files onto different physical disks in order to avoid impact of one server’s load on another. 
It’s difficult to move a Publisher out of the shared distribution to it’s private distribution database after the replication is already up and running. In order to change the distribution database, all publications of the specific publisher must be dropped, recreated and every database must be again copied to all subscribers.
Distribution databases are hiding inside the system databases folder. When you are creating a new distribution database you can name it whatever you like. Don’t be surprised when you see something like this in the system databases folder.
image
To get a list of distribution databases use is_distributor property of sys.databases
selectname
fromsys.databases
whereis_distributor= 1
Distributor password
In order to Configure a Publisher to use the Distributor you need to enable Publisher on the Distributor
imageimage
and after that you can configure the Distributor on the Publisher : 
execsp_adddistributor@distributor=N'DistributorServerName',@password=N'password'
If you don’t know the Distributor password you have two options. First is to find someone who knows. Second -  change it ( on the same snapshot above)
The problem here is that this password is being used by the Publishers to carry out replicated objects configuration changes. For that purpose a new linked server called “repl_distributor” is created on the Publisher. It’s not configured for the data access and its security settings set to use the abovementioned password. In case you have inherited Distributor server from someone else who had forgotten to write this password down, you can change linked server security settings on all Publishers to use a new password. We usually use a regular SQL server user for this purpose to avoid troubles if anyone decides to change the Distributor password by the mistake.
imageimage
The Snapshot folder
There is a snapshot folder that you are requested to define when setting up the Distributor. As I mentioned earlier, even if you are not setting up the snapshot replication, one of the ways to initialize the transactional replication is by using a table’s snapshot. In addition, every time you add or change articles, they will be transferred to the Subscriber using exactly the same snapshot mechanism. First of all, the snapshot folder should be a network path( and not on the local path as the wizard suggests) so pull subscriptions can access it. You must provide write permissions to the Snapshot Agent’s user and read permissions to the Distribution Agent’s user.
If there is a need to separate table snapshots from different publications, an alternative snapshot folder can be set up for each publication. To improve the snapshot transfer speed between the Distributor and the Subscriber you may want to compress the table snapshots, which can be done on the alternate folder only. Note that the compression process takes time, increases the snapshot creation time and is not suitable for unreliable networks because copy cannot be resumed in case of network failure.
All Replication agents can be customized in many ways. We can run the custom scripts before or after the snapshot creation, output all logging information to the text files, change number of threads for the BCP actions to improve performance but all this would be a whole topic for another blogpost.
Yours,
Maria
“Do what you do so well that they will want to see it again and bring their friends.” / Walt Disney

Comments

Popular posts from this blog

Unlocking Microsoft Fabric: A Simple Guide when you only have a personal account.

ETL to ELT journey: Break free your Transformations and discover Happiness and Data Zen

Coding is a rollercoaster of efficiency and eyebrow-raising discoveries.