Execs and desktop heap

Agent Processing Account

In previous post I explained how Replication agents works. It is pretty mych exe running on Distributor (or in pull replication distrib.exe on Subscriber). As always each exe is running under some kind of Windows account. Like here:execs

As you can see distrib.exe and logread.exe are running on SQL Service account and also you notice each if these is using RAM. That account you choosing when you configure publication (configuring snapshot and logreader processing account) and subscriber (configuring distribution agent). Every time you configuring one of these things you need to provide Security settings. There are two things that need to be provided. First processing account and second authentication account. In some cases you can use same account for both, like here:

SecuritySettings

As you can see we will be using SQL Service account for processing logread.exe and also we will use same account in order to connect to Publisher (in order to read transactions). Side note: If you choosing to do so, you need to make sure that this account has all necessary permissions on Publisher server.

Side note 2: You can modify these settings always by going to Publication settings/properties (to modify snapshot and logreader security settings) and Subscription settings/properties (to modify Distribution Agent settings).

We focusing now only on first part. Processing account this is the one that exe will be running under.

You probably wonder why I’m writing about this? Well it is to do with “noninteractive desktop heap “.

Noninteractive desktop heap

Now I would like to put small disclaimer. You may not get to this point where you will get that issue. What is the exact issue?

Let’s assume you have one remote distributor that is used to run 50,60, 100 or more distribution agents. Each of this agent is configured to run under SQL Service account. Each agent is running exe under that account and each process taking some amount of memory on that instance. It is also taking so called “noninteractive desktop heap” that is allowing running processes concurrently. That value is limited for each processing account and it means once u reach it for SQL Service account you wont be able to run any more replication agents.

Once you get to this point the replication agents just wont run… the problem is it wont give you ANY error or just fail without giving any description… it will just stuck on “Starting agent” and finally fail after sometime.

You can test if you have that issue already and just run exe command manually under your account (if you have all necessary permissions on the server) and if agent is running fine then most probably you reached that limit. How to run replication agents manually I described here:

Understanding Replication Agents Part 2

Solution:

  1. Change processing account to use different Domain account. Remember it need to have proper permissions on Distributor. You will need to restart agent after you change processing account.
  2. Increase Desktop Heap Size. However this need server reboot:Increasing the noninteractive desktop heap size

I normally go with solution no. 1 and create few more domain account and run agent under these accounts.

Understanding Replication Agents Part 1

3 Musketeers …LogReader, Snapshot, Distribution

First two (Logreader and Snapshot) are directly impacting Publisher. First is to read the log, find replicated transactions and put it into distribution database. Snapshot is used to make snapshot of schema, indexes, constraints, data etc. It will generate schema, bcp files on distributor (folder that the snapshot will be generated is configurable).

Last one, the Distribution agent is responsible for delivering the data that Logreader or Snapshot generated. In my experience the most issues we have is with this agent.

These three agents are actually SQL Jobs and you can start/stop and manage it in few ways. Trough replication monitor, by right clicking on Publication (View Sanpshot/Logreader status) or Subscription (View synchronization Status) and Job Activity Monitor (on distribution server).
Job naming convention is exactly the same for Types of Agents. Snapshot will have publisher name,published db name, publication name. Log Reader will have only publisher name and published db name. Distribution Agent will have publisher name, published db name, publication name, subscriber name.

REplication_Agents_jobs

In all three cases if the names are too long some strings will be truncated. Also the number at the end of each job name is the “agent id” and it is matching the id in system tables in distribution database.

Because Im dealing with quite big volume of  replication agents I often like to use job Activity monitor to manage my agents. You will notice that after configuring publisher/distributor and creating your first publication/subscription in Job Activity Monitor you can find new Categories of jobs. All replication related jobs has “REPL-” prefix.

  • Repl-Snapshot
  • Repl-Distribution
  • Repl-Logreader
  • Repl-Distribution Cleanup

new_job categories

You can easily filter these and if you ever need to start/stop multiple agents you can do it that way. Very useful when you want to change agent profiles  on multiple agents and in order to new profile start working you need to restart the agent.

Last thing about this is the job history often gives you better information about errors occurred than Replication monitor. If you need more details about error always check job history of given agent.

As mentioned before you can also use Replication Monitor (right click on Replication tree) to manage your agents. If you using Replication Monitor for the first time you need to register your publishers first in order to see it on the tree list on left pane. You can either register the publisher one by one or add entire Distributor. This is useful thing when you using one distributor for many publishers is to choose Add Distributor (Specify a Distributor and add Its Publishers…)

registering new distributors

By doing it, you will add ALL Publishers that are using that distributor. The Replication Monitor connects only to Distributor and you don’t need the direct connection to Publisher. The agents in Replication Monitor are split between the tabs like Subscriptions Watch list and Agents. In Subscriptions you will find all your subscriptions for give publisher and you can drill down to given publication on left pane if needed. In Agents tab you will find all Snapshot, LogReader agents but by default you will see LogReader agents and if you want to Snapshot you need to use drop down list to see other ones. What is nice with Replication Monitor is that you can go directly to the SQL job settings/history on distributor. You just need to choose agent, choose Action and from there Job Settings. It will open normal job properties window as you getting by using Job Activity Monitor (note: This wont work if you connected to distributor and you are using pull replication).

 

Transactional Replication and Transaction Log Management

As you know by now, Transactional Replication is using Transaction Log of the published database.

But Wait! Do I need to run my DB in Full Recovery Model? How about checkpoint? Entire beauty of this is that you do not need run your db in full recovery model. Once Replication is up and configured no transactions for given articles will be lost. How it works?

Lets assume we published a table, we will be replicating INSERT/UPDATES/DELETES and we created subscription (publication need to have at least one active subscription in order to start picking up data for replication). From now on ALL transactions for that table will be marked in transaction log as “For Replication”. That transaction will be kept in the log until LogReader  read it and put into distribution database. If from some reason, Log Reader is failing or stopped the log will keep growing, growing and growing…. when finally you ran out of space…In this case  No backup logs, no checkpoints, no changing recovery model around….will help. Ok one thing will help… dropping entire replication on given published database:) But this is not a solution for us. You need to keep in mind that Log reader agent is crucial and it need to run always. any errors on that agent can affect your published database and entire instance. Keep monitoring that job/agent.

Sequential reads

Transaction log is sequential, Log Reader is reading data  sequentially and Transactional replication is sequential… That means various things or actions can affect our replication. Pretty much everything that touching Transaction Log heavily will impact it.

Index Maintenance/Checkdb?

It will affect your Replication mostly when you are using Online rebuild. Index Rebuild is doing most of the job in transaction log and it will keep growing. Even if Replication really CANNOT replicate this “transactions”, the LogReader still need to read entire Transaction Log from beginning. Let’s assume after large index rebuild your transaction log is 10GB in size… now your Log Reader need to go trough and check entire log in order to find all transaction that are marked for replication. You either need to be very careful with rebuilds on published database (add waitfordelay between rebuilds and let log reader to read the log etc.) or you and the business accepting replication delay during the index maintenance. You will often get message like this in the Log Reader.

LogREader_Indexrebuild

As you see it scans the log and return message that no transactions marked for replication.

One thing you can tweak is the Log Reader Agent Profile. You can increase – LogScanThreshold and -Readbatchsize. Sweet spot? start by adding 0 to each value and see if it helps in your situation. Also what you should increase are -Logintimeout and -QueryTimeout as in some cases you connection can break due to timeout and Log Reader will need to start from the begging and we DO NOT want that.

Mirroring

I know that at some point mirroring will be replaced but still this need to be mention. Lets assume that database you published is also mirrored…If for any reason mirroring is falling behind (slow connection, errors etc.), your replication will do the same. Transactional Replication (Log Reader to be exact) will wait for mirroring to catch up and will read only part of the transaction log that been already send to Secondary by mirroring.

Everything ELSE:

The problem with Transactional Replication is that even large updates on tables that ARE NOT published can affect it..  Update will still go to transaction log and Log Reader still will need to scan it.

Conclusion:

You do not need Full Recovery model in order to setup Transactional Replication and replication wont loose any data because of it. You need to be aware of anything heavy happening on your database and that touching transaction log. Keep monitoring your transaction log and Log Reader.