Replication is one of the most “mature” technology of SQL Server that are bit underrated these days with all AG, Mirroring and other new technology coming now with SQL Server.
As the name says it is Transactional it means it is log based.
This is how it works:
Lets say we doing an insert into published table:
1. Insert transaction is marked as “for replication” in the transaction log of the published database.
2. Log reader that is responsible for reading a log is picking up that transaction (I will go more detail in other article) and it puts the information about this transaction into distribution db on distributor
3. Once the transaction has been read by log reader it can be cleared when next checkpoint (simple recovery) or next tran log backup (full recovery) occur.
4. Distribution agent is picking up next transaction from distribution database and it is delivering that transaction to subscriber.
These 4 simple point describing high level of “how transactional replication works”. We will go more in detail about each of these steps.
AS You noticed we have few components that are crucial for replication to work. Here are the list and short descriptions
Publisher – Is the source of published data.
Distributor – it is connecting to publisher to take the published information, store it and distribute it to subscribers.
Subscriber – it is destination of published data.
You can compare Transactional Replication to printed newspaper subscription. Publisher (New York Times, Washington Post etc.) is creating and publishing information (Articles), Distributor (currier company or post office) is delivering the newspaper to ALL Subscribers that paid for subscription.
Other important components of transactional replication are the AGENTS, each with different tasks.
The main agents for transactional replication are:
Log Reader – running on distributor and connecting to publisher in order to read marked transactions and deliver these to distributor
Snapshot Agent – Is the agent that doing a snapshot of published articles in two main cases : When initial snapshot of all articles are made or when our publication is in “full sync” mode and we are adding new articles into replication
Distribution Agent – Its main job is delivering replicated transactions to all subscribers
Log reader and Distribution Agent are running continuously and should not be stopped for longer period. Each agent is actual SQL Job (usually located on distributor, pull subscription has distribution agents located on subscriber) have sets of parameters that DBA can tweak it (like multi streaming).
These are basics of transactional replication. Now we can go more into the detail. In next articles I show you how to setup, maintain, tweak and fix (or break:)) it.