I’m Saulius and I could help you with any database administration needs. I specialize in the administration of Microsoft SQL Servers.
I can analyze, optimize, setup your MS SQL server environments so that they reach best performance, health and fit your needs. With proper maintenance and monitoring you won’t need to care about your servers.
I working with MS SQL server platform since 2000 with various servers configurations as stand alone, cluster, mirroring, replication, alwayson etc. I’m ensuring high availability, disaster recovery on production, testing and development servers with big data flows and huge databases. The biggest experience in working with all Microsoft SQL products, RedGate MS SQL servers administration tools and my own created SSRS based MS SQL server monitoring reports.
Recently I’ve faced a general error on SQL Server transactional replication – The subscription status of the object could not be changed. This was popped up on no matter what changes trying to implement – add a new subscriber, drop a subscriber, add or remove any article, etc. In other words – the replication became something like read-only.
The changes I tried to implement using SSMS as well as system procedures, like “sp_dropsubscription”, but with no luck: The subscription status of the object could not be changed. (Source: MSSQLServer, Error number: 14068). Other errors by trying other procedures I’ve got:
Msg 14068, Level 16, State 1, Procedure sp_MSrepl_changesubstatus, Line 1386 [Batch Start Line 0] The subscription status of the object could not be changed.
Replication Monitor could not reinitialize one or more subscriptions that support automatic initialization. Additional Information: An exception occurred while executing a transact sql statement or batch. (Microsoft.SQLServer.ConnectionInfo)
The subscription status of the object could not be changed. Could not update the distribution database subscription table. The subscription status could not be changed. Changed database context to “TestDatabase”. (Microsoft SQL Server: Error 14068)
Even a snapshot agent was failing without saying anything useful… Then it looked like I need to re-setup a replication from scratch. This wouldn’t be something much if only one publication on a server with few articles, etc. But in my case – there are over 10 publications with more than 100 subscriptions. All the servers are in production and to re-setup replication – would cost a lot…
Fortunately, I’ve found a way to fix the replication without re-setup. Here what I found.
First, I’ve checked is there any orphaned articles in the replication? This might be done by running the script:
select object_name(objid), name, artid
Success! I’ve found one article exists as a replicated object (in my case udf function), but already dropped out of the database:
Please note, you can’t simply re-create a dropped article since it will get different object_id and will not fix the issue (just now though, instead of deleting the entries as described below, you may try to update article id, but at the moment I was fixing, didn’t thought about this option).
The next step – clean this up. It’s as easy as to delete a record out of a table. For sure, you need to know from what tables. Here you go, four tables in total: syssubscriptions and sysschemaarticles in a publisher database, and MSsubscriptions and MSArticles in distribution database. In my case, from the query above I’ve identified that artid = 2 is the candidate to drop. Before deleting, I’ve copied the entries to a backup table in case I’ll screw up something else and need to restore the change. Remember, all the things I was doing are on production environment…
where artid = 2
where artid = 2
where article_id = 2
where article_id = 2
Success again! After the orphaned article has been deleted out of replication, now I can do all the changes again – add a new subscriber, add a new article, etc. All back to normal!
Recently I had a huge task – improve ETL process between MS SQL servers – to transfer near billion of rows every day in a real time. This means, once a master SQL server receives new rows, the rows must be immediately transferred to 8 satellite MS SQL servers.
The back-end. Every MS SQL server running on one Intel Xeon E5-2690 2.9GHz 8 cores CPU, has 384GB of memory and enterprise SSD storage: separate RAID10 for data, index and log files. TempDB running on RAM drive (in memory). The main database table are separated in file-groups per day.
Before the improvement, I found around up to 800.000 rows per minute through-output. Since the amount of data were increasing every single day, the performance were too slow. By the way, the method to move the rows to satellite servers were to use stored procedure and to transfer the rows in chunks up to 1000000 of rows .
To improve the through-output, the decision were made to use SSIS package instead of USP, since SSIS is probably the best place for ETL processes. And of course to multi-thread the process. So after a lot of tests the best performance we reached using the data flow bellow:
The most important part is Balance Data Distributor. Using the component, the SSIS package were able to write in 4 separate threads into one table. This helped a lot and every additional thread gave additional 500000 of rows per minute through-output (using single thread without the component, SSIS were able to reach up to 1000000 rows per minute). So the total through-output are up to 3 million of rows or 1.5GB of data per minute in the real time from the master MS SQL server to 8 satellite MS SQL servers.
Hope this will help someone to save a lot of time and provides ability to have really fast solution to move huge amount of data between servers, etc. prepared in a short time.
If you need any help on details, leave a comment or write me an email!
As all of you probably know AlwaysOn is not only good feature, but also more and complex work for a DBA. In this article I’ll share with you how to automate security (logins) sync between AlwaysOn Availability Group (AG) replicas. Why we need this? Because if you or anyone else will create/alter/edit any user on SQL server primary replica, the user will not be amended on a secondary replica and if you forget to do the same change, sooner or later you may have a big problems due to security. Please note that you don’t need to care about database level security, because this will be synced by AG itself, you need to care about server level security and server roles.
So, to automate security sync I have decided to create a trigger on the SQL server. The trigger is built of few parts: the first is server logins, the second – server roles and the last – prevent any security changes on secondary replica.
SQL server logins
First thing you should know is that all changes performed on every secondary AG replica fired up by the trigger from primary replica will be executed under SQL server service account. To ensure minimal rights to the service account you need to grant view server state and view any definition (many people grant sysadmin role to the service account, but I strongly disagree with it):
GRANT VIEW SERVER STATE TO [Domain\SQL_Service_Account]
GRANT VIEW ANY DEFINITION TO [Domain\SQL_Service_Account]
The next thing trigger will do is to capture all CREATE_LOGIN, ALTER_LOGIN and DROP_LOGIN events, extract other require data such us affected login, command text from the captured event and create T-SQL command which will be executed on every secondary replica.
SQL server roles
Previously I have mentioned that I strongly disagree to grant sysadmin role to SQL server service account. So how we can add login to lets say sysadmin role if the service account does not have such permission? I decided, to keep secure environment is better than to have fully automated changes on server roles. Regarding my decision, in case of ADD_SERVER_ROLE_MEMBER or DROP_SERVER_ROLE_MEMBER event the trigger will send alert email when any server role change, with details like this:
Subject: Server role changed on [ServerName1]
Message: Need to perform the same server role change on [Servername2, ServerName3, etc.] secondary replica as performed on [ServerName1] primary replica: ALTER SERVER ROLE [diskadmin] DROP MEMBER [test]
This will allow you to know if anybody else made any server role change or at least remember you to make the same change on other replicas.
Prevent any security changes on secondary replica
And finally, what if anybody will make a security change on any secondary replica? I prefer to rollback any change like this. So if you will try to amend any user on secondary replica, you will get an error message:
Unable to add/alter/drop users on SECONDARY replica. Please perform the action on PRIMARY replica.
The source code
Here is the source code of the trigger, feel free to use it and amend it to fit your needs. If you have any thoughts, feel three to contact me or write any comment.
IF EXISTS (
WHERE NAME = 'Trg_TrackLoginManagement'
DROP TRIGGER [Trg_TrackLoginManagement] ON ALL SERVER
CREATE TRIGGER [Trg_TrackLoginManagement]
ON ALL SERVER
FOR DDL_LOGIN_EVENTS, ADD_SERVER_ROLE_MEMBER, DROP_SERVER_ROLE_MEMBER
SET NOCOUNT ON
DECLARE @recipients VARCHAR(200)
DECLARE @AGDatabase VARCHAR(200)
DECLARE @AGGroupName VARCHAR(200)
DECLARE @SQLServiceAccount VARCHAR(200)
SELECT @recipients = 'email@example.com' --Change to email you will receive alert
SET @AGDatabase = 'DatabaseName' --Change a database name which belongs to AG
SET @AGGroupName = 'AGGroupName' --Change to your AlwaysON AG name
SET @SQLServiceAccount = 'Domain\SQLServiceAccount' --Change to SQL service account SQL server is running on
FROM sys.dm_hadr_availability_replica_states ars
INNER JOIN sys.availability_databases_cluster ag
ON ars.group_id = ag.group_id
WHERE ag.database_name = @AGDatabase
AND ars.is_local = 1
) = 'PRIMARY'
DECLARE @data XML,
SET @data = EVENTDATA()
SET @EventType = @data.value('(/EVENT_INSTANCE/EventType)', 'varchar(100)')
SET @AffectedLoginName = @data.value('(/EVENT_INSTANCE/ObjectName)', 'varchar(100)')
SET @text = REPLACE(
SELECT @sql = CASE
WHEN @EventType = 'CREATE_LOGIN' THEN (
SELECT 'create login [' + p.name + '] '
WHEN p.type IN ('U', 'G') THEN 'from windows '
+ 'with '
WHEN p.type = 'S' THEN 'password = '
+ ' hashed, '
+ 'sid = '
+ ', check_expiration = '
WHEN l.is_expiration_checked > 0 THEN 'ON, '
ELSE 'OFF, '
+ 'check_policy = '
WHEN l.is_policy_checked > 0 THEN 'ON, '
ELSE 'OFF, '
+ 'default_database = ['
WHEN LEN(p.default_language_name) > 0 THEN '], default_language = ['
FROM sys.server_principals p
LEFT JOIN sys.sql_logins l
ON p.principal_id = l.principal_id
WHERE p.name = @AffectedLoginName
WHEN @EventType = 'ALTER_LOGIN' THEN @text
WHEN @EventType = 'DROP_LOGIN' THEN REPLACE(@text, CHAR(13), '')
WHEN @EventType = 'ADD_SERVER_ROLE_MEMBER' THEN @text
WHEN @EventType = 'DROP_SERVER_ROLE_MEMBER' THEN @text
WHERE group_name = @AGGroupName
AND replica_server_name <> @@SERVERNAME
WHILE EXISTS (
SELECT TOP 1 @ServerName = replica_server_name
IF @EventType IN ('CREATE_LOGIN', 'ALTER_LOGIN', 'DROP_LOGIN')
SET @executesql = 'sqlcmd -S ' + @ServerName + ' -e -Q "' + @sql + '"'
EXEC xp_cmdshell @executesql
WHERE replica_server_name = @ServerName
PRINT @Eventtype + ' activity successfully transferred to secondary server.'
DROP TABLE #secondarys
IF @EventType IN ('ADD_SERVER_ROLE_MEMBER', 'DROP_SERVER_ROLE_MEMBER')
DECLARE @EmailMain VARCHAR(MAX)
DECLARE @Secondaries VARCHAR(500)
SELECT @Secondaries = COALESCE(@Secondaries + '; ', '') + replica_server_name
WHERE group_name = @AGGroupName
AND replica_server_name <> @@SERVERNAME
SET @EmailSubject = 'Server role changed on ' + @@SERVERNAME
SET @EmailMain = 'Need to perform the same server role change on ' + @Secondaries + ' secondary replica '
SET @EmailMain = @EmailMain + 'as performed on ' + @@SERVERNAME + ' primary replica:<BR>'
SET @EmailMain = @EmailMain + @sql
@recipients = @recipients,
@subject = @EmailSubject,
@body = @EmailMain,
@body_format = 'HTML'
) <> @SQLServiceAccount
AND @EventType IN ('CREATE_LOGIN', 'ALTER_LOGIN', 'DROP_LOGIN')
PRINT 'Unable to add/alter/drop users on SECONDARY replica. Please perform the action on PRIMARY replica.'
The idea to write the article were inspired me by our very first KIT club conference. In the conference I’d talked about advanced SQL server monitoring techniques and after a lot of great feedback I decided to write what I talked about in more detailed way and to share a source code. If you will find any issues to use the source, you know where to find me.
In this article I’ll go through the very important places for advanced SQL server monitoring. This will not cover all SQL server monitoring, but will show you some methods I have not found that any monitoring software are already using. I will cover storage, performance counters, backups, indexes, waits and SQL server configuration changes monitoring. So, lets start from storage monitoring first.
The most popular way to monitor SQL server storage is to get an alert email with drives free space thresholds. This might be configured to receive daily emails with current details on drives, or to send an alert when a disk free space getting to low. It’s not perfect and it’s not enough to be one step ahead. Let me give you same examples. One day I’ve received an email with details:
The thresholds on the email looks perfect, I don’t need to take any actions, everything looks fine. But the other day I’ve received another email with different details:
In the alert email I found that the E: drive is almost full. Now, I’m in trouble. My server is in trouble. I need to take an action immediately. But wait, how to avoid situation like this? It would be better to know, when it might happen and to take an action before production database goes unresponsive. For this I have created SSRS report to monitor storage and I’m able to plan capacity:
As you can see from the report, in the first section on the left side we have details for each server drive: last and current months free space in percent, total capacity, how many free space in megabytes have been before two months and last month, and how many free space there are now. In the next two sections we can find disk space growth since baseline disk space check performed last 60 and 30 days. In the next two sections “Grow since baseline disk space check performed – Last 60 days” and last 30 days are information about how many megabytes disk grew up in total (column MB) and per day (column MB/Day) since baseline. But the most important is to know how many days/weeks/months (columns Days/Weeks/Months to zero) we have to zero. Having this information, we are able to plan future steps and avoid disk space issues on a server. The last two columns predict how many GB’s a disk will growth in the next 6 and 12 months with the current growth. Even more, we are able to drill down by clicking on any server drive to check disk growth in last three or more months in more details:
By having those very simple SSRS reports we are able to do huge work and more important – avoid troubles and mistakes.
Unlike storage monitoring, there are number of different methods to monitor SQL server performance counters. But I didn’t saw the tool to clearly compare and to have a very simple view of your server. It’s very popular to draw a curve for each counter but on the other hand, it’s not so easy to read the values.
The most important things we should know about SQL server are – how heavy a SQL server is, how fast a SQL server is running and how hard a SQL server is running. It’s very easy to answer how heavy a SQL server is – everything you need to know is how many megs or gigs your databases are in total. This very important when you need to think about – maintenance (indexes, statistics); backups configuration; etc. How fast a SQL server is running we will find from the batch request per second performance counter and how hard a SQL server is running – from wait time per second. The funny thing is that a SQL server can wait for 3 seconds in 1 second. This is because at the same time SQL server can wait for number of different things – storage, CPU, IO, network etc. I have created a SSRS report to monitor performance counters in very easy way, so we can see when a SQL server running very fast:
Or, when a SQL server running very hard:
When we are using 3rd party tools to backup databases, the tools usually providing a report about successfully and unsuccessfully backups. That’s sounds cool and looks like we are done here. Unfortunately not. I had some situations when backup tool were green and no alerts, but in true the backups was not performed. For this I have found a way to check and to be alerted if I have a database without Full/Diff or Log (for databases in full recovery mode) backups:
If I receive the email like above, I know, that I should go and check if backup process were failed or maybe I have missed to schedule a log backup for database in full recovery mode. This ensures at least that I will not dismiss to configure backup for any database (new or after recovery mode change, etc) and informs if something went wrong.
We should not only care about index maintenance, but to monitor missing and unused indexes too. Every index must be reviewed if it’s not missed or is unused. This does not mean that we should create every single missed index or drop every unused one. This need a lot of hard work and knowledge about your database.
I prefer at least on monthly basis to get a report about index usage:
In the green square we are able to find filtered (more than 10k compiles/seeks/scans and more than 80% impact) missing indexes. In the red square – unused indexes. By having this report it’s very easy to start looking on an index needs.
To have a good picture of a SQL server bottlenecks, config changes, etc., the wait statistics report is what we need. It’s simple, powerful way to identify why SQL server running slower than previously, to know if after the config changes SQL running faster or maybe slower, to forecast changes needs to SQL server. Let me show you how easy to check things:
From the report we can very easy identify, that some CPU related changes (probably max degree of parallelism or cost threshold for parallelism) has been made two times:
SQL server configuration changes
It is extremely important to know if any key SQL server configuration change has been made or a trace flag were turned on by someone especially without meeting change control process:
From the alert we may identify who and when made a change, what the old value been and what the new value are.
That’s all I got for this time. Hope you will find useful points to improve your monitoring solution! As I mentioned at the beginning, if you have any question, please ask me. I’ll do my best to help you.