Random 'Monitoring stopped' on one server

SQL Server performance monitoring and alerting

Moderators: eddie davis, priyasinha, Adam, chriskelly, Chris Lambrou, Chris Spencer

Random 'Monitoring stopped' on one server

Postby RichyD » Fri Oct 18, 2013 4:16 pm

I'm using SQL Monitor 3.5 to support about 15 servers, and am used to the occasional connection issue. One of my servers, however, is suffering from intermittent 'Monitoring stopped (SQL Server credentials)' errors throughout the day - on average about six times a day at random times. Each time, the alert is ended within a few seconds as SQLMonitor successfully reconnects.
The account being used for this server is the same Windows account for all, and no other servers are showing a similar problem...
I've checked the target SQL and Windows event logs, the monitoring server logs, and can't find anything at all to indicate a problem.
Has anyone else experienced this kind of thing, and/or have pointers on where to investigate next?

Cheers,
Rich
RichyD
 
Posts: 7
Joined: Fri Oct 18, 2013 4:00 pm

Postby RichyD » Fri Oct 18, 2013 4:27 pm

Btw, I've also checked the SQL Monitor 'Monitored Servers' machine log, but that only goes back about 3 minutes.
The full SQL Monitor log shows all sorts of random exceptions, but none with times that correspond with the monitoring failures...
RichyD
 
Posts: 7
Joined: Fri Oct 18, 2013 4:00 pm

Postby Brian Donahue » Tue Oct 22, 2013 4:18 pm

If you are having login failures to SQL Server, the failure reason should be in the SQL Server error log. It is possible that you aren't auditing login failures, so you can check the settings and make sure: http://www.mssqltips.com/sqlservertip/1 ... ql-server/
Brian Donahue
 
Posts: 6669
Joined: Mon Aug 23, 2004 10:48 am

Postby RichyD » Tue Oct 22, 2013 4:37 pm

Thanks for the suggestion Brian, but login failure auditing was already on and no SQL Monitor related events are in the SQL Server log. I have seen other user login failures, so is it definitely logging...

This is one of the odd things about the problem - SQL Monitor saying that it has had SQL credential problems, but the SQL Server itself denies all knowledge. Most peculiar.
RichyD
 
Posts: 7
Joined: Fri Oct 18, 2013 4:00 pm

Postby Brian Donahue » Wed Oct 23, 2013 11:09 am

It's probably a failure to connect to the Windows components then - check the server in "monitored servers" and next to the server, click "show log".
Brian Donahue
 
Posts: 6669
Joined: Mon Aug 23, 2004 10:48 am

Postby RichyD » Wed Oct 23, 2013 1:25 pm

I checked the 'Show log', but never managed to get to it in time to see anything helpful - that log appears to only retain a few minutes of info...

As an experiment, I moved the Base Monitor to a different server, and I haven't had a connect error since the move. It's only been two hours, but I'll keep my eye on it and hope that it was a problem with the monitor host.

Things couldn't be totally fixed, of course - I've now got 100% CPU usage on the new host :( I'll raise that in a new thread if it doesn't settle down this afternoon...
RichyD
 
Posts: 7
Joined: Fri Oct 18, 2013 4:00 pm

Postby maddave » Mon Nov 04, 2013 5:48 pm

Hi,

Did you find moving the base monitor to a new server resolve this issue? I am having exactly the same symptoms with one server always having SQL Monitoring Stopped errors, with no login failures on the server. The server is working fine ad the error is ended in a few seconds.

Thanks.
maddave
 
Posts: 19
Joined: Thu Apr 11, 2013 11:48 am

Postby RichyD » Tue Nov 05, 2013 9:08 am

The best idea I've come up with so far is a pair of memory leaks in Windows - particularly one related to WMI. After a while, the memory allocated to the wmiprvse.exe service will reach 512MB, which is a cap - at this point any remote WMI calls will fail. After a few seconds, some garbage collection will occur to free some memory, and SQL Monitor will connect in again.
I've scheduled a hotfix to be applied, but my OS team is slow to roll these things out, so i can't state if this will definitely solve the problem...

For ref, the Windows 2008r2 hotfix is here: http://support.microsoft.com/kb/2832248, and the vanilla 2008 one is here:http://support.microsoft.com/kb/958124

If that sorts out your issues, please let me know :)

Rich
RichyD
 
Posts: 7
Joined: Fri Oct 18, 2013 4:00 pm

Postby maddave » Tue Nov 05, 2013 9:59 am

Perfect. Thanks for the quick reply, that's definitely something to keep an eye on. I'll see if this resolves the issue.

Thanks again.
maddave
 
Posts: 19
Joined: Thu Apr 11, 2013 11:48 am

Postby RichyD » Thu Nov 07, 2013 12:16 pm

I managed to get the hotfix rolled out on one server yesterday morning, and haven't had a connection failure since. That's a success in my book :)
I'll be rolling that hotfix out to all Windows 2008/r2 servers over the next month to make eliminate the rest of the connection failures I get.
RichyD
 
Posts: 7
Joined: Fri Oct 18, 2013 4:00 pm


Return to SQL Monitor 3

Who is online

Users browsing this forum: No registered users and 0 guests

cron