SQL Cluster Failover Issues
Today’s topic is ‘oh crap, I manually failed over my SQL Server cluster during a lunch-time scheduled maintenance window and SQL Server and SQL Server Agent did not come back online.’ The key words in their being OH CRAP!
Looking at the ‘oh so informative cluster events’ I see the following:
The Cluster service failed to bring clustered service or application ‘SQL Server (MSSQLSERVER)’ completely online or offline. One or more resources may be in a failed state. This may impact the availability of the clustered service or application.
That did not help, let’s move onto the Windows Event Viewer. It gave us: “The specified account’s password has expired.” Wait a minute! These are service accounts created by our active directory administrator that are supposed to never expire. Hmmm, I better investigate further. I look at a big group of my SQL Server service accounts and noticed about half of them are set this way. When I questioned the AD administrator, he indicated that he was training a new person and he must have done the half that was incorrect. Problem solved. Thus, it is a good idea to always check to make sure that your accounts are setup properly before you use them in SQL Server.