While working with SQL Cluster in my lab, I encountered an issue where I was unable to bring the SQL Server online after restarting the VMs. I began investigating the problem and found that there was no interesting error message to guide me. In this blog post, I will share the steps I took to fix the Clustered Instance Online Error.
Here are the observations I made:
- The SQL ERRORLOG was being created.
- If I started SQL from the services, it ran fine.
- However, when I tried to bring the SQL resource online in the cluster, it stayed in the “Online Pending” state and then transitioned to the “Failed” state.
To gather more information about the failure in the cluster, I generated a cluster log using the steps outlined in one of my previous articles.
Upon analyzing the cluster log, I found the following error messages:
INFO [API] s_ApiGetQuorumResource final status 0. INFO [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:5447358a-a102-4fc9-95f4-c040e8716859:Netbios ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]SQL Server Network Interfaces: Error Locating Server/Instance Specified [xFFFFFFFF]. (268435455) ERR [RES] SQL Server : [sqsrvres] ODBC Error: [HYT00] [Microsoft][SQL Server Native Client 11.0]Login timeout expired (0) ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08001] [Microsoft][SQL Server Native Client 11.0]A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if the instance name is correct and if SQL Server is configured to allow remote connections. For more information, see SQL Server Books Online. (268435455) INFO [RES] SQL Server : [sqsrvres] Could not connect to SQL Server (rc -1) INFO [RES] SQL Server : [sqsrvres] SQLDisconnect returns the following information ERR [RES] SQL Server : [sqsrvres] ODBC Error: [08003] [Microsoft][ODBC Driver Manager] Connection not open (0) INFO [RES] Network Name: Agent: Sending request Netname/RecheckConfig to NN:52cf277d-234b-4a81-a9a7-0f078fca2a17:Netbios
Based on the cluster logs, it was evident that the cluster was unable to connect to the SQL Service. To resolve this issue, I followed the steps below:
Possible Causes of the Error:
Before proceeding with the solution, it is important to understand the common causes of this error:
- Incorrect client alias created in the configuration manager.
- SQL Browser isn’t running when SQL is listening on a non-default port or a named instance.
- TCP port connection issue.
In my lab, I discovered that I had a TCP alias created, and the port of the SQL Server had changed after the reboot, causing the SQL cluster issue. To prevent this from happening again, I decided to change the SQL Server to listen on a static port instead of a dynamic port.
Have you ever encountered a similar situation where the cluster log helped you identify and resolve an issue? Share your experiences in the comments below!