Cluster Logs in Windows Server 2008??

In this Blog post, let’s focus on where/how we can review Failover Cluster Logs on Windows Server 2008 and above. As most of us know on Windows Server 2003 Cluster, we used to have  “cluster.log” file on each node participating in cluster, which contains debug information. FYI, One can locate these files in “%systemroot% \ cluster” Folder. But how about cluster log files in Windows Server 2008/2008R2?? Uhuhh…It’s not something which you can review directly by navigating to systemroot folder. Below is the screenshot of that folder in my cluster.

You can see a folder called “REPORTS” in the above screenshot where all the cluster Validation Reports will be stored by default. attaching below Screenshot Just to prove, that cluster.log file can’t be located in the “reports” folder as well 🙂

Starting Windows Server 2008, cluster logs are managed by something called as “Windows Event Tracing“. Just an FYI, If you are interested, You can pull all the current  running traces by opening “perfmon” and navigating to Data Collector Sets. (Shown Below in the Screenshot)

So, as any other logs, cluster logs are stored in “C:\Windows\System32\winevt\Logs” folder with “etl” extension as you can see below.

Well, so How to read those .ETL files??

For that, we have to use “cluster.exe” command with “/gen” switch. Basically this will generate a human readable text file in your “Reports” folder.

Syntax: Cluster log /gen

Output:

 

As you can see in the above Screenshot, it will communicate with all the nodes in your cluster. In my scenario, Node2 is offline(Powered down).  BTW, even though Node2 is down, it will create “Cluster.txt” file in your Reports Folder with related information.

So, how to generate Logs related to a specific Node?

You have to use “/NODE” switch with your cluster log syntax. Please see below Screenshot.

As you can see, this time, we had no RPC Errors.

So, there is lot to explore/learn in 2008 Failover Clustering, if you are using 2003 since long time, things got changed drastically. There are lot of other options/switches available with cluster.exe. Even you can limit the size if you are interested. BTW, everything which I’ve shown here can be achieved via Powershell Cmdlets as well!

Hope this is informative….

Quorum Drive in Failover Cluster(Windows Server 2008R2)?

Let me share one of the interesting conversations I had couple of days ago with one of my colleagues.  This was regarding Quorum Drive in Failover Cluster(This was Windows Server 2008R2).

Let me tell you the story in short. ” She had a Question regarding Quorum(Q Drive typically) disappearing from Failover Cluster Manager. She also verified all the 3 Nodes(Remember this was a 3 Node Cluster) in the cluster and was not able to locate the Q Drive.

So, is this Normal in Failover Cluster?? The Answer is Absolutely Yes, this is perfectly normal for any Failover Cluster(Starting Win Server 2008) with Odd Number of Nodes. Quorum Drive is not mandatory anymore starting Win Server 2008. This might be really confusing for folks who are coming from Windows Server 2003! Yes, you heard it right..There is no need of having a dedicated Drive to act as Quorum. In this case our Cluster Quorum was configured as “Node Majority“. In Other Words, all the nodes will be participating in forming a Quorum and will be voting. If one Node dies, we still have 2 nodes Up and Running(so, we still have Majority of votes) and hence our cluster will be running without any issues. If 2 Nodes dies, majority of Nodes are down, at this point our cluster goes down. The basic idea is to avoid single point of failure!

Modes of Quorum starting Windows Server 2008:

Node Majority: Each node that is available can vote. The cluster functions only with majority of the votes. – For Odd Number of Nodes.

Node and Disk Majority: Each node + a Dedicated Drive(Typically Q) will be voting. – For Even Number of Nodes.

Node and File Share Majority: Each node + File Share Witness will be voting. ( Personally, I wouldn’t prefer this for any number of nodes)

No Majority(aka Disk only): This is what we used to have till Windows Server 2003, where a dedicated Disk will be acting as a Quorum.

Note: Leave the Quorum Config as is, chosen by windows by default when configuring the cluster.  Windows is smart enough to choose appropriate Quorum configuration for your cluster. Change only if you are sure about what you are doing. Also, please note that Quorum configuration can be changed any time even after creating a cluster.

In My lab, I’ve a 2 Node Single Instance Cluster. So I’ve Node and Disk Majority as my Quorum. Please see below screenshots.

See Quorum Configuration as Node and Disk Majority. ( Just Ignore the warning above :D, my Node2 is turned off as of now, hence that warning!)

How to change the Quorum Configuration?

Right click on your Windows Cluster and choose More Actions as shown below.

Now, you can click Next and choose your Quorum Configuration type and proceed further.

Just want to remind again, Don’t change anything unless you understand completely what you are doing!!…

CNOs/VCOs(Computer Objects) and few ways to protect them…!

If you already have experience working on Clustered Environments, you might already know about CNO(Cluster Name Object) and VCO(Virtual Computer Object). For Newbies, let me explain what CNO and VCO are in a line or two…

CNO: This is the Core piece of your Windows Cluster and acts as an identity of your Windows Cluster. This is a computer Object which will be created in your AD under Computer Node(under your Domain or OU, if you have any).  It will be same name as your Cluster.

VCO: Again, these are the Objects being created in AD under Computer Node depending on the Services and Applications which you are creating inside your Cluster. Yes, CNO is responsible for creating those VCO’s.  CNO’s should not be deleted or not even touched in terms of security by any means and by any person.  Services won’t come Online if CNO permissions are modified or CNO gets dropped accidentally, which is a potential threat for your cluster.

In order to Recover from deleted CNO situation, your Domain Admin should be involved and he/she needs to restore your Active Directory Objects which is not a simple task, especially in larger enterprises. Good News is Starting Windows Server 2008R2, we’ve something called Active Directory RecycleBin which is an awesome way to recover AD Objects. Hold on Guys….there’s a GOTCHA though!

Gotcha: AD Recycle Bin is not Enabled by Default. It has to be enabled within your Domain by your Domain Admin!

What if we communicate with our Domain/OU/Server Admins to enable some settings which basically prevents any accidental deletion of Computer Objects? It would be really nice if we could prevent that deletion action in first place right, instead of recovering after a disaster. So what can be done here? Windows Server 2008/2008R2 offers a really simple way to prevent these accidental operations(mostly Human Mistakes). There’s a small checkBox which we should enable to make this happen. Once enabled, It won’t let anyone to delete that Object.

Demo:

I’m on my Domain Controller and I’ve opened AD Users and Computers from Administrative Tools. You can see WINCLUST is my CNO.

Very Imp: Now You’ve to go to View and select Advanced Features, to be able to see/perform all the available options/operations we’ve. You can see below

Now, I’m trying to protect my CNO from accidental deletion. All I’ve to do is Right click on CNO and select properties and navigate to “Object” tab and check that tiny box as shown below:)

That’s it! Is it really hard? Nope. You can check with your Domain Admins to make sure that this is checked on all your CNO’s and VCO’s.

Note:

All the New OUs are automatically set to be protected.

Any New Users/Groups are not set to be automatically protected.

Any New Computers are not set to be automatically protected.

With the protection being enabled, now let’s see what happens if we try to delete that CNO manually from my AD.

Note: Don’t even think about doing this in your Company(In the First Place, we will not be having those level of privileges). I bet, you’ll be fired the very next moment!

I got this warning Message saying, Are you really Sure about what you are trying to do here?? See below Screenshot.

Let’s say…I’m one Stupid Guy and went ahead and clicked on Yes. The below is the screenshot of what I got.

Remember I logged onto this as a Domain Administrator, even then..Windows is saying, “Uhuhhhhh….No Idiot! I’m not letting you to perform this operation unless you uncheck that tiny box which we checked earlier” 😀

Isn’t it Something Awesome Guys? I really really encourage you to check with your Server Admins on this Option being enabled if you are responsible/accountable for some Mission critical Production SQL Server Clusters.

Hope this is useful info you learnt something new! Cheers!…

Cluster Group in Failover Cluster Manager- Windows Server 2008/2008R2??…

Where did Cluster Group go starting Windows Server 2008? Did microsoft really removed the concept of Cluster group starting Win Server 2008??…

If you worked a long time on Windows Server 2003 Clusters, you will definitely know what a cluster group is. In 2003, we used to have a concept called “Groups”. One of the groups which it displays is “Cluster Group”, when you open your Cluster Administrator(Cluadmin). If you are new to clustering,  just to keep it very simple Cluster group is the Core of your Cluster. If your Cluster group is down, basically your entire cluster(all the applications, resources etc) are down.

Things got drastically changed starting Windows Server 2008. Basically we should no longer call it as “MicroSoft Cluster Services(MSCS)“. They renamed it to simply “Failover Cluster Services”. IMHO, this makes it very clear that we are not referring to NLB(Network Load Balancing) – which is one of the flavors of Clustering Services offered by Windows Servers.

Note: Failover Cluster and Network Load Balancing Cluster are completely two different entities, no way related to each other.

Anyways, let’s see where can we see the cluster Group in Failover Cluster Manager in Win Server 2008. Assuming you’ve already Installed FCS(Failover Cluster Service), once you navigate to your Administrative tools and open Failover Cluster Manager(GUI to Manage your Failover Cluster), you’ll see something very similar to the below Screeshot.

Typically we expand our Cluster and navigate to our Services and Applications, Nodes, Networks to manage our SQL Servers! But you are not able to see the actual Cluster Group anywhere as shown in the below Screenshot!

Soooo….where do i get that info!!…

All you’ve to do is to Navigate to the Main Windows cluster and expand Cluster Core Resources(This will be collapsed by default) as shown below.

Once Expanded, you could see something like the below screenshot, where we can see our Cluster Group 🙂

If you are more CMD/PS person, One Simple Way to get the list of all the Groups within our Cluster from CMD is just typing “CLUSTER GROUP” from your Command Prmt! You can see the Below Screenshot from one of the nodes in my Failover Cluster.

As you can see, we still have the Core Concept of “Cluster Group”! Don’t get confused with “Available Storage” being Offline from above Screenshot, It has nothing to with your current allocated SAN Drives(Storage). I Don’t have any extra(additional) storage available in this cluster as of now(See below Screenshot to see what I mean, as you can see all of my drives are allocated to something or other), that’s the reason this is listed as Offline for now, I believe it will appear ONLINE once i create a new SAN Drive and add to this Cluster(Pls. correct me If I’m wrong.)

Soo…the Bottom Line is…We do have Cluster Group Concept, but in stealth Mode 😉

How to Patch SQL Server 2008 Failover Cluster??…

This would be the last part of this Cluster Installation Series! In the Previous part, I’ve shown you how to Configure an Active-Passive SQL Server 20008  Failover Cluster. So at this moment our SQL Version will be at RTM, let’s apply Service pack 2 over our RTM.

Let’s begin! FYI I’ve downloaded SQL Server 2008 Service Pack 2 from microsoft and copied the exe files onto both the Nodes of our Cluster.

Procedure:

To be very simple and precise, We’ve to Install SP2 on our Passive Node(I’ve chosen Node2 as current Passive) first. Once completed, we’ve to manually failover SQL Services to Node2(Currently SQL is running on Node1) and apply patch on Node1.

This model has been introduced starting SQL 2008. Earlier SQL patching was cluster aware where it used to automatically patch all the nodes in your Cluster. But this leads to more down time depending on number of Nodes involved in your Cluster. Let us say…we’ve 8 Nodes, in this case SQL Server will not be available until all the Nodes are successfully being patched. Starting 2008 MSFT avoided this, basically making High Availability Solution more highly available 😉

Same logic applies to Active-Active Cluster as well. You’ve to Manually failover your SQL Services to other node(so at this point 2 instances of SQL will be running on single node) and you’ve to apply patch to idle node. Once patching is completed successfully on your idle node, you’ve to manually failover both the Instances to this node and patch the other node. Hope am not confusing!!

Let’s begin! I’ve chosen to patch my second node first. So I’ll make sure that SQL and all other dependencies are running on Node1. You can see below in the screenshot that Node1 is owning every SQL resource.

Very Important:  You should remove Node 2 as a possible owner through out the patching process of Node2, making sure that SQL is not going(allowed) to failover to Node2 for any reason throughout this process.

Also,Just to double check open your SQL Server Configuration manager on your Node2 and you should be seeing SQL Server and SQL Server Agent as Offline as shown below. you might be seeing Browser and FTS services as running since they are not clustered (SQL Cluster Patch installation will take care of those services as needed).

Last thing to do is open your SSMS and note your Current build and Version just by issuing @@Version. For me the current Output is

“Microsoft SQL Server 2008 (RTM) – 10.0.1600.22 (Intel X86)   Jul  9 2008 14:43:34   Copyright (c) 1988-2008 Microsoft Corporation  Enterprise Evaluation Edition on Windows NT 6.0 <X86> (Build 6001: Service Pack 1) (VM) “

Close your SSMS(if any) on Node2 and double click on the SQL Server 2008 SP2 executable, you should be welcomed with the below screen

Next ->Accept the License Terms ->Next and you’ll get the below screen where you’ve to choose the features which you want to apply patch to.

Next and It’ll perform a basic Check for any Files in Use which might prevent successful installation of SP2 as shown below.

Click Next -> Update  as shown below…That’s it:)

It will take few minutes to complete and you’ll get below screen.

Click Next->Close. Once Done, go to your Failover cluster Manager and add your Node2 back as a possible owner and Failover SQL Services to Node2(At this Point If you open your SQL Server Configuration manager on your Node1, SQL Server Service and SQL Agent Service should be Offline) and repeat the same procedure what we did for Node2, on Node1 to patch Node1 as well.

Note: Don’t forget to remove Node1 as possible owner now through out the process, making sure that SQL is not going(allowed) to failover to Node1 for any reason through out this process.

Once you are done with patching both the nodes successfully, add node1 back as possible Owner and now open your SSMS and issue @@Version and let’s compare with what we had prior to Installation.

As you see in the above screenshot we’ve upgraded our SQL from “Microsoft SQL Server 2008 (RTM) – 10.0.1600.22 (Intel X86)   Jul  9 2008 14:43:34″ to “Microsoft SQL Server 2008 (SP2) – 10.0.4000.0 (Intel X86)   Sep 16 2010 20:09:22″.

Perfect…That’s all for the day!

Basically we’ve covered everything right from scratch(Installing Virtual machines, Installing Windows Server 2008 Enterprise OS, Configure your DNS and Domain controller, Adding new Physical Computers to Domain, Creating your own ISCSI SAN, configuring NIC’s…Clustering Windows and creating SQL Server Cluster and finally how to Patch your SQL Servers which are configured as clusters).

Hope this series will be useful to someone who is new to SQL Failover Cluster setups and also hoping you enjoyed this entire series of SQL 2008 Cluster Installation as i did:)