Wednesday, October 21, 2015

Networking Considerations for VMWare VSAN 5.5 and above

The networking component was entirely foreign to me when we started this adventure. I had some experience with basic switch configuration and running ethernet, but I didn't even know what SFP+ cabling (aka twinax) was, or what kind of switches we needed, or if the nic's we had on the dell servers would work with the switches we were buying from cisco. I don't know what the industry standards are like elsewhere, but for the most part the rest of our organization used either cat5 or fibre. Nobody really had any experience with twinax, but after talking with a number of sales reps and some extensive googling, we found the right network hardware.
VMWare recommends 10GB connection for VSAN, although it can be run over 1GB, 10GB is recommended for production.
We settled on Cisco Nexus 3500 series for our switches. For redundancy, we would have 2 switches for each VSAN deployment.
Here is a link to the switch details. Nexus 3500

VSAN Ready Node, a Myth for our Dell Rep

If you read some of the documentation, VMWare suggests that major hardware vendors have pre-configured "VSAN Ready Nodes" available for purchase. This turned out not to be true for us and our experience with Dell. I work for a large organization and we go through the same guy for everything. He either wasn't able to find the part#, or Dell didn't offer this for our institution. Either way, we spent a long time going back and forth on the hardware. A very long time. Long story short, here is what we went with for VSAN Nodes.

Dell Poweredge R730xd Server
Intel X520 DP 10GB DA/SFP+ Server Adapter (2 each host,4 SFP connections for each server)
iDrac8 enterprise
PERC H730 RAID Controller
Intel Xeon E5-2667 3.2GHz, 20M Cache
16GB RDIMM, 2133 MT/s, Dual Rank Memory (24 for a total of 384GB RAM for each VMWare Host)
1.2 TB 10K RPM SAS HDDS (12 each Host)
200 GB SSD Drive (2 each Host)
16 GB SDCard (2 each Host)
VMWare Esxi 5.5 U2 preloaded on SD Cards


We also purchased the licensing for VMWare VSAN with the Hosts.

VSAN requires at least 1 SDD per host, but I remember reading somewhere about the SSD to HDD ratio and I think around 1:7 was recommended. This mix of drives fit our price point. We had a hard time deciding on this mix of disks, and in fact went back and forth with Dell a number of times.

What's in a name?

By the way, the name of this blog is a bit of a joke. Over the years I've heard a number of aphorisms, tech-talk jargon, and general butchery of the English language in the tech world. Everything from, "it's not a show-stopper" for minor issues to my least favorite in the world, "..from soup to nuts". So, there it is.

About This Blog

The motivation for VMWare VSAN 5.5 stemmed from our limited experience with off the shelf products, our budget, and our existing infrastructure.

First I have to clear the air. Our first experience with any SAN technology was with a Drobo 1200i. This device cost us around $12k at the time and we didn't have very much storage experience, so we were kind of in the dark as to what we needed, what kind of performance we would get, and where to put our money. Well, we quickly learned that this device was not as great as was advertised. The sales folks at Drobo convinced us that we could "run active vm's" on the device and ultimately I dreamed of using this common datastore to enable VMWare features such as High Availability (HA) and Fault Tolerance (FT). No such luck.

The Drobo 1200i, at the time, came with 6 HDD's, with capacity for up to 12 HDD's total, we configured our's with 9 HDD's at 2TB each. Later, they updated their firmware for automated tiering with SSD's, which undoubtedly would have increased performance, but it was too late for us to sink any more money into these devices.

The 1200i is equipped with 4 ethernet ports, 3 for iscsi over ethernet connections, and 1 management port. In our configuration, we wired up a Ciscso SG 300-20 20 Port Gigabit managed switch to the 3 iscsi ports on the drobo, and each VMWare host also had two 1-gb nic's connected to the switch. Everything worked well and all the VMWare hosts were able to mount partitions on the Drobo. After that, I started to play around with moving some of the VM's to this datastore. The performance was abysmal, and so we abandoned the idea of running VM's from the drobo and used it primarily for backups.

We had a lot of issues with the Drobo 1200i. Some mis-configurations would cause the device to reboot continuously, requiring a patch from the vendor to remove bad volumes, this sometimes took weeks. The proprietary software doesn't leave much room for the end user to troubleshoot, so you pretty much have to call support anytime anything goes wrong.

Our solution worked well for many years. We treated the Drobo like tape, and it chugged along slowly and kept our backups on hand.

Fast forward to 2014. We were getting ready to upgrade our VMWare Infrastructure and I had heard some rumors about the VSAN product. After some serious whitepaper reading, we decided that this would be the best fit. In a lot of ways, it is akin to the next logical step up from our old Drobo setup. Only, instead of a 1-GB iscsi over ethernet, we would run an all 10GB network. And instead of a HDD device, we would have a mix of SSD's and HDD's on each VMWare Host. Now, the next step was purchasing the hardware.

Introduction To this blog

So, I recently completed the deployment of two VSAN VMWare 5.5 Clusters. I had some issues along the way, but mostly everything is worked out now and I am satisfied with the result. In the subsequent series of blog posts, I will try to document the entire process, as succintly as possible, from conception and motivation to completion. Also including some of the issues I encountered along the way, and some other random tid-bits. My goal is to document the entire process in hopes of either helping other people along the same path, or just to have this all in one place.

If you have any feedback or comments, feel free to chime in at any time.