VMware related material
A client told me today he thought he was having poor backup performance over a virtual network. But when he started debugging he noticed that it wasn’t just Backup that was the problem it was all network IO through a local vswitch. He started the conversation by asking me what I expect to see in network throughput between 2 VMs localised on the same vswitch. He informed me no matter what he tried in terms of using a vswitch with/without a pNIC and trying different types of vNIC he couldn’t push 240mbits/s. He asked him the obvious question which was “Is there anything else happening on the system that could be impacting net performance, like CPU overhead” and the answer was “no”.
So I decided to test it for myself.
As did my client I used the Netperf tool to test throughput between the 2 VMs. To use NetPerf you run:
C:\Netserver.exe on one VM
C:\Netclient.exe –H hostname (of the first system)
on the other VM.
After a few seconds you should get a result displayed in mbits/s
And just like my client I tested a vswitch with/without a pNIC and different types of vNIC. I also tested different hosts to get varied result. The problem is I saw results which I’d expect to see, and on average I saw speeds of 500mbits/s
A few weeks back I was listening to a discussion about the concept of vMotion over long distance as if it was a new thing which I instantly disagreed with because I personally had been describing the concept for students when I was a VMware instructor 3 years ago. I knew it was possible to do by:
A. Having the right infrastructure in place – to simplify a fast bridged network
B. Using storage virtualisation solutions like DataCore.
DataCore has had for a long time the ability to synchronously mirror an active/active virtualised volume, which means 2 sets of ESX servers can see the same volume in a RW state as if the volume was local to that ESX server. This feature was nothing new for DataCore so in fact the idea of long distance vMotion has been achievable from the day vMotion went GA.
There were some caveats though, one being it was only feasible to achieve this when latency and bandwidth was not an issue and secondly technically at the time it wasn’t supported by VMware. At the time I tested this theory in the lab but didn’t record my research, so as an alternative I’ve asked a friend within DataCore to describe a real world case study thus proving the point. This concept wasn’t manufactured by DataCore but is a side effect of combining the 2 technologies. Following is a real world implementation of a stretched cluster as DataCore refer to it as:
Mike Beevor (DataCore) says.
Let’s start by looking at a real life application of SANSymphony. IoMart, a highly reputable hosting company offering 100% uptime, approached DataCore with the vision of being able to provide High Availability in a potentially heterogeneous storage environment to its hosted customers, DataCore were more than happy to oblige. The solution was delivered using industry standard software, based on readily available server and storage hardware. What followed was a solution that provided not only the HA that they were looking for, but managed to deliver it on scales more akin to DR!IoMart has 5 datacentre’s located around the UK, but the two that we are particularly interested in are in The City and in Maidenhead, a distance of approximately 20 miles, which I think that you’ll agree, is more than suitable to satisfy most company’s DR strategy. The environment created was done using standard x86 hardware, highly specified due to being of a hosting nature. There is 128GB RAM for caching, 4 Quad Core Processors and 8GB HBA’s for connectivity. The disk behind the server was also considered as low end commodity disk by the major manufacturer that was chosen. Naturally we can’t divulge the full details of the environment, we wouldn’t want to give away IoMart’s competitive advantage, but we can say that the cost was less than a 1/3 of the equivalent software and hardware from a well known manufacturer. DataCore SANSymphony was used to virtualise and manage the environment and the DataCentres have a fibre link between them and a DataCore SANSymphony server in each location.What we have achieved using this configuration is Synchronous replication, between the 2 sites, over a distance of 21 miles. This extends, not only through the storage layer replication, but also the Application server layer, in this instance ESX. Now, site to site replication is nothing new, but where this gets very interesting is that the failover is seamless and automated… at the storage layer, but also the failback is automated and seamless. Now, because we are grid based storage, rather than cluster, we are at no danger of a quorum instance, making this an extremely efficient and effective solution. It is also worth noting that the performance metrics within a grid present a linear performance growth model, as each node is able to dedicate its full computing power to the performance of the system rather than having to aggregate performance throughout a cluster and also dedicate some power to the arbiter within the cluster.
Essentially, we have created a Stretch Virtual Storage Grid, or SVSG as you will hear it referred to in future. The benefits of this type of infrastructure is that you can distribute the environment throughout several locations and ensure that unless a major city be taken out (possibly by Godzilla) then you have a fully distributed HA model on DR geographies.
DataCore’s Synchronous replication functionality operates on a forced cache coherency model, and is based on a grid architecture, replicating the i/o block between the cache on each DataCore server before sending the acknowledgement to the application server and committing the data to disk. By doing it this way, we obviate the problems associated with clustered storage, and allow a greater degree of performance and flexibility.
I didn’t want this to come across as a dig at Cisco as it’s Cisco who are currently branding this a new infrastructure application, far from it but if some tells me this is a new concept then I have to disagree. What I will say though it also looks like Cisco also doing a good job of providing a complete solution by providing tools to achieve things like extended VLANs as well as a good IO virtualisation platform and gone for standard hardware and protocols to drive it. Importantly, Cisco has a good hook into VMware in more than one way 😉 . So you can see all the right tools to architect this concept. As for Cisco playing a big part in a cloud that was always going to be a given and you can see how these kind of solutions are going to help.
OK here’s a cheap way to speed up VMware Workstation VMs whilst limited by the amount of memory your computer can take or even the 4GB limit posed by 32bit Windows operating systems.
The answer is simple providing you use Vista or Windows 7: Purchase a 4GB ReadyBoost USB stick.
ReadyBoost is a component of Microsoft Windows, first introduced with Windows Vista in 2006 and also included with Windows 7. It works by using flash memory, USB 2.0 drive, SD card, CompactFlash or any kind of portable flash mass storage system as a drive for disk cache and virtual memory.
Using ReadyBoost-capable flash memory for caching allows Windows Vista/7 to service random disk reads with performance that is typically 80-100 times faster than random reads from traditional hard drives. This caching applies to all disk content, not just the page file or system DLLs. Flash devices typically are slower than a hard disk for sequential I/O so, to maximize performance, ReadyBoost includes logic that recognizes large, sequential read requests and has the hard disk service these requests.
When a compatible device is plugged in, the Windows AutoPlay dialog offers an additional option to use the flash drive to speed up the system; an additional “ReadyBoost” tab is added to the drive’s properties dialog where the amount of space to be used can be configured. 250 MB to 4 GB of flash memory can be assigned. ReadyBoost encrypts, with AES-128, and compresses all data that is placed on the flash device; Microsoft has stated that a 2:1 compression ratio is typical, so that a 4 GB cache could contain upwards of 8 GB of data.
Under the preferences/memory settings of VMware Workstation make sure you select “Allow some virtual memory to be swapped”. Normally to enhance performance in VMware Workstation you’d select “Fit all virtual memory in reserved host RAM” but when using Readyboost swapping occurs a lot faster and it means we get the added bonus of additional memory to use. So not only is it quicker we can now fire up more VMs.