SeriousTeK

dribble from the Tech World

NAVIGATION - SEARCH

Software iSCSI Targets: Part 2B - StarWind, Multiple VMs

Part 2B: 3 VM IOMeter load on a StarWind iSCSI datastore. Same procedure as previous testing - Complete install time: 32 minutes, 39 seconds - 3.5 minutes faster than the Microsoft iSCSI target software. Here's the setup:

Here's the CPU\RAM of the iSCSI server during OS install:

Just as previously, RAM is allocated to cache and CPU is heavily utilized. Here is the network utilization during OS install:

The utilization graph looks strikingly similar to the MS iSCSI target install, however, utilization goes over 30% at a few points. During the major portion of OS install - CPU utilization is high, as is the underlying physical disk queue length. This shows me that a faster disk subsystem would improve performance even further.

And a brief view of network utilization during VMWare tools install on all 3 VMs:

 

Deduplication and Thin Provisioning

StarWind clearly has the advantage here. Not only is the LUN thin provisioned, but it is also deduped once data is written. The deduplication engine works inline, so as data is being written, it is deduped - with a 4K size, it is very effective. All 3 VMs should be taking up nearly 27GB of space - but as you can see, they are not - only about 8.69GB of used space. This yields a 3.15:1 dedupe ratio - which is what I would expect for 3 mostly identical servers.

 

*Note: Each VM should be taking up about 9.13GB however, there is likely some block redundant data within the IOMeter test files on each VM.

 

Performance - Single VM IOMeter, 2 VMs Idle

Again, the StarWind iSCSI target showed far better performance than the Microsoft software target:

I also tested rebooting one of the idle VMs while this test was running - as expected the IOPS dropped down to the 300 range, and response time went up. The physical disk queue also jumped - again showing this as a limitation (reboot initiated at 11:09 AM).

 

All 3 VMs running IOMeter

With all 3 VMs running an IOMeter worker, the performance is still very good. Keep in mind that the IOMeter test files are 500MB - the 3 combined likely fit in the StarWind cache helping the test along. Only 2 of the 3 VMs shown:

The network utilization as the 3 tests are started:

 

Conclusion

The StarWind iSCSI target software has several clear advantages over the Microsoft solution - performance advantages including high-speed RAM cache, thin provisioning, deduplication...with a high-performance disk subsystem that includes controller-based RAM cache as well as faster physical spindles - a major performance boost can be had. Additionally, aggregating network links and adding more RAM to the cache can produce a powerful, fast and efficient software based iSCSI storage solution.

Software iSCSI Targets - Part 2a: MS iSCSI - Multiple VMs

In Part 2 of this series, we will look at the performance of the Software iSCSI targets under a heavier load - more specifically, 3 server VMs. While this may not seem like very much load, keep in mind that the backend storage is still just a single 7200RPM spindle, and all networking is over a single 1GbE link. All of the hardware from test 1 is the same, but here are the specs for the three new VMs:

Windows Server 2008R2 SP1 (3x)

  • 1 vCPU, 2GB vRAM
  • 18GB System drive - * Thick provisioned, eager zeroed

 

The Procedure

Create VMs, mount install ISO, begin installing OS, repeat two more times. During the install, here's what the iSCSI server looks like:

It's a bit hard to tell, but the green graph is iSCSI I/O Bytes/second, the red is iSCSI Target disk latency, and yellow is iSCSI requests/second. In the backround, you can see that the CPU and RAM are fairly dormant.

The iSCSI server network adapter does not appear to utilize more than 30% of total available bandwidth. *Keep this network utilization graph in mind...the pattern will show up again...*

The next important bit of information: From start to Windows took 35 minutes, 53 seconds for 3 VMs installing simultaneously.

IOMeter testing

First, one VM running an IOMeter test, while the other two are idle:

Once again, network usage does not appear to exceed ~30% during the test. Iometer results for 1 VM (start of test):

Average after one minute:

Next, 3 VMs running the same IOMeter workers (same as in Part1):

After one minute:

As you can see by the IPOS performance, this may not be the best solution. In part 2B we'll look at the same tests using a StarWind iSCSI target.

StarWind iSCSI vs. Microsoft iSCSI - Part 1

*NEW* StarWind V8 [BETA] is here!! and it looks VERY promising!!

For some small to medium sized businesses and even some home power users, shared storage is a must-have. Unfortunately, standard high performance SANs carry with them a hefty price tag and a massive administrative overhead - so an EMC or NetApp filer is often out of the question. What is the answer? Turn an everyday server into an iSCSI storage server using iSCSI software target software.

Microsoft's iSCSI target software was originally part of the Windows Storage Server SKU and only available to OEMs. When Windows Storage Server 2008 was released, it was also included in Technet subscriptions making it readily available for businesses and home users to test and validate. Windows Storage Server 2008 R2 was different - it was released as an installable package to any Server 2008 R2 install - but again, it was available through Technet. Then, in April of 2011, Microsoft released the iSCSI target software installer to the public (in this blog post).

Enter Starwind. They have had a software target solution around for quite some time - the current release version is 5.8. The free-for-home-use version is becoming very feature rich - features include thin provisioning, high speed cache, and block level deduplication. The pay versions add multiple server support, HA mirroring, and failover - the full version comparison can be found here.

The Test Setup
I first want to compare both solutions side by side - first testing general OS performance, then more specialized workloads, etc. All network traffic is carried on a Catalyst 2970G switch, on the native VLAN with all other traffic - this is not an optimal configuration, but I want to start with the basics and try to improve performance from there.

iSCSI Target Server

  • Whitebox Intel Xeon 5160 Dual Core 3.0GHz
  • 3GB RAM, single GbE link
  • 60 GB iSCSI LUN presented from standalone 80GB 7200RPM SATA disk

ESXi 5.0 Host Server

  • Whitebox AMD Athlon X2 3.0 GHz
  • 8GB RAM, single GbE link

Windows 7 Test VM

  • 2 vCPU, 2GB RAM
  • Windows 7 SP1 x86
  • 20GB system volume - *Thick provisioned, eager zeroed

 

Comparison 1: Installing OS

StarWind

The StarWind target was installed, and a virtual volume presented over iSCSI - 60GB, with deduplication turned on and a 1.5GB cache enabled. First impressions: OS installation is very quick - I did not time it, but it was remarkably quick. During the install, the iSCSI server was clearly using most of its resources for iSCSI operations - the single 1GbE link was saturated at 95%:

The high speed cache feature is very clearly a factor as it allocates the RAM immediately, and the CPU load is all from the StarWind process:

 

Microsoft iSCSI

The Starwind software was uninstalled and the Microsoft target software installed in its place. A 60GB LUN was presented to ESXi - and OS installation began (the VM was created with the same specs). Immediately, it was obvious that the installation was going much slower than with the StarWind target software. The resources in use by the iSCSI server clearly show this:

Average network use is around 30%:

Same story with CPU use and allocated RAM:

 

Comparison 2: IOMeter Test

Here is the configuration used for the tests:

 

StarWind

This test may be a bit one sided due to the fact that this test VM is the only running on this datastore, and thus the entire IOMeter test file is likely coming from the RAM cache. Either way, here are the results:

It will be interesting to see if this performance scales with more VMs (containing similar blocks - in the kernel, etc) and with more RAM in the iSCSI server.

Microsoft iSCSI

The results while using the Microsoft software target:

 

Part 1 Results

IOMeter clearly shows a 10X improvement in performance - these results will need to be verified with a heavier load, but I can only imagine that more RAM and faster base storage disks will only improve these results.

In part 2 I will see if these results will scale with multiple server VM workloads - and also how effective the StarWind deduplication engine is.

Windows Server 8: Offline servicing

From the little that I've looked into Windows Server 8, my favorite new built-in feature is offline servicing. This was possible in the past with Windows Images (WIM files) using the dism.exe tool, but this new feature looks much more promising - the VHD is becoming a very powerful format.

 

 

This will make servicing VMs even easier.

 

BE.net 2.5

After a lengthy period of downtime the blog is back up, and is now running on BlogEngine.net 2.5!! Now that we've moved in to our new place, there should not be any other major interuptions.

 

SnapMirror from FAS to StoreVault

First, a few warnings:

 

  • This is NOT supported by NetApp. At all. In any way shape or form.
  • Using anything other than the StoreVault Manager GUI can cause data loss.

 

You have been warned - do this at your own risk!
 
First some background - setting aside the fact that FAS to StoreVault is not supported at all - lets go Back to Basics: SnapMirror -
 
Volume SnapMirror operates at the physical block level. It replicates the contents of an entire volume, including all Snapshot copies, plus all volume attributes verbatim from a source (primary) volume to a target (secondary) volume. As a result, the target storage system must be running a major version of Data ONTAP that is the same as or later than that on the source.
 
Here's the problem - the StoreVault will likely be running OnTap (S version) 7.2.x and the FAS will be running 7.3.x - thus meaning that volume SnapMirror will not even work. In fact, if you try, you will probably receive an unspecified error when trying to initialize the mirror. What's the solution? Try to get your filers on the same major version? Good luck - especially since the StoreVault is EOL. Or use qtree SnapMirror.
 
The caveats:
 
 
What does this mean? It means that all the great features of VSM do not apply - particularly (in my case) SMVI integration. HOWEVER, all that said it is still possible to efficiently replicate all data from one filer to another based on a schedule. If you are mapping volumes (not qtrees) to LUNs in a VMWare cluster, you are likely wondering how QSM will work - that's where the trick is, and it's fairly simple.
 
First - remember that SnapMirror is always configured from the destination. Next, use the following syntax to setup a QSM to mirror the entire volume to a qtree:
snapmirror initialize -S SrcFiler:/vol/VolumeName/- DestFiler:/vol/VolumeName/qtreeName
The key is the '/-' to indicate the entire source volume. Also, do NOT create the qtree on the destination filer before initializing the SnapMirror - the initialize will create the qtree for you. This can also be done in the [unsupported] FilerView on the destination StoreVault to enable throttling and a schedule without having to go into the /etc/snapmirror.conf file.
 

The Case of the Print Spooler That Stops Running

Recently, I had cleaned up a virus from a user's laptop - it was a fairly straightforward cleanup, and I thought I was done. Not quite. The user had said that her husband had been trying to print and was getting a print spooler error...had the spooler randomly stopped? I sent the command to restart the spooler. This did not work as it seemed the spooler continually stopped running. I then sent the path for the spool folder to see if there was some corrupt spool files that the spooler did not like - turns out that directory was empty. Finally I recomended to uninstall then reinstall the print drivers for the printer...long story short, the laptop came back in.

There was clearly an issue as the spooler stopped nearly immediately whenever anything print related was done - add a printer, view server properties. I tried removing the entire contents of the driver folder in the spool directory. Still nothing. As always- "When in doubt, run Process Monitor!"

I looked through the log to see what the spoolsv.exe process was doing - did not seem to be anything out of the ordinary. Then I found it: right before the spooler thread exits, there's a QueryOpen to a file in a temp directory:

Why was the spooler looking here? Was this somehow a spool file? I figured I would just rename the file and see if that helped -

 

 

Sure enought that worked - I could now use all printing functions on the system...but what the heck was 17EB.tmpjQuery15209008132997218726_1340742345248 Let's check the stack:

Bingo! The description of this image (Zhgemubqnkekkwthf) matches one of the .exe files associated with the malware I had cleaned previously. Additionally, I have heard of malware associated with 'Heaventools Software'. Now the root cause analysis: why did any print function hose the spooler? Clearly this was the cause, but what continually called it? Let's check the registry:

So the malware had injected itself as a print provider. Anytime that a print function was called, the malware would have likely recopied itself, or run one of the executables I had already deleted or done something else undesirable. Additionally, it was added to all ControlSet trees.....it was also REMOVED from the registry entirely, and fixed.

 

Aironet AP Missing config command?

Doing just a simple SSID and security change on an AP, I was using the web GUI. But then I was not able to successfully enable WPA, so I resorted to the CLI, only to find that I couldn't even get into global config mode!
-----ES-AP>en
Password:
-----ES-AP#config t
              ^
% Invalid input detected at '^' marker.

-----ES-AP#?
Exec commands:
  <1-99>           Session number to resume
  access-enable    Create a temporary Access-List entry
  access-template  Create a temporary Access-List entry
  archive          manage archive files
  cd               Change current directory
  clear            Reset functions
  clock            Manage the system clock
  connect          Open a terminal connection
  copy             Copy from one file to another
  crypto           Encryption related commands.
  debug            Debugging functions (see also 'undebug')
  delete           Delete a file
  dir              List files on a filesystem
  disable          Turn off privileged commands
  disconnect       Disconnect an existing network connection
  dot11            IEEE 802.11 commands
  dot1x            IEEE 802.1X Exec Commands
  enable           Turn on privileged commands
  erase            Erase a filesystem
  exit             Exit from the EXEC
  format           Format a filesystem

Where's the configure commandjQuery15205602584033231411_1340742310610? This one had me confused for a while...but then I did a show version:

cisco AIR-LAP521G-A-K9     (PowerPCElvis) processor (revision A0) with 24566K/8192K bytes of memory.
Processor board ID #############
PowerPCElvis CPU at 262Mhz, revision number 0x0950
Last reset from power-on
1 FastEthernet interface
1 802.11 Radio(s)
And there in the unit model number AIR-LAP512G - this is a lightweight AP!! And there's not even a controller at this site?!?!
 

RemoteApp Icons missing

I ran into something strange when adding the Failover Cluster Manager tool to RemoteApp - the icon on both the RD web services site and RemoteApp and Desktop Connections did not show properly - here's what I mean:

 

The above was taken from RemoteApp and Desktop connections - here's what it looked like on the RD Web Services site:

 

As you can see, the Failover Cluster Manager app has the standard RDP connection icon. Something is clearly wrong with this - all other icons are showing correctly.

First troubleshooting step: Clear the icon cache on the RD Web Services - on the server, the icon cache is located at: C:\Windows\Web\RDWeb\pages\rdp All of the image files and RDP files in this directory can be safely deleted - when you re-open the RD Web Services site (or refresh it) it will reload all of the images and files in the cache folder. This did not work - the Failover icon was still the default RDP icon.

Second troubleshooting step: Remove then re-add the app in RemoteApp Manager. This led me to the solution - the first re-add did not work, however I noticed that the path (as on all other icons in RemoteApp) is a UNC path to the server hosting the app - for example \\rdserver\c$\Windows\... This got me thinking about permissions - I check all directory permissions - which were all correct. Then I ran RemoteApp explicitly as Administrator. When adding the app back, the path to the icon file became a local path - and upon refreshing the site, all icons showed correctly! As a side note, when running RemoteApp Manager as Administrator, you cannot change the user assignment on the app when first creating it.

I'm not sure if it was because of running RemoteApp Manager as Administrator explicitly or the path to the icon - which in this case is different from most others - C:\Windows\Cluster.

 

Data Protection Manager 2010 - Backup to [empty] Disk

When I transfered most of my VMs over to shared storage, I didn't really have the time (or money) to build a respectable storage environment. Needless to say there are some VMs that are stored on non-redundant storage....but most are on at least a RAID1 volume. It was at this time I decided to install Data Protection Manager 2010 and start getting things backed up. Obviously I will be using backup-to-disk as 1. no tape drives or libraries and 2. not paying for cloud storage for a home environment.

The bit that I was confused about was adding storage - Don't add disks into a DPM storage pool if you have the whole disk allocated with a volume - the operation will be successful, but you will not have any available space for DPM to allocate and use.

The way DPM B2D works is by allocating each replica its own individual volume then mapping it into the DPM directory. Here's how it will look:

 

Each volume is mapped to the DPM directory - it's a bit of a long winded path:

C:\Program Files\Microsoft DPM\DPM\Volumes\Replica\Microsoft Hyper-V VSS Writer\

The directory will also depend on which type of replica - in this case, these volumes all correspond to Hyper-V snapshots...but if you browse further down into the directory, it is a full snapshot of the VM storage - VHD and config files - as you would expect.