jake's blog

dribble from the Tech World

7. September 2013 21:36
by jake rutski
0 Comments

StarWind SAN V8 - Beta Initial Testing

7. September 2013 21:36 by jake rutski | 0 Comments

Last Friday, 9/6/2013, StarWind released a beta of version 8 of their core SAN product - iSCSI SAN. This version has been in the works for several months - it includes what appears to be a major rewrite of the core storage code as well as a simplified management console. The core includes:

Please note that this release is a beta release. That said, I could not wait to give it a shot...even though it's not quite the same setup as in previous tests. My physical SAN\NAS is now based on FreeNAS with NFS storage presented to VMware - this test will have to be virtual. As a basic first test, I have a 2vCPU, 8GB RAM 2008R2 virtual machine with StarWind iSCSI SAN V8 installed. It is presenting a 50GB volume (which is presented by NFS from FreeNAS) backed by 5GB write-back cache. The initial install of StarWind shows a vastly different console:

Additionally, I am prompted to store data somewhere - either the C: drive, or another volume:

Next, I created the LSFS based iSCSI target - using the advanced device wizard:

 

After all that, I presented the target to VMware, and cloned a 2008R2 VM from template. Note that the datastore shows full VAAI support, and according to StarWind it is supported. I ended up cloning 2 VMs to the datastore - each with 10GB worth of data. The StarWind server itself showed only 10.1 GB used with DeDupe turned on, but VMware showed 20GB used. The real test: IOMeter:

Again - please note that this is an iSCSI volume based on a VMDK presented by NFS to an ESX host...so there's a few layers of storage virtualization here...plus it's only 8GB of virtual RAM, which is not that much (and it's virtual). All that being said, a write-heavy, random IO test produced EXTREMELY favorable results:

3400 IOPS???!?!?! That's AMAZING! and it's all virtual? Imagine if this were a physical server, with tons of RAM and an SSD L2 cache. I would imagine that it would only get better from here. More tests to come!

30. August 2013 08:16
by jake rutski
0 Comments

FreeNAS 9.1 - LZ4 Compression

30. August 2013 08:16 by jake rutski | 0 Comments

I recently upgraded to FreeNAS 9.1 once it was GA. I have always used compression on the volume that stores all of the virtual machines - instead of using DeDupe which requires considerably more RAM, and doesn't appear to be as forgiving. That being said, the volume has been configured with LZJB compression and I was seeing about 40-50% compression ratios. Then I converted the volume to LZ4 compression. *Note: Modifying compression only effects new data written to the volume.

After storage vMotioning and copying around all the VMs, I saw an additional 9-10% improvement in compression - here's some numbers:

  • 13.98 GB vmdk with lzjb = 12.61 GB vmdk with LZ4
  • 23.32 GB vmdk with lzjb = 20.77 GB vmdk with LZ4

All in all, I was able to reclaim about 20GB of space from compression.

4. July 2013 15:33
by jake rutski
0 Comments

FreeNAS Build Project: Part 3

4. July 2013 15:33 by jake rutski | 0 Comments

This post is slightly out of order - I'll do part 2 with all specs, hardware, and pictures shortly. For now here's some brief preliminary findings - the configuration:

  • 4 mirrors of 2 x 146GB SFF 10K RPM drives
  • ZIL housed on 1 mirror of 2 Intel 330 60GB SSD drives
  • DL380G6 with 30GB PC3-10600R RAM

ESX is connected to this via NFS (I know sync writes...that's why the ZIL is on the SSDs). So in theory, the raw spindle performance should be something like 500 IOPS. Here's a 75% write, 100% random IOmeter test:

While ~1100 IOPS isn't great, for the spindle count it's not bad. Plus, looking at the ARC hit\miss count, if more of the reads were cached, I am betting the performance would be better. I'll likely add an SSD L2ARC and\or more RAM to the system. 

And here's a 100% Read, 100% random test: (obviously this is all coming from ARC)

 

More details later.

20. March 2013 17:08
by Jake Rutski
0 Comments

FreeNAS Build Project: Part 1

20. March 2013 17:08 by Jake Rutski | 0 Comments

FreeNAS Build: Part 1 - Turn on a Power Supply without a Motherboard

After doing much research, I've decided to move forward with my FreeNAS build for my home lab\storage. I will be storing VMware datastores via NFS and tons of files via CIFS. The system will be configured (for now) as follows:

  • HP DL360G5
    • 2x Dual-Core Xeon 3.0GHz
    • 32GB RAM
    • 4 1Gb NICs
  • IBM ServeRAID M1015 flashed to an LSI-92118i
  • HP SAS Expander
  • Habey ESC-2122C for 12 additional 3.5" drives

The 12 bay 'storage shelf' has already been arrived and reviewed here.

This configuration will put the SAS expander in the storage shelf - so the next question: how to power it?

The HP SAS card only gets power from the PCI-e interface; there is no data connection. So all I need is a powered PCI-e x8 slot - one option is to use a cheap motherboard to power a slot, but never boot into an OS. Another option is having the SAS expander in the actual PC - in my case the DL360...this will not work since I am out of slots (used by the 2nd NIC and LSI SAS card). So with this build, I'll be using a 2 slot, 1U backplane (PE-2SD1-R10) available here from OrbitMicro.

The only problem with these backplanes is that they do not have a direct means of turning on the power supply - and keeping it on. There are again, a few options - the most common is to jumper the ground and power-on pins - it should be noted that the power button and reset button pins don't do anything (that I can tell). By jumpering the power-on pin to ground, the power supply is permanently powered on - thus making the external hard switch on the PSU the power on switch. I didn't like this idea - so I built a circuit based on a 2 coil latching relay to allow the front power button to control the power supply. 

Using the 5 volt standby to actuate the 2 coils in the relay - the 2 pin lead is for the built in power button on the chassis and the 3 pin lead connects the ground, +5VSB, and power-on from the backplane.

20. February 2013 15:59
by Jake Rutski
0 Comments

FreeNAS - ESX and NFS; Synchronous Writes and the ZIL

20. February 2013 15:59 by Jake Rutski | 0 Comments

I've been testing FreeNAS lately - connecting ESX hosts via NFS for virtual machine storage. Being POSIX compliant, ZFS must abide by any calls made with the o_sync flag set meaning essentially that all disk activity must be written to stable storage before success is returned. This includes most commonly databases, file server operations and most importantly NFS. This means that the ZIL (ZFS intent log) will be used as a special place on disk in the pool to temporarily write data to from RAM - in its default configuration the ZIL is comprised of blocks from your zpool.

There are 2 ways to improve performance for NFS workloads - especially intensive ones such as VMware:

  1. Use an SSD (or mirrored pair of SSDs - especially if you're not at ZFS v28) as a dedicated log device
  2. Disable the ZIL *Not recommended - can cause data corruption

To prove that the ZIL is being used excessively, you can temporarily disable it to see if performance improves. See the screenshot below from the VCenter console - datastore write latency for my VMs was averaging 9-11ms which is not great. The arrow indicates the point at which the ZIL was disabled:

While it is not a good idea to disable the ZIL, it clearly shows that it is being strained. Once disabled, the new average latency was 0-1ms and the systems were much more responsive.

The command to disable the ZIL is:

zfs set sync=disabled tank/dataset

 

This was done for testing\demonstration only. To re-enable the ZIL back to its default configuration run this command:

zfs set sync=standard tank/dataset

For the final FreeNAS, a pair of SSDs will be used for the ZIL.

20. February 2013 08:15
by Jake Rutski
0 Comments

Habey ESC-2122C Storage Server Chassis Review

20. February 2013 08:15 by Jake Rutski | 0 Comments

I have been testing lots of storage software lately - since a NetApp Filer is overkill for a home lab environment, I'm sticking with other software based storage. The goal has always been to serve storage for two things - Fast storage for an ESX cluster and lots of storage to share over the network with CIFS.

The first iteration was a DL380G5 with a p800 card connected via external SAS to an MSA50. The biggest problem with this setup was simply that the MSA50 was WAY TOO LOUD for a home environment, and there is no real way to quiet it down. The next problem is also that the DL380G5 is also somewhat loud, and it only has 8 SFF bays. Long story short, I decided to build a white box disk shelf - and that's where the Habey ESC-2122C Chassis comes in.

The item can be found here at Newegg. It is a 2U chassis with 12 hot-swap SAS\SATA LFF slots, 4 80mm fans, and the backplane is connected with 3 SFF-8087 ports (not 2 like Habey and Newegg report). The plan is to use an HP SAS Expander in the chassis to provide SAS connectivity to the controller server. Here's an overview:

What you see is what you get - a bag of screws...and that is all. Looks like rails are available (here's one that I found) though I'm not sure if I'll use them or not as it will be sitting on top in the rack...we'll see. Here's the backplane and fan connector board:

Here's the 3 (not two) SAS backplane connections and 4-pin molex connectors for HDD power. Please note that there are SIX (6) power connectors on the backplane, plus ONE (1) for the fans for a total of (7) - needless to say I will be repurposing some SATA power for this.

Update: Note that each pair of molex power adapters is for redundant power supplies - so for a single PSU, only one of each pair will be needed to power each row of four (4) drives.

There are 2 places on either end of the fan-cutout plate for cables to pass to the front of the chassis - it is going to be difficult to get all the cables up front because the space is fairly small. Also the back of the SAS backplane connectors are fairly close to the fan cages and the bottom one is fairly difficult to get to without taking the fan out.

A little blurry but here's the front of the unit:

And one of the HDD trays:

Also note: the trays are compatible with 2.5" drives - note the 4 screw holes in the tray.

The fans are meant to be "hot-swap" as well, but being that the only thing generating much heat is the SAS Expander, air flow is not a high priority here. That said, I am removing the stock fans and replacing them with much quieter units.

 

Here's the fan out of the slot - there are two options if you want to replace the fans:

  1. Use the existing 3-pin fan headers on the control board directly to new fans - this bypasses the 4-pin 'hot plug' connector on the fan cage.
  2. Re-pin the 'hot plug' connector and keep the existing cable connections to the control board.

I opted for option 1 - the pins from the new fans did not fit into the 4-pin connector on the fan cage - also note that only 3 of the 4 pins are used. Here's a close up of the fan cage:

There's a small blue clip that holds the white connector in place - it snaps off. The 4 fan screws also have small spacers between the fan and the cage.

All in all it seems to be a decent chassis for the money. More details once I start building the controller and get the expander mounted.

10. January 2013 13:19
by Jake Rutski
3 Comments

FreeNAS Performance: Part 3

10. January 2013 13:19 by Jake Rutski | 3 Comments

I recently storage-VMotion'd all of the lab home lab VMs over to FreeNAS based storage. It is an NFS share based on 4 10K 146G spindles configured for RAIDZ - deduplication is turned off, instead opting for lzjb compression. The physical specs of the FreeNAS host are the same as in this post.

Keep in mind that there are several other VMs banging away at these 4 disks. That said, 75% write\25% read (75% random) is considerably lower for this configuration. Again - this is not the final configuration - no SSDs for ZIL, RAIDZ (RAID5) instead of a mirror (RAID1), and more RAM.

200 IOPS isn't terrible, but it's not great. It's plenty for my home lab (for now).

Just to compare and show that the L1ARC (RAM) cache is doing its job, I configured a 100% read specification (75% random).

2355 read IOPS is pretty good for these disks - but you can clearly see that it is pretty much all coming out of the RAM cache:

[root@filer] /usr/local/www/freenasUI/tools# ./arcstat.py -f read,hits,miss,hit%,arcsz 1

More testing to continue...

8. January 2013 08:49
by Jake Rutski
0 Comments

FreeNAS Performace: Part 2

8. January 2013 08:49 by Jake Rutski | 0 Comments

I should probably call this one "FreeNAS Performance: The REAL Test"...my previous performance tests were completely unfair to FreeNAS and I'd like to show what it can really do. So - what was wrong with the first tests?

  • FreeNAS needs RAM for ZFS to do what it does best
  • A single spindle just isn't a good test for ZFS
  • Didn't do my homework on FreeNAS or ZFS
  • FreeNAS needs RAM - more than 3GB
  • ...

So here's the new test setup - granted this is not an apples-to-apples comparison, I feel it's enough to show that FreeNAS can and will perform.

The VMware environment is running on HP DL360G5s, as is the FreeNAS appliance. While the VMware box in this test is more beefy than the previous tests, keep in mind that it is also loaded with several other running VMs during the tests. FreeNAS Specs:

  • Dual Intel Xeon 5160
  • 16GB RAM
  • 3x 10K SAS Raid0 (stripe)
  • P400 with 256MB BBWC

I did a single Windows 7 VM install in 7-8 minutes. Boot\reboot times are excellent - sub 20 seconds...overall, performance of the VM feels great. IOMeter shows it:

This is not the same ~50 IOPS from the previous test...this is 30 times the IOPS! This is the same IOMeter test setup as before, and it is pulling 1500+ IOPS from the FreeNAS NFS datastore. The FreeNAS appliance has allocated ~3-5.5GB once this VM was up and running and during the test...clearly 3GB in the previous test was not nearly enough. Further, I do not have deduplication turned on - instead I am using compression on the ZFS Dataset. In Windows, I am showing 8.6GB space used and only using 4.2GB on the NFS share:

Additionally, I was able to copy a file to a CIFS share (all over a 1Gb network) and saw 110-130MBps.

The testing continues in Part 3!

29. September 2012 17:27
by Jake Rutski
0 Comments

StarWind iSCSI SAN V6 Released

29. September 2012 17:27 by Jake Rutski | 0 Comments

Although it has been available for about a month, the next major release of the StarWind iSCSI target is here! Some of the biggest features include (taken from here):

High availability:
- 3-node HA configuration. Synchronous mirroring between 3 nodes of an HA storage cluster. Such storage architecture ensures higher uptime and higher performance compared to a 2-node HA configuration.
- HA device nodes manager. You can add, remove, or switch nodes of HA cluster on the running device instead of the creation of a new HA device from scratch.
- An HA device can now use other types of StarWind devices (deduplicated, thin-provisioned IBV, or DiskBridge devices) for storing data. Thus, you can apply deduplication, thin provision, snapshot technologies, etc., for the data stored on the HA device. Experimental feature.
- ALUA. Asymmetric logical unit access is required for cases when individual nodes use very different by performance metrics storage types like SATA on one node and SSD on another. With ALUA enabled most of I/Os are served by faster node resulting less latency for I/O intensive applications.

Deduplication:
- Asynchronous replication to remote iSCSI Target over WAN as an experimental feature.
- Data deletion support (experimental feature). Unused data blocks are overwritten by the new actual data.
- Memory usage reduced by 30%. When the dedupe block size is set to 4kb, 2MB of memory are required per 1GB of stored data.

iSCSI Boot:
- StarWind can be used to build and configure environment for iSCSI boot.
- Two modes added for Snapshot and CDP device: redirect on write and redirect on write with discard. These options can be used for booting multiple clients from one image.

Event notifications:
- Free space low watermarks are now reported for thin-provisioned and deduplicated volumes.

Backup Plug-in for Hyper-V virtual machines:
- Incremental backup and delta data saving features added.

Backup Plug-in for ESX virtual machines (experimental feature):
- Full and incremental backup of virtual machines.
- ESX VMs management.
- Backup archives are saved in the native VMware format – VMDK.

 

The current version is 6.0.4768 - the biggest reason for me to upgrade to this version was the deduplication deletion support. Previously, a deduped device would continue to grow in size when data was deleted as it did not keep track of unused blocks. This would cause the .spdata file to grow rather large, especially if you do a storage VMotion or have a lot of changing data. Here's a snapshot of the dedupe device creation:

The deletion support in its current form (still an 'experimental feature') will now overwrite unused blocks with actual data - keep in mind, though that the container files on the StarWind server will not shrink in size - that is a feature that will likely show up in a future release.

Currently, I'm "using" around 270GB of storage which is only occupying ~68GB on disk. Bear in mind that this ratio could be better - but I am currently running both Server 2008 R2 AND Server 2012 VMs.

For best performance, set deduplication block size to 4K and do NOT use thin-provisioning on top of deduplicated devices.

2. April 2012 20:53
by Jake Rutski
0 Comments

Software Based Storage: Thoughts and local storage tests

2. April 2012 20:53 by Jake Rutski | 0 Comments

It has occurred to me that all these comparisons are not exactly equal...while the VM configurations, test procedures, and testing hardware are all identical - there are certainly ways to improve performance...some methods could be applied to all comparisons (adding a storage controller card with battery-backed write cache, and several 15K SAS spindles), and some are specific to the software presenting the storage (using an SSD to house the ZIL and\or l2ARC - only applies to ZFS-based products). In reality, these tests are performed using the 'absolute worst case scenario' - who in their right mind would use a single (7200 RPM, non-enterprise) drive to house anything more than a music library?

All that said, I wanted to take the network and the 3rd party storage providers out of the question and repeat some tests using a local datastore on the ESX host. Here's what the datastore looked like during zeroing and OS install:

During install, latency stayed right around 40ms. Complete installation for 3 VMs took just under 1 hour. Here's the datastore latency during idle operation and the IOMeter test at the far right of the chart:

Latency was just under 20ms during idle. The first test was a single VM running the standard IOMeter worker as in previous comparisons:

This shows the local storage to be around 25 IOPS worse than the MS iSCSI target and a single IOMeter test. The 3 VM test shows where DAS takes a big hit:

The best average I saw during the test:

So in the end, local storage is clearly not the way to go (except in some very specific use cases, but that involves some gear that will NEVER be approved for a home lab...and nor should it be)