jake's blog

dribble from the Tech World

20. March 2013 17:08
by Jake Rutski
0 Comments

FreeNAS Build Project: Part 1

20. March 2013 17:08 by Jake Rutski | 0 Comments

FreeNAS Build: Part 1 - Turn on a Power Supply without a Motherboard

After doing much research, I've decided to move forward with my FreeNAS build for my home lab\storage. I will be storing VMware datastores via NFS and tons of files via CIFS. The system will be configured (for now) as follows:

  • HP DL360G5
    • 2x Dual-Core Xeon 3.0GHz
    • 32GB RAM
    • 4 1Gb NICs
  • IBM ServeRAID M1015 flashed to an LSI-92118i
  • HP SAS Expander
  • Habey ESC-2122C for 12 additional 3.5" drives

The 12 bay 'storage shelf' has already been arrived and reviewed here.

This configuration will put the SAS expander in the storage shelf - so the next question: how to power it?

The HP SAS card only gets power from the PCI-e interface; there is no data connection. So all I need is a powered PCI-e x8 slot - one option is to use a cheap motherboard to power a slot, but never boot into an OS. Another option is having the SAS expander in the actual PC - in my case the DL360...this will not work since I am out of slots (used by the 2nd NIC and LSI SAS card). So with this build, I'll be using a 2 slot, 1U backplane (PE-2SD1-R10) available here from OrbitMicro.

The only problem with these backplanes is that they do not have a direct means of turning on the power supply - and keeping it on. There are again, a few options - the most common is to jumper the ground and power-on pins - it should be noted that the power button and reset button pins don't do anything (that I can tell). By jumpering the power-on pin to ground, the power supply is permanently powered on - thus making the external hard switch on the PSU the power on switch. I didn't like this idea - so I built a circuit based on a 2 coil latching relay to allow the front power button to control the power supply. 

Using the 5 volt standby to actuate the 2 coils in the relay - the 2 pin lead is for the built in power button on the chassis and the 3 pin lead connects the ground, +5VSB, and power-on from the backplane.

20. February 2013 15:59
by Jake Rutski
0 Comments

FreeNAS - ESX and NFS; Synchronous Writes and the ZIL

20. February 2013 15:59 by Jake Rutski | 0 Comments

I've been testing FreeNAS lately - connecting ESX hosts via NFS for virtual machine storage. Being POSIX compliant, ZFS must abide by any calls made with the o_sync flag set meaning essentially that all disk activity must be written to stable storage before success is returned. This includes most commonly databases, file server operations and most importantly NFS. This means that the ZIL (ZFS intent log) will be used as a special place on disk in the pool to temporarily write data to from RAM - in its default configuration the ZIL is comprised of blocks from your zpool.

There are 2 ways to improve performance for NFS workloads - especially intensive ones such as VMware:

  1. Use an SSD (or mirrored pair of SSDs - especially if you're not at ZFS v28) as a dedicated log device
  2. Disable the ZIL *Not recommended - can cause data corruption

To prove that the ZIL is being used excessively, you can temporarily disable it to see if performance improves. See the screenshot below from the VCenter console - datastore write latency for my VMs was averaging 9-11ms which is not great. The arrow indicates the point at which the ZIL was disabled:

While it is not a good idea to disable the ZIL, it clearly shows that it is being strained. Once disabled, the new average latency was 0-1ms and the systems were much more responsive.

The command to disable the ZIL is:

zfs set sync=disabled tank/dataset

 

This was done for testing\demonstration only. To re-enable the ZIL back to its default configuration run this command:

zfs set sync=standard tank/dataset

For the final FreeNAS, a pair of SSDs will be used for the ZIL.

20. February 2013 08:15
by Jake Rutski
0 Comments

Habey ESC-2122C Storage Server Chassis Review

20. February 2013 08:15 by Jake Rutski | 0 Comments

I have been testing lots of storage software lately - since a NetApp Filer is overkill for a home lab environment, I'm sticking with other software based storage. The goal has always been to serve storage for two things - Fast storage for an ESX cluster and lots of storage to share over the network with CIFS.

The first iteration was a DL380G5 with a p800 card connected via external SAS to an MSA50. The biggest problem with this setup was simply that the MSA50 was WAY TOO LOUD for a home environment, and there is no real way to quiet it down. The next problem is also that the DL380G5 is also somewhat loud, and it only has 8 SFF bays. Long story short, I decided to build a white box disk shelf - and that's where the Habey ESC-2122C Chassis comes in.

The item can be found here at Newegg. It is a 2U chassis with 12 hot-swap SAS\SATA LFF slots, 4 80mm fans, and the backplane is connected with 3 SFF-8087 ports (not 2 like Habey and Newegg report). The plan is to use an HP SAS Expander in the chassis to provide SAS connectivity to the controller server. Here's an overview:

What you see is what you get - a bag of screws...and that is all. Looks like rails are available (here's one that I found) though I'm not sure if I'll use them or not as it will be sitting on top in the rack...we'll see. Here's the backplane and fan connector board:

Here's the 3 (not two) SAS backplane connections and 4-pin molex connectors for HDD power. Please note that there are SIX (6) power connectors on the backplane, plus ONE (1) for the fans for a total of (7) - needless to say I will be repurposing some SATA power for this.

Update: Note that each pair of molex power adapters is for redundant power supplies - so for a single PSU, only one of each pair will be needed to power each row of four (4) drives.

There are 2 places on either end of the fan-cutout plate for cables to pass to the front of the chassis - it is going to be difficult to get all the cables up front because the space is fairly small. Also the back of the SAS backplane connectors are fairly close to the fan cages and the bottom one is fairly difficult to get to without taking the fan out.

A little blurry but here's the front of the unit:

And one of the HDD trays:

Also note: the trays are compatible with 2.5" drives - note the 4 screw holes in the tray.

The fans are meant to be "hot-swap" as well, but being that the only thing generating much heat is the SAS Expander, air flow is not a high priority here. That said, I am removing the stock fans and replacing them with much quieter units.

 

Here's the fan out of the slot - there are two options if you want to replace the fans:

  1. Use the existing 3-pin fan headers on the control board directly to new fans - this bypasses the 4-pin 'hot plug' connector on the fan cage.
  2. Re-pin the 'hot plug' connector and keep the existing cable connections to the control board.

I opted for option 1 - the pins from the new fans did not fit into the 4-pin connector on the fan cage - also note that only 3 of the 4 pins are used. Here's a close up of the fan cage:

There's a small blue clip that holds the white connector in place - it snaps off. The 4 fan screws also have small spacers between the fan and the cage.

All in all it seems to be a decent chassis for the money. More details once I start building the controller and get the expander mounted.

10. January 2013 13:19
by Jake Rutski
3 Comments

FreeNAS Performance: Part 3

10. January 2013 13:19 by Jake Rutski | 3 Comments

I recently storage-VMotion'd all of the lab home lab VMs over to FreeNAS based storage. It is an NFS share based on 4 10K 146G spindles configured for RAIDZ - deduplication is turned off, instead opting for lzjb compression. The physical specs of the FreeNAS host are the same as in this post.

Keep in mind that there are several other VMs banging away at these 4 disks. That said, 75% write\25% read (75% random) is considerably lower for this configuration. Again - this is not the final configuration - no SSDs for ZIL, RAIDZ (RAID5) instead of a mirror (RAID1), and more RAM.

200 IOPS isn't terrible, but it's not great. It's plenty for my home lab (for now).

Just to compare and show that the L1ARC (RAM) cache is doing its job, I configured a 100% read specification (75% random).

2355 read IOPS is pretty good for these disks - but you can clearly see that it is pretty much all coming out of the RAM cache:

[root@filer] /usr/local/www/freenasUI/tools# ./arcstat.py -f read,hits,miss,hit%,arcsz 1

More testing to continue...

8. January 2013 08:49
by Jake Rutski
0 Comments

FreeNAS Performace: Part 2

8. January 2013 08:49 by Jake Rutski | 0 Comments

I should probably call this one "FreeNAS Performance: The REAL Test"...my previous performance tests were completely unfair to FreeNAS and I'd like to show what it can really do. So - what was wrong with the first tests?

  • FreeNAS needs RAM for ZFS to do what it does best
  • A single spindle just isn't a good test for ZFS
  • Didn't do my homework on FreeNAS or ZFS
  • FreeNAS needs RAM - more than 3GB
  • ...

So here's the new test setup - granted this is not an apples-to-apples comparison, I feel it's enough to show that FreeNAS can and will perform.

The VMware environment is running on HP DL360G5s, as is the FreeNAS appliance. While the VMware box in this test is more beefy than the previous tests, keep in mind that it is also loaded with several other running VMs during the tests. FreeNAS Specs:

  • Dual Intel Xeon 5160
  • 16GB RAM
  • 3x 10K SAS Raid0 (stripe)
  • P400 with 256MB BBWC

I did a single Windows 7 VM install in 7-8 minutes. Boot\reboot times are excellent - sub 20 seconds...overall, performance of the VM feels great. IOMeter shows it:

This is not the same ~50 IOPS from the previous test...this is 30 times the IOPS! This is the same IOMeter test setup as before, and it is pulling 1500+ IOPS from the FreeNAS NFS datastore. The FreeNAS appliance has allocated ~3-5.5GB once this VM was up and running and during the test...clearly 3GB in the previous test was not nearly enough. Further, I do not have deduplication turned on - instead I am using compression on the ZFS Dataset. In Windows, I am showing 8.6GB space used and only using 4.2GB on the NFS share:

Additionally, I was able to copy a file to a CIFS share (all over a 1Gb network) and saw 110-130MBps.

The testing continues in Part 3!

29. September 2012 17:27
by Jake Rutski
0 Comments

StarWind iSCSI SAN V6 Released

29. September 2012 17:27 by Jake Rutski | 0 Comments

Although it has been available for about a month, the next major release of the StarWind iSCSI target is here! Some of the biggest features include (taken from here):

High availability:
- 3-node HA configuration. Synchronous mirroring between 3 nodes of an HA storage cluster. Such storage architecture ensures higher uptime and higher performance compared to a 2-node HA configuration.
- HA device nodes manager. You can add, remove, or switch nodes of HA cluster on the running device instead of the creation of a new HA device from scratch.
- An HA device can now use other types of StarWind devices (deduplicated, thin-provisioned IBV, or DiskBridge devices) for storing data. Thus, you can apply deduplication, thin provision, snapshot technologies, etc., for the data stored on the HA device. Experimental feature.
- ALUA. Asymmetric logical unit access is required for cases when individual nodes use very different by performance metrics storage types like SATA on one node and SSD on another. With ALUA enabled most of I/Os are served by faster node resulting less latency for I/O intensive applications.

Deduplication:
- Asynchronous replication to remote iSCSI Target over WAN as an experimental feature.
- Data deletion support (experimental feature). Unused data blocks are overwritten by the new actual data.
- Memory usage reduced by 30%. When the dedupe block size is set to 4kb, 2MB of memory are required per 1GB of stored data.

iSCSI Boot:
- StarWind can be used to build and configure environment for iSCSI boot.
- Two modes added for Snapshot and CDP device: redirect on write and redirect on write with discard. These options can be used for booting multiple clients from one image.

Event notifications:
- Free space low watermarks are now reported for thin-provisioned and deduplicated volumes.

Backup Plug-in for Hyper-V virtual machines:
- Incremental backup and delta data saving features added.

Backup Plug-in for ESX virtual machines (experimental feature):
- Full and incremental backup of virtual machines.
- ESX VMs management.
- Backup archives are saved in the native VMware format – VMDK.

 

The current version is 6.0.4768 - the biggest reason for me to upgrade to this version was the deduplication deletion support. Previously, a deduped device would continue to grow in size when data was deleted as it did not keep track of unused blocks. This would cause the .spdata file to grow rather large, especially if you do a storage VMotion or have a lot of changing data. Here's a snapshot of the dedupe device creation:

The deletion support in its current form (still an 'experimental feature') will now overwrite unused blocks with actual data - keep in mind, though that the container files on the StarWind server will not shrink in size - that is a feature that will likely show up in a future release.

Currently, I'm "using" around 270GB of storage which is only occupying ~68GB on disk. Bear in mind that this ratio could be better - but I am currently running both Server 2008 R2 AND Server 2012 VMs.

For best performance, set deduplication block size to 4K and do NOT use thin-provisioning on top of deduplicated devices.

2. April 2012 20:53
by Jake Rutski
0 Comments

Software Based Storage: Thoughts and local storage tests

2. April 2012 20:53 by Jake Rutski | 0 Comments

It has occurred to me that all these comparisons are not exactly equal...while the VM configurations, test procedures, and testing hardware are all identical - there are certainly ways to improve performance...some methods could be applied to all comparisons (adding a storage controller card with battery-backed write cache, and several 15K SAS spindles), and some are specific to the software presenting the storage (using an SSD to house the ZIL and\or l2ARC - only applies to ZFS-based products). In reality, these tests are performed using the 'absolute worst case scenario' - who in their right mind would use a single (7200 RPM, non-enterprise) drive to house anything more than a music library?

All that said, I wanted to take the network and the 3rd party storage providers out of the question and repeat some tests using a local datastore on the ESX host. Here's what the datastore looked like during zeroing and OS install:

During install, latency stayed right around 40ms. Complete installation for 3 VMs took just under 1 hour. Here's the datastore latency during idle operation and the IOMeter test at the far right of the chart:

Latency was just under 20ms during idle. The first test was a single VM running the standard IOMeter worker as in previous comparisons:

This shows the local storage to be around 25 IOPS worse than the MS iSCSI target and a single IOMeter test. The 3 VM test shows where DAS takes a big hit:

The best average I saw during the test:

So in the end, local storage is clearly not the way to go (except in some very specific use cases, but that involves some gear that will NEVER be approved for a home lab...and nor should it be)

18. March 2012 01:58
by Jake Rutski
4 Comments

FreeNAS Performance Part 1: NFS Storage

18. March 2012 01:58 by Jake Rutski | 4 Comments

EDIT 1/8/2013: This post should be titled FreeNAS: The performance you will get when you don't allocate enough RAM, or enough disk resources.

These results are not a true representation of what FreeNAS can do. Here's a better example: FreeNAS Performance Part 2

------------------------------------------------------------

Following the Microsoft iSCSI VS. StarWind iSCSI, I would like to also compare another option that offers FreeBSD based network storage - FreeNAS. It supports AFP, CIFS, NFS, iSCSI and has a very user friendly web GUI - further information is available here at the FreeNAS website.

Test Specifications

The same whitebox server that was used for the StarWind and Microsoft iSCSI tests was used for the FreeNAS server - 3.00 GHz Xeon, 3GB RAM, single 1GbE interface, single 80GB spindle for both the OS and NFS export.

OS Installation Performance

Let me put it this way - after 1 hour, none of the VMs had finished more than ~48% completion....Just short of 2 hours after the install was initiated, one of the VMs had successfully installed an OS, and the other 2 had failed setup with errors. Here's some of the built in reporting for FreeNAS:

And CPU utilization:

The latency for the NFS datastore is terrible:

Running IOMeter on a single VM while the other two VMs were installing the OS (Same IOMeter worker configuration as in previous tests):

Hoping to improve performance, the other 2 VMs were powered down, and the IOMeter test was run again:

The IOPS only improved by ~100 - the VM disk IO latency is still around ~1700+ ms - this is confirmed again by terrible host datastore latency - overall average write latency 100ms+ :

 

Conclusion

FreeNAS NFS storage, when configured in the same way as all previous experiments, has worse performance than local storage.

13. March 2012 23:08
by Jake Rutski
0 Comments

Software iSCSI Targets: Part 2B - StarWind, Multiple VMs

13. March 2012 23:08 by Jake Rutski | 0 Comments

Part 2B: 3 VM IOMeter load on a StarWind iSCSI datastore. Same procedure as previous testing - Complete install time: 32 minutes, 39 seconds - 3.5 minutes faster than the Microsoft iSCSI target software. Here's the setup:

Here's the CPU\RAM of the iSCSI server during OS install:

Just as previously, RAM is allocated to cache and CPU is heavily utilized. Here is the network utilization during OS install:

The utilization graph looks strikingly similar to the MS iSCSI target install, however, utilization goes over 30% at a few points. During the major portion of OS install - CPU utilization is high, as is the underlying physical disk queue length. This shows me that a faster disk subsystem would improve performance even further.

And a brief view of network utilization during VMWare tools install on all 3 VMs:

 

Deduplication and Thin Provisioning

StarWind clearly has the advantage here. Not only is the LUN thin provisioned, but it is also deduped once data is written. The deduplication engine works inline, so as data is being written, it is deduped - with a 4K size, it is very effective. All 3 VMs should be taking up nearly 27GB of space - but as you can see, they are not - only about 8.69GB of used space. This yields a 3.15:1 dedupe ratio - which is what I would expect for 3 mostly identical servers.

 

*Note: Each VM should be taking up about 9.13GB however, there is likely some block redundant data within the IOMeter test files on each VM.

 

Performance - Single VM IOMeter, 2 VMs Idle

Again, the StarWind iSCSI target showed far better performance than the Microsoft software target:

I also tested rebooting one of the idle VMs while this test was running - as expected the IOPS dropped down to the 300 range, and response time went up. The physical disk queue also jumped - again showing this as a limitation (reboot initiated at 11:09 AM).

 

All 3 VMs running IOMeter

With all 3 VMs running an IOMeter worker, the performance is still very good. Keep in mind that the IOMeter test files are 500MB - the 3 combined likely fit in the StarWind cache helping the test along. Only 2 of the 3 VMs shown:

The network utilization as the 3 tests are started:

 

Conclusion

The StarWind iSCSI target software has several clear advantages over the Microsoft solution - performance advantages including high-speed RAM cache, thin provisioning, deduplication...with a high-performance disk subsystem that includes controller-based RAM cache as well as faster physical spindles - a major performance boost can be had. Additionally, aggregating network links and adding more RAM to the cache can produce a powerful, fast and efficient software based iSCSI storage solution.

13. March 2012 09:01
by Jake Rutski
0 Comments

Software iSCSI Targets - Part 2a: MS iSCSI - Multiple VMs

13. March 2012 09:01 by Jake Rutski | 0 Comments

In Part 2 of this series, we will look at the performance of the Software iSCSI targets under a heavier load - more specifically, 3 server VMs. While this may not seem like very much load, keep in mind that the backend storage is still just a single 7200RPM spindle, and all networking is over a single 1GbE link. All of the hardware from test 1 is the same, but here are the specs for the three new VMs:

Windows Server 2008R2 SP1 (3x)

  • 1 vCPU, 2GB vRAM
  • 18GB System drive - * Thick provisioned, eager zeroed

 

The Procedure

Create VMs, mount install ISO, begin installing OS, repeat two more times. During the install, here's what the iSCSI server looks like:

It's a bit hard to tell, but the green graph is iSCSI I/O Bytes/second, the red is iSCSI Target disk latency, and yellow is iSCSI requests/second. In the backround, you can see that the CPU and RAM are fairly dormant.

The iSCSI server network adapter does not appear to utilize more than 30% of total available bandwidth. *Keep this network utilization graph in mind...the pattern will show up again...*

The next important bit of information: From start to Windows took 35 minutes, 53 seconds for 3 VMs installing simultaneously.

IOMeter testing

First, one VM running an IOMeter test, while the other two are idle:

Once again, network usage does not appear to exceed ~30% during the test. Iometer results for 1 VM (start of test):

Average after one minute:

Next, 3 VMs running the same IOMeter workers (same as in Part1):

After one minute:

As you can see by the IPOS performance, this may not be the best solution. In part 2B we'll look at the same tests using a StarWind iSCSI target.