What LeftHand is, is a "network RAID" SAN product, built around Linux with some custom internal software, on completely commodity / COTS Proliant hardware. It basically offers iSCSI storage with redundancy built over the network (Ethernet) to multiple servers. Each box is a separate, complete server, containing an arbitrary setup of drives in a RAID volume. Then, multiple boxes are combined together in a RAID-like setup (becoming RAIS - Redundant Array of Inexpensive Servers) offering SAN volumes with a desired RAID level using the servers as lower-level storage. An example setup might consist of three boxes, each with 8 drives in RAID-5, exporting three volumes: one RAID-5 volume spanning the three servers (in effect making this a RAID-55 setup), one RAID-1 volume spanning two servers (RAID-15), and one volume served from only one server.
This has been possible in FreeBSD at least since ZFS was imported, around 3 years ago, but can also be achieved with "lesser" file systems, a volume manager and software RAID. Here is how the example setup could be achieved:
- Configure three individual servers (s1, s2, s3) with some drives; in each server, make a single large ZFS RAID volume from all the drives, or use a hardware controller and create a simple ZFS volume on it (boot from internal USB key if bootability or operating system disk or space is an issue - lots of modern servers have internal USB for things like this and VMWare).
- Plan your end-layout. Let's say each server holds 10 TB of user-available storage and we want to use 6 TB from each server to create the big RAID-Z volume, and the rest will go into either the RAID-1 volume or the "plain" volume.
- Use "zfs create -V" to create one 6 TB zvol and one 4 TB zvol on each server.
- Export these volumes via iSCSI, using ports/net/istgt or via ggated(8).
- Plan which nodes will be "head" for each volume. You can also introduce a new "head node" which will only import the iSCSI nodes, but this could become a bottleneck. Let's say that s1 will be head for the RAID-Z, s2 for RAID-1 and s3 for the plain 4 TB volume.
- On s1, import the other two 6 TB zvols via iSCSI with iscsi_initiator(4), on s2 import the one other 4 TB volume from s1, and on s3 do nothing in this step.
- On s1, create a new RAID-Z volume from one local 6 TB zvol and two iSCSI-imported ones, on s2 create a new RAID-1 ZFS volume from the one local 4 TB zvol and the one iSCSI-imported zvol, and on s3 just use the previously created ZVOL.
- You can now use all of the created storage devices however you want. In case of LeftHand, the end-result is again exported over iSCSI, but you can simply create a file system on the end-volumes and use them locally. Thinking on it in retrospective, you could probably shave quite a few heavy layers by using ZFS only for the end-volumes and using hardware RAID to get the volumes from step 3).
Why would someone use such a setup, especially considering it is considerably more complex than just using a simple DAS or SAN storage with a single level of RAID? First and foremost, it's a cheap way to introduce multi-server storage redundancy, while also increasing space. If you use ZFS on the end-result volumes you can automagically extend storage space by adding more boxes. With some fancy scripting, hot failover can be implemented.
Of course, Ethernet speed is an issue. A setup like this will only work good with either 10 Gbit NICs, or carefully planned network setup with multiple 1 Gbit NICs (which is the way the low-end LeftHand models work).
Why would you buy LeftHand when a setup like this can be done with FreeBSD (and even saner Linuxen)? Because the LeftHand product has a GUI (albeit a wrongly managed one - written in Java but with native installers and requiring its own bundled micro-version of Java not a generic one) which condenses all these steps in a few mouse clicks.
(On a tangential topic, ZFS v28 is ready for testing! It brings deduplication, RAIDZ3, removing devices from log volumes and more!)