Two Kinds of Storage

There are two basic forms of storage, server-attached and networked. Server-attached storage is storage that either is in or directly attached to a single, physical server. Some of our servers rely on such "local" storage, but the bulk of our data storage is networked.

Networked storage removes the storage from the physical server box. There are a couple of ways to access it, via an Ethernet network using TCP/IP as a transport, or (in the case of some SAN hardware) fibre channel, which is essentially SCSI over its own networking protocol. One benefit is that the storage and the server do not have to be in the same facility. Another is that storage can be carved up and allocated to servers as needed. In environments using only local storage, the amount of excess storage on single servers can be significant.

Networked storage itself falls roughly into two categories. The simplest is Network Attached Storage or NAS. You could call this a "storage appliance," although some are much more sophisticated than that term would imply. The idea is that you attach a box containing a significant amount of disk space to your network, and servers (Windows, Linux, etc.) can attach to it using network file system protocols such as CIFS (Windows) or NFS (Unix/Linux).

The other alternative is a Storage Area Network or SAN, which is what we are using at Drew. A SAN is a large, high-performance central storage system. Instead of presenting file systems to servers using high-level protocols such as CIFS or NFS, a SAN presents a block-level or physical device to the server. It looks like a locally attached disk to the server, even though it could be feet, yards, or miles away. And it may not even be a single disk or disk array.

There are several advantages to this concept, called storage virtualization. One is that a wider variety of server operating systems can take advantage of a SAN, since it just looks like disk space to the server. Reliability is another advantage - SAN hardware is generally engineered to provide uninterrupted service and a significant degree of data protection. Another benefit is performance - without the overhead of high-level file protocols, speed improves. Finally, there is allocation. A SAN disk array supporting *virtual arrays* gives you the ability to carve up your disks into any arbitrary amount of space and allocate them to specific servers. The data is distributed over all, or most, of the disks in the array. This increases reliability, because a disk failure cannot lead to data loss, and it increases performance because the server is not always waiting for the same drive to physically read the next piece of data - that data is probably on another drive, which likely isn't busy that particular instant. Other storage devices, like tape libraries (autoloaders) can be accessed over a storage network as well.

Currently, we are using both 2- and 4-gigabit fibre channel for our SAN connections. As mentioned above, fibre channel is a specification for accessing storage using SCSI commands over fiber optic cable using its own communications protocol. These is also iSCSI, a fairly new development in the storage arena. iSCSI actually transmits SCSI commands over TCP/IP, the protocol used for most networking, including the Internet. It may prove to cost less over time. It is unclear if iSCSI is going to displace 4-gigabit fibre channel (or soon, 8-gigabit fc) in the near term. The new fibre channel over ethernet standard may allow fibre channel to take advantage of 10-gigabit ethernet without the need for TCP/IP or switching to iSCSI.

  • No labels