ZFS (originally referred to as Zettabyte File System) is a modern file system specifically designed to add features not available in traditional file systems. It was originally developed at Sun Microsystems. The features of ZFS include protection against data corruption, support for high storage capacities, efficient data compression, snapshots and copy-on-write clones, continuous integrity checking and automatic repair, deduplication, and more. It is safe, simple, efficient, and dynamic. ZFS is still evolving, and new features will appear regularly.
ZFS is a 128-bit CoW (Copy on Write) file system and volume manager, meaning when data is changed it is not overwritten – it is always written to a new block and check summed before pointers to the data are changed. The old data may be retained, creating snapshots of the file system through time as changes are made. These ZFS snapshots are created very quickly, since all the data composing the snapshot is already stored. They are also space efficient, since any unchanged data is shared among the file system and its snapshots.
ZFS also avoids writing duplicate data, through a process called deduplication. This can improve storage efficiency.
ZFS does away with the concept of disk volumes, partitions and disk provisioning by adopting pooled storage. In a traditional Unix file system, you need to define the partition size and mount point at file system creation time. Comparatively, with ZFS, you feed disks to a “pool” and create file systems from the pool as needed. Because of this pooling of disks, all available hard drives in a system are essentially joined together. The combined bandwidth of the pooled devices is also available to ZFS, which effectively maximizes storage space, speed and availability. The ZFS pool scales to exabytes of storage, which means billions of gigabytes.
High performance SSDs can be added to the storage pool to create a hybrid storage pool. When these are configured as cache disks, ZFS uses them to hold frequently accessed data to improve performance. ZFS also uses a technology called L2 ARC (Adaptive Replacement Cache) to write data that has to be stored immediately. This data can be moved over to conventional hard drives for more permanent storage when time and resources allow.
ZFS can scrub all the data in a storage pool, checking each piece of data with its corresponding checksum to verify its integrity, detect any silent data corruption, and to correct any errors it encounters where possible.
Unlike a simple disk block checksum, this scrubbing[?] can detect phantom writes, misdirected reads and writes, DMA (Direct Memory Access) parity errors, driver bugs and accidental overwrites as well as traditional “bit rot.” These scrubs are I/O intensive, so they should be scheduled appropriately.
Typically, scrubbing is given low I/O priority so that it has a minimal effect on system performance and can operate while the storage pool is in use. Reading the scrub results can provide an early indication of possible disk failure.
Our ZFS product management team has been engaged from the initial development fof ZFS all the way through the implementation of using ZFS as the underlying storage of Lustre and we can discuss the many different distributions to help you determine which one is best suited for your specific requirement. Please call us to discuss.