by Cindy Swearingen
Published July 2014 (updated July 2015)
At a time when cloud data storage needs are exploding, Oracle Solaris ZFS provides integrated features to reduce your storage footprint. ZFS also provides a flexible tiered storage model so that you can host important data on your fastest storage devices as well as compress and archive less-important data on standard storage devices.
Oracle Solaris ZFS provides the following ways to optimize your storage requirements in Oracle Solaris 11:
Oracle Solaris 11.3 includes the LZ4 compression algorithm, which has a better compression ratio than LZJB and is generally faster (reduced CPU overhead).
Oracle Solaris customers are reporting ZFS compression ratios in the 2x to 18x range, depending on the compressible workload. If you haven't tried ZFS compression previously because you were concerned about CPU overhead, consider enabling LZ4 (
dedup) property is enabled on a ZFS file system, duplicate data blocks are removed as they are written to disk. Enabling deduplication consumes system and memory resources, so you must determine whether your data is dedupable and your system has enough resources to support the deduplication process.
Use the following procedure to enable the default ZFS compression algorithm (
lzjb), which generally reduces storage footprints on enterprise-level systems, as shown in Figure 1, without adversely impacting system performance.
Figure 1. Reducing storage footprints using ZFS compression
# zfs create tank/data
# cp file.1 /tank/data/file.1
# zfs list tank/data
# zfs help compression compression (property) Editable, Inheritable Accepted values: on | off | lzjb | gzip | gzip-[1-9] | zle | lz4
# zfs create -o compression=on tank/newdata
# cp /tank/data/file.1 /tank/newdata/file.1 # zfs list tank/data
# zfs list tank/newdata
zfs listoutput for the two file systems:
The size of the compressed data should be 2x–3x less than the original size of the file. If your sample data is not compressible, enabling ZFS compression is not a good fit for your data.
# zfs create -o compression=on tank/newdata
Keep in mind that if you enable compression on an existing file system, only new data is compressed.
If a ZFS file system has the
dedup property enabled, duplicate data blocks are removed as data blocks are written to disk. The result is that only unique data is stored on disk and common components are shared between files, as shown in Figure 2.
Figure 2. Example of how duplicate data blocks are removed when deduplication is enabled
Deduplication can result in savings in disk space usage and cost. However, before enabling deduplication, you must ensure your data is dedupable and your system meets the memory requirements.
If your data is not dedupable, there is no point in enabling deduplication and doing so will waste CPU resources. ZFS deduplication is in-band, which means deduplication occurs when you write data to disk and impacts both CPU and memory resources.
Deduplication tables (DDTs) consume memory and eventually spill over and consume disk space. At that point, ZFS has to perform extra read and write operations for every block of data on which deduplication is attempted. This causes a reduction in performance.
A system with a large data pool and a small amount of physical memory does not perform deduplication well. Some operations, such as removing a large file system with deduplication enabled, severely decrease system performance if the system doesn't meet the memory requirements.
Also, consider whether enabling compression on your file systems would provide a better way to reduce disk space consumption.
zdbcommand to determine whether the data in your file system is dedupable.
If the estimated deduplication ratio is greater than 2, you might see space savings. In the example shown in Listing 1, the ratio is less than 2, so enabling deduplication is not recommended.
# zdb -S tank Simulated DDT histogram: bucket allocated referenced ______ ______________________________ ______________________________ refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE ------ ------ ----- ----- ----- ------ ----- ----- ----- 1 1.00M 126G 126G 126G 1.00M 126G 126G 126G 2 11.8K 573M 573M 573M 23.9K 1.12G 1.12G 1.12G 4 370 418K 418K 418K 1.79K 1.93M 1.93M 1.93M 8 127 194K 194K 194K 1.25K 2.39M 2.39M 2.39M 16 43 22.5K 22.5K 22.5K 879 456K 456K 456K 32 12 6K 6K 6K 515 258K 258K 258K 64 4 2K 2K 2K 318 159K 159K 159K 128 1 512 512 512 200 100K 100K 100K Total 1.02M 127G 127G 127G 1.03M 127G 127G 127G dedup = 1.00, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.00
Each in-core DDT block is approximately 320 bytes. So multiply the number of allocated blocks by 320. Here's an example using the data from Listing 1:
In-core DDT size (1.02M) x 320 = 326.4 MB of memory is required.
Be sure you enable
dedup only for file systems that have dedupable data.
# zfs set dedup=on mypool/myfs
ZFS provides options for configuring flexible, tiered storage pools to optimize your enterprise storage, as shown in Figure 3.
Figure 3. Example of tiered storage pools
Use the following procedure to configure tiered storage pools:
Mirrored pools provide the best performance for most workloads. For example, the following syntax creates a mirrored ZFS storage pool,
tank, with two mirrored components of two disks each and one spare disk:
# zpool create tank mirror disk1 disk2 mirror disk3 disk4 spare disk5
A RAIDZ pool is a good choice for archive data. For example, the following syntax creates a RAIDZ-2 pool,
rzpool, with one RAIDZ-2 component of five disks and one spare disk. When the snapshot stream is sent to the new pool, we also enable compression on the receiving file system:
# zpool create rzpool raidz2 disk1 disk2 disk3 disk4 disk5 spare disk6 # zfs snapshot -r tank/old_data@date # zfs send -Rv tank/old_data@date | ssh sys-B zfs receive -o compression=on rzpool/oldtankdata # zfs destroy -r tank/old_data@date
Note: You must be configured to use
ssh on the other system.
# zpool add tank cache ssd-1 ssd-2
# zpool add nfspool log mirror ssd-1 ssd-2
This article described the flexible ways that you can use ZFS to reduce your storage footprint and optimize your data storage in Oracle Solaris 11. Consider which one works best for your application data workload and your storage and system resources.
Cindy Swearingen is an Oracle Solaris Product Manager who specializes in ZFS and storage features.
|Revision 1.0, 07/14/2014|
|Revision 1.1, 07/31/2015 |
Updated for Oracle Solaris 11.3