by Margaret Bierman with Lenz Grimmer
Published July 2012
This article describes the basic capabilities that I discovered while becoming familiar with the Btrfs file system in Oracle Linux, plus the instructions I used to create a file system, verify its size, create subdirectories, and perform other basic administrative tasks. A second article describes how I use the advanced capabilities of the Btrfs file system.
The Oracle Linux operating system provides advanced methods for storing and organizing data on disk storage systems, such as the ext3, ext4, and XFS file systems, the Oracle Cluster File System 2 (OCFS2) clustered file system, and the next-generation Btrfs file system. Each has its own characteristics and feature sets, allowing administrators to select the one that best fits data storage needs and requirements.
The Btrfs file system provides the following advanced capabilities:
Our research is based on the version of Btrfs available in Oracle Linux 6 with the Unbreakable Enterprise Kernel Release 2 (Version 2.6.39).
Created by Chris Mason at Oracle, the initial design for Btrfs has its roots in a presentation by Ohad Rodeh about copy-on-write friendly B-tree implementations at the USENIX FAST '07 conference. Mason based the Btrfs design on his experience developing the ReiserFS file system (extent-based storage, packing of small files) and the idea to store data and metadata in B-tree structures. After several months of internal development, Btrfs was presented to the Linux community in June 2007. Since then, Oracle engineers have continued to maintain and advance its development. They work in close collaboration with many contributors from the Linux community, including engineers from Linux distributors, such as Red Hat and SUSE, and other companies, such as Dreamhost, Fujitsu, HP, IBM, and Intel. Today, Btrfs is included in the mainline Linux kernel and is gaining popularity through several Linux distributions, including Oracle Linux.
When researching Btrfs, I discovered it has a wealth of functionality built into it.
-o autodefragmount option that enables auto-defragmentation. Another is the ability to disable copy-on-write via the
nocowoption, which can help to minimize fragmentation, particularly for files with sequential access requirements, such as databases and streaming media. In this mode, file blocks are overwritten in place, similar to traditional file systems.
Btrfs uses a number of built-in features to ensure data integrity.
dm_cryptdisk encryption subsystem and Linux Unified Key Setup (LUKS) layer, which support a variety of encryption standards. However, this approach disables some of the capabilities and advantages of using Btrfs on raw block devices, such as automatic solid-state disk support and detection.
Btrfs supports compression on a mount basis. It can be enabled at any time after the subvolume is created. Once enabled, Btrfs automatically tries to compress files using LZO or zlib compression. (Other compression algorithms, such as Snappy and LZ4, are in development.) If a file does not compress well, it is marked as not compressible and written to disk uncompressed. In this case, Btrfs does not make additional compression attempts. A
force-compress option is available that tries to compress new writes in case newly added file content can be compressed.
Btrfs provides functionality and device support designed to improve file system performance characteristics.
-o autodefrag) that enables an auto-defragmentation helper. When a block is copied and written to disk, the auto-defragmentation helper marks that portion of the file for defragmentation and hands it off to another thread, enabling fragmentation to be reduced automatically in the background. This capability can provide significant benefit to small database workloads, browser caches, and similar workloads.
The copy-on-write nature of Btrfs makes it easy for the file system to provide several features that facilitate the replication, migration, backup, and restoration of information.
cp -reflinkcommand, which does for files what snapshots do for volumes.
btrfs subvolume find-new) that identifies the files that have changed on a given subvolume. We find using this feature to be faster than traversing the entire file system with the
find -mtimecommand to locate changed files.
Btrfs is managed primarily using command-line utilities. The only dedicated GUI tools available focus on operating system installation and basic support capabilities. Access to the advanced features of the file system generally is not provided. Table 1 lists the key Btrfs administrative commands.Table 1. Btrfs Administrative Commands
|Initialize a file system|| |
|Administer an existing file system|| |
I found getting started with Btrfs to be very simple. To create file systems, you need to use the
sudo command (or otherwise become the
root user) and have unused disk devices attached to the system. The first step is to create a Btrfs file system using the
mkfs.brtfs command. For example, I created a 10 GB file system that spans two physical 5 GB disks (
/dev/sdc), using default file system configuration parameters.
/* Create a Btrfs file system on two devices using default options */ # mkfs.btrfs /dev/sdb /dev/sdc adding device /dev/sdc id 2 fs created label (null) on /dev/sdb nodesize 4096 leafsize 4096 sectorsize 4096 size 10.00GB Btrfs Btrfs v0.19
Next, I used the
btrfs filesystem show command to verify the file system was created on the two devices.
/* Display the file system configuration */ # btrfs filesystem show /dev/sdb Label: none uuid: b4f5c9a8-d8ec-4a5b-84f0-2b8c8d18b257 Total devices 2 FS bytes used 28.00KB devid 1 size 5.00GB used 1.53GB path /dev/sdb devid 2 size 5.00GB used 1.51GB path /dev/sdc Btrfs Btrfs v0.19
The next step is to make the file system visible to the operating system so that it can be used. I used the standard Oracle Linux
mount command to mount the file system on
/mnt. Note that only the first device that comprises the file system needs to be mounted.
/* Mount the newly created file system */ # mount /dev/sdb /mnt /* Note: only the first device needs to be mounted. Btrfs takes care of the rest. */ # mount /dev/sdc /mnt mount: /dev/sdc already mounted or /mnt busy mount: according to mtab, /dev/sdb is already mounted on /mnt Btrfs Btrfs v0.19
Next, I used the standard Oracle Linux
df command to verify the size of the file system created followed by the
btrfs filesystem df command to get more detailed file system information.
/* Display the size of the file system */ # df -h /mnt Filesystem Size Used Avail Use% Mounted on /dev/sdb 10G 56K 8.0G 1% /mnt /* Get more detailed information */ # sudo btrfs filesystem df /mnt Data, RAID0: total=1.00GB, used=0.00 Data: total=8.00MB, used=0.00 System, RAID1: total=8.00MB, used=4.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=1.00GB, used=24.00KB Metadata: total=8.00MB, used=0.00
Once it was created and verified, I put the Btrfs file system to work. First, I created a subvolume—a named B-tree to hold directories and files—named
/* Create a subvolume named subbasefoo */ # btrfs subvolume create subbasefoo Create subvolume './subbasefoo'
Next, I created three empty files (
foobar3) in the
subbasefoo subvolume using the standard Oracle Linux
/* Create three empty files named foobar1, foobar2, and foobar3 */ # touch foobar1 foobar2 foobar3
I wanted to determine how best to keep data safe. First, I created a snapshot of the subvolume using the
btrfs subvolume snapshot command and verified its existence and contents using the standard Oracle Linux
ls command. I named the snapshot s
/* Create a snapshot of the subbasefoo subvolume */ # btrfs subvolume snapshot subbasefoo/ subbasefoo-20120501 Create a snapshot of 'subbasefoo/' in './subbasefoo-20120501' /* Verify the existence and contents of the snapshot by doing a recursive listing */ # ls -R .: subbasefoo subbasefoo-20120501 ./subbasefoo: foobar1 foobar2 foobar3 ./subbasefoo-20120501: foobar1 foobar2 foobar3
Since snapshots persist until removed, deleting a file in the
subbasefoo subvolume does not release any storage space. Keep in mind that disk space cannot be freed until all snapshots that reference the files in question are removed.
Snapshots are just subvolumes. As a result, all of the same commands apply. To emulate an "undo" facility, always create a new snapshot for experimentation. If you like the result, simply delete the previous generation snapshot. If you do not like the result, just delete the experimental version. It's a handy feature.
When it is necessary to have a zero-space copy of a single file, use the
reflink option of the
cp command. For example, the following commands verify the size of the
subbasefoo subvolume and clone the file named
rantest.tst (creating the file
clonetest.tst). Subsequent use of the
df command shows that the file clone does not consume additional disk space.
/* Create a clone of a 200 MB single file named rantest.tst */ # df -h . Filesystem Size Used Avail Use% Mounted on - 10G 201M 7.8G 3% /mnt/btrfs/subbasefoo # cp --reflink rantest.tst clonetest.tst # df -h . Filesystem Size Used Avail Use% Mounted on - 10G 201M 7.8G 3% /mnt/btrfs/subbasefoo
My research into Btrfs revealed that it addresses long-standing deficiencies found in conventional file systems. Better yet, setting up and using a Btrfs file system is quick and easy, particularly if default configuration parameters are used. These defaults provide a reasonable amount of data protection and improved functionality—and little or no effort is required compared to the default file system. Many advanced features are in place to help improve data integrity and reliability, unify volume management, increase device utilization, and more. In our view, it is the best file system to use when deploying Oracle Linux platforms. As always, the choice is up to you.
The following resources provide more information on the capabilities of Btrfs:
Lenz Grimmer is a member of the Oracle Linux product management team. He has been involved in Linux and Open Source Software since 1995.
Margaret Bierman is a senior writer and trainer specializing in the research and development of technical marketing collateral for high-tech companies. Prior to writing, she worked as a software engineer on optical storage systems, specializing in the development of cross-platform file systems, hierarchical storage management systems, device drivers, and controller firmware. Margaret was also heavily involved in related standards committees, as well as training ISVs and helping them implement solutions. She received a B.S. in computational mathematics from Rensselaer Polytechnic Institute.