As Published In
Oracle Magazine
July/August 2008

DEVELOPER: Open Source


More Support for the Kernel

By Rich Schwerin

Oracle developers continue their contributions to the Linux kernel.

The kernel is an integral part of Linux that manages system resources and provides services and APIs on top of hardware for all tools and applications to use. Building on a decade-long commitment to Linux, an experienced engineering team at Oracle continues to develop key Linux kernel technology for the open source community.

"Mainline kernel features adhere to a design and implementation process in which code is reviewed and improved by a significant community of engineers before going to [kernel maintainers] Andrew Morton or Linus Torvalds," says Chris Mason, software development director at Oracle, who leads a team of Linux kernel developers. "Major features commonly take months or years for acceptance, depending on the technology and how significant a change they represent."

Mason's team is working on a variety of important new kernel development projects, including Btrfs, T10 Data Integrity Field (DIF), and Internet Protocol version 6 (IPv6) support for Linux NFS.

Building a Btrfs

Linux is hindered by file systems that haven't kept pace with modern, higher-capacity storage in terms of scalability and efficiency, lacking not just the capability to address higher-capacity storage but also the ability to manage the storage, keep track of it all, and gain the best-possible performance from it. And most importantly, says Mason, Linux needs a file system that can deal with inevitable faults and errors.

"It's fairly easy to create a 16- or 32-terabyte file system, but when there are errors in the underlying storage, you have to do a very long-running and expensive file system check, resulting in quite a bit of downtime, all of which becomes dramatically worse, the larger the storage," says Mason. "And as storage capacity continues to grow exponentially, it's clear that Linux needs a file system built with the future in mind." This is where Btrfs (pronounced butter-FS ) comes in.

"One of the key features in Btrfs is the ability to detect any errors in the file system before it uses the data; if the disk returns a bad block or incorrect data, the file system should be able to detect that and do something about it preemptively," says Mason. "Btrfs accomplishes this by adding check-summing to all of the data and metadata in the file system and maintains multiple copies of crucial file system structures."

Other features of Btrfs include efficient writable snapshots, an online file system check that detects and deals with errors online, and storage management capable of controlling numerous disks in the system with intelligent mirroring and striping.

Fielding Data Integrity

Oracle is also contributing to the Linux platform in the area of enterprise-class data integrity. It is working on an interface to expose the T10 DIF standard to the Linux kernel and end-user applications. DIF technology in the Linux kernel will enable applications and kernel subsystems to take advantage of crucial data integrity features such as standardized protection metadata, for reduced system downtime and cost savings for end users.

"Data corruption is commonly due not only to bit rot on physical disks but also to transmission errors or bugs in the I/O path between the application and the drive. The DIF project aims to prevent corrupted data buffers from being written to disk in the first place," says Mason.

Two common corruption scenarios are bad buffer writes, where the write ends up in the right place on the disk but the data written is not what the application sent, and misdirected writes, where the write buffer contains good data but that data ends up being written to the wrong location on disk.

"The DIF is an addition to the SCSI specification that standardizes the contents of the protection metadata and enables extra information to be sent and received from the host controller as well as verified along the chain of devices," says Mason.

In cooperation with Emulex and other key storage industry partners, Oracle has developed an infrastructure that takes the DIF specification a step further, enabling the protection metadata to be exposed to the operating system as well as the application. The Linux DIF project enables applications or kernel subsystems to attach checksums to I/O operations, enabling devices that support DIF to verify the integrity before passing them further down the stack and physically committing them to disk.

Stepping Up NFS to Ipv6

Oracle Linux kernel engineers are also focused on a third area: enabling the Linux Network File System (NFS) to run natively on IPv6 networks. IPv6 is a network layer for packet-switched networks including the internet; IPv6 compatibility is a key requirement for software purchased by the U.S. federal government after 2008.

Currently, 32-bit IPv4 addresses allow as many as 4 billion unique host addresses, which—despite network address translation (NAT) routers—aren't enough to support the rapidly expanding population of networked computers. IPv6 provides 128-bit addresses, creating an astronomically sized address space, enabling every computer to have its own globally unique internet address and connect directly without NAT. 

Next Steps



READ more about
Btrfs
T10 Data Integrity Field

IPv6 support for Linux NFS
oss.oracle.com/~cel/LSF-08-NFS-IPv6.pdf
linux-nfs.org
Oracle and open source
Oracle Unbreakable Linux Support
Oracle Validated Configurations

 DOWNLOAD Oracle Enterprise Linux

 LISTEN to Chris Mason podcast on Btrfs columns

Under IPv4, routers have to read most or all of a header before making a routing decision, but with IPv6, headers have a new layout that makes packet routing much more efficient, resulting in better scalability for handling more traffic. Additional IPv6 features include support for mobility, such as automatic globally unique addresses for every device you own; IP security, a suite of protocols for securing IP via authentication and/or encryption; and integrated quality-of-service support that allows resource reservation control and prioritization for different applications, users, and data flows.

"NFS is a very large code base encompassing several different subsystems in the Linux kernel, so IPv6-enabling all of those things is a fairly large project and it's very difficult to get all of those improvements in and maintain compliance with all the related standards," says Mason. "NFS is supported by many different operating systems and storage devices, making standards compliance and interoperability testing a crucial part of any improvements. We just have to go component by component through the stack, IPv6-enabling things as we go."

The NFS-for-IPv6 project is in the final stages, and NFS support for IPv6 is expected to be in the Linux 2.6.27 mainline kernel.

Kernel Results

What is the bottom line of Oracle's contributions to the Linux kernel? "Oracle's significant Linux kernel development work with file systems, data integrity, and other contributions will prove beneficial to all Linux users," says Jim Zemlin, executive director, Linux Foundation.

 


Rich Schwerin (rich.schwerin@oracle.com) is the Linux, virtualization, and open source senior product marketing manager with Oracle technology marketing.

Send us your comments