By Ginny Henningsen, May 2011
This is the second article in a series highlighting best practices in Oracle Solaris 11 Express. The first article, Updating Software, introduced the Image Packaging System (IPS) software packaging model and discussed the best practice for performing updates: creating a new Boot Environment (BE) before applying an update. In some cases, such as with updates and full upgrades, IPS creates a new BE automatically.
BEs are a built-in safety net when you make software changes, much like Live Upgrade environments in Oracle Solaris 10. In Oracle Solaris 11 Express, the root file system is implemented using Oracle Solaris ZFS, so BEs are basically Oracle Solaris ZFS snapshots that are readable/writableand activated for booting. Because of this underlying technology, you can periodically generate snapshots of BEs just like you can take a snapshot of any Oracle Solaris ZFS volume.
This article describes how Oracle Solaris 11 Express uses automated snapshots to deliver Time Slider services, a new feature in the GNOME desktop. Using Time Slider, you can take periodic snapshots of the active BE, capturing the software state at regular intervals. This approach might prove useful if you forget to explicitly create a BE when you update software.
Since the history of software updates can be especially helpful in troubleshooting, this article also highlights commands for researching package changes. Note that all examples presume an authorized user (see “User Accounts, Roles, and Rights Profiles” in Getting Started With Oracle Solaris 11 Express).
The Time Slider service is a new component in the GNOME file manager (a.k.a.,
nautilus), and it relies on services that generate periodic Oracle Solaris ZFS snapshots. Similar in concept to Time Machine in Apple’s Mac OS, Time Slider provides an easy way for desktop users to restore individual files or directories from automatically scheduled, incremental snapshots of home directories. Once activated and set up, Time Slider services can generate “frequent” snapshots (every 15 minutes, by default), as well as hourly, daily, weekly, or monthly.
The services that automate periodic snapshots can do so for any file system, as well as BEs, even on non-desktop systems. However, to automatically snapshot using Time Slider mechanisms, packages for the GNOME desktop environment (for example, the
slim_install package group) are required even when the
gdm desktop manager is not running. (This is because Time Slider services use notification mechanisms that are part of the GNOME stack.)
By default, the services that support Time Slider are disabled, even on desktop systems:
# svcs time-slider STATE STIME FMRI disabled 16:33:56 svc:/application/time-slider:default # svcs auto-snapshot STATE STIME FMRI disabled 16:33:56 svc:/system/filesystem/zfs/auto-snapshot:monthly disabled 16:33:56 svc:/system/filesystem/zfs/auto-snapshot:weekly disabled 16:33:56 svc:/system/filesystem/zfs/auto-snapshot:daily disabled 16:33:56 svc:/system/filesystem/zfs/auto-snapshot:hourly disabled 16:33:56 svc:/system/filesystem/zfs/auto-snapshot:frequent
The easiest way to enable Time Slider services on a desktop is to use the GUI (enter
time-slider-setup on the command line or use the System -»Administration-»Time Slider menu). The GUI (shown in Figure 1) allows you to select which file systems to snapshot, specify whether to back up snapshots to an external drive, and set the capacity policy for snapshot removal.
Figure 1. Time Slider GUI
First, if you are not using the GUI, designate the file systems and BEs to be captured via automatic snapshots. To do this, set the
com.sun:auto-snapshot property to true, as shown in the following commands. The first command designates the home directory of user
jdoe, and the second command designates the home directory of user
rkai. In the output of the third command, values shown as
true will be snapshotted.
# zfs set com.sun:auto-snapshot=true rpool1/export/home/jdoe # zfs set com.sun:auto-snapshot=true rpool1/export/home/rkai # zfs get all |grep auto-snapshot
Since directories inherit Oracle Solaris ZFS properties from their parent, setting up automatic snapshots for
/export/home captures all subdirectories in that hierarchy.
Whether or not you use the GUI to select what to snapshot, you must first enable Time Slider services. In the following example, all auto-snapshot services are enabled, as are the Time Slider plug-in and Time Slider services, which are also needed:
# svcadm enable auto-snapshot:frequent # svcadm enable auto-snapshot:hourly # svcadm enable auto-snapshot:daily # svcadm enable auto-snapshot:weekly # svcadm enable auto-snapshot:monthly # svcadm enable time-slider/plugin:zfs-send # svcadm enable time-slider/plugin:rsync # svcadm enable time-slider
time-slider/plugin services depend on the
time-slider service, be sure to enable the
time-slider service last (or restart it). Again, these services require GNOME desktop packages, and they go into maintenance mode if they are enabled when GNOME packages are not installed.
Once services are activated and Oracle Solaris ZFS properties are set (either through the GUI or command line, as described above), Time Slider functionality is available within the GNOME file manager. Start the file manager and click the clock icon in the file manager’s navigation bar. (If the clock icon is grayed out, Time Slider services haven’t been enabled properly. Go back and check the previous steps and then restart the file manager.)
Time Slider functionality allows the desktop user to navigate among captured snapshots. In Figure 2, the left-most file manager shows home directory contents of an earlier snapshot while the right-most file manager shows the current state. A directory called “Project A” appears in the earlier snapshot but no longer exists in the current directory.
Figure 2. Comparing Time Slider Snapshots
Suppose the “Project A” directory shown in Figure 2 was accidentally deleted. To restore it, the user simply drags and drops it from the previous snapshot into the desired target directory in the current state (labeled “Now” in the File Manager).
OK, great. Time Slider has obvious value for a user who’s inadvertently deleted or overwritten a file or directory, as in the “Project A” example above. Automatic snapshots, though, can periodically capture the current BE, which can be helpful if a software update goes awry. By regularly cloning the current software state in a BE, you can revert to a previous state if needed.
If you are an authorized user, you can initiate automatic snapshots of BEs using the GUI or command‑line equivalents. The procedure to designate a BE for automatic snapshots is basically the same as for a user’s home directory: Simply activate the services (as shown before) and set the BE’s
com.sun:auto-snapshot property to
true. As an example, these commands set the
auto-snapshot property to
true for the BEs
# zfs set com.sun:auto-snapshot=true rpool1/ROOT/solaris # zfs set com.sun:auto-snapshot=true rpool1/ROOT/BE2
Of course, as an alternative, you can create a
cron job to periodically snapshot a BE (issuing a command such as
zfs snapshot rpool1/ROOT/solaris@backup). Additional examples of generating Oracle Solaris ZFS snapshots can be found in the Oracle Solaris ZFS Administration Guide.
The following command lists all snapshots, both those that are BEs and those that are not BEs:
# zfs list -t snapshot NAME USED AVAIL REFER MOUNTPOINT rpool/ROOT/BE2@zfs-auto-snap_hourly-2011-03-10-11h10 0 - 3.49G rpool/ROOT/BE2@zfs-auto-snap_frequent-2011-03-10-11h55 0 - 3.49G - rpool/ROOT/solaris@install 26.5M - 3.34G - rpool/ROOT/solaris@2011-03-09-22:39:13 11.6M - 3.58G - rpool/ROOT/solaris@zfs-auto-snap_hourly-2011-03-10-11h10 40.5K - 3.58G - rpool/ROOT/solaris@zfs-auto-snap_frequent-2011-03-10-11h40 40.5K - 3.58G - rpool/ROOT/solaris@zfs-auto-snap_frequent-2011-03-10-11h55 0 - 3.58G - rpool/export/home/jdoe@zfs-auto-snap_monthly-2011-03-09-16h10 19.5K - 39.0M - rpool/export/home/jdoe@zfs-auto-snap_hourly-2011-03-09-22h10 19K - 39.0M - rpool/export/home/jdoe@zfs-auto-snap_hourly-2011-03-10-09h10 112K - 39.0M - rpool/export/home/jdoe@zfs-auto-snap_hourly-2011-03-10-10h10 282K - 942K - rpool/export/home/jdoe@zfs-auto-snap_hourly-2011-03-10-11h10 23K - 31.6M - rpool/export/home/jdoe@zfs-auto-snap_frequent-2011-03-10-11h25 23K - 31.6M - rpool/export/home/jdoe@zfs-auto-snap_frequent-2011-03-10-11h40 32K - 38.1M - rpool/export/home/jdoe@zfs-auto-snap_frequent-2011-03-10-11h55 0 - 38.1M -
To show only snapshots for BEs, use this form of
# beadm list -s BE/Snapshot Space Policy Created ----------- ----- ------ ------- BE2 BE2@zfs-auto-snap_frequent-2011-03-10-11h55 0 static 2011-03-10 11:55 BE2@zfs-auto-snap_hourly-2011-03-10-11h10 0 static 2011-03-10 11:10 solaris solaris@2011-03-09-22:39:13 11.60M static 2011-03-09 16:39 solaris@install 26.54M static 2011-03-08 16:17 solaris@zfs-auto-snap_frequent-2011-03-10-11h40 40.5K static 2011-03-10 11:40 solaris@zfs-auto-snap_frequent-2011-03-10-11h55 0 static 2011-03-10 11:55 solaris@zfs-auto-snap_hourly-2011-03-10-11h10 40.5K static 2011-03-10 11:10
For automatic snapshots, snapshot names use the Oracle Solaris ZFS data set name followed by
@zfs-auto-snap <type> (where type is
hourly, and so on), along with a date and time stamp.
A certain number of automatic Oracle Solaris ZFS snapshots are kept, as long as space permits, according to type. From the service manifest (
auto-snapshot service keeps (by default) 3 frequent, 23 hourly, 6 daily, 4 weekly, and 12 monthly snapshots into the past. Snapshots are deleted, however, if space is needed, with oldest snapshots being deleted first.
Carefully! The following command deletes all snapshots matching the pattern
# for s in `zfs list -H -o name -t snapshot | grep @zfs-auto-snap`;
If you are an authorized user, you can also use the Time Slider GUI to delete unwanted snapshots. Deleting a BE deletes related snapshots. Compare the following output with the earlier
beadm list command:
# beadm destroy BE2 Are you sure you want to destroy BE2? This action cannot be undone(y/[n]): y # beadm list -s BE/Snapshot Space Policy Created ----------- ----- ------ ------- solaris solaris@install 26.58M static 2011-03-08 16:17 solaris@zfs-auto-snap_frequent-2011-03-10-12h25 41.0K static 2011-03-10 12:25 solaris@zfs-auto-snap_frequent-2011-03-10-12h40 41.0K static 2011-03-10 12:40 solaris@zfs-auto-snap_frequent-2011-03-10-13h40 29.0K static 2011-03-10 13:40 solaris@zfs-auto-snap_hourly-2011-03-10-11h10 40.5K static 2011-03-10 11:10 solaris@zfs-auto-snap_hourly-2011-03-10-12h10 40.5K static 2011-03-10 12:10 solaris@zfs-auto-snap_hourly-2011-03-10-13h10 41.0K static 2011-03-10 13:10
To roll back using a previous snapshot of a BE, create a BE from the snapshot, activate the BE, and reboot. In the following example, a new BE called
BEnew is created from a previous BE snapshot, and then
BEnew is set to become the active BE upon reboot:
# beadm create -e solaris@zfs-auto-snap_hourly-2011-03-10-11h10 BEnew # beadm activate BEnew # reboot
pkg history command is quite useful for researching what software updates have been made to the current BE. When it is used without arguments, it provides an overview of software changes, as shown in the following example:
# pkg history TIME OPERATION CLIENT OUTCOME 2010-11-05T11:14:56 purge-history pkg Succeeded 2011-03-14T09:16:00 uninstall pkg Succeeded 2011-03-14T09:16:09 uninstall pkg Succeeded 2011-03-14T09:16:15 set-property pkg Succeeded 2011-03-14T09:16:17 update-publisher pkg Succeeded 2011-03-14T09:16:18 set-property pkg Succeeded 2011-03-14T10:58:31 refresh-publishers pkg Succeeded 2011-03-14T10:58:31 install pkg Succeeded 2011-03-14T10:58:32 rebuild-image-catalogs pkg Succeeded 2011-03-14T11:32:45 uninstall pkg Succeeded 2011-03-14T12:47:31 refresh-publishers updatemanager Succeeded 2011-03-14T12:47:33 rebuild-image-catalogs updatemanager Succeeded 2011-03-15T09:22:07 install pkg Succeeded
Perhaps even more useful is the output provided with the
-l option, which reports what changes were made, whether the changes succeeded, which user made the changes, when the changes were made, which command was used, and details about the starting and ending states.
For example, the following excerpt shows that user
jdoe ran the
pkg install gcc -3 command, which successfully installed four packages for the GNU compiler:
Operation: install Outcome: Succeeded Client: pkg Version: 052adf36c3f4 User: jdoe (101) Start Time: 2011-03-16T09:22:07 End Time: 2011-03-16T09:23:41 Command: /usr/bin/pkg install gcc-3 Start State: Solver: [ Variables: 887 Clauses: 6874 Iterations: 1 State: Succeeded] Timings: [phase 1: 0.577, phase 2: 0.116, phase 3: 0.256, phase 4: 0.000, phase 5: 0.000, phase 6: 0.000, phase 7: 0.019, phase 8: 1.258, phase 9: 0.000, phase 10: 0.781, phase 11: 0.028, phase 12: 0.141] Maintained incorporations: pkg://email@example.com,5.11-0.151.0.1:20101105T053408Z . . . pkg://firstname.lastname@example.org,5.11-0.151.0.1:20101104T230646Z Package version changes: End State: None -> pkg://email@example.com,5.11-0.151.0.1: 20101105T053803Z None -> pkg://firstname.lastname@example.org,5.11-0.151.0.1: 20101104T231349Z None -> pkg://email@example.com,5.11-0.151.0.1:20101105T053751Z None -> pkg://firstname.lastname@example.org,5.11-0.151.0.1:20101105T002136Z
XML records for
pkg operations are stored in
/var/pkg/history, which the
pkg history command uses to compose its output. The command
pkg purge-history wipes out these records, so use it with caution, because this history can be valuable when troubleshooting.
Suppose you take a day off and, upon returning, users report problems with a server. Examining the system’s package history for installations or updates might be a place to start. Suppose the history indicates that another system administrator performed a package installation while you were gone, which is evident in the third line of the following history excerpt:
2011-03-14T12:47:31 refresh-publishers updatemanager Succeeded 2011-03-14T12:47:33 rebuild-image-catalogs updatemanager Succeeded 2011-03-16T09:22:07 install pkg Succeeded
If further investigation confirms this install is the root cause, one possible corrective action is to revert to a previous BE until you can resolve the issue. Using the install’s time stamp, you can look for a BE snapshot that predates the installation, and revert back to the earlier snapshot:
# beadm create -e SLIM@zfs-auto-snap_frequent-2011-03-15-06h59 BEpre317 # beadm activate BEpre317 # reboot
In summary, Oracle Solaris 11 Express supplies new ways to help administrators cope with human errors, including their own. On desktops, enable Time Slider services and automatic snapshots of user directories (or the full
/export/home file system). On servers, always create a new BE before performing an update, and optionally set up periodic snapshots for the current BE. Oracle Solaris 11 Express provides built-in safety nets, so best practice, of course, is to use them.
Revision 1.0, 05/06/2011