How to Migrate Control of System Services from Scripts to the Service Management Facility

by Suzanne Zorn

How to migrate script-based services from older Oracle Solaris releases and other UNIX environments, such as
HP-UX 11i, to the Service Management Facility in Oracle Solaris 11.


Published May 2012

Managing Services in Oracle Solaris 11
   SMF Key Components
Adding a Service to SMF
   Creating the SMF Manifest File
   Creating the Start Method
Migrating a Service to SMF
   Monitoring Services with SMF
   Demonstrating Automatic Restart of an SMF-Controlled Service
Final Thoughts
See Also
About the Author

This article describes how to migrate a service controlled via legacy /etc/rc* scripts written for older Oracle solaris releases or other UNIX environments to the Service Management Facility (SMF) in Oracle Solaris 11. It is recommended that administrators use the svcbundle utility to quickly generate a manifest and install it into the system, as is covered in Using svcbundle to Create SMF Manifests and Profiles in Oracle Solaris 11.

OTN is all about helping you become familiar enough with Oracle technologies to make an informed decision. Articles, software downloads, documentation, and more. Join up and get the technical resources you need to do your job.

Oracle Solaris 11 continues to allow legacy /etc/rc* scripts to start and stop its services. However, since SMF makes it so much easier to monitor and manage system services, including providing failure detection, dependency checking, and automatic restart, it's well worth your while to transition control of those services to SMF.

Managing Services in Oracle Solaris 11

Traditional scripts in the /etc/rc* directories are used to start and stop services in many UNIX environments. The system looks in these directories when transitioning to run levels, finds scripts beginning with the letters S and K, and simply runs these scripts sequentially.

Administrators can add a custom script to these directories, but there are no management tools. Administrators are responsible for ensuring sequence numbers don't conflict and that any required services are available by previously executed scripts, and they must manually monitor and restart failed services.

SMF simplifies management and provides improved methods for controlling, monitoring, and managing services. Relationships between services can be defined, and services can be started only if the services upon which they depend are online. Instead of running scripts sequentially, SMF can start services in parallel, speeding system startup times.

SMF also contains an administrative interface to easily manage and control services, with configuration changes persisting across reboots. And, SMF monitors services and can be configured to automatically restart failed services, increasing the availability and reliability of services.

SMF Key Components

An SMF service has the following components associated with it:

  • A manifest file (written in XML) that describes the service, including information such as the name of the service, how to stop and start it, and which other services it depends upon. Manifest files are stored in /lib/svc/manifest (preferred) or /var/svc/manifest subdirectories.
  • One or more methods that define how to start, stop, and restart the service. Methods are typically stored in /lib/svc/method subdirectories.
  • One or more executables (daemons) that are called by the methods to implement the service.
  • A log file that SMF uses to store output and status for the service. Log files are stored in the /var/svc/log directory, and they can be looked up using the svcs -x command.
  • A Fault Management Resource Identifier (FMRI) that identifies each specific service instance. For example, the FMRI svc:/network/rpc/bind:default contains three parts separated by semi-colons:
    • svc identifies it as a service.
    • /network/rpc/bind identifies the service.
    • default identifies the service instance (the first instance of a service is often tagged the default instance).

Adding a Service to SMF

This document assumes you have a service that is currently started by /etc/rc* scripts.

In this example, we start with an example service named myservice. This service is purely an example used to illustrate SMF functionality. Our service runs the prstat executable and is started using the following /etc/rc3.d/S99my_service script:

#!/bin/sh

if [ -x /usr/bin/prstat ]; then
    /usr/bin/prstat -a > /dev/null &
fi
exit 0

The /etc/rc3.d/S* scripts are run in order whenever the system reboots or transitions into multi-user mode (run level 3). When our S99myservice shell script runs, it starts the prstat executable. We can confirm this by rebooting the system and then checking the running processes:

# ps -A | grep prstat
1595 console     0:00 prstat

To migrate any service to SMF control, you need three things:

  • An executable (a daemon program that runs continually and has no controlling tty)
  • A method (shell script) to start and stop the executable
  • An XML file that describes the service to the SMF

Since our example service is currently being started by /etc/rc* scripts, we already have the first two pieces: the executable (/usr/bin/prstat) and the shell script to start the executable (/etc/rc3.d/S99myservice). The one piece we need to create is the XML file.

Creating the SMF Manifest File

The SMF manifest file is an XML file that describes our service to SMF, listing information such as the executable to run and any dependencies. The manifest file used for our example service is shown in Listing 1. A description of the key elements follows.

<?xml version="1.0"?>
<!DOCTYPE service_bundle SYSTEM
    /usr/share/lib/xml/dtd/service_bundle.dtd.1">

<service_bundle type="manifest" name="myservice">
  <service name="site/myservice" type="service" version="1">
    <create_default_instance enabled="false" />

    <!-- Dependencies -->
    <dependency 
        name="filesystem-local" grouping="require_all" 
        restart_on="restart" type="service">
        <service_fmri value="svc:/system/filesystem/local:default" />
    </dependency>

    <!-- Execution method for start and stop -->
    <exec_method 
         type="method" name="start" 
         exec="/lib/svc/method/myservice.sh" timeout_seconds="60" >    
      <method_context>
          <method_credential user="root" group="root" />
      </method_context>
    </exec_method>

    <exec_method 
       type="method" name="stop" exec=":kill -9"
       timeout_seconds="60" >
    </exec_method>

    <template>
       <common_name>
         <loctext xml:lang="C">My example service</loctext>
       </common_name>
       <documentation>
          <manpage title="No man" section="99" manpath="/dev/null" />
       </documentation>
    </template>
  </service>
</service_bundle>

Listing 1. Manifest File for Our Example

Our manifest file contains the following elements:

  • XML header. The standard XML header is:

    <?xml version="1.0"?>
    <!DOCTYPE service_bundle SYSTEM 	
             /usr/share/lib/xml/dtd/service_bundle.dtd.1">
    
  • Service bundle. Includes a name for the service bundle and can include zero or more service elements. (Our example will include one service element that describes our service.)

     <service_bundle type="manifest" name="myservice">
    
  • Service. Specifies the name of the service and also contains child elements that specify dependencies, start methods, and so on. By convention, services are placed into categories to identify the general use of the service. (Service categories are listed in /lib/svc/manifest and include application, milestone, platform, system, device, network, and site.) The name for our example, site/myservice, indicates it is in the site category and is named myservice:

    <service name="site/myservice" type="service" version="1">
    
  • Default startup flag. This flag element specifies whether the service should be started by default. Our example creates an instance named default that is disabled by default:

    <create_default_instance enabled="false" />
    
  • Service dependencies. Specify the services that our service is dependent upon. In this example, we specify that our service is dependent on the local file system service. The attribute restart_on=restart specifies that our service should be restarted if the local file system service is restarted for any reason.

    Other possible values include none (never restart our service due to dependency state change), error (restart our service if the dependency is restarted due to hardware fault), and refresh (restart our service if the dependency is refreshed or restarted for any reason).

    The grouping=require_all attribute specifies that all services upon which our service depends must be online:

        <dependency 
            name="filesystem-local" grouping="require_all" 
            restart_on="restart" type="service">
            <service_fmri value="svc:/system/filesystem/local:default" />
        </dependency>
    
  • Exec methods. Describes the methods used to start, stop, or restart our service. In our example, we specify the name of our start script, specify a timeout value of 60 seconds, and indicate that our method should run as user root and group root:

    <exec_method 
         type="method" name="start" 
         exec="/lib/svc/method/myservice.sh" timeout_seconds="60" >    
         <method_context>
             <method_credential user="root" group="root" />
         </method_context>
    </exec_method>
    

    For our stop method, we tell SMF to execute the kill command to stop our service:

    <exec_method 
        type="method" name="stop" exec=":kill -9" timeout_seconds="60" >
    </exec_method>
    
  • Template. Contains metadata about the service, including a localizable string and links to documentation.

Note: Because our example created a single instance by default, we did not need to include explicit instance elements. If we wanted to start multiple instances of our service, we would need to include elements such as the following:

<instance name="myservice_instance1" enabled="false" />

Our example includes only a subset of the available SMF manifest features. The Oracle white paper "How to Create an Oracle Solaris Service Management Facility Manifest" contains a more detailed description. The complete XML syntax is documented in the file /usr/share/lib/xml/dtd/service_bundle.dtd.1 on Oracle Solaris 11 systems.

Creating the Start Method

Before adding our service to SMF, we need to have a method to start the service executable. At a minimum, our startup script will need to start the service. It can also include options to stop and restart the service.

In this example, we can use our existing /etc/rc3.d/S99myservice script and just add an SMF-specific exit code:

#!/bin/sh
. /lib/svc/share/smf_include.sh

if [ -x /usr/bin/prstat ]; then
    /usr/bin/prstat -a > /dev/null &
fi

exit $SMF_EXIT_OK

In general, you can use a single script that handles everything for a service, or you can create separate scripts for starting and stopping a service. And although most services use scripts, they aren't strictly required in all cases: If you can start, stop, or refresh with a single command, you can specify that command in the manifest and eliminate the script.

The following sections step through the procedure for migrating an example service from legacy /etc/rc* scripts to SMF management.

Migrating a Service to SMF

This procedure assumes we have a service that currently uses the legacy /etc/rc* scripts to automatically start. Our example uses the /etc/rc3.d/S99myservice script.

  1. We can use the svcs command to see that our service has been started by Oracle Solaris 11 and the prstat daemon is running:

    # ls /etc/rc3.d/S99myservice
    /etc/rc3.d/S99myservice
    # svcs S99myservice
    STATE        STIME     FMRI
    Legacy_run   11:41:19  lrc:/etc/rc3_3/S99myservice
    # ps -A | grep prstat
    375 ?        0:00 prstat
    
  2. First, we stop our service (by killing the executable prstat, in this example). Then, we need to remove or rename any related scripts in the /etc/rc* directories so that Oracle Solaris 11 doesn't continue to automatically start our service every time we reboot the system, for example:

    # pkill prstat 
    # ps -A | grep prstat
    # 
    # mv /etc/rc3.d/S99myservice /etc/rc3.d/_S99myservice
    
  3. This procedure assumes that we've already created a startup script and XML manifest file for our service, as described in the previous sections. In this example, our startup method is /lib/svc/method/myservice.sh:

    # ls -l /lib/svc/method/myservice.sh
    -rwxr-xr-x  1 root   root   133 Mar 28 10:30 /lib/svc/method/myservice.sh
    

    Note: Make sure your script has execute permissions and the appropriate user and group ownership.

  4. Copy the manifest (XML) file for your service to the /lib/svc/manifest/site subdirectory:

    # ls /lib/svc/manifest/site
    myservice.xml
    
  5. Run the svccfg validate command to check your manifest file for syntax errors. No output indicates that no errors were found:

    # svccfg validate /lib/svc/manifest/site/myservice.xml
    #
    
  6. Use the svcadm restart command to let the SMF know about your new service:

    # svcadm restart manifest-import
    Loading smf(5) service descriptions: 1/1
    

    Note: The procedure for adding new services to SMF has changed from Oracle Solaris 10 to Solaris 11. In Oracle Solaris 11, it is recommended that manifests be stored in the standard locations of /lib/svc/manifest and /var/svc/manifest, with /lib/svc/manifest being the preferred location. If your manifest is stored in a standard location in Oracle Solaris 11, it is no longer necessary (nor is it recommended) to use the svcadm import command. Instead, the svcadm restart command performs synchronization of the SMF repository and will recognize any changes to manifest files stored in the standard locations.

  7. Use the svcs command to check on the status of the new service we added. SMF knows about our service, but it has not yet been enabled:

    # svcs myservice
    STATE     STIME     FMRI
    disabled  10:54:08  svc:/site/myservice:default
    
  8. Enable the new service using the svcadm enable command:

    # svcadm enable myservice
    # svcs myservice
    STATE     STIME     FMRI
    enabled   10:55:03  svc:/site/myservice:default
    
  9. Confirm that our executable (in this example, we used prstat) was started:

    # ps -A | grep prstat
    2135 ?        0:00 prstat
    

Monitoring Services with SMF

Once our service is under SMF control, we can use the svcs command to monitor its operation.

  • To view all available information about a particular service, use the svcs command with the -l option, specifying the service by name, as shown in Listing 2:

    # svcs -l myservice
    fmri         svc:/site/myservice/default
    name         A simple example service
    enabled      true
    state        online
    next_state   none
    state_time   March 30, 2012 01:13:54 PM PDT
    logfile      /var/svc/log/site-myservice:default.lot
    restarter    svc:/system/svc/restarter:default
    contract_id  131
    manifest     /lib/svc/manifest/site/myservice.xml
    dependency   require_all/none   
                   svc:/system/system/filesystem/local:default (online)
    

    Listing 2. Viewing All Available Information for a Service

  • Using the svcs command with the -x option displays extra information about the service, which can be helpful when troubleshooting problems with a service:

    # svcs -x myservice
    svc:/site/myservice:default (A simple example service)
     State: online since March 30, 2012 01:13:54 PM PDT
        See: man -M /dev/null -s 99 No man
        See: /var/svc/log/site-myservice:default.log
    Impact: None.
    
  • To display the process IDs associated with a service, use the svcs command with the -p option:

    # svcs -p myservice
    STATE       STIME
    Online      13:13:54 svc:/site/myservice:default
                13:13:54     1321 prstat 
    

We can also use the svccfg setnotify command to modify notification parameters for events in the SMF repository, such as when a problem is diagnosed or resolved. For example, the default behavior is to send e-mail to root@localhost when a problem is diagnosed for any SMF service. The following command modifies SMF to send an e-mail to user1@example.com instead:

# svccfg setnotify problem-diagnosed mailto:user1@example.com

You can set notification globally or per specific service, and you can also set notifications for SMF state transitions, such as degraded and offline. See the svccfg(1M) man page for more details.

Demonstrating Automatic Restart of an SMF-Controlled Service

SMF monitors services and automatically restarts any executables that have failed, whether they failed because of administrator or configuration error, software fault, or uncorrectable hardware error. The following example demonstrates this automatic restart.

  1. First, confirm the executable is running. In our example service, we start the prstat executable:

    # ps -A | grep prstat
    2135 ?        0:00 prstat
    
  2. Kill the executable:

    # kill -9 2135
    
  3. Now, check the running processes again:

    # ps -A | grep prstat
    2155 ?        0:00 prstat
    

    Note: The prstat executable was stopped with the kill command, but it was automatically restarted by SMF. (You can tell it's a different process because the process IDs, 2135 and 2155, are different.)

SMF maintains a log file for each service instance. These log files can be useful when troubleshooting a service. Log files are stored in the /var/svc/log directory. The log file our example service is stored at is /var/svc/log/site-myservice:default.log.

Entries detail what start and stop methods are called and provide information about changes in service status. For example, the following log entries were created after executing the previous command to kill our service executable. SMF noticed the failure, called our stop method, and then called our start method to automatically restart our service:

# tail -4 /var/svc/log/site-myservice:default.log
[ Mar 29 11:33:23 Stopping because all processes in service exited. ]
[ Mar 29 11:33:23 Executing stop method (:kill). ]
[ Mar 29 11:33:23 Executing start method ("/lib/svc/method/myservice.sh"). ]
[ Mar 29 11:33:23 Method "start" exited with status 0. ]

Final Thoughts

This article showed how to migrate a service that was previously started using legacy /etc/rc* scripts to SMF control. Our simple example was complete, but it's impossible to explore all features of SMF and describe all elements of the XML manifest file within the scope of this document. The "See Also" section provides links to more information on creating an SMF manifest file and managing services with SMF. It also provides links to online man pages that document the manifest file format and SMF-related commands.

The author would like to acknowledge Brian Down (Management of Systems and Services Made Simple with the Oracle Solaris Service Management Facility, June 2010) and Rob Romack (Service Management Facility in the Solaris 10 Operating System, February 2006) for their work, which was used as background for this article.

See Also

The following resources are available for the Oracle Solaris Service Management Facility:

The following man page entries contain information on SMF and the service manifest file format:

And here are some Oracle Solaris 11 resources:

About the Author

Suzanne Zorn has over 20 years of experience as a writer and editor of technical documentation for the computer industry. Previously, Suzanne worked as a consultant for Sun Microsystems' Professional Services division specializing in system configuration and planning. Suzanne has a BS in computer science and engineering from Bucknell University and an MS in computer science from Renssalaer Polytechnic Institute.

Revision 1.1, 04/28/2014

See sysadmin-related content for all Oracle technologies by following OTN Systems on Facebook and Twitter.