How to Use the Power Management Controls on SPARC Servers

by Bruce Evans, Julia Harper, and Terry Whatley, March 2012


How to manage power policies, power capping, and the power of idle devices on SPARC CMT Servers with the firmware CLI and BUI, as well as SNMP and IPMI tools.




Introduction

SPARC T-Series systems have power-saving features designed into the hardware and software. These features allow you to reduce server power consumption, which leads to a cost reduction for environmental cooling and reduced power usage by other infrastructure components. The SPARC T-Series power management (PM) interfaces make it easy to manage these PM features.

Want technical articles like this one delivered to your inbox?  Subscribe to the Systems Community Newsletter—only technical content for sysadmins and developers.

This article covers the following topics:

Power Management Features

This section describes power management policies, power capping, and device power management in Oracle Solaris 10.

Power Management Policies

There are two power management policies: performance and elastic. When the performance policy is enabled, all hardware power states are set to full power (unless power capping is enabled, as described in the next section). When the elastic policy is enabled, hardware power states are selected based on system utilization.

System power consumption can be reduced by tens to hundreds of watts depending on the system configuration. For example, on a SPARC T4-4 server with 256 gigabytes of memory, we measured a savings of 200 watts (17% of full power).

Use the performance policy when:

  • There are known periods during which full performance is needed, such as for financial market trading, end-of-month accounting, or system data backups.
  • Time-critical operations come at random intervals and don't always fully load the system.

Use the elastic policy when:

  • The system will be idle for a while, such as overnight or on the weekend.
  • Work comes and goes on the system, and small latencies in completing that work are tolerable for overall power savings.
  • The system runs flat out for long periods of time, with quiet times in between.
  • Electricity cost savings is a factor.

Power Capping

You can set a power consumption limit for the system.

Use a power cap when:

  • Electricity rates go up dramatically after a contracted power consumption is exceeded.
  • Only a limited amount of power is available for all server systems and it needs to be divided among the servers.
  • The power provider requests that consumption be curtailed due to high demand, such as during certain periods in hot summer months.

Power capping works with either the performance policy or the elastic policy.

Oracle Solaris 10: Device Power Management

In Oracle Solaris 10, device power management allows you to configure when to apply low power states to idle devices.

Note: This feature is no longer available in Oracle Solaris 11.

Use device power management when:

  • The system has disks that are not currently used or administered.
  • The system has a power-hungry display, such as a CRT.
  • The system has any other PM-capable devices, such as frame buffers, PCI, and so on, that are often idle.

Power saved by using device power management is in addition to the power savings from using the elastic policy or power capping.

Using Interfaces to Manage the Power Management Features

Below is a summary of the interfaces that allow you to enable the power management features you want. See the appendix for more details on how to access and configure each interface.

Power Management Policy

The PM policy is managed in ILOM under the /SP/powermgmt target. There are several ways to view or change the policy.

ILOM Command Line

Log in to the ILOM SP as root. Show and set the current policy, as follows. (The ILOM prompt is ->.)

-> show /SP/powermgmt policy
-> set /SP/powermgmt policy=elastic
-> set /SP/powermgmt policy=performance

ILOM BUI

This section shows how to set a PM policy using the ILOM BUI.

  1. From a Web browser with network connectivity to the SP, connect to https://<SP-IP-address>, and log in as user root.
  2. Navigate to the Power Management -> Settings tab.
  3. Select the desired power policy, and click Save.
Figure 1

Figure 1. ILOM BUI

SNMP

On a management system with network access to the ILOM SP, use the snmpget and snmpset commands shown below to read and set the PM policy using the SNMP MIB called SUN-HW-CTRL-MIB.

  • To get the policy, use:

    snmpget -v2c -cprivate <SP-IP-address> sunHwCtrlPowerMgmtPolicy.0
    
  • To enable the performance policy, use:

    snmpset -v2c -cprivate <SP-IP-address> sunHwCtrlPowerMgmtPolicy.0 3
    
  • To enable the elastic policy, use:

    snmpset -v2c -cprivate <SP-IP-address> sunHwCtrlPowerMgmtPolicy.0 4
    

IPMI

On a management system with network access to the ILOM SP, use the ipmitool commands shown below to read and set the PM policy. This requires version 1.8.9 or later of ipmitool. Provide the root password for the SP when prompted by the tool.

ipmitool -I lan -H <SP-IP-address> -U root sunoem cli
	"show /SP/powermgmt"
ipmitool -I lan -H <SP-IP-address> -U root sunoem cli
	"set /SP/powermgmt policy=performance"
ipmitool -I lan -H <SP-IP-address> -U root sunoem cli
	"set /SP/powermgmt policy=elastic"

Oracle Enterprise Manager Ops Center

  1. From a Web browser, log in to your Oracle Enterprise Manager Ops Center console.
  2. From the left Navigation frame, expand the Assets section, select All Assets from the list, and then find and click the system of interest.
  3. From the right Actions frame, click the Set Power Policy link, select the desired policy option, and then click the Submit button.
Figure 2

Figure 2. Oracle Enterprise Manager Ops Center

Power Capping

The power cap is managed in ILOM under the /SP/powermgmt/budget target. There are several ways to view or change the budget. See the appendix for details about additional properties for advanced control.

ILOM Command Line

First, log in to the SP ILOM as root. Then use the following commands.

  • To show the power cap settings,use :

    -> show /SP/powermgmt/budget
    
  • To show the current power consumption, use:

    -> show /SP/powermgmt actual_power
    
  • To configure the pending power limit in watts (replacing 400 with a value that is appropriate for your environment), use:

    -> set /SP/powermgmt/budget pendingpowerlimit=400
    
  • To apply the pending values, use:

    -> set /SP/powermgmt/budget commitpending=true
    
  • To enable the configured power limit, use :

    -> set /SP/powermgmt/budget activation_state=enabled
    

ILOM BUI

This section shows how to set a power cap budget using the ILOM BUI.

  1. Connect to https://<SP-IP-address>, and log in.
  2. Navigate to the Power Management -> Consumption tab to view the current power usage.
  3. Navigate to the Power Management -> Limit tab.
  4. Select the Power Limiting checkbox to enable power capping.
  5. Set the power limit in watts in the Target Limit box.
  6. Save the settings.
Figure 3

Figure 3. ILOM BUI

SNMP

On a management system with network access to the ILOM SP, use the snmpget and snmpset commands shown below to read and set the power cap using the SNMP MIB called SUN-HW-CTRL-MIB.

  • To see if power capping is enabled or disabled, use:

    snmpget -v2c -cprivate <SP-IP-address> sunHwCtrlPowerMgmtBudget.0
    
  • To read the current power limit, use:

    snmpget -v2c -cprivate <SP-IP-address>
    	sunHwCtrlPowerMgmtBudgetPowerlimit.0
    
  • To configure a pending power limit in watts (replacing 500 with a value that is appropriate for your environment), use:

    snmpset -v2c -cprivate <SP-IP-address>
    	sunHwCtrlPowerMgmtBudgetPendingPowerlimit.0 = 500
    
  • To apply the pending values, use:

    snmpset -v2c -cprivate <SP-IP-address>
    	sunHwCtrlPowerMgmtBudgetCommitPending.0 = true
    
  • To enable the configured power limit, use:

    snmpset -v2c -cprivate <SP-IP-address>
    	sunHwCtrlPowerMgmtBudget.0 = enabled
    

IPMI

On a management system with network access to the ILOM SP, use the ipmitool commands shown below to read and set the PM policy. Version 1.8.9 or later supports the sunoem cli command. Provide the root password for the SP when prompted by the tool.

  • To show current settings, use:

    ipmitool -I lan -H <SP-IP-address> -U root sunoem cli
    	"show /SP/powermgmt/budget"
    
  • To configure the pending power limit in watts (replacing 400 with a value that is appropriate for your environment), use:

    ipmitool -I lan -H <SP-IP-address> -U root sunoem cli
    	"set /SP/powermgmt/budget pendingpowerlimit=400"
    
  • To apply the pending values, use:

    ipmitool -I lan -H <SP-IP-address> -U root sunoem cli
    	"set /SP/powermgmt/budget commitpending=true"
    
  • To enable the configured power limit,use:

    ipmitool -I lan -H <SP-IP-address> -U root sunoem cli
    	"set /SP/powermgmt/budget activation_state=enabled"
    

Oracle Solaris 10: Device Power Management

Device PM is managed via the host CLI, using the pmconfig(1M) command and the /etc/power.conf(4) file. See the appendix for more details on how to configure device power management.

  1. Edit the power.conf file. See power.conf(4) for detailed information.

    autopm   (enable | disable}
    system-threshold {always-on | <idle_time>}
    device-thresholds <physical_path1> {<idle_time> | always-on}
    ...
    device-thresholds <physical_pathx> {<idle_time> | always-on}
    cpu-threshold <idle_time>
    
  2. Enable the new settings in power.conf with this command:

    pmconfig
    

Use Cases and Scripts

Below are some examples of how to take advantage of the SPARC T-Series power management features.

Elastic Policy

Enable the elastic policy every evening at 5 p.m. and disable it at 7 a.m. on weekdays. Leave the elastic policy enabled during the weekend. This is accomplished with a ksh script and crontab for the time scheduling. The script uses the SNMP interface to communicate with the ILOM SP to either get or set the PM policy for one or more systems.

Here is a link to the policy.ksh script.

Power Capping

Enable power capping every weekday from 3 p.m. to 7 p.m. during the summer months of June, July, and August because electricity rates go up for high-demand hours. This is accomplished with a Perl script and crontab for the time scheduling. The script uses the SNMP interface to communicate with the ILOM SP to either configure and enable or disable the power cap for one or more systems.

Here is a link to the pwrcap.pl script.

Oracle Solaris 10: Device Power Management

When the power management policy is set to the performance policy, enable device power management, so that at least some type of power management is being done.

  1. First, /etc/power.conf must be edited to specify the devices that are to be managed and their thresholds.
  2. Once power.conf is edited to specify the devices, the only change that needs to be made to the file is to turn on or off device power management.

All these tasks are accomplished with a Python script and crontab for the time scheduling. The script will enable or disable autopm in the /etc/power.conf file.

Here is a link to the devicepm.py script.

How Power Management Works: Focus on SPARC T4 Systems

This section provides a glossary, describes various hardware states and power management software, and provides an example of power consumption for the performance policy versus the elastic policy.

Glossary

Term Description
Chip multithreading (CMT) A hardware technology that uses chip multiprocessing and hardware multithreading to allow simultaneous execution of multiple software threads on a processor.
CKE (ClocK Enable) A HW clock signal that HW components use for timing and operation.
Core A HW unit on a CPU chip that contains strands.
HV (Hypervisor ) The hyper-privileged firmware in sun4v.
ILOM (Integrated Lights Out Manager) The firmware on the SP.
Management Information Base (MIB) A collection of objects that describe an SNMP manageable entity.
PPFE (Pre-charge Power-down Fast Exit) A memory idle state mode with short latency to exit the idle state.
PPSE (Pre-charge Power-down Slow Exit) A memory idle state mode with long latency to exit the idle state.
SNMP (Simple Network Management protocol) An internet standard protocol for managing devices on IP networks.
SP (Service Processor) A small circuit board-based computer that includes both hardware and software to help control a larger computer system.
Strand An execution unit within a core on the CPU chip.

Hardware Power States

There are several hardware features that can be applied to put components into lower power states:

  • Clock Cycle Skip. Clock Cycle Skip is a mechanism by which instruction execution is suppressed during some of the chip's clock cycles. This reduction in activity reduces the chip's power consumption. A SPARC T4 can be configured to skip one, two, three, or four of every eight clock cycles.
  • Core Disable. A core is disabled by turning off its clock input, which is a significant source of power savings. SPARC T4 systems have eight cores per CPU socket. Cores can be independently enabled and disabled. Each core has eight physical CPU units called strands. A core can be disabled only after all its strands are parked. Parking a strand stalls its instruction stream and removes it from participation in pipeline execution. Once a strand is parked, any external stimulus (such as an interrupt) is ignored.
  • Memory PPFE and PPSE. PPFE and PPSE are two modes for memory idle states. If either mode is enabled, the CKE signal is turned off to DIMMs that have no memory accesses queued. In both the PPFE and PPSE mode, the DIMM's I/O interface goes to standby, a state in which the clock signal is turned off. There is a small latency on first access to the DIMM as the I/O interface restarts. In PPSE mode, clock synchronization logic is also stopped. This saves more power, but resynchronizing the clock logic causes a longer first-access latency.

Power Management Software

Power management is provided by Oracle VM Server for SPARC, which makes use of the hardware power states to manage power. Below is a description of the behavior of Oracle VM Server for SPARC 2.2.

Elastic Policy Applied to CPUs

When elastic policy is enabled, Oracle VM Server for SPARC uses the Clock Cycle Skip feature to keep the CPU utilization of each guest within a target utilization range. Oracle VM Server for SPARC adjusts the cycle skip ratio to a level sufficient to address the utilization needs of all guests sharing the CPU chip. Oracle VM Server for SPARC also disables cores that have no strands assigned to guests.

Elastic Policy Applied to Memory

When elastic policy is enabled, Oracle VM Server for SPARC determines when to apply PPSE or PPFE to groups of DIMMs based on the utilization of those DIMMs. If utilization falls below the target range, Oracle VM Server for SPARC puts the DIMMs in PPSE mode. If utilization rises above the range, the DIMMs are put into PPFE mode. The same mode (PPFE or PPSE) must be applied to all DIMMs whose memory addresses are interleaved together. On a SPARC T4, there are two memory interleaves:

  • Rank low: All DIMMs attached to a CPU chip are interleaved together.
  • Rank high: A CPU chip's DIMMs are interleaved in two separately manageable groups.

The memory interleave is determined by the power management policy in effect when the system is booted from a powered-off state. The elastic policy causes DIMMs to be interleaved into two groups per CPU, whereas the performance policy interleaves all DIMMs under a CPU into one group. Utilization is less likely to fall below the target in the larger DIMM group, so memory power management will tend to be applied less often.

Power Capping

Oracle VM Server for SPARC uses the Clock Cycle Skip feature to achieve a power cap. When power capping is enabled, the service processor monitors the system power consumption and tells Oracle VM Server for SPARC whether it must reduce power or it is allowed to increase power. Based on this notification, Oracle VM Server for SPARC either increases or decreases the cycle skip level of the CPU chips in the system. When a clock cycle skip level is selected by power capping, CPU power management cannot set a higher level for either the performance or the elastic policy.

Example: Performance Policy Versus Elastic Policy

We measured a SPARC T4-4 system with 256 gigabytes of memory, comparing the power consumption and behavior of the performance policy to the elastic policy. Figure 4 shows the results as a data warehousing workload is started and stopped on the system.

  • The red line shows the system power consumption in watts. This was measured using the ILOM sensor /SYS/VPS.
  • The light green line labeled "processor state" shows the cycle skip setting of the processors (how many cycles allow execution).
  • The dark green line labeled "# PPFE memgroups" shows how many of four memory regions are in PPFE mode.
Figure 4

Figure 4. Performance Policy Versus Elastic Policy

Figure 4 shows that when the performance policy is enabled, this system consumes about 1160 watts of power when idle. When the elastic policy is enabled, about 200 watts, or 17% of the system power, can be saved. Figure 4 also shows that when the workload is run while the elastic policy is enabled, the system quickly adjusts the power states of the hardware components to provide full performance. No watts are saved with the elastic policy while the workload runs, because it takes advantage of the full performance provided.

Appendix: Accessing and Configuring the Interfaces

This appendix provides details about how to access and configure each interface.

Using the ILOM CLI to Set Power Cap Properties

This section shows the policy and power cap ILOM properties and guidelines for setting the power cap properties.

  1. Log in to the ILOM SP as a user with administrative (a) role. The ILOM prompt is ->.
  2. List the /SP/powermgmt target.

        -> show /SP/powermgmt
          /SP/powermgmt
             Targets:
                 budget
                 powerconf
             Properties:
                 actual_power = 272|
                 permitted_power = 1045
                 allocated_power = 1045
                 available_power = 1200
                 threshold1 = 0
                 threshold2 = 0
                 policy = performance
             Commands:
                 cd
                 set
                 show
    

    The policy property shows the current power management policy.

  3. List the /SP/powermgmt/budget target. This shows the power capping configuration.

         -> show /SP/powermgmt/budget
          /SP/powermgmt/budget
             Targets:
             Properties:
                 activation_state = disabled
                 status = ok
                 powerlimit = 500 (watts)
                 timelimit = 100
                 violation_actions = none
                 min_powerlimit = 332
                 pendingpowerlimit = 500 (watts)
                 pendingtimelimit = 10
                 pendingviolation_actions = none
                 commitpending = (Cannot show property)
             Commands:
                 cd
                 set
                 show
    

    The powerlimit, timelimit, and violation_action properties are the applied (active) values. These properties are applied to the system whenever activation_state is enabled. The properties starting with pending are values you can configure and then apply by setting commitpending to true. Once committed, the pending values become the applied policies.

    The powerlimit determines the point at which the system enables power-savings features to cap power. The system caps power whenever the power consumption exceeds the powerlimit, and it removes the cap when power falls below the powerlimit. The powerlimit can be set to any value between the min_powerlimit property and the /SP/powermgmt allocated_power property. The best way to choose a meaningful powerlimit is to observe the power consumed by the system at idle and when running your workload, and then choose a value between idle power and close to or just above your normal workload power consumption.

    The timelimit is expressed in seconds. This property's default value is tuned for best behavior. Unless there is specific requirement for it to be set lower, it is recommended to not change it. Setting the timelimit to zero (0) tells the system to proactively cap power to never exceed the powerlimit (this is called Hard Cap in the BUI). This value is not supported on SPARC systems and will cause the system to immediately generate a violation action as soon as the timelimit is activated.

    When the system cannot reduce power to the powerlimit within the specified timelimit, the limit is violated. The system continues to apply the maximum power capping it can when it is in this state. The violation_actions property, which can be set to either none or hardpoweroff, determines the action the system takes when the limit is violated. Rarely will you want to set it to hardpoweroff, because that will power off the system if:

    • The powerlimit is violated for longer than the timelimit.
    • The timelimit is zero (0) and the powerlimit is less than allocated_power.

    If the timelimit is zero, the system will not boot until the powerlimit is set at or above the allocated_power value.

    -> set /SP/powermgmt/budget/pendingpowerlimit=<watts> 
    -> set /SP/powermgmt/budget/pendingtimelimit=<seconds>  # Recommended to leave at default (10)
    -> set /SP/powermgmt/budget/pendingviolation_actions=none # Recommended to leave at default (none)
    -> set /SP/powermgmt/budget/commitpending=true
    

Configuring ILOM to Accept SNMP Commands

The SNMP interface is based on the open source net-SNMP client. The snmpget and snmpset commands are standard Oracle Solaris commands, which are documented in the snmpget man page.

In order to use SNMP to manage ILOM power management settings, you need to configure ILOM to accept SNMP commands. See "Configuring SNMP Settings in Oracle ILOM" in the ILOM documentation for details.

  1. Install the MIB(s). The MIB that contains the policy and power capping objects is SUN-HW-CTRL-MIB.
  2. Create a mibs directory to put all .mib files in, for example, /.snmp/mibs.
  3. Extract the MIB files from the host SP using the ILOM CLI. The MIBs are contained on the SP flash in a zip file. Download this file to an SNMP host if it was not previously downloaded.

    -> cd /SP/services/snmp/mibs
    -> set dump_uri= scp://<user>:<passwd>@<IP_snmp_host>/<full_path_to_mib_zip_file>
    
  4. Extract MIB files from the zip file on the SNMP host:

    unzip <full_path_to_mib_zip_file>
    
  5. Create or modify the snmp.conf config file. The recommendation is to put this file in the parent directory for the MIBs, for example, /.snmp.
  6. Add the following lines to snmp.conf:

    mibs    ALL
    mibdirs +/.snmp/mibs
    
  7. Broadcast where the MIB files are located.
    • There are several ways to do this. Read the snmpd man page or visit SNMP_CONFIG.
    • Recommendation: Set the ENV variable to SNMPCONFPATH="/.snmp/mibs".
  8. Install an SNMP client:
    • Recommendation: Install net-snmp.
    • The commands snmpget and snmpset need to be available from your SNMP client. Set $PATH to include the path to these SNMP commands, for example:

      PATH=$PATH:/usr/sfw/bin 
      
  9. Read the .mib file, SUN-HW-CTRL-MIB.mib.

    The SNMP object names for policy and power capping and their allowed values are in this file. You will need to supply the object name and values to the snmpget and snmpset commands.

  10. Set SNMP MIB properties for power cap advanced settings:
    1. Set a pending timeout in seconds, specified as milliseconds:

      snmpset -v2c -cprivate <SP-IP-address>
      	sunHwCtrlPowerMgmtBudgetPendingTimelimit.0 = 20000
      
    2. Set a pending violation action:

      snmpset -v2c -cprivate <SP-IP-address>
      	sunHwCtrlPowerMgmtBudgetPendingTimelimitActions.0 = none
      

Configuring IPMI

The IPMI interface is an open industry standard for management of server systems. See the following links for more information on IPMI.

Your server might already have ipmitool installed. Version 1.8.9 is required for setting the power management policy and power capping. Check your version using the following command:

/usr/sbin/impitool -V

You can download the most recent version at this download page.

If you had to download a newer version, follow the README to build and install the new ipmitool. You must provide a password, either when prompted by the tool or in a file provided to the tool. See the man page for details.

In order to use IPMI to manage ILOM power management settings, you need to configure ILOM to accept IPMI commands. See "Before You Begin — ILOM and IPMItool Requirements" in the ILOM documentation for details.

Here are some details on how to set the power management policy and a power cap:

  • ipmitool has a raw mode where hex bytes are specified for the command parameters. You can use raw mode to set a power cap. See "Manage ILOM Power Budget Interfaces (IPMItool)" in the ILOM documentation for details.
  • The sunoem cli command option is a simpler way to set a power cap or power management policy. The results of using the sunoem cli command are the same as the results from using the ILOM command line interface.
  • To set a pending timeout that is specified in seconds, use the following command:

    ipmitool -I lan -H <SP-IP-address> -U root sunoem cli
    	"set /SP/powermgmt/budget pendingtimelimit=20"
    
  • To set a pending violation action, use the following command:

    ipmitool -I lan -H <SP-IP-address> -U root sunoem cli
    	"set /SP/powermgmt/budget pendingviolation_actions=none"
     

Configuring Oracle Solaris 10 Device Power Management

Device power management is managed via the CLI command pmconfig and the file /etc/power.conf.

  1. Edit /etc/power.conf to set the options for device PM.

    system-threshold {<idle_threshold> | always-on}
    
    • If idle_threshold is specified, its time value is used as the default idle threshold for all devices that do not have an overriding idle threshold set.
    • If always-on is specified, there is no default system idle threshold, and all devices that do not have an overriding idle threshold set will be left at full power.
       
  2. Use disable to turn off all device PM:

    autopm {enable | disable}
    
  3. Use the following command to set a device-specific threshold:

    device-thresholds {<idle_threshold | always-on}
    
    • See power.conf(4) for more complex configuration.
    • If an idle_threshold is specified, it overrides the system idle threshold for the specific device named.
    • If always-on is specified, this device is not power managed.

Here is an example for keeping a boot disk always powered on and nonboot disks power managed:

device-thresholds /dev/dsk/c1t0d0s0 always-on  # boot disk
device-thresholds /pci@8,600000/scsi@4/ssd@w210000203700c3ee,0  15s # 15 seconds idle threshold
device-thresholds /dev/dsk/c1t3d0s0 1m # 1 minute idle threshold

Configuring the Oracle Enterprise Manager Ops Center Interface

See the Oracle Enterprise Manager Ops Center Quick Start Guide for a discussion about how to configure Oracle Enterprise Manager Ops Center so it will discover and manage your system. Also see the full Oracle Enterprise Manager Ops Center documentation set.

Resources

Here are URLs for the resources referenced earlier in this document:

Revision 1.0, 02/27/2012

Follow us on Facebook, Twitter, or Oracle Blogs.