文章
服务器与存储管理
作者:Subarna Ganguly 和 Jonathan Mellors,2011 年 12 月
本文逐步介绍使用交互式 scinstall 实用程序为两个节点安装和配置 Oracle Solaris Cluster 软件(包括配置仲裁设备)的过程。不包含配置高度可用的服务。
|
注:有关如何安装和配置其他 Oracle Solaris Cluster 软件配置的更多详细信息,请参见 Oracle Solaris Cluster 软件安装指南。
交互式 scinstall 实用程序是菜单驱动的。菜单通过使用默认值、提示您提供特定于您集群的信息以及标识无效条目,有助于降低出错可能性并促进最佳实践。
scinstall 实用程序还通过针对新集群实现仲裁设备配置自动化,无需手动配置仲裁设备。
注:本文参考 Oracle Solaris Cluster 4.0 版本。有关最新 Oracle Solaris Cluster 版本的更多信息,请参见版本说明。
本节讨论适用于双节点集群的一些先决条件、假设和默认设置。
本文假设满足以下条件:
注:建议(但非必需)您在集群安装过程中通过控制台访问节点。

图 1. Oracle Solaris Cluster 硬件配置
本文假设在两个系统上都安装了 Oracle Solaris 11。
您必须具有将配置为集群的节点的逻辑名称(主机名称)和 IP 地址。将这些条目添加到每个节点的 /etc/inet/hosts 文件,或添加到命名服务(如果使用 DNS、NIS 或 NIS+ 映射等命名服务)。
本文中的示例使用 NIS 服务和表 1 所示的配置。
表 1. 配置| 组件 | 名称 | 接口 | IP 地址 |
|---|---|---|---|
| 集群名称 | phys-schost | — | — |
| 节点 1 | phys-schost-1 | nge0 | 1.2.3.4 |
| 节点 2 | phys-schost-2 | nge0 | 1.2.3.5 |
scinstall 交互式实用程序以 Typical 模式安装 Oracle Solaris Cluster 软件,并使用以下默认设置:
switch1 和 switch2本文中的示例没有集群传输交换机,而是使用背对背电缆解决专用网络连接。
在本文的示例中,两个集群节点上的专用互连的接口都是 nge1 和 e1000g1。
执行以下步骤。
root 临时启用 rsh 或 ssh 访问。/etc/inet/hosts 文件条目。如果未提供任何其他名称解析服务,则将另一个节点的名称和 IP 地址添加到此文件。 在我们的示例(使用 NIS 服务)中,/etc/inet/hosts 文件如下所示。
在节点 1 上:
# Internet host table # ::1 phys-schost-1 localhost 127.0.0.1 phys-schost-1 localhost loghost
在节点 2 上:
# Internet host table # ::1 phys-schost-2 localhost 127.0.0.1 phys-schost-2 localhost loghost
在我们的示例中,在两个节点之间共享两个磁盘:
c0t600A0B800026FD7C000019B149CCCFAEd0c0t600A0B800026FD7C000019D549D0A500d0
# format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
0. c4t0d0 <FUJITSU-MBB2073RCSUN72G-0505 cyl 8921 alt 2 hd 255 sec 63>
/pci@7b,0/pci1022,7458@11/pci1000,3060@2/sd@0,0
/dev/chassis/SYS/HD0/disk
1. c4t1d0 <SUN72G cyl 14084 alt 2 hd 24 sec 424>
/pci@7b,0/pci1022,7458@11/pci1000,3060@2/sd@1,0
/dev/chassis/SYS/HD1/disk
2. c0t600A0B800026FD7C000019B149CCCFAEd0 <SUN-CSM200_R-0660 cyl 2607 alt 2 hd 255 sec 63>
/scsi_vhci/disk@g600a0b800026fd7c000019b149cccfae
3. c0t600A0B800026FD7C000019D549D0A500d0 <SUN-CSM200_R-0660 cyl 2607 alt 2 hd 255 sec 63>
/scsi_vhci/disk@g600a0b800026fdb600001a0449d0a6d3
# more /etc/release
Oracle Solaris 11 11/11 X86
Copyright (c) 1983, 2011, Oracle and/or its affiliates. All rights reserved.
Assembled 26 September 2011
addrconf 类型,如命令 ipadm show-addr -o all 所显示的)。 如果节点配置为静态,则继续到配置 Oracle Solaris Cluster 发布者一节。否则,继续此过程并执行以下操作:
# netadm enable -p ncp defaultfixed Enabling ncp 'DefaultFixed' phys-schost-1: Sep 27 08:19:19 phys-schost-1 in.ndpd[1038]: Interface net0 has been removed from kernel. in.ndpd will no longer use it Sep 27 08:19:19 phys-schost-1 in.ndpd[1038]: Interface net1 has been removed from kernel . in.ndpd will no longer use it Sep 27 08:19:19 phys-schost-1 in.ndpd[1038]: Interface net2 has been removed from kernel . in.ndpd will no longer use it Sep 27 08:19:20 phys-schost-1 in.ndpd[1038]: Interface net3 has been removed from kernel . in.ndpd will no longer use it Sep 27 08:19:20 phys-schost-1 in.ndpd[1038]: Interface net4 has been removed from kernel . in.ndpd will no longer use it Sep 27 08:19:20 phys-schost-1 in.ndpd[1038]: Interface net5 has been removed from kernel . in.ndpd will no longer use it
# svccfg -s svc:/network/nis/domain setprop config/domainname = hostname: nisdomain.example.com # svccfg -s svc:/network/nis/domain:default refresh # svcadm enable svc:/network/nis/domain:default # svcadm enable svc:/network/nis/client:default # /usr/sbin/svccfg -s svc:/system/name-service/switch setprop config/host = astring: \"files nis\" # /usr/sbin/svccfg -s svc:/system/name-service/switch setprop config/netmask = astring: \"files nis\" # /usr/sbin/svccfg -s svc:/system/name-service/switch setprop config/automount = astring: \"files nis\" # /usr/sbin/svcadm refresh svc:/system/name-service/switch
# ypinit -c
# beadm create Pre-Cluster-s11 # beadm list BE Active Mountpoint Space Policy Created -- ------ ---------- ----- ------ ------- Pre-Cluster-s11 - - 179.0K static 2011-09-27 08:51 s11 NR / 4.06G static 2011-09-26 08:50
可通过两种主要方法访问 Oracle Solaris Cluster 软件包信息库,具体取决于集群节点是否可直接访问(或通过 Web 代理访问)互联网:
要访问 Oracle Cluster Solaris 版本信息库或支持信息库,按如下所示获得 SSL 公钥和私钥:
ha-cluster 发布者,使其指向 pkg.oracle.com 上的选定信息库 URL。以下示例使用版本信息库:# pkg set-publisher \ -k /var/pkg/ssl/Oracle_Solaris_Cluster_4.0.key.pem \ -c /var/pkg/ssl/Oracle_Solaris_Cluster_4.0.certificate.pem \ -g https://pkg.oracle.com/ha-cluster/release/ ha-cluster
要访问 Oracle Solaris Cluster 版本信息库或支持信息库的本地副本,按如下所示下载信息库映像。
# lofiadm -a /tmp/osc4.0-repo-full.iso /dev/lofi/1 # mount -F hsfs /dev/lofi/1 /mnt # rsync -aP /mnt/repo /export # share /export/repo
ha-cluster 发布者。以下示例使用节点 1 作为共享信息库本地副本的系统:# pkg set-publisher -g file:///net/phys-schost-1/export/repo ha-cluster
ha-cluster 软件包无法访问 Oracle Solaris 发布者,则此软件包的安装可能失败。# pkg publisher PUBLISHER TYPE STATUS URI solaris origin online <solaris repository> ha-cluster origin online <ha-cluster repository>
ha-cluster-full 软件包组,如清单 4 所示。
# pkg install ha-cluster-full
Packages to install: 68
Create boot environment: No
Create backup boot environment: Yes
Services to change: 1
DOWNLOAD PKGS FILES XFER (MB)
Completed 68/68 6456/6456 48.5/48.5$<3>
PHASE ACTIONS
Install Phase 8928/8928
PHASE ITEMS
Package State Update Phase 68/68
Image State Update Phase 2/2
Loading smf(5) service descriptions: 9/9
Loading smf(5) service descriptions: 57/57
在节点 1 上,运行以下命令。
# dladm show-phys LINK MEDIA STATE SPEED DUPLEX DEVICE net3 Ethernet unknown 0 unknown e1000g1 net0 Ethernet up 1000 full nge0 net4 Ethernet unknown 0 unknown e1000g2 net2 Ethernet unknown 0 unknown e1000g0 net1 Ethernet unknown 0 unknown nge1 net5 Ethernet unknown 0 unknown e1000g3
在节点 2 上,运行以下命令。
# dladm show-phys LINK MEDIA STATE SPEED DUPLEX DEVICE net3 Ethernet unknown 0 unknown e1000g1 net0 Ethernet up 1000 full nge0 net4 Ethernet unknown 0 unknown e1000g2 net2 Ethernet unknown 0 unknown e1000g0 net1 Ethernet unknown 0 unknown nge1 net5 Ethernet unknown 0 unknown e1000g3
在我们的示例中,将在每个节点上使用 net1 和 net3 作为专用互连。
# svcs -x
network/rpc/bind:default 将其 local_only 配置设置为 false。# svcprop network/rpc/bind:default | grep local_only config/local_only boolean false
如果未设置为 false,则按如下方式设置:
# svccfg svc:> select network/rpc/bind svc:/network/rpc/bind> setprop config/local_only=false svc:/network/rpc/bind> quit # svcadm refresh network/rpc/bind:default # svcprop network/rpc/bind:default | grep local_only config/local_only boolean false
scinstall 命令启动 Oracle Solaris Cluster 配置实用程序(也可在另一个节点上运行此命令来配置软件),然后在主菜单键入 1 选择创建新的集群或添加集群节点。在清单 5 所示的示例中,此命令在第二个节点 phys-schost-2 上运行。
scinstall 命令
# /usr/cluster/bin/scinstall
*** Main Menu ***
Please select from one of the following (*) options:
* 1) Create a new cluster or add a cluster node
* 2) Print release information for this cluster node
* ?) Help with menu options
* q) Quit
Option: 1
*** Create a New Cluster ***
This option creates and configures a new cluster.
Press Control-D at any time to return to the Main Menu.
Do you want to continue (yes/no) [yes]?
Checking the value of property "local_only" of service svc:/network/rpc/bind
...
Property "local_only" of service svc:/network/rpc/bind is already correctly
set to "false" on this node.
Press Enter to continue:
>>> Typical or Custom Mode <<<
This tool supports two modes of operation, Typical mode and Custom
mode. For most clusters, you can use Typical mode. However, you might
need to select the Custom mode option if not all of the Typical mode
defaults can be applied to your cluster.
For more information about the differences between Typical and Custom
modes, select the Help option from the menu.
Please select from one of the following options:
1) Typical
2) Custom
?) Help
q) Return to the Main Menu
Option [1]: 1
phys-schost)。
>>> Cluster Name <<<
Each cluster has a name assigned to it. The name can be made up of any
characters other than whitespace. Each cluster name should be unique
within the namespace of your enterprise.
What is the name of the cluster you want to establish? phys-schost
phys-schost-1),按 Control-D 完成节点列表,回答 yes 确认节点列表,如清单 8 所示。
>>> Cluster Nodes <<<
This Oracle Solaris Cluster release supports a total of up to 16
nodes.
List the names of the other nodes planned for the initial cluster
configuration. List one node name per line. When finished, type
Control-D:
Node name (Control-D to finish): phys-schost-1
Node name (Control-D to finish):
^D
This is the complete list of nodes:
phys-schost-2
phys-schost-1
Is it correct (yes/no) [yes]?
net1 和 net3,这是先前所确定的。如果工具在这些接口上发现网络流量,则要求进行确认以便以任何方式使用它们。确保这些接口未连接到任何其他网络,然后确认将它们用作传输适配器,如清单 9 所示。
>>> Cluster Transport Adapters and Cables <<<
You must identify the cluster transport adapters which attach this
node to the private cluster interconnect.
Select the first cluster transport adapter:
1) net1
2) net2
3) net3
4) net4
5) net5
6) Other
Option: 1
Searching for any unexpected network traffic on "net1" ... done
Unexpected network traffic was seen on "net1".
"net1" may be cabled to a public network.
Do you want to use "net1" anyway (yes/no) [no]? yes
Select the second cluster transport adapter:
1) net1
2) net2
3) net3
4) net4
5) net5
6) Other
Option: 3
Searching for any unexpected network traffic on "net3" ... done
Unexpected network traffic was seen on "net3".
"net3" may be cabled to a public network.
Do you want to use "net3" anyway (yes/no) [no]? yes
>>> Quorum Configuration <<<
Every two-node cluster requires at least one quorum device. By
default, scinstall selects and configures a shared disk quorum device
for you.
This screen allows you to disable the automatic selection and
configuration of a quorum device.
You have chosen to turn on the global fencing. If your shared storage
devices do not support SCSI, such as Serial Advanced Technology
Attachment (SATA) disks, or if your shared disks do not support
SCSI-2, you must disable this feature.
If you disable automatic quorum device selection now, or if you intend
to use a quorum device that is not a shared disk, you must instead use
clsetup(1M) to manually configure quorum once both nodes have joined
the cluster for the first time.
Do you want to disable automatic quorum device selection (yes/no) [no]?
Is it okay to create the new cluster (yes/no) [yes]?
During the cluster creation process, cluster check is run on each of the new cluster nodes.
If cluster check detects problems, you can either interrupt the process or check the log
files after the cluster has been established.
Interrupt cluster creation for cluster check errors (yes/no) [no]?
清单 11 显示最终输出,指明了节点配置和安装日志文件名称。然后,实用程序以集群模式重新启动每个节点。
清单 11. 节点配置详情
Cluster Creation
Log file - /var/cluster/logs/install/scinstall.log.3386
Configuring global device using lofi on phys-schost-1: done
Starting discovery of the cluster transport configuration.
The following connections were discovered:
phys-schost-2:net1 switch1 phys-schost-1:net1
phys-schost-2:net3 switch2 phys-schost-1:net3
Completed discovery of the cluster transport configuration.
Started cluster check on "phys-schost-2".
Started cluster check on "phys-schost-1".
.
.
.
Refer to the log file for details.
The name of the log file is /var/cluster/logs/install/scinstall.log.3386.
Configuring "phys-schost-1" ... done
Rebooting "phys-schost-1" ...
Configuring "phys-schost-2" ...
Rebooting "phys-schost-2" ...
Log file - /var/cluster/logs/install/scinstall.log.3386
scinstall 实用程序完成时,基本 Oracle Solaris Cluster 软件的安装和配置便会完成。现在,集群已就绪,您可以配置将用于支持高度可用的应用程序的组件。这些集群组件可以包括设备组、集群文件系统、高度可用的本地文件系统以及各个数据服务和区域集群。要配置这些组件,请参阅文档库。# svcs -x # svcs multi-user-server STATE STIME FMRI online 9:58:44 svc:/milestone/multi-user-server:default
# cluster status
=== Cluster Nodes ===
--- Node Status ---
Node Name Status
--------- ------
phys-schost-1 Online
phys-schost-2 Online
=== Cluster Transport Paths ===
Endpoint1 Endpoint2 Status
--------- -------- ------
phys-schost-1:net3 phys-schost-2:net3 Path online
phys-schost-1:net1 phys-schost-2:net1 Path online
=== Cluster Quorum ===
--- Quorum Votes Summary from (latest node reconfiguration) ---
Needed Present Possible
------ ------- --------
2 3 3
--- Quorum Votes by Node (current status) ---
Node Name Present Possible Status
--------- ------- -------- ------
phys-schost-1 1 1 Online
phys-schost-2 1 1 Online
--- Quorum Votes by Device (current status) ---
Device Name Present Possible Status
----------- ------- -------- ------
d1 1 1 Online
=== Cluster Device Groups ===
--- Device Group Status ---
Device Group Name Primary Secondary Status
----------------- ------- --------- ------
--- Spare, Inactive, and In Transition Nodes ---
Device Group Name Spare Nodes Inactive Nodes In Transition Nodes
----------------- ----------- -------------- --------------------
--- Multi-owner Device Group Status ---
Device Group Name Node Name Status
----------------- --------- ------
=== Cluster Resource Groups ===
Group Name Node Name Suspended State
---------- --------- --------- -----
=== Cluster Resources ===
Resource Name Node Name State Status Message
------------- --------- ----- --------------
=== Cluster DID Devices ===
Device Instance Node Status
--------------- ---- ------
/dev/did/rdsk/d1 phys-schost-1 Ok
phys-schost-2 Ok
/dev/did/rdsk/d2 phys-schost-1 Ok
phys-schost-2 Ok
/dev/did/rdsk/d3 phys-schost-1 Ok
/dev/did/rdsk/d4 phys-schost-1 Ok
/dev/did/rdsk/d5 phys-schost-2 Ok
/dev/did/rdsk/d6 phys-schost-2 Ok
=== Zone Clusters ===
--- Zone Cluster Status ---
Name Node Name Zone HostName Status Zone Status
---- --------- ------------- ------ -----------
现在,我们将创建一个故障切换资源组,其中包含高度可用的网络资源的 LogicalHostname 资源以及 zpool 资源中高度可用的 ZFS 文件系统的 HAStoragePlus 资源。
/etc/inet/hosts 文件。 在以下示例中,使用 schost-lh 作为资源组的逻辑主机名称。此资源的类型为 SUNW.LogicalHostname,这是一种预先注册的资源类型。
在节点 1 上:
# Internet host table # ::1 localhost 127.0.0.1 localhost loghost 1.2.3.4 phys-schost-1 # Cluster Node 1.2.3.5 phys-schost-2 # Cluster Node 1.2.3.6 schost-lh
在节点 2 上:
# Internet host table # ::1 localhost 127.0.0.1 localhost loghost 1.2.3.4 phys-schost-1 # Cluster Node 1.2.3.5 phys-schost-2 # Cluster Node 1.2.3.6 schost-lh
/dev/did/rdsk/d1s0 和 /dev/did/rdsk/d2s0。在我们的示例中,已经使用 format 实用程序将整个磁盘分配给磁盘的分片 0。# zpool create -m /zfs1 pool1 mirror /dev/did/dsk/d1s0 /dev/did/dsk/d2s0 # df -k /zfs1 Filesystem 1024-blocks Used Available Capacity Mounted on pool1 20514816 31 20514722 1% /zfs1
现在,zpool 将放在高度可用的资源组中,作为 SUNW.HAStoragePlus 类型的资源。此资源类型必须先注册,然后才能进行首次使用。
# /usr/cluster/bin/clrg create test-rg
# /usr/cluster/bin/clrslh create -g test-rg -h schost-lh schost-lhres
# /usr/cluster/bin/clrt register SUNW.HAStoragePlus
# /usr/cluster/bin/clrs create -g test-rg -t SUNW.HAStoragePlus -p \ zpools=pool1 hasp-res
# /usr/cluster/bin/clrg online -eM test-rg
# /usr/cluster/bin/clrg status
=== Cluster Resource Groups ===
Group Name Node Name Suspended Status
---------- -------- --------- ------
test-rg phys-schost-1 No Online
phys-schost-2 No Offline
# /usr/cluster/bin/clrs status
=== Cluster Resources ===
Resource Name Node Name State Status Message
------------- --------- ----- --------------
hasp-res phys-schost-1 Online Online
phys-schost-2 Offline Offline
schost-lhres phys-schost-1 Online Online - LogicalHostname online.
phys-schost-2 Offline Offline
通过上面的状态,我们看到节点 1 上的资源和组已联机。
# /usr/cluster/bin/clrg switch -n phys-schost-2 test-rg
# /usr/cluster/bin/clrg status
=== Cluster Resource Groups ===
Group Name Node Name Suspended Status
---------- --------- --------- ------
test-rg phys-schost-1 No Offline
phys-schost-2 No Online
# /usr/cluster/bin/clrs status
=== Cluster Resources ===
Resource Name Node Name State Status Message
------------- --------- ----- --------------
hasp-res phys-schost-1 Offline Offline
phys-schost-2 Online Online
schost-lhres phys-schost-1 Offline Offline - LogicalHostname offline.
phys-schost-2 Online Online - LogicalHostname online.
本文介绍了如何在 Oracle Solaris 11 上使用 Oracle Solaris Cluster 4.0 安装和配置双节点集群。本文还介绍了如何通过在一个节点上创建并运行两个资源,然后将这些资源切换到第二个节点来验证集群运行正常。
有关如何配置 Oracle Solaris Cluster 组件的更多信息,请参见表 2 列出的资源。
表 2. 资源| 资源 | URL |
|---|---|
| Oracle Solaris Cluster 4.0 文档库 | http://www.oracle.com/pls/topic/lookup?ctx=E23623 |
| Oracle Solaris Cluster 软件安装指南 | http://www.oracle.com/pls/topic/lookup?ctx=E23623&id=CLIST |
| Oracle Solaris Cluster 数据服务规划和管理指南 | http://www.oracle.com/pls/topic/lookup?ctx=E23623&id=CLDAG |
| Oracle Solaris Cluster 4.0 版本说明 | http://www.oracle.com/pls/topic/lookup?ctx=E23623&id=CLREL |
| Oracle Solaris Cluster 培训 | http://www.oracle.com/technetwork/cn/server-storage/solaris-cluster/training/index.html |
| Oracle Solaris Cluster 下载 | http://www.oracle.com/technetwork/cn/server-storage/solaris-cluster/downloads/index.html |
| 修订版 1.0,2011 年 12 月 2 日 |