Oracle Application Server Clusters
Infrastructure
Web Cache
OHS
OPMN
OC4J-Servlets
OC4J-EJB
Identity Management
Mod_plsql and Portal
Forms
Reports
Backup & Recovery
Disaster Recovery
Oracle Application Server Clusters
1.1 Does Oracle Application
Server suport clusters that live in a file-based farm? In other words, can I cluster
OC4J instances without having an infrastructure database installation?
Yes. An infrastructure install is no longer required
to do Oracle Application Server clustering. If you are using infrastructure
anyway, you can of course leverage it, like in 902, to do Oracle Application
Server clustering. All clustering features are identical independent of
the type of repository being used. In addition, if your OC4J and OHS were
on different machines and you wanted to route from OHS to a cluster of
OC4J instances, you do the appropriate routing using manual configuration
and mod_oc4j.
1.2 Is the machine containing
the repository file in case of file based Oracle Application Server clusters
a single point of failure?
No. The file based repository is on one machine, but
each Oracle Application Server instance has all of its configuration data
in its cache. If the machine housing the repository is irrecoverably damaged,
you can designate any other machine to take its place as the repository
host. All instances can then be joined to this new repository.
The clusters or other combination of instances will need to be re-defined.
However, since the configuration data in the repository is not used during
steady state operation of a site, there is no runtime impact if the repository
is unavailable for a short while.
1.3 Can OC4J instances
in a non-J2EE install types be clustered?
Due to an implementation limitation, only J2EE and Web
Cache installations can get their configuration automatically replicated with Distributed Configuration Management (OracleAS DCM) tool and OracleAS Control. However,
mod_oc4j can be manually configured to route from an OHS to multiple
OC4J instances and this can work together replicating state between them no matter the type of install type that they reside at.
1.4 Can I create an
OracleAS Cluster comprising of OracleAS instances installed on separate
platforms (say, Solaris and Linux)? In other words, are heterogenous OracleAS
clusters supported?
To be clusterable, all application server instances that are to be members of an OracleAS Cluster must be installed on the same operating system type (this means that different variants of UNIX can participate in the same cluster)
Top of Page
2.1 Can the Infrastructure
Metadata Repository be "RAC-enabled"?
Yes. Using Repository Creation Assistant, the Metadata
Repository (MR) can be created in an existing RAC database. But note
that even if the MR schemas are created in an existing database,
an Infrastructure still needs to be installed (to run SSO, DAS and other
non-database components of the Infrastructure). See Oracle Application
Server Administrator's Guide for further details.
2.2 Can an existing database
be leveraged for the Metadata Repository?
Yes, in 9.0.4 an existing database can be leveraged for
the metadata repository. Additionally, this existing database may be a
RAC database. See http://certify.oracle.com
for versions of the database that are certified for hosting the repository.
2.3 Can the Infrastructure
(both Metadata Repository database and the Identity Management
services) be run on a hardware cluster?
Yes, the Infrastructure can be run on Cold Failover Clusters
using standard clusterware such as Sun Cluster, HP MC/Service Guard, Red
Hat Cluster Manager, etc (see the certification matrix here for a list of certified clusterwares for each platform) . This is an Active/Passive cluster solution and
is fully supported. For even higher availability and for scalability,
Infrastructure can be deployed on Active Failover Clusters (AFC). This
is an Active/Active solution.
2.4 For AFC and CFC, what clusterwares
do you support?
For a list of supported clusterware, see the certification
matrix here. Please note that
AFC is currently a Limited Release feature. See questions 2.3 and 2.4
above.
2.5 Are there alternate solutions
for AFC on the platforms on which AFC is not supported?
Since AFC is only supported on a limited number of
platforms, Oracle has certified alternate solution like Multi-Box Identity Management that provide the same active-active capabilities as in an Active Failover Solution
This solution would require
- using RepCA to install the Metadata Repository and OID data into
an existing RAC database
- installing multiple Active/Active IMMTs (Identity Management Middle
Tiers that contain just SSO, DAS, DIP and LDAP, but no database) on
separate non-clustered commodity boxes, and
- using a load balancer to distribute incoming Identity Management
(IM) requests to the multiple IM boxes
The net result is still an Active/Active configuration for the Infrastructure
(like AFC), but this solution will work on any platform that supports
RAC. This configuration has been called Multi-box Identity Management. Visit the High Availability area in otn for a detailed description on the necessary steps to configure this topology
2.7 Is there a limit to the
number of cluster nodes used for AFC?
Yes, AFC is limited to a 2-node cluster at this time.
2.8 Must one use an external
hardware load balancer for AFC deployment?
Yes, this is a requirement. The load balancer is required
to distribute incoming HTTP and LDAP requests from Middle Tier components
to the different active OHS and LDAP servers.
2.9 I have an existing RAC
database on a hardware cluster. Can I install the Infrastructure using
CFC on the same hardware cluster?
Yes, as long as CFC and RAC are certified and use the same cluster manager (this automaticaly excludes Red Hat Enterprise Linux, since CFC is only supported on Red Cluster Manager and RAC is not certified with it)
2.10 I have an existing RAC
database on a hardware cluster. Can I install MR and OID data in this
RAC database and then install IM-only (SSO/LDAP/DAS) using CFC on
the same hardware cluster?
Yes (with the same restrictions on Linux as 2.9)
2.11 I have an existing RAC
database on a hardware cluster. Can I install the Infrastructure using
AFC on the same hardware cluster?
Yes, as long as there is no 64-bit RAC databse on
this cluster (this is because AFC installs a 32-bit RAC database and a
32-bit RAC database cannot coexist with a 64-bit RAC database).
2.12 I have an existing RAC
database on a hardware cluster. Can I install MR and OID data in this
RAC database and then install IM-only(SSO/LDAP/DAS) using AFC on
the same hardware cluster?
No, this configuration is not supported.
Top of Page
3.1 Does Web Cache improve
availability?
Yes. Since it can cache the results from back-end application
server, Web Cache can provide responses even when the backend servers
are temporarily down. However, it is generally not a good way to provide
availability for longer downtime durations. It may however be adequate
for largely static requests.
3.2 What benefits does
a web cache cluster bring?
Clustering web cache instances provides a larger pool
of memory in which to store the cached page objects. Browsers however
still need an IP address (whether mapped by DNS or directly entered into
the address bar by the end users) to route the HTTP requests to. A Web
Cache cluster is a way of sharing the cached objects (ex. web pages) across
multiple Web Cache instances; it does not provide a way to share
the IP addresses. There is some improvement on availability (since
each instance now has to go to the backend application server fewer times).
3.3 Is an External Load
Balancer required when using a webcache cluster? Can Windows Network Load Balancing be used instead?
Yes, an external load balancer is recommended for multiple
web cache instances. This provides a single IP address for the browser
to connect to. Alternatively, Windows Network Load Balancing mechanism
may be used to make multiple windows machine housing Web Cache instances
appear as one IP to the browser. However, the scalability of the latter
approach is limited, and hence an external load balancer is a better approach.
3.4 Can Web Cache be setup to determine failure
of different Oracle Application Server components?
Yes. Web Cache supports of a mechanism to specify ping
URLs. This ping URL can point to either a Portal based URL, or a Forms
URL, or a Report. Web Cache issues regular HTTP requests to this URL to
determine if the backend server is up. If the server goes down, the ping
request fails and Web Cache stops routing to the server that is down.
Having Web Cache as a front-end is a very effective mechanism to loosely
cluster stateless mid-tiers such as Portal. The other components Forms,
Reports and others can also leverage this kind of setup.
3.5 Can one have unclustered Web Cache
instances front-end a J2EE clustered back-end. Or, can a clustered Web
Cache front-end unclustered middle tiers. and vice versa?
Yes. The two clustering modes Oracle Application Server
clusters for J2EE and Web Cache clusters are independent of each other.
3.6 Is there an availability impact of doing
stateful or stateless routing at Web Cache level?
If the backend servers are all performing stateless requests
mod_plsql, Portal etc. it is recommended to set a stateless routing at
Web Cache. In this scenario Web Cache routes across all backend application
servers, independent of which server serviced the previous request from
the same browser (session).
In case the backend servers are stateful ex. a shopping
cart in a J2EE application the answer to Web Cache routing setup is less
clear. If the backend J2EE servers are clustered, it is still recommended
to do stateless routing at the Web Cache level. This is because OHS (with
mod_oc4j) contains a smart routing algorithm that understands islands
and state replication. And this information is already contained
in the cookie. Thus, even if Web Cache routes to a different OHS, that
OHS will route it to the correct OC4J. The advantage of this setup is
that it provides better availability:
·
If backend server machine is down, stateless routing enables Web Cache
to route to a different machine.
·
This routing enables OHS on that machine to route to an OC4J within the
same island that serviced the previous request.
·
Thus the request can be satisfied without any visible failure to the end
user.
Without such a stateless setup, Web Cache will return
failure to the end-user in step #1 above.
However, if the backend servers are not clustered
for J2EE applications, and there are stateful applications, it
is recommended to use stateful routing at Web Cache level. In case of
confusion or unclear requirements, a stateful routing setup should
be preferred at Web Cache level.
3.7 Are there mechanisms to provide better availability
of the WebCache process itself?
Yes. The Web Cache process and the corresponding admin
process are monitored by the Oracle Application Server process monitoring
agent and are restarted on failure.
Top of Page
4.1 Are OHS processes auto-started in case of
failure?
Yes. OPMN Oracle Process Monitoring and Notification
Service monitors the OHS processes and restarts the OHS parent process.
OHS parent process monitors the children and restarts them should they
die.
4.2 Is there any state sharing across OHS instances?
No. The OHS instances are all independent of each other
even if certain applications such as Perl or FastCGI applications may
contain states. Thus, state safety and failover has to be architected
outside OHS. Please see the J2EE section for J2EE specific details.
4.3 What is the recommended OHS deployment for
higher availability?
It is recommended to front-end OHS with a load balancing
router or switch. Or, web cache, which is the default. Also, OHS processes
should be configured to recycle (MaxRequests directive) after servicing
a reasonable number of requests (say 10000), so that any impact of memory
leak etc. does not manifest itself as a failure.
4.4 Is restart the same as shutdown and
start from an availability perspective?
Restarting OHS does not stop the HTTP service
from being available to the end users. It stops the individual child processes
after they are done servicing the request at hand, and then starts
a new process. This may manifest itself as a graceful slowdown, but not
as a downtime to the end users. On the other hand, shutdown stops the
listener from listening on the designated port and hence, can be visible
as a downtime to the end users. Thus, a restart should always be
preferred to a shutdown and restart of OHS.
Top of Page
5.1 Is OPMN monitored?
Yes. OPMN has a buddy process that monitors and restarts
OPMN if it goes down.
5.2 Is the OPMN Buddy process monitored? What
happens if OPMN goes down?
This buddy process is not monitored, thus it has to be
externally restarted if it goes down.
5.3 Can OPMN leverage a separate network for
its internal communication?
No. Currently there is no way for OPMN to communicate
on a different network. Thus, its ping messages share the same network
as the actual requests from browsers.
5.4 If Oracle Application Server clustering is
not leveraged, what benefits does OPMN bring?
OPMN provides process monitoring and restart capability,
independent of the clustering auto-configuration benefits that it provides
in conjunction with mod_oc4j. This monitoring and restart capability is
useful in a stand-alone single instance of Oracle Application Server,
too. Moreover, if that instance contains multiple OC4J processes or islands,
OPMN can provide a process level re-routing.
Top of Page
6.1 Can the fault tolerant island setup setup
(for state replication) be done without an infrastructure database?
Yes. 904 supports this with file based clustered setup.
6.2 Is the replication of session data between
islands guaranteed?
No, it is not. It is best effort only. Data needs to be
saved in the database if guarantees are required.
6.3 How many members should I have in an island
and/or in an cluster?
Typically, the cluster membership is used only for configuration
replication. The cluster size is governed by scalability needs (and not
by availability). The island membership provides fault tolerance via session
replication. However, beyond two to three members of an island, there
is little to be gained from an availability perspective. The memory overhead
of session replication also becomes a big negative as this number climbs.
Multiple islands within the same cluster is recommended in these situations..
6.4 Is there an automatic way to persist session
state in a file system or in a database?
Oracle Application Server provides facility to serialize
states to disk across server restarts. It also provides Java Object Cache,
which can be leveraged to provide automatic persistence of data to disk. Aditionally, Jdeveloper ADF provides the framework to achieve httpsession persitence as an option for the applications being deployed
6.5 What is the performance overhead of session
replication in islands?
The replication is best effort and is asynchronous. Thus,
there is not much overhead on the OC4J instance that is replicating
the data. However, the receiving OC4J instances do have to take the interrupt
to copy the new data to memory, and thus maybe impacted.
Hence, it is recommended
to keep the island size small, and only put coarse objects in the session
state.
Top of Page
7.1 Is there a performance impact of leveraging
EJB clustering?
In steady state, there shouldn�t be an impact. On failure
of an EJB server, the proxy stub attempts to connect to the next available
server in the cluster. This process takes time and thus the client may
see degradation in performance as the proxy loops through the choices
to find an available server. However, this is a small price to pay to
mask the failures. The rest of the impact (memory and network) is the
same as described earlier for Servlets and EJB.
7.2 What are the failure scenarios for EJB Clustering?
There are two possible situations when a failure may
manifest itself in the environment:
·
If using JVM replication mode for the state and the node itself goes down
(ex. Power failure)
·
If using one single opmn for dynamic discovery mechanism (for greater availability, use a comma separated list f opmn address)
7.3 Should the servlets and EJB replication be
within the same multicast subnet?
Yes, this is the recommendation. Even though there are
separate methodologies to defining "islands" for servlets and EJBs, it is
recommended that the constituent member processes be the same. This is
especially important if the servlets use the services provided by the
EJBs. This will ensure that both the servlet and EJB states
is available in the same set. Un-availability of either one of these otherwise
could cause failure.
7.4 What mechanism is used for data replication
for EJB state?
UDP / multicast is used for data replication.
Top of Page
8.1 Can OID be configured to work with a RAC
repository?
Yes it can be. In earlier releases of Oracle Application
Server, this was not possible.
8.2 Can a hardware load balancer be used to front-end
multiple OID instances?
Yes. Such a setup can be leveraged with a RAC setup at
the backend, to provide improved availability.
8.3 Can OID be separated from an infrastructure
installation?
Yes. There are installation time choices available to
select which components you would like to install.
8.4 Can OID work with a hardware cluster?
Yes.
8.5 How can SSO be made highly available?
There are three choices for this: (a) Oracle Application
Server RAC which is the RAC install of Oracle Application Server infrastructure
components, with a load balancer in front. (b) Cold Failover cluster,
which is the infrastructure install on a shared disk of a clustered machine,
with only one machine active at any time, or (c) Leverage OID replication
to replicate IM data.
Top of Page
9.1 Can multiple Portal mid-tiers be installed
for scalability and availability?
Yes.
9.2 Can Portal schema be installed in a RAC enabled
database for improved availability?
Yes.
9.3 Does Portal availability imply availability
of the pages?
Portal assembles its pages from a variety of sources.
Each page could be assembled from different providers, including external,
internet-based providers. Portal availability only guarantees the availability
of the page assembly engine; it does not guarantee the availability of
the page contents.
9.4 What is the impact of SSO and OID availability
on Portal?
Unavailability of both these components results in Portal
unavailability. See the SSO and OID sections to determine how to make
those components highly available.
Top of Page
10.1 Can Web Cache be used to load balance across
Forms instances?
Web Cache can be used with sticky session routing to
load balance requests against multiple Oracle Application Server instances
running Forms. Web Cache can also be configured to monitor a Forms URL,
so that Web Cache can re-route to a different machine if Forms is not
working on a given instance. This helps new requests to be routed to a
working node, although the in-flight sessions are impacted.
10.2 Does Forms work with RAC?
Yes, a Forms application can work with a RAC back-end
database.
Top of Page
11.1 Can Web Cache be leveraged with a Reports
deployment?
Web Cache can be leveraged to load balance requests across
several reports service instances. Web Cache can be further used to cache
contents but only for public content of the reports.
11.2 Can Reports be used with a RAC enabled database
backend?
Yes.
11.3 Can a Reports Server leverage OPMN and the
auto-rerouting of requests?
Since the in-process reports server just runs within
the OC4J process, OPMN effectively restarts it by restarting OC4J on failure.
OPMN does not manage out of process Reports Server. However, after certain
amount of requests Reports Engine is restarted, thus reducing the failure
possibility.
Since there can be only one in-process reports server
per machine, a glitch in the reports server on a machine is end-user visible
until OPMN restarts that OC4J instance. Multiple machines can however
be running the in-process reports server that can then be load balanced
via Web Cache or an external load balancer.
11.4 What is the impact of a reports server failure?
A reports server failure impacts all the in-flight sessions.
The impact of a reports engine failure is limited to the specific reports
request being served from that engine. The other users and future requests
from the same session do not have any impact.
Top of Page
12.1 Does Oracle Application Server 10g
(9.0.4) provide a Backup & Recovery tool?
Yes. See Oracle Application Server Administrator's Guide
(Documentation) for details.
The tool is shipped on the Utilities CD and can be used both for Infrastructure
as well as Middle Tier components.
12.2 Does the Backup & Recovery tool support
incremental backups?
Yes. After the initial complete backup, all subsequent
backups are incremental backups.
12.3 Does Oracle Application Server 10g
(9.0.4) point-in-time recovery?
Yes. See Oracle Application Server Administrator's Guide
(Documentation) for details.
Top of Page
13.1 Does Oracle Application Server 10g
(9.0.4) support site-to-site (geographically dispersed) Disaster Recovery?
Yes. See Oracle Application Server High Availability
Guide (Documentation) for
details.
13.2 Does Oracle Application Server Disaster
Recovery use Oracle Data Guard?
Yes. But note that Oracle Data Guard is a solution for
Oracle database disaster recovery and as such, it can only be used
for Disaster Recovery of the Infrastructure Database. For other configuration
files in the Infrastructure and for any middle tiers, Disaster Recovery
uses Backup & Recovery to copy files from the primary site to the
standby site.
|