Based on the information received from ACME Medium Co. the following is a
reasonable initial estimate of the hardware required to service the usage requirements
specified. A proof of concept is vigorously recommended to further affirm the
estimation supplied here.
ACME Medium Co. have stated that their user community
will be approximately 2,500 'concurrent' users, however the definition of
concurrency in the case of a portal is not actually related to users, the
important load factors (as described earlier) are the page request rate and
login rate. ACME Medium Co. have indicated that they expect 32,000,000 page
'hits' per month. Assuming these are actual unique page requests and not
iterative hits for item content on a page (e.g. images, JS libs etc), then this
equates to an actual page hit rate (assuming a 31 day month) of 11.94 reqs/sec.
One can only assume that the statement that 2,500 users
are 'concurrent' is invalid, as this equates to each user clicking once every
3.4 minutes. More likely some percentage of this user basis is logged in and
active which is generating the load.
As the logged in user-load is unavailable one can only
make a general assumption and assume that the logged in user base will be 15%
(the accepted constant for a general portal sizing) of the stated user
community. This equates to a logged in user base of 375 each clicking
approximately once every 30 seconds. The resulting page request rate is ~12 reqs/sec.
If we assume that
a site may have 2500 registered users
of those 2500 users 15% are making page requests at any one time
of the 375 requestors each one clicks twice every 60 seconds
this equates to ~12 page requests per second
That is the definition of concurrency we use within Portal development, it is
not really reasonable to use the term concurrency as none of the users are
really concurrent because of the disconnected manner in which HTTP requests are
made and destroyed.
Given the following topology recommendation, there follows some hardware
implementation suggestions for a Linux based solution.
Component
Description
CPU
Memory
O/S
A
Load Balancing Router
Not Applicable
B
3 * Web Cache
2 x 1.4GHz Intel
4Gb
Redhat Adv Svr 2.1
C
2*
Middle-Tier
2 x 3GHz Intel
6Gb
Redhat Adv Svr 2.1
D
2 *
Infrastructure
4 x 2GHz Intel
16Gb
Redhat Adv Svr 2.1
E
Shared Disk
N/A
N/A
N/A
Diagram Definitions
Component A
This machine is a hardware load balancing router that will
handle the initial routed requests from the internet. This is commonly
implemented using vendors like F5 (BigIP) and Cisco (Local Director). This is
not a mandatory but recommended requirement. Using an LBR will ease any further expansion of the system
design by ensuring that the infrastructure will not to be rewired should any
further middle-tiers be added to the solution. It is possible to designate one
of the Webcache machines to act as an LBR for the Webcache cluster, although
this will detract from the Webcache performance of that machine. Should SSL be
required it is possible to configure a HW LBR to implement SSL encryption on
outgoing traffic thus alleviating this expensive step from the iAS software.
Component B
These machines will run the OracleAS Webcache They will be configured
as part of a Webcache cluster.
Suggested Vendor Example = Dell 1650 RackMount
Component C
These machines will run the Oracle HTTP
Server components including OC4J_Portal and mod_plsql. They will be configured
to run 2 instances of OC4J_Portal providing rudimentary failover
Suggested Vendor Example = Dell 2650 RackMount
Component D
These machines will be the infrastructure component of the
implementation. Components on this machine will include the infrastructure OHS,
LDAP processes and the infrastructure repository installed in an Oracle
9.0.1.3 DB. This infrastructure install will be a standard implementation of
the AS infrastructure and as such will contain Portal, SSO, Enterprise
Manager and OID schemas.
The machines will be configured using the Redhat Cluster Manager server and
will act as a cold failover cluster
Suggested Vendor Example = Dell 6650 RackMount
Component E
Network attached storage, or SCSI shared disk for the cluster.
This architecture provides no growth beyond the capacity expressed by
ACME Medium Co.. If the login rates, usage models or general data volumes grow above
those previously expressed then the hardware outlined here will need to be
enhanced and/or added to.
If the volumes assumed are reduced then it may be possible to reduce the
implementation architecture to one (1) middle-tier machine and one (1)
infrastructure machine. For the given volume load it is not recommended to go
below this lower limit.
Security
This architecture provides no explicit security requirements beyond that
provided out of the box by Oracle9i Application Server, should further
requirements arise then it is possible to implement SSL through the software or
(as recommended above) through an LBR
High Availability (HA)
HA is provided through the realms of Cold Failover Clustering
(CFC), more information on this can be obtained from
OTN and
Redhat. Low levels of redundancy will be provided for the mid-tier purely by
the existence of two mid-tier machines, within these two machines it will be
possible to configure them to operate two OC4J_Portal instances thereby giving
servlet execution redundancy on each machine.
Assumptions
The assumptions for performance are based upon using suitable caching models,
preferably full page wherever possible, failing that PMD and Portlet caching
Low data volumes for the content
No increase in user request rates
Low login rates (i.e. no spikes)
Suitable network infrastructure with good/reasonable latency (<20ms) for
roundtrips between the users browser and the mid-tier
Reasonable development/request mix - i.e 20% of page operations are
for page development and maintenance, 80% of operations are simple page
request for predominantly cached content
The infrastructure DB will not be used for storing customer application
data other than that generated by the Portal interface.
Options
The following suggestions are provided as indicators for ACME Medium Co. to
consider in the event of a projected expansion
Consider running more than one OC4J_Portal instance in the mid-tier for
redundancy
All the infrastructure schema's are in one machine, currently the only supported method
for infrastructure failover is through cold failover. It is possible to install
the Portal and OID schema's into a RAC node thereby offering a more dynamic
form of HA than that provided by CFC
An increase in login load would probably necessitate the inclusion of a
separate login machine and the stripping out of the LDAP processes and OID & SSO schemas from the
infrastructure.
Should document storage requirements increase consider moving
the portal repository or documents table to another machine or set of spindles
This recommendation document is designed to serve as an indication of a
suitable implementation architecture. Implementation may be possible with an
architecture that differs from the one recommended, and with less or more
hardware. The most reliable method for sizing a suitable implementation
architecture is to run a proof of concept or pilot prior to full implementation.
This will allow the implementation team to assess the likely success of the
suggested implementation architecture and adjust the specifications accordingly
if the need to do so is demonstrated by the results.