Sizing OracleAS Portal : Ten-Minute Overview

Sizing : Architectural Considerations

Introduction

As with any Web portal, the server and database capacity needed to deploy a portal built using OracleAS Portal largely depends on the number of anticipated user requests for a given page. Displaying a single page to a user may require many separate transactions, from verifying whether the user has permission to view the page, to loading the images that appear on the page, to calling a style sheet that contains formatting information for the page.

The upper and lower limits of what is needed are determined by how users are expected to use the portal. At a minimum, enough server capacity to satisfy the average load during a work day will be required, with response times that are acceptable to the user base. If possible, strive to satisfy the volume of page requests anticipated during peak intervals of high user activity. Hardware resources such as CPU, memory, I/O capacity, and network bandwidth are key to reducing response times. Unless installing OracleAS Portal on a server or group of servers that can handle a large number of transactions, users are probably going to experience slow response times.

The same is true of the database. If many applications compete for the same database resources, Web portal performance may suffer. It is possible to install multiple instances of OracleAS Portal in the same database, for example, a development instance for developing new pages and portlets, and a separate instance for deploying the finished Web Portal. Consider whether the database can satisfy requests from both instances in a timely manner.

Adding more servers and database capacity will certainly improve the Web portal's performance, but unless there are unlimited funds available, balancing good performance against the costs associated with each new piece of hardware and software will become key.

The initial sections of this document offer a high level overview of performance and some of the elements of sizing that are important, the latter section offers the recommendation and further considerations.

Performance Targets

Whether designing or maintaining a system, set specific performance goals for optimization. Altering parameters without a specific goal in mind can waste tuning time for the system without a significant gain.

An example of a specific performance goal is an order entry response time under three seconds. If the application does not meet that goal, identify the cause (for example, I/O contention), and take corrective action. During development, test the application to determine if it meets the designed performance goals.

Tuning usually involves a series of trade-offs. After determining the bottlenecks, performance in some other areas may need to be modified to achieve the desired results. For example, if I/O is a problem, purchasing more memory or more disks may resolve that. If a purchase is not possible, limiting the concurrency of the system to users may achieve the desired performance. However, if there are clearly defined goals for performance, the decision on what to trade for higher performance is simpler because the most important areas will have been identified.

User Expectations

Application developers, database administrators, and system administrators must be careful to set appropriate performance expectations for users. When the system carries out a particularly complicated operation, response time may be slower than when it is performing a simple operation. Users should be made aware of which operations might take longer.

Performance Evaluation

With clearly defined performance goals, determining when performance tuning has been successful becomes a simple matter of comparison. Success depends on the functional objectives established with the user community, the ability to measure whether or not the criteria are being met, and the ability to take corrective action to overcome any exceptions.

Ongoing performance monitoring enables maintenance of a well tuned system. Keeping a history of the application’s performance over time enables useful comparisons to be made. With data about actual resource consumption for a range of loads, objective scalability studies can be undertaken and from these predict the resource requirements for anticipated load volumes.

Performance Terms

concurrency The ability to handle multiple requests simultaneously. Threads and processes are examples of concurrency mechanisms.

contention Competition for resources.

cluster A group of machines that handle workload in a distributed manner, providing redundancy and failover.

failover A method of allowing one machine or set of machines to provide an alternative execution arena for a task, should the original machine(s) fail.

hit The subsequent request for a snippet of content from either the Portal Parallel Page Engine, or the client browser - this content can take the form of images javascript libraries, cascading style sheets etc. It is reasonable to expect ~30 hits from a single page request

latency The time that one system component spends waiting for another component in order to complete the entire task. Latency can be defined as wasted time. In networking contexts, latency is defined as the travel time of a packet from source to destination.

page request The unique request for a page defined inside the Portal repository. A figure specifiying page requests per second is the measurement of the load expected for the architected solution given a common element of portal content. One page request is likely to result in one or more (~30) hits for subordinate content.

response time The time between the submission of a request and the receipt of the response.

scalability The ability of a system to provide throughput in proportion to, and limited only by, available hardware resources. A scalable system is one that can handle increasing numbers of requests without adversely affecting response time and throughput.

service time The time between the receipt of a request and the completion of the response to the request.

think time The time the user is not engaged in actual use of the processor.

stream time The time taken to transmit the response to the requestor

throughput The number of requests processed per unit of time.

wait time The time between the submission of the request and initiation of the request.

Sizing the Portal System

Consider the following elements of page generation when planning for a system sizing.

  • Peak page throughput required
  • Page cache hit rate
  • Peak login rate

Consideration should be given to other performance factors of the the final built portal, these could include the portlet cache hit rate, portlet execution speed, page complexity, page security, available network bandwidth and load distribution, other portal activity, available hardware resources, amount and type of content and the impact of using SSL.

Peak Page Throughput

The peak number of pages/second requested by Portal users. For example, assume a Portal serves a total population of 10,000 users of which 10% are active (a user may be logged in but not active) at peak times and an active user makes 3 requests per minute. The peak throughput requirement will be

((10000 x 0.10) x 3) ÷ 60 = 50 pages/sec

Page Cache Hit Rate

The Page Cache Hit Rate is the number of page definitions that can be retrieved from the cache compared to the number of pages that must be regenerated during peak load times. To estimate the PageCHR consider

  • How often pages are modified by their owners
  • How often pages are customized by end users
  • Whether you will be using validation, invalidation or expiry based caching

The aim with PageCHR is to get it as close to 100% as possible, judicious use of the correct caching policy within each unique page and portlet when weighed against the dynamicism of the data will ensure that both the need to deliver content in a timely and performant fashion and the desire to deliver the most upto date content possible are met.

Building page content from cached content (both in-memory and on-disk) is much less expensive than retrieving the content from the meta-data repository.

Peak Login Rate

The login rate is the rate at which users login to the Portal, thereby placing a load on the SSO and OID servers. For example assume a Portal serves a total population of 10,000 users and 20% of those users login during a 15 minute period at the start of the business day. The peak login rate will be:

10000 × 0.20 ÷ 15 = 133.33 logins/min (2.22 logins/sec)

Explicit logouts will also place a load on the servers and may need to be considered also.

Other Performance Factors

Portlet Cache Hit Rate (PortletCHR)

The PortletCHR is the number of portlet requests that can be satisfied from the cache compared to the number of the portlet requests that must be handled by a provider during peak load times

Portlet Execution Speed

The Portlet Execution Speed is the average time required to execute all (uncached) portlets on a page. Since portlets execute in parallel, this measure will be equal to the execution speed of the slowest portlet, plus any page assembly overhead. The portlet execution speed may differ from site to site if each site has a differing mix of content, caching policies and hardware for their portal. Estimating this number can only be achieved through a proof of concept that accurately reflects the eventual target data and page design. In general the speed of page assembly will be limited by the execution speed of it's slowest portlet.

Page Complexity

Page security and the number of tabs and portlets on a page will affect the time it takes to generate page metadata. The number of portlets on a page will affect the page assembly times especially if each page must be generated or contacted for a validity check.

Network Bandwidth

The speed of the network that connects interacting portal components will affect response times but does not affect in-machine throughput. Bandwidth issues will be a large concern for portal implementations with a geographically dispersed user-base. The further the content must travel the more latency sensitive the delivery mechanism will be. Largely distributed systems over higher latency networks will suffer from poor performance which will only be exacerbated by dynamic portlet content.

Load Distribution

The distribution of system load across servers will affect overall system performance. The normally accepted method of dealing with load distribution and scalability is to place each AS component on a separate machine or machines depending on how much scalability is required. This distributed load architecture also assists when dealing with the issues of High Availability.

Other Portal Activity

The impact of the other users of the portal will affect overall performance response. Content managers, developers and monitoring overhead can all consume valuable processing resource that could be applied for page generation. This is a normal situation as the nature of the portal provides for multiple concurrent usage models, however, from a pure performance point of view the execution models for page generation differ from that for application development and as such doing both activities simultaneously will reduce the overall systems resource available for one specific task.

Hardware Resources

Both page generation from the PPE and caching through webcache are memory sensitive, in-memory operations are orders of magnitude faster than that involving I/O bound disk caching or swap files. Providing suitable quantities of memory for the Portal servers is a critical step in the machine configuration.

CPU Performance

Page generation is a CPU intensive process, therefore the speed of the available CPU speed and quantities of those CPU's is another critical factor in the machine configuration for a Portal server

Type of Content

The amount and type of content that is server could affect system throughput. Multimedia content could place an additional load on the  OHS, network bandwidth, file system, memory cache and DB processes.

Sizing & Estimation Methodology

Estimating anything can be a complex and error-prone process, that's why it's an 'estimation' not a 'calculation'.

Sizing portal software requires for there to be common denominator, in the case of Database performance metrics we can refer to TPCC benchmarks. For J2EE application server performance we can refer to 'Pet Store' transaction figures. Unfortunately there is no 'Pet Store' for Portals, until the true unification of portal development and deployment standards through the efforts of JSR168, WSRP and other open portal development standards, it will be impossible to develop a Portal Pet Store because of the variety of implementation methods employed by the portal vendors in the marketplace.

Primarily there are three approaches to sizing a portal implementation, these can be identified as:

  • Algorithm or Calculation based

    An algorithm or process that accepts inputs from the customer (e.g. user count, page count, hits, latency, doc size etc) and attempts to deliver a processing requirement is probably the most commonly accepted tool for delivering sizing estimations.

    Unfortunately this approach is also the most inaccurate.

    When considering a logical n-tier enterprise class portal implementation the number of variables involved in delivering a calculation that even approaches a realistic sizing response would require input values numbering in excess of one hundred, and calculations so complex and sensitive that providing an input value plus or minus 1% of the correct value would result in wildly inaccurate results.

    The other approach to calculation based solutions would be to simplify the calculation to the point where it was simple to understand and simple to use. Unfortunately the sizing results delivered from this approach would also be wildly inaccurate.
     

  • Size-by-Example based

    A size-by-example (SBE) approach requires a set of known samples that may be used as data-points along the thermometer of system size. The more examples available for SBE the more accurate the intended implementation will be. Asking a customer how many users they will have, what those usage patterns are likely to be and what type of content they intend to deploy on the portal are all questions that they should be able to answer. Asking them the likely cache-hit-ratio for a portlet is probably something they won't be able to answer unless you're asking the right people. Normally in a pre-sales situation the customer will not know the answers to questions like that. Those are the types of questions that would be required for the algorithm approach.

    Oracle has the ability to deliver targeted SBE sizing solutions for our prospective portal customers through reference implementation documents that outline both our internal deployments and customer's external deployments.

    By using these real world examples both customers and Oracle can be assured that the configurations being proposed have been implemented before and will provide the performance and functionality unique to the proposed implementation.
     

  • Proof of Concept based

    A proof of concept (POC) or pilot based approach offers the most accurate sizing data of all three approaches.

    A POC allows the customer to do the following :

    • Test their portal implementation design
    • Test their chosen hardware platform
    • Test their caching strategy
    • Simulate projected load
    • Validate design assumptions
    • Validate OracleAS Portal
    • Provide iterative feedback for their implementation team
    • Adjust or validate the implementation decisions made prior to the POC

    There is, however, two downsides to a POC based approach, namely time and money.

    Running a POC requires the customer to have manpower, hardware and the time available to implement the solution, validate the solution, iterate changes and re-test and finally analyze the POC findings.

    A POC is always the best and recommended approach for any sizing exercise, it will deliver results that are accurate for the unique implementation of the specific customer, and that are as close to deploying the real live solution as possible but without the capital outlay on hardware and project resources.

Size by Example Opportunity

To provide the example implementation repository of size-by-example solutions, we need to collate data from real world customers. If you feel that you would like to take part in the Size-By-Example Survey then please read this document.

Back to Top

Last Updated : 22 November 2004
E-mail this page
Printer View Printer View
Oracle Is The Information Company About Oracle | Oracle RSS Feeds | Careers | Contact Us | Site Maps | Legal Notices | Terms of Use | Privacy