OLAP Sizing - What should I consider?
1. Disk Space
Disk space is the first thing you think of when thinking about sizing for OLAP. How much space do you really need?
Permanent Space The best way to estimate the likely size of an AW is to get a relevant subset of your data, build a cube and see how big it gets. Let's say you have a 3 dimensional cube. There are 17 million records for a year. A representative month of that data size may be 1.7 million records, which builds to a size of 550M (say 50M taken up by OLAP metadata); multiply by 12 to estimate the actual size of full AW for a year, which would be about 6G. In this testcase, 10 months of data was built, and the AW size was 4.9G. If you will be refreshing the AW then you should double the PERM space available. This provides adequate space for old pages which will later be reused.
Temporary Space You'll need to calculate the TEMP space needed at load and aggregate time. Recommended temp space should be at least the same as the resulting AW size since all the data stays in TEMP before being written to PERM. Autoextend should be on for the tablespaces, so that it may get bigger as needed. Resumable timeout should also be set so that a build can be resumed should there be a problem.
Tablespace Management Ideally, the AW ideally should be partitioned to store the different partitions (e.g. months) of data in different AW$LOBS as it improves update and query performance. The tablespaces should be split into multiple data files since it improves update performance. Generally, use disk arrays that favor write performance. For example, RAID 0+1 or RAID 10. RAID 5 does not favor write performance, but read performance.
2.CPUs/ Parallelism
CPU speed and number of CPUs is another consideration when considering the server. One of the great advantages of Oracle OLAP is that it is multi threaded and queries and builds can run in parallel utilizing multiple CPUs. Oracle Real Application Clusters (RAC 10g) can also be used; parallel update can be turned off for AWs only.
How many CPUs? We recommend that 2 CPUs should be the minimum used when building serially, one for all other server processes and one for OLAP. To parallel build, where you have partitioned the data, you can load and aggregate in different processes; thus multiple processors can be used, so as many CPUs as can be licensed should be utilized for an OLAP build to make it as fast as possible (1 processor per partition would be best). Note, there is a slight overheard on both load and aggregate time when parallel building, so that each partition will take slightly longer than if done one after the other in a serial build, but you're doing all at the same time so it will be faster!
CPU Speed: As far as CPUs go, the faster the better. Tests have proven a direct correlation between CPU speed and aggregation time i.e. double your CPU speed and aggregation time will halve. For example, aggregation on a test data set with a 3G Hz machine took 17 minutes, on a 450MHz machine, it took 90 minutes.
3. Memory
Memory: the amount of RAM available for OLAP is critical : You need enough RAM available to OLAP via the OLAP_PAGE_POOL. OLAP_PAGE_POOL manages the OLAP memory pages (blocks) and should be managed to accommodate loads, aggregations and queries for your AW loads and the relational sorts and queries which should be 70% of the physical RAM available to Oracle, to be allocated to both SGA and PGA. For the OLAP build, 3G is normally the least amount that should be allocated to each of these, thus 8G of RAM is the recommended minimum amount of RAM for Oracle to run an AW load. For larger loads and for many concurrent users more memory will be needed. At query time, for example you can allocate a recommended minimum of 64M per simple query user so for 100 simultaneous users this is 6.4G (8G to accommodate query and sort overhead) available to PGA at a minimum. For a more detailed explanation of memory use, refer to MetaLink Note 414063.1.
4. Disk I/O Throughput
Contact your hardware vendor to check that disks are well balanced in relation to CPUs so that there are no I/O bottlenecks. For example you don't want to have 1 disk for 10 CPUs. |