How Traffix Systems Optimized Its LTE Diameter Load Balancing and Routing Solutions Using Oracle Hardware and Software

March 2011

By Orgad Kimchi

This document is intended for telecommunications industry independent software vendors (ISVs) and service providers that want to understand the details of how Oracle's hardware and software can improve their application solution environment.

This paper provides technical information on how Traffix Systems, the leading Diameter protocol solutions vendor, optimized its Long Term Evolution (LTE) Traffix Diameter Load Balancer and Traffix Diameter Router to benefit from Oracle's software and hardware, improving throughput and resiliency, reducing hardware and software costs, and driving maximum return on investment.

This paper also includes brief technical descriptions of how specific Oracle Solaris features and capabilities are implemented in the Traffix solutions to optimize scalable performance, advanced reliability, and visibility.

Contents:

Introduction


Telecommunications operators have embarked on a path to migrate their core network technology from legacy 2G and 3G architectures to 4G and LTE-based architectures. This migration creates a new set of challenges for operators in how they introduce advanced services and manage their networks. This paper discusses some of the main challenges operators face in their migration to LTE, specifically regarding the adoption of evolved signaling using the Diameter protocol, a mandatory component of the LTE architecture.

The paper also describes how Traffix Systems and Oracle partnered to offer a joint signaling routing and management solution to assist telecommunications operators in tackling signaling issues, clearing the path to LTE.

Among the critical challenges for telecommunications operators moving to LTE is the large quantity of Diameter signaling flows and the complexity of the network architecture. Addressing these challenges requires an introduction to Diameter signaling load balancing and routing solutions.

The quantity of signaling in LTE core networks is unlike anything telecommunications operators have seen in the past. It is estimated that there will be up to 25 times more signaling per subscriber compared to legacy and intelligent networks. The main reasons for this growth in network signaling are the following:

  • Fragmentation of the network and new functionality. Network functionality is becoming more and more dispersed due to the distributed characteristic of the network architecture and the incorporation of new functionality defined by the Third Generation Partnership Project (3GPP), such as Policy and Charging Rules Function (PCRF), Mobility Management Entity (MME), Online Charging System (OCS), Home Subscriber Server (HSS), and others. In addition, the explosion in mobile data adoption requires ever greater numbers of network nodes, all of which are Diameter nodes.

  • Introduction of new, advanced multimedia services, advanced charging schemes, and policy control. All these are signaling-hungry services, and create large quantities of signaling flows.

  • Growing quantity of cross-network and roaming-related signaling between operators as a result of new capabilities, such as end-to-end policy control and more advanced cross-operator charging.

  • The shift of voice from the traditional circuit-switched network to the packet-switched network under initiatives such as Voice over LTE (VoLTE).

As a result of these trends, network operators moving to LTE are finding it more difficult to maintain, manage, and scale their core network architecture and Diameter signaling.

To address these challenges, operators are turning to LTE network elements, such as Diameter signaling routers and load balancers, to manage the growing quantity of signaling. These advanced signaling routers and load balancers enable smooth and rapid service introduction, network scaling, and hiding of internal network topologies, thereby increasing the reliability and redundancy of the network. Traffix Diameter Router and Traffix Diameter Load Balancer are used by operators migrating to advanced LTE, Evolved High-Speed Packet Access (HSPA+), IP Multimedia Subsystem (IMS), and other networks to tackle routing, scalability, and management needs.

Traffix Diameter Router, Diameter Load Balancer, and Diameter Signaling Delivery Controller


Traffix Systems' Diameter Router and Diameter Load Balancer are delivered over the Traffix Diameter Signaling Delivery Controller (SDC) platform, a unified platform that provides operators with the market's most cost efficient and operationally efficient solutions to scale, connect, and control a network.

The SDC enhances LTE Diameter signaling with greater intelligence to simplify management, increase resilience, and increase reliability by providing the following functionality:

  • Flexible contextual routing, plus subscriber and services context-awareness based on business intelligence extracted from real-time signaling
  • Standard Interworking Function (IWF) for legacy protocols connectivity and Diameter Edge Agent (DEA) for roaming
  • Gateway and message translation for interconnectivity between the Diameter protocol and other signaling protocols
  • Clustering and load balancing for scaling up Diameter-based server elements, such as PCRF, HSS, OCS, and others

The SDC is designed as a modular platform that combines this functionality in a single streamlined flow, providing a flexible and robust solution for the most challenging control-plane connectivity problems.

Traffix Diameter Router, Traffix Diameter Load Balancer, and Sun SPARC Enterprise T-Series Servers


Using Oracle's Sun SPARC Enterprise T-Series family of servers, Traffix Diameter Router and Traffix Diameter Load Balancer have achieved high levels of sustained throughput, ultra low latency, and linear scalability by taking advantage of the Oracle Solaris 10 operating system. Key Oracle Solaris features used include these unique and innovative technologies: Oracle Solaris ZFS, Oracle Solaris DTrace, Oracle Solaris Service Management Facility (SMF), and built-in virtualization with Oracle Solaris Containers.

The UltraSPARC T2/T2+ processors and the latest SPARC T3 processors implement the industry's first massively threaded "system on a chip." These processors power the Sun SPARC Enterprise T-Series servers, with support for up to 16 cores at 8 threads per core for a total of 128 threads on the SPARC T3 processor. The Sun SPARC Enterprise T-Series servers run up to four SPARC T3 processors.

The Sun SPARC Enterprise T-Series server architecture is highly flexible, and together with Oracle Solaris, it enables different modular combinations of processors, cores, and integrated components, offering the following benefits:

  • Increasing computational capabilities to meet the growing demand for today's telecommunications applications
  • Support for larger and more diverse IP-based telecommunications workloads
  • Power for faster networking to serve new network-intensive content
  • Increased service levels and reduced application downtime
  • Improved data center capacities with a smaller footprint and reduced operating costs

These systems are closely integrated with Oracle Solaris and provide record-setting performance and excellent reliability, availability, and serviceability (RAS) characteristics, which are ideal for maximizing the up time and ROI of mission-critical telecommunications applications. For the latest published records, refer to this blog.

Sun SPARC Enterprise T-Series server offerings include both state-of-the-art rack servers and density efficient blade servers. For the latest server information, refer to the Sun Servers web page.

These servers are specifically designed to provide increased performance and greater flexibility in a low-cost, high-density solution, and they can help relieve data center capacity constraints for the massive horizontal scaling of today's telecommunications infrastructures.

Performance of Traffix Diameter Router and Traffix Diameter Load Balancer on Sun SPARC Enterprise T-Series Servers


Traffix Systems Diameter Router and Diameter Load Balancer are highly scalable and take full advantage of the chip multithreading (CMT) technology in the UltraSPARC T2 and SPARC T3 processors and the multithreaded performance of Oracle Solaris.

Oracle Solaris enables Traffix Systems software instances to deliver excellent throughput on Sun SPARC Enterprise T-Series servers by optimizing performance across the 128 threads available on the 16-core SPARC T3 processors.

The combined Traffix-Oracle solution was tested in the following business scenario, which represents session-based, online charging in a 3G or LTE network.

In online charging, a subscriber account located in an OCS is queried prior to granting permission to use the requested network resources. Typical examples of network resource usage are a data session of certain duration, the transport of a certain volume of data, or the submission of a multimedia message of a certain size.

When receiving a network resource usage request, the network assembles the relevant charging information and generates a charging event towards the OCS in real time. The OCS then returns an appropriate resource usage authorization. The resource usage authorization may be limited in its scope (for example, volume of data or duration); therefore, the authorization might need to be renewed from time to time as long as the user's network resource usage persists.

To demonstrate the scenario, Traffix Diameter Router and Traffix Diameter Load Balancer were installed in a lab configuration using a Sun Netra T5220 server from Oracle, as shown in Figure 1. In the testing scenario, each client sent the following Diameter Credit-Control Application (DCCA) commands:

  • Initiate
  • Ten updates
  • Terminate

Each client was set up with different threads that were configured to send these command sequences. The typical request size was 1,000 bytes and the response size was 200 bytes. In total, 150 Diameter clients and 30 different Diameter servers were used to simulate traffic in the test setup. In some test cases, multiple clients were run on separate hosts to accommodate the execution of such a large scenario. The performance test results are summarized in Table 1.

Figure 1. Lab Setup

Table 1. Traffix Load Balancer and Diameter Router CPU Utilization

Transactions per Second (TPS) per Client

Total TPS

CPU Average Utilization (%)

CPU Max Utilization (%)

CPU Total Number of Virtual Cores Utilization

6.67

1000

1.09

1.14

0.7

13.33

2000

2.07

2.15

1.32

20.00

3000

2.83

2.98

1.81

26.67

4000

3.73

3.92

2.39

33.33

5000

4.62

4.8

2.96

40.00

6000

5.41

5.73

3.46

46.67

7000

6.29

6.64

4.03

53.33

8000

7.4

7.66

4.74

60.00

9000

8.24

8.44

5.24

66.67

10000

8.69

9.07

5.56

73.33

11000

8.5

8.93

5.44

80.00

12000

9.43

10.36

6.04

86.67

13000

9.71

10.97

6.21

93.33

14000

10.46

11.48

6.69

100.00

15000

11.05

12.26

7.07

106.67

16000

11.64

12.92

7.45

113.33

17000

11.8

13.59

7.55

120.00

18000

12.7

14.17

8.13

126.67

19000

13.44

14.95

8.6

133.33

20000

14.42

14.47

9.23

Throughout the testing, the Traffix Diameter Load Balancer and Traffix Diameter Router were able to linearly scale across the 64 threads available on the 8-core UltraSPARC T2+ processor to process Diameter transactions in parallel and to deliver extremely high overall throughput. This conclusion is supported by Figure 2.

Figure 2. Traffix Diameter Load Balancer and Traffix Dater Router CPU Utilization

Performance gains are made possible through innovative features in Oracle Solaris, including those described in the following sections.

Multithread Awareness


Oracle Solaris is optimized for the UltraSPARC T2/T2+ and SPARC T3 processors so that the scheduler can effectively balance the load across all the available pipelines. Even though Oracle Solaris exposes every physical processor strand as a logical processor (up to 128 per chip), Oracle Solaris understands the correlation between cores and the threads they support, and it provides a fast and efficient thread implementation.

The CMT technology in the processors enables Traffix software to continue processing while a given thread is waiting for memory operations or network I/O processing. Each thread within the processor can handle its own sequence of traffic requests, taking maximum advantage of the Traffix software's scalability. Fast switching between processor threads then enables high utilization of the processor, providing extremely high levels of sustained throughput.

For an example of how Traffix software uses the Oracle Solaris multithread capability, refer to Appendix A.

Network Performance


Oracle Solaris 10 running on Sun SPARC Enterprise T-Series servers from Oracle provides a new and highly scalable TCP/IP stack that significantly increases network throughput and capacity. This innovative stack speeds packet processing by reducing overhead during the processing of packets. It provides the critical bandwidth and granular resource allocation required for parallel applications and high-speed networks.

High I/O throughput is also made possible by the fully integrated, dual 10-gigabit Ethernet (GbE) network interfaces, which are built into the UltraSPARC T2/T2+ and SPARC T3 processors. The fully integrated, dual 10-GbE network interfaces eliminate PCI-E latencies, helping to accelerate multithreaded application performance and optimize I/O throughput for applications that utilize parallel threads.

The following is the output from nicstat during a Traffix Diameter Load Balancer and Diameter Router process load of 16K TPS. nicstat is a freeware tool written in C that prints out network utilization and saturation by interface. For more information about nicstat, refer to this network monitoring information.

# nicstat -i nxge0 1
  Time     Int    rKb/s  wKb/s    rPk/s     wPk/s  rAvs    wAvs   %Util  Sat
12:15:25   nxge0 3585.38 2023.97  2427.64   2294.0 1512.34 903.44 4.60  0.00
12:15:26   nxge0 8207.80 7863.69 22206.00 17712.00 378.49  454.63 13.17 0.00
12:15:27   nxge0 7561.39 6651.48 20414.00 15662.00 379.29  434.88 11.64 0.00
12:15:28   nxge0 8544.01 7901.36 22882.00 17944.00 382.36  450.90 13.47 0.00
12:15:29   nxge0 6916.20 6731.69 18828.00 14965.00 376.15  460.62 11.18 0.00
12:15:30   nxge0 7125.66 6737.55 19380.00 15226.00 376.51  453.12 11.36 0.00
12:15:31   nxge0 6646.07 6576.80 18104.00 14506.00 375.92  464.27 10.83 0.00
12:15:32   nxge0 7217.59 6891.87 19557.00 15468.00 377.91  456.25 11.56 0.00
12:15:33   nxge0 7681.30 7164.13 20831.00 16276.00 377.59  450.73 12.16 0.00
12:15:34   nxge0 7040.57 6460.80 18973.00 14774.00 379.99  447.80 11.06 0.00
12:15:35   nxge0 7797.56 7438.75 21164.00 16731.00 377.28  455.28 12.48 0.00
12:15:36   nxge0 8029.57 7394.94 21761.00 17008.00 377.84  445.23 12.64 0.00
12:15:37   nxge0 6740.25 6419.38 18347.00 14410.00 376.19  456.17 10.78 0.00
12:15:38   nxge0 7749.75 7207.25 20796.00 16357.00 381.60  451.20 12.25 0.00
12:15:39   nxge0 8626.74 7622.95 23149.00 17947.00 381.61  434.94 13.31 0.00
12:15:40   nxge0 7171.69 7032.51 19545.00 15528.00 375.74  463.76 11.64 0.00
12:15:41   nxge0 6676.98 6569.37 18220.00 14553.00 375.26  462.24 10.85 0.00
12:15:42   nxge0 7179.67 6732.18 19472.00 15284.00 377.57  451.04 11.40 0.00
12:15:43   nxge0 7671.01 7153.52 20595.00 16150.00 381.41  453.57 12.14 0.00
12:15:44   nxge0 6607.64 6504.19 17966.00 14291.00 376.61  466.05 10.74 0.00

"The rPk/s and wPk/s fields indicate the number of read packets per second and write packets per second, respectively, on the zone's network card. In the example above, you can see that the average number of read packets (rPk/s) plus write packets (wPk/s) per second is 37K with up to 14% network card utilization (%Util ) and 0% saturation (Sat)."

Note: For more information about how the zone and network card were set up, see the Oracle Solaris Containers section.

These results show high network I/O throughput with very low network card utilization and saturation.

In addition to the preceding Oracle solutions, Traffix benefits from the Oracle Solaris technologies described next.

Oracle Solaris Service Management Facility (SMF)


SMF is a feature of Oracle Solaris that creates a supported, unified model for services and service management on each Oracle Solaris system. Traffix was able to get much better Traffic Diameter Load Balancer and Diameter Router uptime by using the service auto-restart feature of SMF, plus faster system boot and shutdown by starting the application services in parallel according to their dependencies.

SMF also enables better control of applications. For example, Traffix was able to control which solution modules to start (GUI, cluster, or core service) based on the software license. Services managed by SMF are easy to test, back up, and restore to a particular configuration, because configuration states are preserved in service manifests. In addition to the features mentioned above, SMF also provides a standardized Oracle Solaris interface for system administrators.

Oracle Solaris Containers


An Oracle Solaris Container running in Oracle Solaris 10 can have a shared IP stack with the global zone, or it can have an exclusive IP stack (which was released in Oracle Solaris 10 8/07). An exclusive IP stack provides a complete, tunable, manageable and independent networking stack to each zone. A zone with an exclusive IP stack can configure Stream Control Transmission Protocol (SCTP), IP routing, IP multipathing, or IPsec. For an example of how to configure an Oracle Solaris zone with an exclusive IP stack, see Appendix B.

Traffix was able to materially benefit from development cost savings by implementing a solution based on Oracle virtualization technologies. Oracle Solaris Containers enabled a complete solution without the need to compromise on any security features, such as the SCTP protocol required by 3GPP standards, and Oracle Solaris Containers provides an infrastructure for secure and reliable communication between administrative domains within the telecommunications network. SCTP offers the reliable delivery of messages without forced message sequencing constraints.

Oracle Solaris Cluster


Built on the solid foundation of Oracle Solaris, Oracle Solaris Cluster provides a load balancing feature in addition to high availability capabilities. Oracle Solaris Cluster enables increased service levels and availability through faster failure detection and recovery at the server, storage, network, and application levels. Using this feature, Traffix was able to configure and run the Diameter Load Balancer and Diameter Router on multiple systems simultaneously and optimize server load and throughput while ensuring that mission-critical applications were restarted in priority in case of server failure. The following are the positive results of using Oracle Solaris Cluster:

  • No planned downtime, since the Traffix Diameter Load Balancer and Diameter Router can be failed over to another server in case of maintenance
  • No unplanned downtime, due to fast failure detection at the server, network, and application levels
  • Scalability, since the cluster software offers a single IP address for managing the increased capacity of the application service

With Oracle Solaris Cluster, there is no need to add extra hardware for achieving IP-based load balancing. By using Oracle Solaris Cluster, Traffix was able to reduce the total cost of its solution by eliminating the need to purchase an additional Layer-3 hardware load balancer and the requisite third-party support contract.

Proposed Combined Solution


Traffix's high-capacity Diameter Load Balancer and Diameter Router architecture is depicted in Figure 3. Built on the solid foundation of Oracle Solaris Cluster, Oracle Solaris, and Sun SPARC Enterprise T-Series servers from Oracle, it offers availability and scalability and enables the deployment of a complete, turnkey telecommunications-ready solution.

The solution, including subcomponents, such as chassis and blades, is managed as a single network element, that is, the IP address of the chassis is configured at the client, and traps issued by a blade contained in the chassis are presented as traps issued by the network element.

Figure 3. Proposed Combined Solution

Conclusion


ISVs can reach tactical and strategic goals by choosing the right hardware and software from the right vendor to better manage product quality, time to market, and resource optimization. In this paper, we demonstrated how Traffix Systems was able to achieve better time to market, reduce hardware and software costs, and improve product performance by using Oracle-based solutions.

Appendix A: Oracle Solaris Multithread Usage Examples


The following is the output of the Oracle Solaris mpstat(1m) command. Each line represents one virtual CPU (vCPU).

# mpstat 1
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0    1   0  454   609  102  744    9  124   52   0  3369  10   7  0  84
1    1   0 1152  1713  323 2356   34  470  165   0  3809  21   8  0  71
2    0   0  688  1054  162 1496   21  261   63   0  4241  18   9  0  73
3    0   0  474   674  100  972   14  118   81   0  1991  11   4  0  85
4    0   0  343   882  162 1207   18  157  100   0  2183   7   4  0  89
5    0   0  316   727  106 1075   14  127   32   0   911   8   5  0  87
6    0   0  387   903  130 1438   21  139   99   0  2827  17   6  0  77
7    0   0  370   806  138 1101    9  129   29   0  1076   7   2  0  91
8    0   0  142   446   51  600    2   75   36   0   703   4   3  0  93
9    0   0  538  1032  163 1563   17  185   63   0  1690   9   4  0  87
10   4   0 1323  2006  389 2790   43  524  151   2  4719  22  10  0  69
11   0   0  472  1006  167 1433   26  245  101   0  2278  14   5  0  82
12   0   0  419  1017  162 1439   33  204   63   0  1583  10   4  0  87
13   0   0  250   937  167 1328   22  178  104   0  2064  14   5  0  81
14   0   0  621   664  114  915    8  129   49   0  1362   6   3  0  91
15   0   0  439   841  119 1263   15  124   56   0  1483   9   4  0  87
16   0   0  534   518   84  783   15  103   38   0  1467  10   2  0  88
17   0   0  296   530   98  688   11   76   56   0  1982  11   4  0  86
18   0   0  513  1038  207 1361   24  208   71   0  1491  11   4  0  86
19   0   0 1213  2239  454 3087   43  593  191   0  3566  21  10  0  69
20   0   0  487  1280  247 1722   34  307  112   0  1681  10   5  0  86
21   0   0  461   942  137 1350   24  165   94   0  2483  12   4  0  84
22   0   0  281   802  128 1155   14  129   80   0  1431   8   8  0  85
23   0   0  351   695  123  955   11  111   60   1  1208   9   2  0  89
24   0   0  503   621  112  869   12  112  112   0  2163  12   4  0  85
25   0   0 5681  3744 3422  539    7   67  322   0   555  12  21  0  67
26   0   0  530   728  108 1145   12  107   62   0  1221   7   4  0  89
27   1   0  525   732  127 1015   13  140  112   1  5213  17  11  0  72
28   2   0 1433  2075  412 2935   55  534  189   1  4621  15  10  0  76
29   0   0  361   951  163 1260   25  230  105   0  1294  12   5  0  83
30   0   0  449   702   56 1146   14  106  122   0  1983  14   3  0  83
31   1   0 1068   436   59  529    8   78   37   1   839   5   1  0  94
32   0   0  370   798   77 1396   27  112   78   0  1751  12   4  0  84
33   0   0  437   571   77  776   14   94   25   0  5060  19  11  0  70
34   0   0  472   991  112 1624   16  136   50   0  1628  12   3  0  85
35   1   0  345   642   92  964   15  109   45   0  1088   9   3  0  88
36   0   0  315   637   40 1019   11   79   35   0  1394  12   3  0  86
37   0   0  352   836  138 1287   16  170   68   0  1714  10   4  0  87
38   0   0 1076  2083  406 2944   36  538  191   0  3383  14  10  0  77
39   0   0  538   979  158 1321   27  207   51   0  1560  14   4  0  82
40   0   0  597  1172  182 1761   31  243  152   0  2931  11   6  0  83
41   0   0 1329  1055  163 1461   19  167   57   1  2017  11   4  0  85
42   0   0  272   946  165 1425   25  151   62   0  1436   9   4  0  87
43   0   0  213   635  107  850   15  112   37   0   889  10   3  0  87
44   0   0  288   519   53  812    9   71   42   0  1166   6   3  0  91
45   0   0  224   452   43  663   12   73   91   0  1048  13   2  0  85
46   0   0  352   929  207 1214   13  167   76   0  1459  12   4  0  84
47   0   0 1218  2322  488 3198   51  635  220   0  3667  17   9  0  75
48   0   0 1045  2014  436 2621   44  566   97   0  2801  14   7  0  79
49   1   0  612   836  134 1135   23  181   95   0  5058  30  10  0  61
50   0   0  488   692   79 1076   13  119   32   0  1301  11   3  0  87
51   1   0  539   475   56  614    7   71   45   1  1709   6   3  0  91
52   0   0  455   718  115 1070   18  112   52   0  1335  12   3  0  85
53   1   0  493   809  126 1205   16  141   82   1  1924  11   4  0  86
54   0   0  312   800  151 1055   18  132   39   0  1047   7   2  0  91
55   0   0  919  1074  202 1501   20  207   78   0  1443  12   4  0  84
56   0   0  322   822  121 1255   18  140  102   0  1450  12   3  0  85
57   0   0  159   726  139  948   17  127   45   0   682   6   3  0  91
58   0   0  926   891  150 1242   17  164  132   0  3088   7   6  0  87
59   0   0  492  1020  209 1380   18  181   93   0  3609  12   6  0  83
60   1   0 1368  2134  463 2870   54  577  210   0  3607  15   9  0  76
61   0   0  738  1038  162 1539   23  246   86   0  1860  18   5  0  77
62   1   0  208   660  110  843   13  124   54   0   915   8   2  0  90
63   0   0  323   741   64  881   15   91   77   0  1200   8   4  0  88

Appendix B: Installing an Oracle Solaris Zone with an Exclusive IP Stack


The following section provides setup instructions for installing an Oracle Solaris zone that has an exclusive IP stack.

zonecfg -z myzone
myzone: No such zone configured

Use create to begin configuring a new zone.

Note: The ip-type parameter indicates the Oracle Solaris zone's IP stack type, which is exclusive. The physical parameter indicates which network card will be used exclusively in this zone.

zonecfg:myzone> create
zonecfg:myzone> set zonepath=/myzone
zonecfg:myzone> set autoboot=true
zonecfg:myzone> set ip-type=exclusive
zonecfg:myzone> add net
zonecfg:myzone> set physical=nxge0
zonecfg:myzone> end
zonecfg:myzone> verify
zonecfg:myzone> commit
zonecfg:myzone> exit

Revision 1,1 03/06/2012