Rick Hetherington, Oracle’s vice president of hardware development, manages a team of architects and performance analysts who design Oracle’s M- and T-series processors. Hetherington’s team tracks the performance of these designs in great detail, from the moment they are conceived until they are released as products. In this interview, Hetherington explains the design process and how the team’s day-to-day work is focused on what SPARC customers will have in their data centers three to five years from now.
Q: How do you determine what a processor needs to do five years before it will come to market?
A: There are a number of factors that influence the set of features in our processors. To begin with we do work with our customers to understand their needs and their issues. And we study the competitive environment very closely. We also have close ties with universities doing research. Silicon technology is fundamental and we need to factor in where that will trend in five years. Our processors integrate memory controllers and I/O ports so we track the trends in DRAM memory technology and trends in I/O, primarily PCI. Finally, we get valuable input from Oracle software architects, our system partners, marketing, and I can't leave out upper management.
Q: How do you actually design the processors?
A: The actual design process would take hours to describe and we are fortunate to have a team of over 1000 skilled engineers who do that. Imagine interconnecting one billion transistors with an enormous number of constraints to perfectly execute 10's of 1000s of applications. The interesting thing about being at Oracle from the architect's perspective is that we actually use the Oracle database to craft the architecture of these chips. We use existing SPARC systems running the entire Oracle software stack including Oracle Solaris, Oracle OVM for SPARC and the Oracle database to collect instruction traces that we apply to our models from the moment they are conceived.
Q: Does that give Oracle a competitive advantage in the market?
A: Absolutely. Because we can spot architecture bottlenecks as well as software issues when we're looking at the minute details of the design as well as the code. When we do see anomalous behavior we can instantly consult with the Solaris team or our database experts to understand what is happening. The entire hardware and software stack is fully integrated and functioning quite well and we are just beginning. It's a huge advantage no other company can enjoy.
Q: Can you talk about our current and future SPARC offerings? Our roadmap if you will.
A:When we first launched the SPARC T-series processor in 2004/2005, the goal was to keep a continual, predictable pace of new products. About every 18 months was about where we wanted to be in terms of product delivery.
And I think if you look at the roadmap, we're pretty close to that. Our current systems were announced at Oracle OpenWorld in September, and the SPARC T3 is shipping right now. The SPARC T3 is a follow-on to the SPARC T2 processor: We doubled the core count and that doubles the thread count on a per socket basis. We're building both rack systems and bladed systems based on SPARC T3 technology.
Going forward, the SPARC T4 is less than 12 months away. So that's pretty quick. And there's a reason behind that. We wanted to get more single thread performance into the SPARC T-series systems sooner rather than later. So we developed a new core for the SPARC T4 that brings together the combination of throughput performance through threading as well as really high-speed single thread performance. It's really breakthrough technology for us. Referring back to the timeline for development that we have, we developed that core and that technology back in 2006/2007 and we'll be delivering that to the market in 2011.
And then the SPARC T5 will have 16 cores as opposed to the eight cores that we have in the SPARC T4, in 28 nanometer technology. As we go forward with each of these new introductions we'll make sure that the interfaces that deal with memory are common and are contemporary with the sweet spot of memory technology.
Q: Oracle refers to the T-series processors as a “system on a chip.” Can you explain that?
A: We have always called them systems-on-chip because one piece of silicon contains processors cores, memory controllers, I/O ports and network interface controllers. To build a completed system all one needs is to plug in a few DRAM dimms, connect an Ethernet cable, attach a power source, blow some air and off we go. Of course, there is much more to it than that but the point is that the essential ingredients are on one die and there are significant advantages to that. Namely, lower overall latency to memory, critical interfaces are now on die rather between die, and a lower component count for improved reliability.
Q: It's my impression that Oracle is innovating with Solaris and SPARC in a way that gives it a real competitive advantage in the market. Can you comment on that?
A: Well, I think, yes, we do have a unique situation here where the operating system, processor development, virtualization software, and the applications all under one roof. Our roadmap out to 2015 and beyond is showing significant growth in the number of cores and the numbers of threads we can support. The only way our customers can take advantage of these systems is with an operating system capable of effectively scheduling 1000s of threads, virtualization software that can offer 100s of logical domains, and Oracle database software that can effectively scale performance in these highly threaded systems.
Q: Can you provide a concrete example of what hardware and software engineered to work together means for customers?
A: A good example of hardware and software working together is our unique support for crypto acceleration in SPARC. The SPARC T3 has a crypto co-processor that is driven by the Solaris Crypto Framework. Imagine a SPARC T-series system running online banking and there are 1000s of customers who demand secure communication with their accounts. The crypto accelerator and the software drivers provide the level of performance to allow 1000s of customers to connect securely to their bank. Each connection involves a computationally intensive public/private key exchange, agreement on the strongest common cipher, and then a session of encrypted communication. The SPARC T3 and SPARC T4 can initiate about 20,000 secure connections per second and can encrypt/decryt nearly 15GBs of secure communication. This enormous level of performance is only realized through very close cooperation with the hardware and software teams.
Q: And this all happens right on the chip?
A: Yes, each core has an accelerator in the case of SPARC T3. For the SPARC T4, we moved the accelerator closer to the instruction execution pipeline to reduce the software overhead and drive it with a new set of crypto instructions. This still requires well engineered software.
Q: What's the next big thing for SPARC processors?
A: There are numerous improvements under consideration. Some of these are the result of requests from Oracle software architects and I should hold off talking about those for now. One item I am excited about is what we are calling “critical thread API” or the ability of the operating system to recognize critical threads in our Oracle applications and assign them all by themselves to a single core. That allows the critical threads to run at the very highest performance levels without competing with other, less critical threads. And those threads that aren't as critical get assigned at a lower priority, but they can still take advantage of the threading capabilities we have on that core.
Q: Since you are architecting today what customers will want in 2015, what are you hearing from them now?
A: I think power is certainly an issue and the ability to be much more flexible in terms of managing their workloads to get the most efficient use of their systems. I see that the whole notion of virtualization is catching on. We introduced LDOMs three to four years ago. But you know, back then customers had some experience with VMware but on the SPARC side they didn't really know how to take the first step.
And now, virtualization is really what they want to talk about as they go forward in the next three, four, five years. They want a lot of flexibility and really good management tools to migrate workloads without constraints to realize a much more effective use of their compute resources. We have a great line up of processors, operating systems, virtualization, and management software to meet this demand.