by Mike Cuppett
Why patience, information sharing, and collaboration can lead to faster and more comprehensive performance improvements
Published March 2012
Many developers and DBAs are determined to baseline application performance and then strive to continuously improve responsiveness. But in too many IT shops, the development team and the DBA team spend little time together discussing application performance challenges. Furthermore, finger pointing often hampers each team’s ability to produce the best outcome for the customer: the application user.
Two-pronged attacks are more effective in battle and have been proven to reduce technical problem resolution times. What if instead of placing blame elsewhere, developers and DBAs decided a joint strike was the better option? What if database and code tuning efforts were not done in isolation? Would the odds of success – improving the customer experience - increase? In my experience, yes they would.
How is this goal achieved? First, the development and DBA teams need to agree that the customer’s experience using the application must be the foremost key performance indicator for measuring how well the application performs. Second, the teams must agree to collaborate, without “blame storming”, to identify and deliver the best performance tuning opportunities to improve the overall customer satisfaction level.
To explain, first I’ll define application performance from the customer’s perspective, an explore an example of why IT team members need to understand the broader picture concerning application delivery versus just making sure the respective technology silos are responding acceptably. Once we’ve level-set our understanding of the customer’s expectations and perspective, we’ll explore three scenarios where developers and DBAs often don’t understand each other - and see how patience, information sharing, and collaboration can lead to faster and more comprehensive performance improvements.
Application performance is truly only measured by the perception of the people that use the application often. It is unfortunate, as well as somewhat amusing, that the old determinate is that if the application users are screaming at you, there are problems, while complete silence - as if you didn’t exist - means that either the performance is acceptable or people have just resigned themselves to the fact that the application will always be performance challenged.
I define application performance as an aggregate measure that includes database responsiveness, code efficiency, and infrastructure delivery capability. Each customer’s perspective of application performance is a unique, personal satisfaction level with how quickly the application responds once a request is made.
Now, let’s explore the customer perspective holistically to better understand how we can affect their satisfaction level.
For many years now I’ve used the term PACE (Performance Assurance of the Customer Experience) when trying to convince colleagues that just because your technical silo (server, database, application code, network, desktop) is “green” or “acceptable” from an IT perspective – say maybe a 99% successful delivery rate or uptime - it absolutely does not mean that the customer perceives the application as well performing and highly available.
There is an enlightening example available that demonstrates how the end result does not necessarily match the individual results. When I encountered it during training, this concept altered my perception concerning how each technology used in the delivery of services to the customer impacts the overall customer experience:
Mapping application performance to customer satisfaction
Note that while each technology silo involved in the application delivery to the customer would be viewed as “green” within the IT shop, the customer may not necessarily agree. If, for the example shown, there is an SLA between the customer and IT stating that 98% of all transactions must complete with a response time of < 2 seconds, the metrics shown in this example are not acceptable at a success rate just slightly over 93%. This metric will not look good on the CIO’s monthly dashboard report.
This gap – IT’s view of performance versus the customer’s view of performance - is just one of the reasons many business teams believe that IT does not understand the business, and that IT is not willing to be accountable as a true business insider. I’m happy to see that more discussions are occurring across the industry to eliminate the language of “aligning IT with the business”, as that statement encourages people to think of IT as a group of non-business people who have to be handled differently. (When was the last time you heard someone say, “aligning marketing with the business” or “aligning HR with the business”?)
People have been conditioned to view IT as singularly misaligned with the business strategy and detached from the “traditional” business departments. As IT begins to communicate differently - using business language and business metrics that prove IT delivers value to the business in terms the business understands - developers and DBAs have the opportunity to: “bubble-up” this paradigm shift by exploiting a new understanding of customer satisfaction, start delivering business-focused performance improvements, and last, but equally important, communicating the improvements such that the business views IT as a business-oriented and results-driven team, focused on delivering sustainable business improvements.
The business has no interest in server uptime, network-bandwidth utilization and database availability metrics. Therefore, developers and DBAs need to start asking questions during performance improvement planning such as:
“How will this change affect the overall customer experience?”
“Should we engage other teams to validate infrastructure delivery capabilities?”
“What priorities need to change to make sure our efforts are noticeable to the our customers?”
Starting to ask different questions will lead developers and DBAs to a more comprehensive perspective of the IT supply chain – the infrastructure and processes used to deliver data to the customer – both expanding and deepening the team camaraderie while increasing the team’s success stories.
Now that we better understand the customer perspective, let’s walk through three examples in which a development team and a DBA team decided to approach the remediation of a poorly performing application and unstable database via a collaborative, mutually respectful approach.
Most of us would agree that managing the database buffer cache is critical for acceptable application and database performance. Basically, there are two methods available for doing that: expanding the cache, which is traditionally a DBA task; or reducing the amount of data being read into the cache, which by default is a developer task. However, addressing buffer cache performance and query result set size independently is unlikely to lead to an optimal performance solution. In contrast, by working together to implement both improvements, DBAs and developers can generate significant performance boosts to application responsiveness. This collaboration becomes even more important if the DBA team is constrained by hardware limitations.
For example, independently, DBAs can easily check database buffer efficiency and implement a change to expand the buffer for better performance. However, an effort should also be made to reduce the amount of data being read into the buffer cache, which requires a more collaborative tuning approach, something many DBAs don’t consider. Conversely, my personal experience is that developers rarely consider the size of a query result set from a database buffer cache performance perspective. Therefore, DBAs need to take responsibility for proactively teaching developers how large result sets can negatively impact overall application and database performance. On the flip side, developers need to help DBAs understand the data requirements and the data usage and volume trends to help with system capacity planning.
Let’s say that upon reviewing the database buffer cache advisor it is determined that the buffer cache needs to grow by 10GB; however, the current server only has 8GB of free memory available and additional memory cannot be purchased until the next capital purchase cycle or next year. The DBA team may decide it’s safe to expand the buffer cache by 4GB - which will improve performance, albeit not optimally.
If the DBA team stopped the tuning effort now, a huge performance improvement opportunity would be missed. To maintain momentum, the next DBA step would be to see if the development team could eliminate 6GB of data from ever being read into the cache. The data read reduction would effectively “expand” the buffer cache by reducing the demand.
Even with a commitment to respectful collaboration, in some cases, the “language barrier” between DBAs and developers can also cause problems.
For instance, let’s say that the DBAs quickly inform developers that the application is slow because of application locking, not knowing that the developers may not fully understand what the DBA is trying to communicate. Or, developers may tell the DBAs that the application keeps “freezing up” because something is wrong with the database. So, what is the real problem?
One approach would be for the DBA to sit down with key application team staff members to explain what a DBA means when he or she says application locks are to blame for this behavior. Once developers better understand this concept, it’s their turn to educate the DBA team.
Knowing that the locking is occurring is just the tip of the problem identification iceberg. Developers need the DBA team to identify the involved sessions and provide details concerning what each session was trying to do within the application. By pulling up the current SQL statements for the blocking (lock holder) and blocked (lock requester) sessions, the development team can start to research the code segments causing the application locks.
Note that there are many times when the blocking session will not have a current SQL statement as all of the transaction work has been performed, but a commit or rollback has not occurred. This is apparent in the below example.
Query result showing a blocker and waiter:
589859 9688777 6 0 TX Blocker
589859 9688777 0 6 TX Waiter
Enter value for sid: 614
762549800 SELECT ID , LAST , FIRST , MI , NICKNAME , GENERATION , CMP_ID ,
762549800 SVCBR_ID , ADDR1 , ADDR2 , CITY , STATE , ZIP1 , ZIP2 , COUNTY
...Session has moved on without a commit or rollback
Enter value for sid: 321
1472019480 UPDATE TABLE SET SHIPPER_METHOD = '01' WHERE ID =
1472019480 :B1 AND SHIPPER = 'UPS' AND SHIPPER_METHOD = '10'
...Session is waiting on a commit or rollback from the above session.
Based on the above, the teams would find that missing foreign key indexes are causing slow referential integrity (RI) checks that in turn cause full table locks on the referenced table for the duration of the RI check, primary keys being updated that cause unneeded IR lookups, missed commit opportunities, multiple process sequences (A-B-C and A-C-B and B-A-C) that caused user work to unnecessarily conflict with other user work, and so on.
In summary, when DBAs continuously provide such data, the DBAs as well as the development team will start noticing a pattern that should lead them to the root cause of the problem and then onto code remediation.
Here’s another ubiquitous example: Developers often tell the DBA team that the “database is slow” because a job is running behind schedule. Defensive responses from the DBAs, which are all too common, often reflect a sincere misunderstanding about what is being vaguely communicated as opposed to an attempt to deflect blame.
In this example, DBAs would need to know how much the job times have changed and when the job durations started to expand. Experienced DBAs can quickly distinguish between a problem where the amount of data being processed has increased (thus requiring a job to run longer to completion) versus needing to consider a hardware failure or similar event. A job time that increases 5 percent over a month is a totally different animal than a job that had a 400% increase in run time since last night. Therefore, clean and precise communications greatly improve situational understanding among both teams, most likely leading to faster problem resolution for the customer – the ultimate goal.
As you’ve learned, cooperation between developers and DBAs can positively impact the customer’s experience – which we now understand more holistically. Certainly, the DBA team or the development team could work in isolation to improve the database performance or code efficiency, each improving the application incrementally. However, it’s been my experience that a collaborative approach better defines the priorities and the “best bang for the buck” opportunities that will accelerate the improvements, which will surely have the application users praising the IT team.
Practically speaking, it is not necessary for developers to know every aspect of database tuning nor do DBAs need to fully understand every best practice for tuning code. Rather, when each team understands the benefits and path to improving the customer experience, they can apply their skills to quickly produce and implement the necessary performance-improving changes.
Mike Cuppett currently works as a database team manager for a Fortune 50 healthcare organization. Mike specializes in building and leading teams that increase infrastructure stability and availability, while delivering performance improvements to meet business demand. Mike’s strongest hands-on technical skill is Oracle database management.