Tuesday 30 July 2013

Best practices in performance testing

Here are some of the most common things projects do that make performance testing difficult or less productive…

Stick to the requirements, even when they don’t make sense
You should know by now that projects aren’t very good at defining their non-functional requirements. This means that it is necessary to use a certain amount of common sense when applying them to the performance test results. As an example, imagine that a response time requirement specified that average response times for login should be less than 5 seconds. During performance testing, it is found that 90% of login transactions take 1 second, but 10% take 40 seconds. On average, the response time for the login transaction is 4.9 seconds, so someone who interprets requirements very strictly would consider the response time acceptable, but anyone with good critical thinking skills would see that something is wrong and think to get the intent behind the requirement clarified.

Use the wrong people to do your Performance Testing
A very common mistake is to assume that someone who does functional test automation is necessarily suited to performance testing because they know how to use another tool that is made by the same company. Another mistake is to assume that just because the project has purchased the very best tool money can buy, and it is very easy to use, this will compensate for testers who don’t know anything about performance testing (“fools with tools”). Performance testing is a highly technical activity, and is not a suitable job for anyone who cannot write code, and who does not understand how the system under test fits together.

Don’t provide enough technical support to investigate and fix problems
A good way to ensure that it takes a long time to fix defects is to fail to provide someone who is capable of fixing the problem, or to provide someone who is too busy to work on the problem. Load and performance-related defects are difficult problems, which are not suitable to assign to a junior developer. It is best to make code-related performance problems the responsibility of a single senior developer, so that they have a chance to focus, and are not distracted by all the other (much easier to fix) problems in the buglist.

Don’t let performance testers do any investigation themselves

Having a rigidly enforced line between testers (who find problems), and a technical team (who determine the root cause of a problem, and fix it) doesn’t work so well with performance testing. Performance testers find problems that are impossible for other teams to reproduce themselves (and it’s pretty hard to fix a problem you can’t reproduce). This means that performance testers and technical teams need to work together to determine the root cause of problems. Performance testers can do a lot of this by themselves if they have access to the right information. This means setting up infrastructure monitoring, and providing logons to servers in the test environment and access to any application logs.

Wishful extrapolation

Imagine that the test system is two 2-CPU servers, and performance testing shows that it can handle 250 orders per hour. The Production system is two 8-CPU servers, so it should be able to handle 1000 orders per hour, right? Well, not necessarily; this assumes that the system scales linearly, and that CPU is the only bottleneck. Both are bad assumptions. It is best to test in a Production-like environment, or to have a very solid (experimentally proven) knowledge of how your system scales.


Hide problems

One of the main reasons for software testing is so that the Business stakeholders can make an informed decision about whether a new system is ready to “go live”. Often performance testers are put under pressure to downplay the severity or likelihood of any problems in their Test Summary Report. This is usually due to a conflict of interest; perhaps performance testing is the responsibility of the vendor (who is keen to hit a payment milestone), or the maybe the project manager is rewarded for hitting a go-live date more than for deploying a stable system. In either case, it is a bad outcome for the Business.

Following are some of these performance stats and business losses examples.
  • In September 2010, Virgin Blue's airline's check-in and online booking systems went down. Virgin Blue suffered a hardware failure, on September 26, and subsequent outage of the airline's internet booking, reservations, check-in and boarding systems. The outage severely interrupted the Virgin Blue business for a period of 11 days, affecting around 50,000 passengers and 400 flights, and was restored to normal on October 6. .There is a loss of $20million.
  • Average user clicks away after 8 seconds of delay
  • $45 billion business revenue loss due to poor web applications performance
  • In November 2009, a computerized system used by US based airlines to maintain flight plans failed for several hours causing havoc amongst all major airports. This caused huge delays to flight schedules causing inconvenience for thousands of frustrated passengers. Identified as a ‘serious efficiency problem’ by the Federal Aviation Authority, this was one of the biggest system failures in US Aviation History!
  • Aberdeen found that inadequate performance could impact revenue by up to 9%
  • Business performance begins to suffer at 5.1 seconds of delay in response times of web applications and 3.9 for critical applications and an additional second of waiting on a website significantly impact customer satisfaction and visitor conversions. Page views, conversions rate and customer satisfaction drops 11%, 7% and 16% respectively!
  • A/c to Amazon: Every 100ms delay costs 1% of sales

No comments: