Concepts: Performance Test

Concepts: Performance Testing

Performance testing is a class of tests that are implemented and executed to characterize and evaluate the performance related characteristics of an application (or system) such as the timing profiles, execution flow, response times, and operational reliability and limits. Different types of performance tests, each focused on a different test objective, are implemented throughout the software development life cycle (SDLC). Early, during the architecture iterations, performance tests are implemented and executed to identify and eliminate architectural-related performance bottlenecks. In the construction iterations, additional types of performance tests are implemented and executed to tune the software and environment (optimizing response time and resources). In the late construction iterations and into transition, performance tests are executed to verify that the application(s) and system acceptably handle high load and stress conditions, such as a large numbers of transactions, clients, and / or volumes of data.

Included in Performance Testing are the following types of tests:

Benchmark testing – tests that use a standard, reference workload to measure the performance of a [new or unknown] system and compare it to a known reference system (or measurement).
Performance testing - tests using a constant workload and varying system variables to tune (or optimize) the performance of the system. Measurements typically include the number of transactions per minute, number of users, and size of the database being accessed.
Load testing - tests to verify and assess acceptability of the operational limits of a system under varying workloads while the system-under-test remains constant. Measurements include the characteristics of the workload and response time. When systems incorporate distributed architectures or load balancing, special tests are performed to ensure the distribution and load balancing methods function appropriately.
Stress Testing - tests that focus on ensuring the system functions as intended when abnormal conditions are encountered. Stresses on the system may include extreme workloads, insufficient memory, unavailable services / hardware, or diminished shared resources.
Volume Testing - testing that focuses on the ability of the system to handle large amounts of data, either as input and output or resident within the database. Volume testing includes test strategies such as creating queries that [would] return the entire contents of the database, or have so many restrictions that no data is returned, or data entry of the maximum amount of data in each field.

Performance evaluation is normally performed in conjunction with the User representative and is done in a multi-level approach. The first level of analysis involves evaluation of the results for individual tests and the comparisons of results across multiple tests. An example of this is the distribution of response times for individual users during an execution of the test. Such first-level analysis can help identify trends in the data which may indicate there is contention among system resources, and could affect the validity of the conclusions drawn from the data.

A second level of analysis examines the summary statistics and actual data values for specific end-user requests the system responses. Summary statistics include standard deviations and percentile distributions for the response times, which provide an indication of the variability in system responses as seen by individual end-users.

A third level of analysis can help you understand the causes and significance of performance problems. This detailed analysis takes the low-level data and uses statistical testing to help testers draw correct conclusions from the data. Detailed analysis provides objective and quantitative criteria for making decisions, but it is more time consuming and requires a basic understanding of statistics.

Detailed analysis uses the concept of statistical significance to help the tester understand when differences in response times are real or due to some random event associated with the test data collection. The idea is that on a fundamental level, there is randomness associated with any event. Statistical testing determines whether there is a systematic difference that can’t be explained by random events.

Rational Unified Process 5.1 (build 43)