Qualitas Expertus : Performance Testing

Purpose

1. To discover bottlenecks, perform capacity planning, optimize the system’s performance

Pre-Requisites

Scope of each PSR test
Timelines and environment assumptions
Details should be obtained from the infrastructure team like network topology diagram to assess which network segments and routers will be affected as a result of the PSR tests. [The network topology diagram and the KBs of data sent per day should help the test engineer plan and mitigate risks associated with the PSR test.]
List of components to be monitored like database, application servers, infrastructure etc. and respective contact persons to report the issues.
Find out how many KBs of data are currently being transmitted for a typical workday for all applications
Templates to collect Environment details
Get the detailed requirements to create the expected volumes of transactions and business processes within the environment for a given number of concurrent users
Creating a folder structure for the artifacts
Tool Evaluation Criteria

Normal scenarios

Identify the current response time (backend and client end)
Identify current Areas of Bottlenecks
Trial run with minimal load
Isolate how many KBs of data the application under test sends on a given day for an end-user.
Find how many end users can be emulated from a desktop/endpoint
Functionality of the application should be robust and stable before initiating a PSR test.
Trigger events like emails etc should be tested separately and disabled for a general PSR run
Identify component level, architectural level, Infrastructural level and end to end level tests

Data Load

Obtain unique data values from the subject matter experts to re-execute the tests with multiple iterations of data for processes that have unique data constraints
Create test scripts with enough unique data records to prevent the data-caching problem from occurring.

Ex: Using the same set of data could cause data to be stored and buffered which may not fully exercise the database. Otherwise DBA may need to restore and refresh the database to cleanup the cache.

Setting up LAN to handle the Load test
Setting up the hardware to handle the Load test
Need to clear the end-point browser cache of JS files and CSS to for each run to simulate actual time taken to render the objects.
Need to figure out a strategy for clearing browser cache for the apps (like UMS only) but not the cache entries from the login. This is required to measure the timing for doing different operations on an extend period of time.

Should not clear the application cache and go ahead with multiple operations to capture the performance.

Execution

Inform other users before starting the PSR tests
Contingency Plan if the LAN crashes
Having all the personnel present
Documentation
Run on a known environment and make a benchmark/baseline and run on different environments to identify system bottle necks, measuring response times etc and increase the numbers gradually to find the deviations and the system limits
Capture and get screen shot print outs of all error messages that the application under test and place it in an organized repository
Scrum [interpret test results, monitor the test, and coordinate the testing efforts with multiple parties]
If the test environment is small compared to the Production then, the results has to be extrapolated with respect to the production environment. Strategy to extrapolate the results.
Actual results of performance with different versions
Identify bottlenecks using tools like Windows Task Manager, Windows Performance Monitor, Component Services administrative tool etc
Initial few rounds of testing should be followed by a development process to tune the application.

Tune the Database
Rewriting programs with inefficient SQL statements or network roundtrip calls to server from the client.
Application Server Configuration, like Java memory allocation or any other parameters

User Load Plan

Steady Load: Same/constant number of users
Increasing Load: Begin with small number of users and then the users are increased in sets gradually.
Dynamic Load: Change the frequency and the number of users randomly
Scheduled Load: Schedule the transactions with random intervals
Identify workflows and categorize/generate data in volume for them as given above

Login - Number of users logging/logged in at a given instance
Reports – Number of users accessing the reports at a given instance
CreateUser – Number of users created at a given instance
Search option – Number of users accessing the search option at a given instance
Billing

Example

SNO	Workflow Name	Parameters	Different sets: Distribution of users	Volume in each set	DataFile
1	Login	Username, Password	Eg: Admin, Helpdesk, finance, readwrite users, readonly users etc.		Login.csv

Test Variables

Variable	Description
TD	Duration of the Test
NU	Number of Users to Start
INU	Intervals to Increment number of users
NUi	Number of users to increment for each interval
TU	Total Number of users

Metrics

Response Times

Total Response Time: From the time the request sent from client and the response received from the server including network-time, test-environment, database dependencies and network-latency
Measure response time from server logs and compare with total response time to determine how the response times vary depending on the network traffic during different intervals

Average No. of transactions/sec [user transaction, file transfer, batch process]
Average Number of hits/sec
Throughput: Number of Transactions/sec, maximum concurrent users & byte/sec
Average, minimum and maximum limits of the system
Utilization: Percentage of time that the system resources are in use. If utilization is too high, there will be longer queuing delays, leading to higher response times. If utilization is too low, then there is an excess of capacity
Analyze why a certain transaction with certain input generates a good or a bad response
Number of bottle-necks, memory leaks identified

Strategy to fix the problem in short term and in long term
Capture all the decisions in one artifact
Running the test after the fix

Monitoring webserver

Processor time
Number of bytes
Thread counts

Scalability

Purpose

To discover the performance ability of the system, while high volume use to get non-disruptive growth, continuous availability, and consistent response times, even during peak usage times

Architecture level

Evaluate whether there is enough memory and processors to handle the expected user volumes.
Good Understanding of how easily the configurations of Web, application, and/or database servers can be expanded.
Tune database indexing [Incorrect database indexing of information can impact performance when requests are either unable to find information or get stuck in an infinite loop trying to fill a request]
For high-volume consider load balancing software/hardware in order to efficiently route requests to the least busy resource.
Security and its effect on performance [Adding a security layer to transactions will impact response times because information will require added steps to encrypt and decrypt transaction data.]
Identifying the parts of application that requires secure requests, and dedicating specific server(s) for handling secure transactions

Reliability

Search user – Number of users performing various actions like view, delete, edit change password on the same record/user

Qualitas Expertus

Pages

Quality to me is

Friday, February 20, 2009

Performance Testing

No comments:

Followers