Performance Engineering : 06/05/13

Wednesday, June 5, 2013

HOW GARBAGE COLLECTION WORKS IN JAVA

The important things about the garbage collections are

1) objects are created on heap in Java irrespective of there scope e.g. local or member variable. while its worth noting that class variables or static members are created in method area of Java memory space and both heap and method area is shared between different thread.
2) Garbage collection is a mechanism provided by Java Virtual Machine to reclaim heap space from objects which are eligible for Garbage collection.
3) Garbage collection relieves java programmer from memory management which is essential part of C++ programming and gives more time to focus on business logic.
4) Garbage Collection in Java is carried by a daemon thread called Garbage Collector.
5) Before removing an object from memory Garbage collection thread invokes finalize () method of that object and gives an opportunity to perform any sort of cleanup required.
6) You as Java programmer can not force Garbage collection in Java; it will only trigger if JVM thinks it needs a garbage collection based on Java heap size.
7) There are methods like System.gc () and Runtime.gc () which is used to send request of Garbage collection to JVM but it’s not guaranteed that garbage collection will happen.
8) If there is no memory space for creating new object in Heap Java Virtual Machine throws OutOfMemoryError or java.lang.OutOfMemoryError heap space
9) J2SE 5(Java 2 Standard Edition) adds a new feature called Ergonomics goal of ergonomics is to provide good performance from the JVM with minimum of command line tuning.

When an Object becomes Eligible for Garbage Collection
An Object becomes eligible for Garbage collection or GC if its not reachable from any live threads or any static refrences in other words you can say that an object becomes eligible for garbage collection if its all references are null. Cyclic dependencies are not counted as reference so if Object A has reference of object B and object B has reference of Object A and they don't have any other live reference then both Objects A and B will be eligible for Garbage collection.
Generally an object becomes eligible for garbage collection in Java on following cases:
1) All references of that object explicitly set to null e.g. object = null
2) Object is created inside a block and reference goes out scope once control exit that block.
3) Parent object set to null, if an object holds reference of another object and when you set container object's reference null, child or contained object automatically becomes eligible for garbage collection.
4) If an object has only live references via WeakHashMap it will be eligible for garbage collection. To learn more about HashMap see here How HashMap works in Java.

Heap Generations for Garbage Collection in JavaJava objects are created in Heap and Heap is divided into three parts or generations for sake of garbage collection in Java, these are called as Young generation, Tenured or Old Generation and Perm Area of heap.
New Generation is further divided into three parts known as Eden space, Survivor 1 and Survivor 2 space. When an object first created in heap its gets created in new generation inside Eden space and after subsequent Minor Garbage collection if object survives its gets moved to survivor 1 and then Survivor 2 before Major Garbage collection moved that object to Old or tenured generation.

Permanent generation of Heap or Perm Area of Heap is somewhat special and it is used to store Meta data related to classes and method in JVM, it also hosts String pool provided by JVM as discussed in my string tutorial why String is immutable in Java. There are many opinions around whether garbage collection in Java happens in perm area of java heap or not, as per my knowledge this is something which is JVM dependent and happens at least in Sun's implementation of JVM. You can also try this by just creating millions of String and watching for Garbage collection or OutOfMemoryError.

Types of Garbage Collector in JavaJava Runtime (J2SE 5) provides various types of Garbage collection in Java which you can choose based upon your application's performance requirement. Java 5 adds three additional garbage collectors except serial garbage collector. Each is generational garbage collector which has been implemented to increase throughput of the application or to reduce garbage collection pause times.

1) Throughput Garbage Collector: This garbage collector in Java uses a parallel version of the young generation collector. It is used if the -XX:+UseParallelGC option is passed to the JVM via command line options . The tenured generation collector is same as the serial collector.

2) Concurrent low pause Collector: This Collector is used if the -Xingc or -XX:+UseConcMarkSweepGC is passed on the command line. This is also referred as Concurrent Mark Sweep Garbage collector. The concurrent collector is used to collect the tenured generation and does most of the collection concurrently with the execution of the application. The application is paused for short periods during the collection. A parallel version of the young generation copying collector is sued with the concurrent collector. Concurrent Mark Sweep Garbage collector is most widely used garbage collector in java and it uses algorithm to first mark object which needs to collected when garbage collection triggers.

3) The Incremental (Sometimes called train) low pause collector: This collector is used only if -XX:+UseTrainGC is passed on the command line. This garbage collector has not changed since the java 1.4.2 and is currently not under active development. It will not be supported in future releases so avoid using this and please see 1.4.2 GC Tuning document for information on this collector.
Important point to not is that -XX:+UseParallelGC should not be used with -XX:+UseConcMarkSweepGC. The argument passing in the J2SE platform starting with version 1.4.2 should only allow legal combination of command line options for garbage collector but earlier releases may not find or detect all illegal combination and the results for illegal combination are unpredictable. It’s not recommended to use this garbage collector in java.

JVM Parameters for garbage collection in JavaGarbage collection tuning is a long exercise and requires lot of profiling of application and patience to get it right. While working with High volume low latency Electronic trading system I have worked with some of the project where we need to increase the performance of Java application by profiling and finding what causing full GC and I found that Garbage collection tuning largely depends on application profile, what kind of object application has and what are there average lifetime etc. for example if an application has too many short lived object then making Eden space wide enough or larger will reduces number of minor collections. you can also control size of both young and Tenured generation using JVM parameters for example setting -XX:NewRatio=3 means that the ratio among the young and tenured generation is 1:3 , you got to be careful on sizing these generation. As making young generation larger will reduce size of tenured generation which will force Major collection to occur more frequently which pauses application thread during that duration results in degraded or reduced throughput. The parameters NewSize and MaxNewSize are used to specify the young generation size from below and above. Setting these equal to one another fixes the young generation. In my opinion before doing garbage collection tuning detailed understanding of garbage collection in java is must and I would recommend reading Garbage collection document provided by Sun Microsystems for detail knowledge of garbage collection in Java. Also to get a full list of JVM parameters for a particular Java Virtual machine please refer official documents on garbage collection in Java. I found this link quite helpful though http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html

Full GC and Concurrent Garbage Collection in JavaConcurrent garbage collector in java uses a single garbage collector thread that runs concurrently with the application threads with the goal of completing the collection of the tenured generation before it becomes full. In normal operation, the concurrent garbage collector is able to do most of its work with the application threads still running, so only brief pauses are seen by the application threads. As a fall back, if the concurrent garbage collector is unable to finish before the tenured generation fill up, the application is paused and the collection is completed with all the application threads stopped. Such Collections with the application stopped are referred as full garbage collections or full GC and are a sign that some adjustments need to be made to the concurrent collection parameters. Always try to avoid or minimize full garbage collection or Full GC because it affects performance of Java application. When you work in finance domain for electronic trading platform and with high volume low latency systems performance of java application becomes extremely critical an you definitely like to avoid full GC during trading period.

Summary on Garbage collection in Java1) Java Heap is divided into three generation for sake of garbage collection. These are young generation, tenured or old generation and Perm area.
2) New objects are created into young generation and subsequently moved to old generation.
3) String pool is created in Perm area of Heap, garbage collection can occur in perm space but depends upon JVM to JVM.
4) Minor garbage collection is used to move object from Eden space to Survivor 1 and Survivor 2 space and Major collection is used to move object from young to tenured generation.
5) Whenever Major garbage collection occurs application threads stops during that period which will reduce application’s performance and throughput.
6) There are few performance improvement has been applied in garbage collection in java 6 and we usually use JRE 1.6.20 for running our application.
7) JVM command line options –Xmx and -Xms is used to setup starting and max size for Java Heap. Ideal ratio of this parameter is either 1:1 or 1:1.5 based upon my experience for example you can have either both –Xmx and –Xms as 1GB or –Xms 1.2 GB and 1.8 GB.
8) There is no manual way of doing garbage collection in Java.

Latency Vs Bandwidth

One of the most commonly misunderstood concepts in networking is speed and capacity. Most people believe that capacity and speed are the same thing. For example, it's common to hear "How fast is your connection?" Invariably, the answer will be "640K", "1.5M" or something similar. These answers are actually referring to the bandwidth or capacity of the service, not speed.

Speed and bandwidth are interdependent. The combination of latency and bandwidth gives users the perception of how quickly a webpage loads or a file is transferred. It doesn't help that broadband providers keep saying "get high speed access" when they probably should be saying "get high capacity access". Notice the term "Broadband" - it refers to how wide the pipe is, not how fast.

Latency:

Latency is delay.

For our purposes, it is the amount of time it takes a packet to travel from source to destination. Together, latency and bandwidth define the speed and capacity of a network.

Latency is normally expressed in milliseconds. One of the most common methods to measure latency is the utility ping. A small packet of data, typically 32 bytes, is sent to a host and the RTT (round-trip time, time it takes for the packet to leave the source host, travel to the destination host and return back to the source host) is measured.

The following are typical latencies as reported by others of popular circuits type to the first hop. Please remember however that latency on the Internet is also affected by routing that an ISP may perform (ie, if your data packet has to travel further, latencies increase).

Ethernet                  .3ms
Analog Modem              100-200ms
ISDN                      15-30ms
DSL/Cable                 10-20ms
Stationary Satellite      >500ms, mostly due to high orbital elevation
DS1/T1                    2-5ms

Bandwidth:

Bandwidth is normally expressed in bits per second. It's the amount of data that can be transferred during a second.

Solving bandwidth is easier than solving latency. To solve bandwidth, more pipes are added. For example, in early analog modems it was possible to increase bandwidth by bonding two or more modems. In fact, ISDN achieves 128K of bandwidth by bonding two 64K channels using a datalink protocol called multilink-ppp.

Bandwidth and latency are connected. If the bandwidth is saturated then congestion occurs and latency is increased. However, if the bandwidth of a circuit is not at peak, the latency will not decrease. Bandwidth can always be increased but latency cannot be decreased. Latency is the function of the electrical characteristics of the circuit.

90TH PERCENTILE

90th percentile Response Time is defined by many definitions but it can be easily understood by:

"The 90th percentile tells you the value for which 90% of the data points are smaller and 10% are bigger."

90% RT is the one factor we should always look in once the Analysis report gets generated

To calculate the 90% RT

1. Sort the transaction RT by their values

2. Remove the top 10% instances

3. The higher value left is the 9oth percentile RT.

For e.g. Consider we have a script with transaction name "T01_Performance_Testing" and there are 10 instances of this transaction, i.e. we ran this transaction for 10 times.

Values of transaction's 10 instances are

1. Sort them by their values

2. Remove top 10% values, i.e. here 9Sec

3. 8 Sec is the 90th percentile RT.

Flex Protocol Scripting in LR

Introduction

Adobe Flex is a software development kit released by Adobe Systems for the development and deployment of cross-platform rich Internet applications based on the Adobe Flash platform. Flex applications can be written using Adobe Flex Builderor by using the freely available Flex compiler from Adobe.

Developers use two core languages to create Flex applications. The first core language is MXML, the Macromedia Flex Mark-up Language, which includes a rich set of XML tags that allows developers to layout user interfaces. Some MXML constructs allow you to call remote objects, store data returned in a model, and customize your own look and feel to MXML components.

The second core language for Flex development is Action Script 2.0, which is similar to JavaScript. Action Script elements are coded inside MXML pages has robust event handling capabilities to allow the application to respond to dynamic user interactions. Unlike JavaScript, since Action Script runs inside the Flash plug-in there is no need to rewrite several versions of the same code to support different browsers.

The Flex server is responsible for translating the MXML and Action Script components into Flash byte code in the form of .SWF files. The SWF file is executed on the client in the Flash runtime environment. The Flex server provides other services such as caching, concurrency, and handling remote object requests.

Flex Protocol with LR

VuGen allows you to create Vusers that emulate the protocol suite provided with the Flex 2 SDK.

RIAs are lightweight online programs that provide users with more dynamic control than with a standard web page. Like Web applications built with AJAX, Flex applications are more responsive, because the application does not need to load a new Web page every time the user takes action. However, unlike working with AJAX, Flex is independent of browser implementations such as JavaScript or CSS. The framework runs on Adobe's cross-platform Flash Player.

Pre-requisite

Load Runner 11.0 support the flex with the patch3
JRE 6.0
Adobe flash player 10 and higher

Environment Variables

Verify for the following environment variables in Windows Operating system.

The environment variables can be reached by following the below steps:

1. Right-click “My Computer”. Go to properties.

2. Go to Advanced Tab

3. Click on the Environment Variables button

Click on the “New “button under System variables and enter the below values:

Variable name: HP_FLEX_JAVA_LOG_FILE

Variable value: C:\flex.log

Variable name: VUGEN_PATH

Variable value: C:\Program Files\HP\Virtual User Generator\

Variable name: ANALYSIS_PATH

Variable value: C:\Program Files\HP\LoadRunner\

* The HP_FLEX_JAVA_LOG_FILE is used to generate the log file which will help us to identify the classes involved in a particular transaction.

This log file will be very useful for debugging. Ensure that there are no errors in the flex log file after recording.

Recording Options

The following recording options need to be considered before recording in flex protocol:

· Go to Tools à Recording Options in LoadRunner.

· Under Script Check the check box against “Generate recorded event logs”. This setting helps to generate the log files for debugging

· Under Protocols tab select all three check boxes viz.,

o Action Message Format (AMF)

o Flex

o Web(HTTP/HTML)

· Under Recording, select HTML based script.

· Select HTML Advanced button and select “A script that contains explicit URLs only (e.g.web_url, web_submit_data)” and “Record in separate steps and use concurrent grops” for Non HTML – generated elements

· Under Code generation select Encode AMF3 using external parser.And provide the below jar files location under Value column.

o flex-messaging-common.jar

o flex-messaging-core.jar

o (Any application specific jars.The jars are dependent on the transaction and should be verified before recording every transaction)

· Under Port-Mapping, click on Options. Under Advanced Port Mapping Settings à Change Log level to Advanced Debug. - This setting would enable to generate flex log in C:\ drive.

· The Advanced tab under HTTP properties is the standard one.

· Under Correlation tab, uncheck the “Enable correlation during recording” check box.

Post Recording Verification

After recording verify following:

1. A file by name “flex. log” should be generated in C:\ drive.

2. flex_amf_call should be generated with readable XML’s and not the binary format for all requests.

Correlation in Flex Scripts

Flex applications often contain dynamic data, data that changes each time you run the script. For example, the object name may change from run to run.

When you record a Vuser script, VuGen records a set of data and argument values. When you replay the script, however, the server may reject these arguments and issue an error. This error could be the result of dynamic data that is outdated and no longer accepted by the server.

To overcome this, you apply correlation to your script:

➤ Save the server response in preparation for extracting the required values.

➤ Extract the required values from the server response.

➤ Save the values to a parameter.

➤ Use those parameters as input to your Flex requests.

These errors are not always obvious, and you may only detect them by carefully examining Vuser log files. If you encounter an error when running your Vuser, examine the script at the point where the error occurred. Often, correlation will solve the problem by enabling you to use the results of one statement as input for another.

To perform correlation:

1 Locate the step in your script that failed due to dynamic values that need correlation.

Use the Replay Log to assist you in finding the problematic step.

2 Identify the server response with the correct value in one of the previous steps.

Double-Click the error in the Replay log to go to the step with the error. Examine the preceding steps in Tree View and look for the value in the Server Response tab.

3 Save the entire server response to a parameter.

Before you extract the value, the entire server response should be saved to a parameter as follows:

➤ Right-click the step node (in the left Action pane) corresponding to the server response containing the value and select Properties.

➤ In the Flex Call Properties dialog box, type a Response parameter name.

➤ Click OK to save the new parameter name.

4 Save the original server response value to a parameter.

➤ In the Replay Snapshot: Response Data, right-click the node above the value (for example, string), and select Save value in parameter.

In the XML Parameter Properties dialog, specify a parameter Name. You will use this name in subsequent steps.

➤ Click OK. The script will now contain a new function, lr_xml_get_values.

5 Insert the parameter in the subsequent calls.

In VuGen edit view, beginning with the call that failed, replace the value in all subsequent calls to the object with the parameter that you defined:

➤ Right-click the step node (in the Action pane) corresponding to the failed call and select Properties.

➤ Locate the argument that required correlation.

➤ In the Value box, type the parameter name in curly brackets, for example, {ParamValue_string}.

Click OK

6 Run the script.

Make sure that VuGen properly substitutes the argument value with the parameter value that you saved.

Some Important JAR files needed are

We need application JAR files as well along with these JAR files from developer to generate the decoded AMF calls in the scripts else we can’t parse and correlate the requests.

SAR COMMANDS UNIX

Using sar you can monitor performance of various Linux subsystems (CPU, Memory, I/O..) in real time.

Using sar, you can also collect all performance data on an on-going basis, store them, and do historical analysis to identify bottlenecks.

Sar is part of the sysstat package.

This article explains how to install and configure sysstat package (which contains sar utility) and explains how to monitor the following Linux performance statistics using sar.

Collective CPU usage
Individual CPU statistics
Memory used and available
Swap space used and available
Overall I/O activities of the system
Individual device I/O activities
Context switch statistics
Run queue and load average data
Network statistics
Report sar data from a specific time

This is the only guide you’ll need for sar utility. So, bookmark this for your future reference.

I. Install and Configure Sysstat

Install Sysstat Package

First, make sure the latest version of sar is available on your system. Install it using any one of the following methods depending on your distribution.

sudo apt-get install sysstat
(or)
yum install sysstat
(or)
rpm -ivh sysstat-10.0.0-1.i586.rpm

Install Sysstat from Source

wget http://pagesperso-orange.fr/sebastien.godard/sysstat-10.0.0.tar.bz2

tar xvfj sysstat-10.0.0.tar.bz2

cd sysstat-10.0.0

./configure --enable-install-cron

Note: Make sure to pass the option –enable-install-cron. This does the following automatically for you. If you don’t configure sysstat with this option, you have to do this ugly job yourself manually.

Creates /etc/rc.d/init.d/sysstat
Creates appropriate links from /etc/rc.d/rc*.d/ directories to /etc/rc.d/init.d/sysstat to start the sysstat automatically during Linux boot process.
For example, /etc/rc.d/rc3.d/S01sysstat is linked automatically to /etc/rc.d/init.d/sysstat

After the ./configure, install it as shown below.

make

make install

Note: This will install sar and other systat utilities under /usr/local/bin

Once installed, verify the sar version using “sar -V”. Version 10 is the current stable version of sysstat.

$ sar -V
sysstat version 10.0.0
(C) Sebastien Godard (sysstat  orange.fr)

Finally, make sure sar works. For example, the following gives the system CPU statistics 3 times (with 1 second interval).

$ sar 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:27:32 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:27:33 PM       all      0.00      0.00      0.00      0.00      0.00    100.00
01:27:34 PM       all      0.25      0.00      0.25      0.00      0.00     99.50
01:27:35 PM       all      0.75      0.00      0.25      0.00      0.00     99.00
Average:          all      0.33      0.00      0.17      0.00      0.00     99.50

Utilities part of Sysstat

Following are the other sysstat utilities.

sar collects and displays ALL system activities statistics.
sadc stands for “system activity data collector”. This is the sar backend tool that does the data collection.
sa1 stores system activities in binary data file. sa1 depends on sadc for this purpose. sa1 runs from cron.
sa2 creates daily summary of the collected statistics. sa2 runs from cron.
sadf can generate sar report in CSV, XML, and various other formats. Use this to integrate sar data with other tools.
iostat generates CPU, I/O statistics
mpstat displays CPU statistics.
pidstat reports statistics based on the process id (PID)
nfsiostat displays NFS I/O statistics.
cifsiostat generates CIFS statistics.

This article focuses on sysstat fundamentals and sar utility.

Collect the sar statistics using cron job – sa1 and sa2

Create sysstat file under /etc/cron.d directory that will collect the historical sar data.

# vi /etc/cron.d/sysstat
*/10 * * * * root /usr/local/lib/sa/sa1 1 1
53 23 * * * root /usr/local/lib/sa/sa2 -A

If you’ve installed sysstat from source, the default location of sa1 and sa2 is /usr/local/lib/sa. If you’ve installed using your distribution update method (for example: yum, up2date, or apt-get), this might be /usr/lib/sa/sa1 and /usr/lib/sa/sa2.

/usr/local/lib/sa/sa1

This runs every 10 minutes and collects sar data for historical reference.
If you want to collect sar statistics every 5 minutes, change */10 to */5 in the above /etc/cron.d/sysstat file.
This writes the data to /var/log/sa/saXX file. XX is the day of the month. saXX file is a binary file. You cannot view its content by opening it in a text editor.
For example, If today is 26th day of the month, sa1 writes the sar data to /var/log/sa/sa26
You can pass two parameters to sa1: interval (in seconds) and count.
In the above crontab example: sa1 1 1 means that sa1 collects sar data 1 time with 1 second interval (for every 10 mins).

/usr/local/lib/sa/sa2

This runs close to midnight (at 23:53) to create the daily summary report of the sar data.
sa2 creates /var/log/sa/sarXX file (Note that this is different than saXX file that is created by sa1). This sarXX file created by sa2 is an ascii file that you can view it in a text editor.
This will also remove saXX files that are older than a week. So, write a quick shell script that runs every week to copy the /var/log/sa/* files to some other directory to do historical sar data analysis.

II. 10 Practical Sar Usage Examples

There are two ways to invoke sar.

sar followed by an option (without specifying a saXX data file). This will look for the current day’s saXX data file and report the performance data that was recorded until that point for the current day.
sar followed by an option, and additionally specifying a saXX data file using -f option. This will report the performance data for that particular day. i.e XX is the day of the month.

In all the examples below, we are going to explain how to view certain performance data for the current day. To look for a specific day, add “-f /var/log/sa/saXX” at the end of the sar command.

All the sar command will have the following as the 1st line in its output.

$ sar -u
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

Linux 2.6.18-194.el5PAE – Linux kernel version of the system.
(dev-db) – The hostname where the sar data was collected.
03/26/2011 – The date when the sar data was collected.
_i686_ – The system architecture
(8 CPU) – Number of CPUs available on this system. On multi core systems, this indicates the total number of cores.

1. CPU Usage of ALL CPUs (sar -u)

This gives the cumulative real-time CPU usage of all CPUs. “1 3″ reports for every 1 seconds a total of 3 times. Most likely you’ll focus on the last field “%idle” to see the cpu load.

$ sar -u 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:27:32 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:27:33 PM       all      0.00      0.00      0.00      0.00      0.00    100.00
01:27:34 PM       all      0.25      0.00      0.25      0.00      0.00     99.50
01:27:35 PM       all      0.75      0.00      0.25      0.00      0.00     99.00
Average:          all      0.33      0.00      0.17      0.00      0.00     99.50

Following are few variations:

sar -u Displays CPU usage for the current day that was collected until that point.
sar -u 1 3 Displays real time CPU usage every 1 second for 3 times.
sar -u ALL Same as “sar -u” but displays additional fields.
sar -u ALL 1 3 Same as “sar -u 1 3″ but displays additional fields.
sar -u -f /var/log/sa/sa10 Displays CPU usage for the 10day of the month from the sa10 file.

2. CPU Usage of Individual CPU or Core (sar -P)

If you have 4 Cores on the machine and would like to see what the individual cores are doing, do the following.

“-P ALL” indicates that it should displays statistics for ALL the individual Cores.

In the following example under “CPU” column 0, 1, 2, and 3 indicates the corresponding CPU core numbers.

$ sar -P ALL 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:34:12 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:34:13 PM       all     11.69      0.00      4.71      0.69      0.00     82.90
01:34:13 PM         0     35.00      0.00      6.00      0.00      0.00     59.00
01:34:13 PM         1     22.00      0.00      5.00      0.00      0.00     73.00
01:34:13 PM         2      3.00      0.00      1.00      0.00      0.00     96.00
01:34:13 PM         3      0.00      0.00      0.00      0.00      0.00    100.00

“-P 1″ indicates that it should displays statistics only for the 2nd Core. (Note that Core number starts from 0).

$ sar -P 1 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:36:25 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
01:36:26 PM         1      8.08      0.00      2.02      1.01      0.00     88.89

Following are few variations:

sar -P ALL Displays CPU usage broken down by all cores for the current day.
sar -P ALL 1 3 Displays real time CPU usage for ALL cores every 1 second for 3 times (broken down by all cores).
sar -P 1 Displays CPU usage for core number 1 for the current day.
sar -P 1 1 3 Displays real time CPU usage for core number 1, every 1 second for 3 times.
sar -P ALL -f /var/log/sa/sa10 Displays CPU usage broken down by all cores for the 10day day of the month from sa10 file.

3. Memory Free and Used (sar -r)

This reports the memory statistics. “1 3″ reports for every 1 seconds a total of 3 times. Most likely you’ll focus on “kbmemfree” and “kbmemused” for free and used memory.

$ sar -r 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

07:28:06 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit  kbactive   kbinact
07:28:07 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
07:28:08 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
07:28:09 AM   6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204
Average:      6209248   2097432     25.25    189024   1796544    141372      0.85   1921060     88204

Following are few variations:

sar -r
sar -r 1 3
sar -r -f /var/log/sa/sa10

4. Swap Space Used (sar -S)

This reports the swap statistics. “1 3″ reports for every 1 seconds a total of 3 times. If the “kbswpused” and “%swpused” are at 0, then your system is not swapping.

$ sar -S 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

07:31:06 AM kbswpfree kbswpused  %swpused  kbswpcad   %swpcad
07:31:07 AM   8385920         0      0.00         0      0.00
07:31:08 AM   8385920         0      0.00         0      0.00
07:31:09 AM   8385920         0      0.00         0      0.00
Average:      8385920         0      0.00         0      0.00

Following are few variations:

sar -S
sar -S 1 3
sar -S -f /var/log/sa/sa10

Notes:

Use “sar -R” to identify number of memory pages freed, used, and cached per second by the system.
Use “sar -H” to identify the hugepages (in KB) that are used and available.
Use “sar -B” to generate paging statistics. i.e Number of KB paged in (and out) from disk per second.
Use “sar -W” to generate page swap statistics. i.e Page swap in (and out) per second.

5. Overall I/O Activities (sar -b)

This reports I/O statistics. “1 3″ reports for every 1 seconds a total of 3 times.

Following fields are displays in the example below.

tps – Transactions per second (this includes both read and write)
rtps – Read transactions per second
wtps – Write transactions per second
bread/s – Bytes read per second
bwrtn/s – Bytes written per second

$ sar -b 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:56:28 PM       tps      rtps      wtps   bread/s   bwrtn/s
01:56:29 PM    346.00    264.00     82.00   2208.00    768.00
01:56:30 PM    100.00     36.00     64.00    304.00    816.00
01:56:31 PM    282.83     32.32    250.51    258.59   2537.37
Average:       242.81    111.04    131.77    925.75   1369.90

Following are few variations:

sar -b
sar -b 1 3
sar -b -f /var/log/sa/sa10

Note: Use “sar -v” to display number of inode handlers, file handlers, and pseudo-terminals used by the system.

6. Individual Block Device I/O Activities (sar -d)

To identify the activities by the individual block devices (i.e a specific mount point, or LUN, or partition), use “sar -d”

$ sar -d 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:59:45 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:59:46 PM    dev8-0      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM    dev8-1      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM dev120-64      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM dev120-65      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM  dev120-0      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM  dev120-1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM dev120-96      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
01:59:46 PM dev120-97      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91

In the above example “DEV” indicates the specific block device.

For example: “dev53-1″ means a block device with 53 as major number, and 1 as minor number.

The device name (DEV column) can display the actual device name (for example: sda, sda1, sdb1 etc.,), if you use the -p option (pretty print) as shown below.

$ sar -p -d 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:59:45 PM       DEV       tps  rd_sec/s  wr_sec/s  avgrq-sz  avgqu-sz     await     svctm     %util
01:59:46 PM       sda      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM      sda1      1.01      0.00      0.00      0.00      0.00      4.00      1.00      0.10
01:59:46 PM      sdb1      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM      sdc1      3.03     64.65      0.00     21.33      0.03      9.33      5.33      1.62
01:59:46 PM      sde1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM      sdf1      8.08      0.00    105.05     13.00      0.00      0.38      0.38      0.30
01:59:46 PM      sda2      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91
01:59:46 PM      sdb2      1.01      8.08      0.00      8.00      0.01      9.00      9.00      0.91

Following are few variations:

sar -d
sar -d 1 3
sar -d -f /var/log/sa/sa10
sar -p -d

7. Display context switch per second (sar -w)

This reports the total number of processes created per second, and total number of context switches per second. “1 3″ reports for every 1 seconds a total of 3 times.

$ sar -w 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

08:32:24 AM    proc/s   cswch/s
08:32:25 AM      3.00     53.00
08:32:26 AM      4.00     61.39
08:32:27 AM      2.00     57.00

Following are few variations:

sar -w
sar -w 1 3
sar -w -f /var/log/sa/sa10

8. Reports run queue and load average (sar -q)

This reports the run queue size and load average of last 1 minute, 5 minutes, and 15 minutes. “1 3″ reports for every 1 seconds a total of 3 times.

$ sar -q 1 3
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

06:28:53 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
06:28:54 AM         0       230      2.00      3.00      5.00         0
06:28:55 AM         2       210      2.01      3.15      5.15         0
06:28:56 AM         2       230      2.12      3.12      5.12         0
Average:            3       230      3.12      3.12      5.12         0

Note: The “blocked” column displays the number of tasks that are currently blocked and waiting for I/O operation to complete.

Following are few variations:

sar -q
sar -q 1 3
sar -q -f /var/log/sa/sa10

9. Report network statistics (sar -n)

This reports various network statistics. For example: number of packets received (transmitted) through the network card, statistics of packet failure etc.,. “1 3″ reports for every 1 seconds a total of 3 times.

sar -n KEYWORD

KEYWORD can be one of the following:

DEV – Displays network devices vital statistics for eth0, eth1, etc.,
EDEV – Display network device failure statistics
NFS – Displays NFS client activities
NFSD – Displays NFS server activities
SOCK – Displays sockets in use for IPv4
IP – Displays IPv4 network traffic
EIP – Displays IPv4 network errors
ICMP – Displays ICMPv4 network traffic
EICMP – Displays ICMPv4 network errors
TCP – Displays TCPv4 network traffic
ETCP – Displays TCPv4 network errors
UDP – Displays UDPv4 network traffic
SOCK6, IP6, EIP6, ICMP6, UDP6 are for IPv6
ALL – This displays all of the above information. The output will be very long.

$ sar -n DEV 1 1
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

01:11:13 PM     IFACE   rxpck/s   txpck/s   rxbyt/s   txbyt/s   rxcmp/s   txcmp/s  rxmcst/s
01:11:14 PM        lo      0.00      0.00      0.00      0.00      0.00      0.00      0.00
01:11:14 PM      eth0    342.57    342.57  93923.76 141773.27      0.00      0.00      0.00
01:11:14 PM      eth1      0.00      0.00      0.00      0.00      0.00      0.00      0.00

10. Report Sar Data Using Start Time (sar -s)

When you view historic sar data from the /var/log/sa/saXX file using “sar -f” option, it displays all the sar data for that specific day starting from 12:00 a.m for that day.

Using “-s hh:mi:ss” option, you can specify the start time. For example, if you specify “sar -s 10:00:00″, it will display the sar data starting from 10 a.m (instead of starting from midnight) as shown below.

You can combine -s option with other sar option.

For example, to report the load average on 26th of this month starting from 10 a.m in the morning, combine the -q and -s option as shown below.

$ sar -q -f /var/log/sa/sa23 -s 10:00:01
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

10:00:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
10:10:01 AM         0       127      2.00      3.00      5.00         0
10:20:01 AM         0       127      2.00      3.00      5.00         0
...
11:20:01 AM         0       127      5.00      3.00      3.00         0
12:00:01 PM         0       127      4.00      2.00      1.00         0

There is no option to limit the end-time. You just have to get creative and use head command as shown below.

For example, starting from 10 a.m, if you want to see 7 entries, you have to pipe the above output to “head -n 10″.

$ sar -q -f /var/log/sa/sa23 -s 10:00:01 | head -n 10
Linux 2.6.18-194.el5PAE (dev-db)        03/26/2011      _i686_  (8 CPU)

10:00:01 AM   runq-sz  plist-sz   ldavg-1   ldavg-5  ldavg-15   blocked
10:10:01 AM         0       127      2.00      3.00      5.00         0
10:20:01 AM         0       127      2.00      3.00      5.00         0
10:30:01 AM         0       127      3.00      5.00      2.00         0
10:40:01 AM         0       127      4.00      2.00      1.00         2
10:50:01 AM         0       127      3.00      5.00      5.00         0
11:00:01 AM         0       127      2.00      1.00      6.00         0
11:10:01 AM         0       127      1.00      3.00      7.00         2