Offline Java Memory Analysis : Introduction to tools for Heap and Thread Dumps, GC log Analysis

In this article we are going to see different tools which can be used for different Java Memory Analysis in offline. I will try to include all kinds of offline analysis tools. OS process memory monitoring and tools are ignored due to context.

What is offline analysis?
When we do analysis on recorded data , not live data, then it refers to offline analysis.

For Java memory analysis, we will perform analysis on not  live data but recorded. For Java memory analysis, we need mainly three type of information of JVM to get to bottom of it.
1. Java Heap Dump : This represents JVM heap memory information
2. Java Core /Thread Dump: This shows running thread state and conditions. The core contains more detail information. For IBM , core and thread dump are same. Different JVM may also include trace files here.
3. GC logs : This shows Garbage collection history in logs.

I will not go detail on each one, lets see the tools only

For All IBM JVM : You need to have IBM Support Assistant. This is a web distribution, you need to run this with Java and your PC will host all tools to gather. Here is the download linkHow to setup IBM ISA? Unzip and run start_isa.bat .
And in browser if you go to this link http://localhost:10911/isa5 You will be redirected here (all of this link & ports are configurable)

image

You will get 3 type of tools, JNLP web start, eclipse plugins link, web based. If you download and open the JNLP via notepad, you can actually configure JVM configuration. I used make a separate bat file to run JNLP with IBM JVM(non environment run time environment)

Make sure, your PC don’t have global log file as environment variable(if you install HP testing product you will have), Just delete that environment variable. ISA will create a variable path for log for it self and you can run this.

For all Oracle JVM : 

1. Java Mission Control(JMC), Comes free with JDK. But you need to install these tools as plugins.
a. JOverflow / Heap Analysis
b. DTrace Recorder
c. Flight Recorder Plugins
d. Console Plugins

image 

image

2. Visual VM : Comes free with JDK. Download useful plugins to get best out of it.

image

Architecturally JMC based on eclipse & Visual VM based on netbeans., so plugins installation process follows process of those IDE in installation (network config+ tools installations)

Java Heap Analysis :

1. Eclipse MAT(including IBM IDDE) supports IBM (phd)& Oracle JVM heap dumps.

2. Visual VM : Oracle hprof format heap dump reader

3. JOverflow/Dump Analysis as plugins of Java Mission Control :  Oracle hprof format heap dump reader, provides more details analysis than Visual VM. 

image

4. Heap Analyzer(IBM Only) : IBM Phd format heap analyzer. Download link.



Thread Analysis :

For IBM JVM: (Java Core file and .trc file)

1. JCA (Java Core analyzer) : This is from IBM support. Download and load text format java Core files.



2.  GCMV( Garbage collector Memory Visualizer) standalone or GCMV Eclipse Plugins , comes with ISA


3. Class Loader Analyzer (IBM only) used for class loader analysis from java core and Snap<>.trc file. It comes with ISA.

4. Trace & Request analyzer(IBM only)  used for reading Snap<>.trc files . Download Link

5. Thread & Monitor Dump Analyzer (TMDA, IBM only)


Note : .trc is trace file, you need to enable and configure to get proper information.

For Oracle JVM

1. .tdump format:  Visual VM

2. hs_err_pid.log format : We need to use Command line tools comes free with JDK.
a. jstack -> Prints Stack Trace
b. jmap -> Heap memory details
And most of the time manually reading as this is text format. We can use filtering. There are also small parser, mainly shell scripts for different filters. Here is one example.

GC Log Analysis :

As , there are different type of JVM implemented from Open JDK, GC log also have different format. Mostly commonly use IBM & Oracle format.  And in each JVM there are different type of GC logs but mostly they follow the general format. Which is text format. Some times , old IBM verbose GC might not be compatible for latest openJVM or Oracle JVM generated GC. How to generate log, I will go details in separate blog.

For Oracle(default : Solaris JVM format)  & IBM Verbose GC log reading :

1.  GCMV( Garbage collector Memory Visualizer) standalone or GCMV Eclipse Plugins , comes with ISA


2. PMAT (IBM Pattern Modeling Tool) : Multiple GC log together, comes with ISA

3. One of good 3rd Party Log viewers : GCViewer ,  Supported formats :
Sun JDK 1.4/1.5 ( -Xloggc:<file> [-XX:+PrintGCDetails] )
Sun JDK 1.2.2/1.3.1/1.4 (-verbose:gc )
IBM JDK 1.3.1/1.3.0/1.2.2 ( -verbose:gc)
IBM iSeries Classic JVM 1.4.2 (-verbose:gc)
HP-UX JDK 1.2/1.3/1.4.x ( -Xverbosegc)
BEA JRockit 1.4.2/1.5 (-verbose:memory)

Some other GC log reader tools you may find in this link.

Note :
1. For both VMs, we can use JProfiler, which is paid tool. If we use yourkit, different version of yourKit supports different version of JVMs, see archive page of yourkit for more details
2. For IBM tools, it is better to use IBM JVM. Either you can download JDK from IBM, or , you can download their development package with eclipse where JDK is present.
3. For IBM JVMs, usually an OOM will create a phd, a javaCore & a trc file. In case you are storing GC logs, verbosegc log will be there.

Please comment if you have any question. Thanks.. :)

How to find memory leaks in application? General Ideas

In this article we are going to see generic ideas about memory leak detection technique. We will use Top Down approach so that we can actually see symptoms from outside of application to inside.

For example, I will add reference on java(j2ee) and dotnet(Asp.Net/Web Forms) applications. I will try to avoid theoretical definitions , rather using my own way to express.

What is memory leak?
By the name we can understand, it is about high memory consumption. Actually, it has large impact in several type. In generic way, when an application's required memory increases while application is running, it might refers to memory leak. Not necessary a memory increment will be a definite leak, but, logical relation between functionality and required memory can define if it is leak or not. Based on symptoms, I define memory leak in two categories.

Symptom A :  Increase of memory usages: While running application , if we see the system memory & application memory usages(heap+non heap) increases, that is clear indication the required memory is higher.

We need to relate this with action that we perform with the application. Let's say , if our actions on application working with large volume of data, it is logical to have memory increment. But, we need to get, how much. And, we need to observe the new object that are created for those actions are removed from run time environment as soon as the functionality ended.

(for Java/Dotnet a , force GC from profiler should be useful to judge if the objects are collected as soon as they are useless)

Some times, we can see the memory increases more and more like stepping up thread, even after GC memory consumption is still growing and we can see generational changes.
Dot net : Gen 0 will be Gen 1 & gen 1 will move to get 2
java : Nursery -> survivor( S 0), S0 -> S 1  and S1 -> Tenured (old gen).

This can lead to critical stage and cause Out of memory Exception.

Symptom B : Memory usages is not dropping: While running application with large number of user or data (typical stress test scenario), if we see memory is not released and for longer time observation, the memory is still occupied. This may cause memory increment for new action perform on application.
To ensure this symptom, we may perform  force GC to see, if memory usages are going down then it is okay, if not, that means, it is constantly occupied. And, we can see the same generational changes of memory like as described in symptom A.

It is very logical, when we perform some task on application, it will load new class , create object and do the task. But as soon as we complete, we should expect those objects should be destroyed/cleaned in next GC cycle.

Tools :
1. OS Memory monitoring tool (perfmon or similar)

2. Application Memory monitoring tool (
Dotnet :  VMMap, or any profiler , ANTS memory/DotMemory,
Java : JConsole/Visual VM or Java Mission Control.(JMC)
IBM JAVA : Health center Under IBM Support Analyst
Or Commercial Profiler like Yourkit or JProfiler

And, you may also select any APM tools(Dynatrace, NewRelic or AppDynamics)  for both technology monitoring like)

3. Tools for Memory snapshot collect & compare : (
Dotnet : Perfview, Or Commercial Ants memory, DotMemory
Java : JConsole/Visual VM, JMC or commercial tool Yourkit or JProfiler

4. Analysis tool : Either manually, see things as you need or follow up commercial tool suggestions.

So, how to find the memory leak ?

I will follow Top down analysis. I will provide separate blog post for what is top down analysis, for quick recap, Top down analysis refers to approach on performance analysis from top level system to deep drive into code at run time .

Step A : Choose Scenario: 
First you need to know which area of you application might have memory leak. Usually, leak analysis activity comes after any performance test or profiling activity or in worst case Out of Memory exceptions happened.
In all cases, the initial steps must be know which actions or events or UI interactions are causing this suspects. This will ensure, the effort you are doing , is going on right direction.

For example : Let's consider a banking application, when user logins in , he gets account balance & all other baking functions. If he do credit, debit or any transaction, due to different 3rd party dependency , the application takes large memory and leaks are suspected.

Step B : Monitor System & Process Memory Usages : 
Monitor Environment/OS Memory Usages : After knowing application , either by automation (load test tool or function test tool) or manually, you should run the same process repetitively for a longer period and monitor memory usages of
1. Host PC using OS monitoring tool
2. Application Process (each OS has process monitoring)
3. Application Run time (Java/Dotnet)
(This part is based on architecture of run time, to know more detail you can visit my posts below)

This is small example for a IIS process monitoring with task manager and profmon.



Now, we have clear visibility over environment for memory and we can see the memory usages trends (over time). Let's go to next stage.

You can do application internal monitoring also with profilers.


Step C : Breakdown your Scenario: 
After getting memory usage over time, break down with your chosen areas(from step A) separately and narrow your scope. You will have small , well defined small steps from all steps.
Form Step A example, in here we will choose, any of steps for memory leak. Let say debit transaction. And, lets assume, for debit transaction , we need to
1. log in & see home screen
2. Go to , transaction
3. Select , transfer money to another account
4. Give all inputs and click transfer
5. Wait for mobile/id verification steps & complete that
6. See the successful message .

Step D :  Prepare your Memory profiler : 
We have fixed , small group of steps as a part of big scenario. Now, prepare your Memory profiler. I recommend : following for this

Dot net :(free) : Perfview

Dot net :(paid)
1. Dot Memory (I prefer)
2. ANTS Memory Profiler

Java :(free)
1. Visual VM
2. Jconsole
3. JMC
For IBM, Health center if JMC/VisualVM/Jconsole are not supported

Java :(paid)
1. JProfiler(I prefer)
2. YourKit

Note : It is easy to use paid tools to find memory leak due to their data representation.

So, Attach your memory profiler and take a memory snapshot at the beginning of the steps that we have chosen in Step D. (From example, Just after log in)

Perform each step in UI and take related heap snapshot in memory profiler.
(Note : if you are testing JS/Ajax application, make sure your UI requests passing to server from browser tools, without this , you may see no changes in server due your actions in UI)

So, after doing all steps, you will get heap snapshot for each step.

Step E : Comparing heap snap shot : 
Consider initial heap snap shot(from example, just after login) as Baseline or zero snapshot. Let say, we have take total 5 snap shot and No 1 is consider as base snapshot.  So, now, we can compare in two ways.
1. Comparing each snapshot with baseline : It will show the differences from initial state.

That means, from example, comparing Snapshot 2, 3, 4, 5 with snapshot 1.

2. Comparing each snapshot with it's previous snapshot : This will show gradually growing.

That means, from example, comparing Snapshot 5 with 4, 4 with 3, 3 with 2, 2 with 1.

Step F : Analyze Snapshots : 
If you do comparison, paid tool will give you differences by default. For Free tools, you might need get those manually by shorting, binding, folding.  So, lets see what items that we need to see.

1. Summary Analysis : A summary view of memory growth among all snapshots. In the summary , we need to see

a. How many time GC happened (to estimate our expectation)

b. Heap memory growth , break down into Generational heap(dotnet) or Nursery/Survive/Old generation(Java)

c. Base on Old generation growth & GC events decide was it necessary. In fact, if your application has manually GC calling, you might see extra old generation(gen 2 in dotnet) increases.

I am giving Small Example from ANTS memory profiler or Dot memory (some names are blur)


This happened when we solved a small part of the problem(only search).


in Dot memory


2. Heap snapshot comparison : In here , we need to see following differences
a. New Objects -> to determine what new objects are created for our particular step
b. Survival Objects -> Which object are necessary from previous stage. (like session object, previous data)
c. New Class -> Newly present class in run time, relate to the step . We might look at to Class loaders  also
d. Big Objects(85k+ for dotnet only) -> In CLR these are separately handled  can do filtering based on object size
e. Promoted Generations : -> Promoted objects indicates long living live objects.

3. Reference Analysis : After looking from heap snot comparison, we need to go target wise,

Get who are large differences, in size, in instance number.

Note : in Java, the size has two type, shallow size, refers to size of the object only & retention size, refers to the object size + all outgoing reference size(actually the memory that can be free due to its clean up)

For each candidate, try to analyze, 
- If there is any circular reference (exclude same interface implementation)
- How far a object is from GC (refers to , after how many GC event, it will be collected)
[if it is 1, it is nearest, so it will be collected on next GC, so you can easily ignore]
- Size of Single instance of the object. More often, God classes causing serious problems.
- Outgoing references and their size.
- Incoming references and who initiate them
- Instance List, to know exactly how may number of instances, who created them
- Instances which are associated with external resources(DB/File writer etc)





And, in all paid tool, you will get benefit of Retention Graph or Call tree. In this way, you can backtrack from object to its initiator & event. This is the main reason why paid tools are easy to find memory problems quickly.

And in summary, know your application memory usages: This is very key point, of going forward.

So, now you get your leaking object, you get who is calling, how it is called. And, you have to eliminate leaking. For that , there are some generic and technology specific workflows. I will provide  separate post on how to approach on solving memory leak , in Java & DotNet.

Note : Most of tools trigger a Full GC before taking snapshot, so for analysis you need consider that event with mealtime data monitoring and heap comparison analysis. For this, you may use free tools(shipped with framework) to take manual heap snapshot and compare them.

References :
1. A quick overview on JVM Architecture
2. What are the elements inside DotNet?
3. What is CLR? How it works?

This is a continuing post, I will add more specific information upon requests later on.

Please comment if you have any question.

 Thanks.. :)