Tue 20 Sep 2011
OutOfMemoryError: Fun with Heap Dump Analysis
My recent post on postmortem analysis discussed the investigative trail you can follow when your web app slows down and crashes. In the latter section I focused in on thread dump analysis. In this post I’ll continue the theme by switching to heap dump analysis and all the fun you can have with a large fresh heap of… memory.
What is a heap dump?
A heap dump is a snapshot of memory at a given point in time. It contains information on the Java objects and classes in memory at the time the snapshot was taken. In the same way that thread dumps are an instant in time a single heap dump cannot provide much in the way of temporal information to answer questions like “where did an object come from?” or “what has it been doing?”
Heap dumps come in various formats — the Sun VM generates HPROF whereas the IBM VM provides portable heap dump (PHD) files. At Rally we run Sun’s VM so usually work with HPROF dumps. The internals vary a little but the basic principals are still the same.
Why would you want to read one?
If your Java application crashes with an OutOfMemoryError it is possible to automatically get a heap dump to analyze. This view into the memory profile of your application at the time it crashed can help you figure out what caused the error. Was it a leak? Poor choice of data structure or algorithm? Or is your application large and complex requiring more memory allocating to it at execution time?
Reading a heap dump can also help you understand the memory footprint of your app. You can see what are your largest data structures, object graphs etc. This can help decide what to optimize in your code. For example, you may choose a more efficient storage mechanism for sparse data collections (List over Array?), remove empty collections, improve your hashing functions to get fewer collisions in your HashMaps, or start using weak references for certain objects.
On the flip side, heap dumps are not so good for figuring out why you have a lot of GC-able objects being created or which methods are greedy in terms of object allocation. This is because heap dumps do not tell you about object life cycle. Objects in heap dumps are identified by their memory addresses and not by some persistent ID. Since objects are not pinned to a single memory address and may move around you cannot track a given object between heap dumps. For looking at object life cycle over time you should go back to thread dumps.
How to get a heap dump?
Executing the Java VM with the appropriate parameters:
-XX:+HeapDumpOnOutOfMemoryErrorwrites heap dump onOutOfMemoryError(recommended)-XX:+HeapDumpOnCtrlBreakwrites heap dump together with thread dump onCTRL+BREAK
Or you can run the jmap tool:
- Sun (Linux, Solaris; not on Windows)
JMap Java 5: jmap -heap:format=b - Sun (Linux, Solaris; Windows see link)
JMap Java 6: jmap.exe -dump:format=b,file=HeapDump.hprof - Sun (Linus, Solaris) JMap with Core Dump File:
jmap -dump:format=b,file=HeapDump.hprof /path/to/bin/java core_dump_file - Sun JConsole: Launch
jconsole.exeand invoke operationdumpHeap()on HotSpotDiagnostic MBean - SAP JVMMon: Launch
jvmmon.exeand call menu for dumping the heap
Some analysis tools will even generate one for you.
How to read a heap dump?
Well, it’ll be in binary format so you don’t read the plain file. Instead use a tool like Memory Analyzer Tool.
MAT has built in functions to help you find suspect objects and will offer to do this for you. It’ll draw a pie chart showing memory usage and even take you to a stack trace (JDK 6 and up) for offending objects.
It’ll also display data on:
- All loaded classes
- name
- superclass
- class-loader
- defined fields for instances (name and value)
- static fields for classes (name and value)
- Object information
- class and values of all fields
- reference and primative types
- looking at internals, say char[] in a StringBuffer can be very hjelpful
- class and values of all fields
- List of GC roots
- Callstacks of threads (JDK 6 and up)
As an aside, the JVM usually does a GC before generating a heap dump so everything you see should be live. It is possible to have GC-able objects that are unreachable from the GC roots. Most tools like MAT will attempt to remove these when loading your dump file.
Example:
I wrote a tiny Java application that continually appended random text to a StringBufffer until it crashed with an java.lang.OutOfMemoryError. I executed the code with this command: java -XX:+HeapDumpOnOutOfMemoryError MemoryEater.
When the JVM ran out of memory it created a HPROF dump file for me. I loaded the file into MAT and got this:

Clicking on the “See stacktrace” link gave me the exact line of code that was building up the huge StringBuffer causing the application to crash. Now I know where to go look in my code to fix this bug:
main
at java.lang.OutOfMemoryError.()V (OutOfMemoryError.java:25)
at java.util.Arrays.copyOf([CI)[C (Arrays.java:2882)
at java.lang.AbstractStringBuilder.expandCapacity(I)V (AbstractStringBuilder.java:100)
at java.lang.AbstractStringBuilder.append(I)Ljava/lang/AbstractStringBuilder; (AbstractStringBuilder.java:597)
at java.lang.StringBuffer.append(I)Ljava/lang/StringBuffer; (StringBuffer.java:329)
at MemoryEater.main([Ljava/lang/String;)V (MemoryEater.java:9)
