Analyzing Java Heap problems Part 2: Using Eclipse MAT

In part one we saw how to obtain the data to analyze, the heap dumps. Now we are looking into a nice plugin for the Eclipse IDE for analyzing the dumps.  Compared to the basic tools described in the previous article Memory Analyzer Tool (MAT) offers better usability, performance and some high level analysis and report tools.Eclipse MAT Overview After you open a hprof heap dump with MAT it will generate index files for faster access to all the data you are interested in and show an overview with nice charts.  From here you have access to other views and features:

  • The histogram is somewhat similar to what jHat offers.mat-pathtogcroot It allows you to browse, sort and filter the object instances in memory and shows you instance count and the shallow heap (memory used only by this object instance) and retained heap (memory used by this object instance including referenced objects). From the context menu you can choose “Merge Shortest Paths to GC roots” to see the reference chain of an object all the way up to the classloader. Here we can see that the JDateChooser registers itself at the MenuSelectionManager as a listener which can cause serious memory leaks as described in another post about Java memory handling.
  • The dominator tree allows you to quickly identify the biggest objects and what they reference. Again, using the context menu on an item in the list offers many options to dive deeper into the analysis.
  • The object inspector gives you detailed information about the selected objects like shallow and retained size, its fields and the class loader by whom it was loaded.
  • The leak suspects report tries to give you some high level hints about possible causes of memory problems of your application.
  • MAT Component ReportThe component report provides some very interesting statistics about Strings and collection usage which might be worth looking at if you are not hunting down leaks but trying to reduce overall memory usage. You can even get performance hints when many overfull HashMaps are detected or there are many empty collections which could be better lazily created.

I personally am using the histogram and the dominator tree the most because I am a technical guy and like to hunt down the problems in the code. Nevertheless the reports may show use other valuable aspects which you did not think of before. The MAT team are expanding the tools nicely on that side so that the benefit of these reports is ever increasing.

It is very likely that when you analyze large heap dumps you may need to increase the Java heap size for Eclipse by using the -vmargs -Xmx<memory size> parameter. That way you are able to analyze big heaps > 500M relatively fast and comfortable. For some live demo take a look at a webinar by some of SAPs Eclipse MAT committers.

Observer/Listener structures in C++ with boost’s smart pointers

Whenever you are developing sufficiently large complex programs in languages like C++ or Java you have to deal with memory issues. This holds true especially when your program is supposed to run 24/7 or close to that. Because these kinds of issues can be hard to get right Java has this nice little helper, the garbage collector. But as Java solves all memory problems, or maybe not? points out, you can still easily shoot yourself in foot or even blow your whole leg away.  One of the problems stated there is that memory leaks can easily occur due to incorrect listener relations. Whenever a listener is not removed properly, which is either a large object itself or has references to such objects,  it’s only a matter of time until your program dies with “OutOfMemoryError” as its last words.  One of the proposed solutions is to use Java weak pointers for listener management.  Let’s see how this translates to C++.

Observer/listener management in C++ is often done using pointers to listener objects. Pointers are pretty weak by default. They can be :

  • null
  • pointing to a valid object
  • pointing to an invalid memory address

In listener relationships especially the latter can be a problem. For example, simple listener management could look like this:

   class SimpleListenerManagement
   {
   public:
      void addListener(MyListener* listener);
      void removeListener(MyListener* listener);
      void notifyListeners();
   private:
      std::list<MyListener*> listeners_;
   };

   void SimpleListenerManagement::notifyListeners()
   {
      // call notify on all listeners
      for (std::list<MyListener*>::iterator iter = listeners_.begin();
          iter != listeners_.end();
          ++iter)
      {
         (*iter)->notify(); // may be a bad idea!
      }
   }

In notifyListeners(), the pointer is used trusting that it still points to a valid object. But if it doesn’t, for instance because the object was deleted but the client forgot to removed it from the listener management, well, too bad.

Obviously, the situation would be much better if we didn’t use raw pointers but some kind of wrapper objects instead.  A first improvement would be to use boost::shared_ptr in the listener management:

   typedef boost::shared_ptr<MyListener> MyListenerPtr;

   class SimpleListenerManagement
   {
   public:
      void addListener(MyListenerPtr listener);
      void removeListener(MyListenerPtr listener);
      void notifyListeners();
   private:
      std::list<MyListenerPtr> listeners_;
   };

Provided that the given MyListenerPtr instance was created correctly by the client we can be sure now that all listeners exist when we call notify() on them.  Seems much better now. But wait! Using boost::shared_ptr, we now hold  strong references in our listeners list and are therefore kind of in the same situation as described in the post mentioned above. If the client forgets to remove its MyListenerPtr instance it never gets deleted and may be in a invalid state next time notify() is called.

A solution that works well in most cases is to use boost::weak_ptr to hold the listeners. If you see boost::shared_ptr on a level with normal Java references, boost::weak_ptrs are roughly the same as Java’ s weak references. Our listener management class would then look like this:

   typedef boost::shared_ptr<MyListener> MyListenerPtr;
   typedef boost::weak_ptr<MyListener> MyListenerWeakPtr;

   class SimpleListenerManagement
   {
   public:
      void addListener(MyListenerPtr listener);
      void removeListener(MyListenerPtr listener);
      void notifyListeners();
   private:
      std::list<MyListenerWeakPtr> listeners_; // using weak_ptr
   };

Note that addListener and removeListener still use MyListenerPtr as parameter. This ensures that the client provides valid listener objects.  The interesting stuff happens in notifyListeners():

   void SimpleListenerManagement::notifyListeners()
   {
      std::list<MyListenerWeakPtr>::iterator iter = listeners_.begin();
      while(iter != listeners_.end())
      {
         if ((*iter).expired())
         {
            iter = listeners_.erase(iter);
         }
         else
         {
            MyListenerPtr listener = (*iter).lock(); // create a shared_ptr from the weak_ptr
            listener->notify();
            ++iter;
         }
      }
   }

Each weak_ptr can now be checked if its object still exists before using it. If the weak_ptr is expired, it can simply be removed from the listeners list. With this implementation the removeListener method becomes optional and can as well be omitted. The client only has to make sure that the shared_ptr holding the listener gets deleted somehow.

Java solves all memory problems, or maybe not?

Many people think that Java’s Garbage Collector (GC) solves all of their memory management problems. It is true that the GC does a great job in many many real world situations. It really eases your life as software developer especially compared to programming in languages like C /C++ where memory management is a major PITA. Even there you can help yourself by using object systems with reference counting, smart pointers etc. but you have to be aware of this issue all the time.

So everything regarding memory is fine in Java?

Actually not really. Many Java developers do not think about code potentially leading to memory leaks. I would like to point out some problems we encountered. The problems can be divided into two categories:

  1. Native resources which have to be managed manually
  2. Listeners attached to central objects which are never removed again

Examples of native resources

Database connections, result sets and so on are a very common native resource that need manual management. JDBC is a real pain regarding resource management and especially Oracle is very susceptical to leaking those. Either you are very careful here or you use some framework to help you. If you do not want to go the whole way to a persistence framework like hibernate, iBatis or toplink a solution like Spring JDBCTemplate may help you a lot.

Another example is the JOGL TextRenderer which has to be manually disposed or you will leak texture memory  and soon run into resource problems.

Files/Streams and Sockets should be handled carefully too. In most cases you are more or less in the same boat with the C/C++ people but using finally can help you there.

Examples of listener leaks

Sometimes something innocent looking like a Swing Component can turn into a memory leak. We used JDateChooser one of our projects and found some of our data displaying dialogs to exist several times in memory and thus taking huge amounts of RAM eventually leading to OutOfMemoryExceptions. In case of dialogs and windows a WindowListener might help.

Sometimes you might write similar objects yourself that register to some central instance (maybe even a singleton *yuck*). Deregistering them always is easily forgotten or overlooked. A common code pattern to look out for listener leaks where you cannot deregister easily at the right moment is the following:

public class MyCoolClass implements IDataListener {

    public MyCoolClass(IDataProvider dataProvider) {
        super();
        dataProvider.addDataListener(this);
    }

    ...
}

Avoid such constructs as they can prove really dangerous. There is more that can be done to lower the risk of hard-hitting memory/listener leaks: Use WeakReferences for listener management at the crucial central objects. The referenced objects are taken care of by the GC and the listener manager has to take care of the WeakReferences. They can be cleaned up periodically or when a notification takes place.

Conclusion

The Java GC helps a lot in everyday programming but there are still things to look out for. Just be aware of the resources you are using and think about their need of management. I will write some follow up articles about getting heap dumps in different situations and searching them for memory leaks using some nice free tools.

Update:

Kris Kemper wrote a nice article about Swing Memory Leaks with JCalendar and a solution to the problem.