Sunday, May 18, 2008

Object Queue/Pooling

Currently I am in design phase of a system which require high throughput and equally very high level for performance. In one of the design meeting, one of the very experienced member in the team make a very interesting remark, lets make object pool, this will increase the performance. Hmmm. That was really interesting for me and inspired me to write this blog on object pooling.

The myth about the object pooling goes like this “Object pooling works. The idea is that we can reuse objects by pooling them on and off free lists instead of using new and letting the garbage collector pick them up. Once you have more than one thread going off the pool, you need a synchronized free list, which has costs. If the list gets hot and contended, you can get scaling bugs. It gets complicated too fast and is not worth it for small to even moderate sized objects. Use it only for large objects.”

Object Pooling is about objects being pre-created and “pooled” for later use. The concept behind object pooling is that it is far cheaper to access an object from a pool of identical objects rather than create a new instance of the object. Creating a new instance involves the following steps:

  1. Loading the class if not already loaded.
  2. Obtaining the required memory from the heap.
  3. Creating an instance of the class within the obtained memory.

In the case of object pooling all these steps would have been performed for a pre-configured number of objects already. It was argued that one of these pre-configured objects could be utilized when a new instance is required. After use, this object would be returned to the pool. The problems with this paradigm are:

Pooling is not cheap. It requires the following steps as part of its execution:

  1. The pool should be locked when the object is being obtained.
  2. The pool needs to be potentially scanned for unused objects.
  3. The object needs to be marked as _used_.
  4. The pool can then be unlocked.

All these steps are not cheap since they involved synchronization locks across multiple threads.

  1. Pooling does not automatically support garbage collection. The object needs to be explicitly returned back to the pool to avoid memory leaks. Isn’t garbage collection one of the most compelling reasons to use Java or a similar language in the first place?
  2. Objects need to be coded to be “stateless”. Otherwise, the object may have to be re-initialized to ensure that it can be re-used back from the pool. This may be a feature of the pooling implementation but would add overhead during deallocation.

Hence it’s pretty conclusive that object pooling make no sense but resource pooling (like connection pool etc) make lots of sense.

No comments: