How I Learned to Love CMS and Had My Heart Broken by G1
G1 is Sun’s new garbage collector that first appeared in Java 6 release 14 or something like that. It’s supposed to be the best of all worlds: concurrent garbage collection with compaction! Prior to G1 it was either concurrent collection with very short “stop the world” GC events but with no compaction, or the copying and compacting collector, which had “stop the world” events that would increase in duration with the size of your heap. The problem is that the size of heaps has been rising steadily and a 50 Gb heap is entirely reasonable on today’s modern computers, if only the garbage collector were up for it!
At Flipboard we have some servers with 14 Gb heaps. Before we could move to heaps of that size we knew we had to switch away from the parallel copying and compacting collector. The stop the world occurrences were too common and too long, often up in the 10 to 15 second range. With one particular server I first tried the CMS (Concurrent Mark and Sweep) collector. I couldn’t get that to work for very long. It would fall over, failed allocations, I assumed the fragmentation issue, that is a side-effect of a non-compacting collector, was what was causing the problem. So then I gave the great, new hope (G1) a try. It didn’t fail to allocate but it did seem to have to do a lot of Full GC’s, which are “stop the world” events. What the heck? What good is this? It was no good, is the answer, but I set out trying to figure out what was going wrong.
Watch What You Allocate
Turns out G1 kept running full GCs because we were making entirely unreasonable demands on the memory system. We were routinely trying to allocate 64, 128, 256, even 512 Mb of contiguous memory! The problem was caused by the way we were using hbase and, more importantly, the way the asynchronous hbase library we were using loads data for multiple key/values. I made some changes to that library which solved that problem, and suddenly the G1 collector was working as advertised!
FYI #1: Attempts to allocate large contiguous arrays of data will result in full GCs in order to compact away the fragmentation and make it possible to satisfy the request. Avoid at all costs.
With this fix in place we had made a great deal of progress. But still we were running into issues: our servers would continue growing beyond the bounds of their heap sizes, eventually causing swapping, which is death to a Java VM.
How I learned to Re-hate Finalization
Java has this notion of finalizers: if you declare a method called finalize() the VM will promise to call it when that object becomes garbage. In the early days of Java, since you don’t have to free data manually anymore, we thought we’d take advantage of that and add the notion of finalize, so that you could be a lazy programmer and not even close your file descriptors! OK - that’s not exactly why we did it but all that matters is we did do it. And it has served as a great way to hide all sorts of actual bugs over the years, and to enable all sorts of bad decisions. And sadly, with modern GC algorithms supporting super large heap size, finalization is just a huge problem rearing its ugly head. See, the VM gives no guarantees as to when it will run finalizers, and in the case of the new G1 collector, the answer seems to be: well, almost never.
The actual answer, as far as I can tell, is: we’ll run your finalizers after the next Full GC. The problem with that is, If you play your cards right, you will never run a full GC, ever! Because with a 14 Gb heap those GCs can take forever. (And, no, we’re not swapping.)
FYI #2: the G1 collector does not run finalizers. And it doesn’t matter if you personally don’t use finalizers in your code because the JDK is full of them, from top to bottom. It’s almost comical it’s so bad.
The side-effect of never running finalizers is unnecessary heap pressure, and I have to say, virtual memory issues. Some of the objects not getting finalized obviously have some native components to them. I am not sure if the 4 million SocketImpl’s, FileInputStreams, etc. that my jmap uncovered have a native component, but I am sure that some of the direct memory buffers definitely have a native component: the direct memory itself! The end result is that our machines eventually ran out of virtual memory as though a slow memory leak was occurring. Very frustrating.
G1 is CPU intensive
The other thing we noticed is that the CPU usage on the G1 collect was rather high. Machines with a fresh restart would have a CPU utilization at 50% of an older running server. How can that be a good thing? And we’re not talking 10% vs 20%, we’re talking 25% vs. 50% or 60%. A huge difference.
CMS Makes a Comeback
Meanwhile, I have been reading up on the web that people definitely run 40 and 50 Gb heaps, but how come I can’t? What am I missing? Why don’t other people run into the issues of having no finalizers called, ever? Why don’t other people have 30 second or more GC times as their heap sizes go up? CMS seemed to be a common indicator of success, so I decided to give it a try again. It had been a while since I last tried it.
Sure enough CMS is a dream come true. Before I get to the comparison, I had to ask myself, why is this working now when it failed for me before? And then I remembered that it failed for the very same reason the G1 collector failed: I was allocating those huge arrays back in the old days. Once I had figured out that that was what had brought down the G1 collector, I forgot to go back and try CMS again to see if it was working any better. The G1 just sounded so much better on paper I had to go with it!
In theory the only thing G1 should have on CMS is the compaction. And I figured that that would be a show-stopper, an obvious issue, and we’d see signs of it pretty quickly once the CMS were running for a few hours. Far from it, however. In fact, here’s what I have seen:
- much lower CPU usage on average
- zero detectable “stop the world” delays
- lower heap usage in general
- very timely finalization
- zero swapping over time (probably due to the previous point)
The heap usage grows up to some point, perhaps 70 or 80 percent, and then some magic occurs and the heap size reduces back to 20% over the course of 10 or 20 seconds. Unreal! This would never EVER happen with the G1. The only way to see anything like that with the G1 is to run a Full GC (which we have disabled with a command line option because it stops the VM for 20 seconds or more). If we run a full GC with the CMS collector, the VM keeps running but the heap size just starts melting away down to the 20%. Say it with me: the app never appears to stop at all!
It is possible fragmentation will come to bite us eventually with the CMS collector, but, if it’s every few days or once a week, who cares? We do weekly backend releases in the normal course of events anyway.