Premature optimization, where software thrives unless you kill it first — a tale of Java GC

Frankie
5 min readMar 30, 2024

Will a LinkedList be faster? Should I swap the `for each` with an `iterator`? Should this `ArrayList` be an `Array`? This article came to be in response to an optimization so malevolent it has permanently etched itself into my memory.

This article was originally published at https://wasteofserver.com/premature-optimization-where-software-thrives-unless-you-kill-it-first-a-tale-of-java-gc/. You may want to check it there for new updates and comments.

Before going heads on into Java and the ways to tackle interference, either from the garbage collector or from context switching, let’s first glance over the fundamentals of writing code for your future self.

Premature optimization is the root of all evil.

You’ve heard it before; premature optimization is the root of all evil. Well, sometimes. When writing software, I’m a firm believer of being:

  1. as descriptive as possible; you should try to narrate intentions as if you were writing a story.
  2. as optimal as possible; which means that you should know the fundamentals of the language and apply them accordingly.

As descriptive as possible

Your code should speak intention, and a lot of it pertains to the way you name methods and variables.

int[10] array1;        // bad
int[10] numItems; // better
int[10] backPackItems; // great

Just by the variable name, you can already infer functionality.

While numItems is abstract, backPackItems tells you a lot about expected behaviour.

Or say you have this method:

ArrayList<Countries> visitedCountries() {
if(noCountryVisitedYet)
return new ArrayList<>(0);
}
// (...)
return listOfVisitedCountries;
}

As far as code goes, this looks more or less ok.

Can we do better? While borderline pedantic, I believe so:

static final ArrayList<Countries> EMPTY_COUNTRIES_LIST = new ArrayList<>(0);
ArrayList<Countries> visitedCountries() {
if(noCountryVisitedYet)
return EMPTY_COUNTRIES_LIST;
}
// (...)
return listOfVisitedCountries;
}

Reading EMPTY_COUNTRIES_LIST is much more descriptive than new ArrayList<>(0);

Imagine you’re reading the above code for the first time and stumble on the guard clause that checks if the user has actually visited countries. Also imagine this is buried in a lengthy class, reading EMPTY_COUNTRIES_LIST is definitely more descriptive and less straining than new ArrayList<>(0).

As optimal as possible

Know your language and use it accordingly. If you need a double there's no need to wrap it in a Double object. The same goes to using a List if all you actually need is an Array.

Know that you should concatenate Strings using StringBuilder or StringBuffer if you're sharing state between threads.

// don't do this
String votesByCounty = "";
for (County county : counties) {
votesByCounty += county.toString();
}

// do this instead
StringBuilder votesByCounty = new StringBuilder();
for (County county : counties) {
votesByCounty.append(county.toString());
}

Know how to index your database. Anticipate bottlenecks and cache accordingly. All the above are optimizations. They are the kind of optimizations that you should be aware and implement as first citizens.

How do you kill it first?

I’ll never forget about a hack I read a couple of years ago. Truth be said, the author backtracked quickly, but it goes to show how a lot of evil can spur from good intention.

// do not do this, ever!
int i = 0;
while (i<10000000) {
// business logic

if (i % 3000 == 0) { //prevent long gc
try {
Thread.sleep(0);
} catch (Ignored e) { }
}
}

A garbage collector hack from hell!

You can read more on why and how the above code works in the original article and, while the exploit is definitely interesting, this is one of those things you should never ever do.

  • Works by side effects, Thread.sleep(0) has no purpose in this block
  • Works by exploiting a deficiency of code downstream
  • For anyone inheriting this code, it’s obscure and magical

Only start forging something a bit more involved if, after writing with all the default optimizations the language provides, you’ve hit a bottleneck. But steer away from concoctions as the above.

An interpretation of Java’s future Garbage Collector “imagined” by Microsoft Copilot

How to tackle that Garbage Collector?

If after all’s done, the Garbage Collector is still the piece that’s offering resistance, these are some of the things you may try:

  • If your service is so latency sensitive that you can’t allow for GC, run with “Epsilon GC” and avoid GC altogether.
    -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC
    This will obviously grow your memory until you get an OOM exception, so either it's a short-lived scenario or your program is optimized not to create objects
  • If your service is somewhat latency sensitive, but the allowed tolerance permits some leeway, run GC1 and feed it something like -XX:MaxGCPauseTimeMillis=100 (default is 250ms)
  • If the issue spurs from external libraries, say one of them calls System.gc() or Runtime.getRuntime().gc() which are stop-the-world garbage collectors, you can override offending behaviour by running with -XX:+DisableExplicitGC
  • If you’re running on a JVM above 11, do try the Z Garbage Collector (ZGC), performance improvements are monumental! -XX:+UnlockExperimentalVMOptions -XX:+UseZGC.

Notice that since Java 15, ZGC is production ready, but you still have to explicitly activate it with -XX:+UseZGC.

Other relevant libraries you may want to explore:

Java Microbenchmark Harness (JMH)

If you’re optimizing out of feeling without any real benchmarking, you’re doing yourself a disservice. JMH is the de facto Java library to test your algorithms’ performance. Use it.

Java-Thread-Affinity

Pinning a process to a specific core may improve cache hits. It will depend on the underlying hardware and how your routine is dealing with data. Nonetheless, this library makes it so easy to implement that, if a CPU intensive method is dragging you, you’ll want to test it.

LMAX Disruptor

This is one of those libraries that, even if you don’t need, you’ll want to study. The idea is to allow for ultra low latency concurrency. But the way it’s implemented, from mechanical sympathy to the ring buffer, brings a lot of new concepts. I still remember when I first discovered it, seven years ago, pulling an all-nighter to digest it.

Final thoughts

The code that made me write this post? I’ve written way worse. I still do. The best we can hope for is to continuously mess up less.

I’m expecting to be disgruntled looking at my own code a few years down the road.

And that’s a good sign.

Given the nature of this post, I found it appropriate to promote a product I’ve had for quite some time, a home shredder!

This is the 3rd different model/brand I’ve had and can definitely attest to Amazon Basics sturdiness.

While I do prefer signed and encrypted PDFs, there are a lot of financial institutions that still share data via paper. I shred those. Then I shred some other trivial stuff just to make it safe by obscurity.

In all honesty, I find it soothing.

Originally published at https://wasteofserver.com on March 30, 2024.

--

--