Loading...
Abstract:
Biased locking has made unnecessary mutexes cheap for over a decade. However, it is disabled by default in Java 15, slated for removal. From Java 15 onwards we should be more diligent to avoid synchronized in places where we do not need it.
Welcome to the 282nd edition of The Java(tm) Specialists' Newsletter, sent to you from ... the Island of Crete (good guess :-)). This month I did several live Java streams. The first six were accidental ;-) My friend David sent me frantic messages on WhatsApp:
[13:31, 7/13/2020] David Gomez Garcia: Hey Heinz. [13:31, 7/13/2020] David Gomez Garcia: I'm not sure if you are streaming online in Facebook and periscope on purpose. [13:32, 7/13/2020] David Gomez Garcia: It seems like you are recording clips for your courses... and not really meant for a live stream.I was trying to record a "sales pitch" for my new Juppies 2 course. I have no problem speaking about technical things for hours. But marketing stuff - that is hard. My little "Go Live" button sent it to Restream.io, which then diligently broadcast my antics to three Facebook accounts, Periscope/Twitter, YouTube, Twitch and a few others. This was not for public consumption, and one of the preview images had me digging for diamonds. It took me an hour to delete them all.
But then I thought - this is fun, let us do more. I announce them on Twitter and the recordings are here.
Another thing. I have moved my Java consulting offerings onto Teachable as well, to make purchasing easier. You can buy single hours or bundles of consulting over here.
javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.
Last month, I sent a puzzle showing how single-threaded access of Vector had slowed down in Java 15. The first to send the correct explanation was Ulrich Grepel. With JEP 374, biased locking has been disabled and deprecated. Turn it on with -XX:+UseBiasedLocking and Java 15 runs as fast as the previous versions.
My second puzzle showed further evidence that biased locking, or rather its absence, was to blame. The IdentityHashMap calls System.identityHashCode() on the vectors, thus disabling biased locking on those individual objects (see newsletter 222). Well done to Bas de Bakker for being the first to figure out that weird behavior.
I also mentioned in the puzzle that the results were a bit different for Java 10. No one picked up that subtlety. Here are the biased locking JVM flags for Java 9:
java -XX:+PrintFlagsFinal -version | grep Biased intx BiasedLockingBulkRebiasThreshold = 20 intx BiasedLockingBulkRevokeThreshold = 40 intx BiasedLockingDecayTime = 25000 intx BiasedLockingStartupDelay = 4000 bool UseBiasedLocking = trueAnd here are the settings for Java 10, with the BiasedLockingStartupDelay set to 0.
java -XX:+PrintFlagsFinal -version | grep Biased intx BiasedLockingBulkRebiasThreshold = 20 intx BiasedLockingBulkRevokeThreshold = 40 intx BiasedLockingDecayTime = 25000 intx BiasedLockingStartupDelay = 0 bool UseBiasedLocking = trueBiased locking got a bad rap in the Java performance world. Many years ago, one of the engineers at Azul Systems wrote a benchmark that seemed to indicate that biased locking could cause a long time to safepoint. However, he left and apparently his colleagues struggled to reproduce his results. Perhaps it is true, or maybe not. Or confirmation bias made programmers blame biased locking? That would be ironic.
When Java 5 was released, programmers moved en masse to ReentrantLock, following the promise of better performance and richer functionality. However, code with ReentrantLock was also harder to write and certainly more challenging to debug. Since Java 8, there has been a shift back to synchronized. For example, ConcurrentHashMap was rewritten and now locks internally with synchronized instead of ReentrantLock. CopyOnWriteArrayList changed to synchronized in Java 9, with this comment capturing the thinking nicely:
/** * The lock protecting all mutators. (We have a mild preference * for builtin monitors over ReentrantLock when either will do.) * final transient Object lock = new Object();Synchronized is in my experienced easier to analyze, more performant under low contention and more robust. The coding idioms are also much easier than with ReentrantLock or StampedLock.
The only disadvantage that I know with synchronized is that virtual threads, as found in Project Loom, do not play nicely with monitor locks. Project Loom promises to be a game changer and should make coding in Java so much easier. It took me 2.5 hours to explain the basics of non-blocking IO. With Project Loom I could create the same functionality in one little class and in about 10 minutes of explanation, including time for questions.
If I had to choose which I want in Java 17, biased locking or virtual threads, I would definitely take virtual threads.
Back to biased locking. In JEP 374 they state: Furthermore, many applications that benefited from biased locking are older, legacy applications that use the early Java collection APIs, which synchronize on every access (e.g., Hashtable and Vector). Newer applications generally use the non-synchronized collections (e.g., HashMap and ArrayList), introduced in Java 1.2 for single-threaded scenarios, or the even more-performant concurrent data structures, introduced in Java 5, for multi-threaded scenarios.
True, it is unlikely that I would use Vector in modern code. Instead, I would use Collections.synchronizedList(new ArrayList<>()) if I needed a thread-safe list. Most of the time, I would write my code so that I would not have to synchronize my list and thus an ArrayList would do. However, for maps I follow the advice by Jack Shirazi, to use the ConcurrentHashMap as my default map. It is as sensible as wearing a seat belt. Most likely you will be just fine never wearing a seat belt, but you just need one accident to ruin your life. Similarly, the advice that I have been following and promulgating for the last few decades is to make our Java code correct and then let HotSpot optimize it for us. If it is fast enough then great, otherwise we profile and fix the bottlenecks. Synchronized was easy to fix. If a lock was contended, we could find it quickly with the available tooling.
With Java 15, this advice might be dangerous to follow. As we saw, our demo ran twice as slowly as in Java 14. All we did was use a class that happened to be synchronized. Furthermore, since each list is thread confined, the lock is never contended. Thus the threads would not go into the BLOCKED state. Our usual toolset for finding lock contention would not help us.
The same issue can also happen with ConcurrentHashMap, which sometimes uses synchronized on put().
import java.util.*; import java.util.concurrent.*; import java.util.stream.*; public class ConcurrentHashMapBench { public static void main(String... args) { for (int i = 0; i < 10; i++) { test(false); test(true); } } private static void test(boolean parallel) { IntStream range = IntStream.range(1, 100_000_000); if (parallel) range = range.parallel(); long time = System.nanoTime(); try { ThreadLocal<Map<Integer, Integer>> maps = ThreadLocal.withInitial(() -> { Map<Integer, Integer> result = new ConcurrentHashMap<>(); for (int i = 0; i < 1024; i++) result.put(i, i * i); return result; }); range.map(i -> maps.get().put(i & 1023, i)).sum(); } finally { time = System.nanoTime() - time; System.out.printf("%s %dms%n", parallel ? "parallel" : "sequential", (time / 1_000_000)); } } }Here are the results for different versions of Java running on my 1-6-2 MacBook Pro Late 2018 model.
openjdk version "14.0.1" 2020-04-14 OpenJDK Runtime Environment (build 14.0.1+7) OpenJDK 64-Bit Server VM (build 14.0.1+7, mixed mode, sharing) sequential 2441ms parallel 525ms sequential 2405ms parallel 479ms sequential 2381ms parallel 480ms sequential 2414ms parallel 474ms sequential 2424ms parallel 485ms sequential 2420ms parallel 479ms sequential 2417ms parallel 476ms sequential 2406ms parallel 469ms sequential 2377ms parallel 473ms sequential 2374ms parallel 469msThe degradation in performance when putting into a ConcurrentHashMap is not as bad in Java 15 as it was with Vector, but it is still easily observable:
openjdk version "15-ea" 2020-09-15 OpenJDK Runtime Environment (build 15-ea+30-1476) OpenJDK 64-Bit Server VM (build 15-ea+30-1476, mixed mode, sharing) sequential 3057ms parallel 574ms sequential 3208ms parallel 529ms sequential 3167ms parallel 535ms sequential 3219ms parallel 542ms sequential 3221ms parallel 525ms sequential 3198ms parallel 548ms sequential 3234ms parallel 537ms sequential 3220ms parallel 538ms sequential 3214ms parallel 537ms sequential 3158ms parallel 536msWhen we explicitly turn biased locking on with the -XX:+UseBiasedLocking, then we get better performance:
OpenJDK 64-Bit Server VM warning: Option UseBiasedLocking was deprecated in version 15.0 and will likely be removed in a future release. openjdk version "15-ea" 2020-09-15 OpenJDK Runtime Environment (build 15-ea+30-1476) OpenJDK 64-Bit Server VM (build 15-ea+30-1476, mixed mode, sharing) sequential 2237ms parallel 490ms sequential 2315ms parallel 468ms sequential 2285ms parallel 444ms sequential 2277ms parallel 451ms sequential 2222ms parallel 461ms sequential 2183ms parallel 474ms sequential 2236ms parallel 455ms sequential 2218ms parallel 459ms sequential 2192ms parallel 437ms sequential 2222ms parallel 438msI have been consulting on Java for more than two decades. This change in Java 15 might add some wonderful new opportunities ;-) Jokes aside, for now there is an easy way to test. If the performance of your system is not good enough in Java 15, turn biased locking on and see if it improves to acceptable levels. Most likely it will not make a difference. If it does, then chances are that you are overusing synchronized. We would then need to use profilers to find the offending unnecessary mutexes. Good luck :-)
Kind regards from Crete
Heinz
Java Specialists Superpack 2020 Our entire Java Specialists Training in One Huge BundleLoading...
Loading...