Understanding Java’s Project Loom - header image

Understanding Java’s Project Loom

Last updated on December 02, 2022 - 11 comments
Star me on GitHub →  

You can use this guide to understand what Java's Project loom is all about and how its virtual threads (also called 'fibers') work under the hood.

Project Loom’s Virtual Threads

Trying to get up to speed with Java 19’s Project Loom, I watched Nicolai Parlog’s talk and read several blog posts.

All of them showed, how virtual threads (or fibers) can essentially scale to hundred-thousands or millions, whereas good, old, OS-backed Java threads only could scale to a couple of thousand (TBD: check OS-thread hypothesis in real-world scenarios).

try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    IntStream.range(0, 100_000).forEach(i -> executor.submit(() -> {  // (1)
        Thread.sleep(Duration.ofSeconds(1));
        System.out.println(i);
        return i;
    }));
}
  1. The example the blog posts used, letting 100.000 virtual threads sleep.

Hundred-thousand sleeping virtual threads, fine. But could I now just easily execute 100.000 HTTP calls in parallel, with the help of virtual threads?

// what's the difference?

for (int i = 0; i < 1000000; i++) {
    // good, old Java Threads
    new Thread( () -> getURL("https://www.marcobehler.com"))
        .start();
}


for (int i = 0; i < 1000000; i++) {
    // Java 19 virtual threads to the rescue?
    Thread.startVirtualThread(() -> getURL("https://www.marcobehler.com"))
        .start();
}

Let’s find out.

Why are some Java calls blocking?

Here is the code from our getURL method above, which opens a URL and returns its contents as a String.

static String getURL(String url) {
    try (InputStream in = new URL(url).openStream()) {
        byte[] bytes = in.readAllBytes(); // ALERT, ALERT!
        return new String(bytes);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
}

When you open up the JavaDoc of inputStream.readAllBytes() (or are lucky enough to remember your Java 101 class), it gets hammered into you that the call is blocking, i.e. won’t return until all the bytes are read - your current thread is blocked until then.

How come, I can now supposedly execute this call a million times in parallel, when running inside virtual threads, but not when running inside normal threads?

Parts of the puzzle - topics you never knew you wanted to know more about after CS 101: Sockets & Syscalls.

Sockets

When you want to make an HTTP call or rather send any sort of data to another server, you (or rather the library maintainer in a layer far, far away) will open up a Socket. And accessing sockets, by default, is blocking.

// pseudo-code
Socket s = new Socket();

// blocking call, until data is available
s.read();

However, operating systems also allow you to put sockets into non-blocking mode, which return immediately when there is no data available. And then it’s your responsibility to check back again later, to find out if there is any new data to be read.

// pseudo-code
Socket s = new Socket();

// pseudo code, consult a random Java NIO tutorial
s.setBlockingFalse(true);   // ;D

// yay, this call will return immediately, even if there is no data
s.read();

Syscalls

When executing the getURL() call above, Java doesn’t do the network call (open up a socket, read from it, etc) itself - it asks the underlying operating system to do the call. And here’s the trick: Whenever you are using good-old Java threads, the JVM will use a blocking system call (TBD: show OS call stack.).

When run inside a virtual thread, however, the JVM will use a different system call to do the network request, which is non-blocking (e.g. use epoll on Unix-based systems.), without you, as Java programmer, having to write non-blocking code yourself, e.g. some clunky Java NIO code.

To cut a long story short (and ignoring a whole lot of details), the real difference between our getURL calls inside good, old threads, and virtual threads is, that one call opens up a million blocking sockets, whereas the other call opens up a million non-blocking sockets.

Now, if you tried out this (non-sensical) example in the real world⟨™), you’d find that depending on your operating system, and if you are sending or receiving data, you’d run into operating system socket limits - a reminder that using virtual threads is not an automagically scaling solution without you needing to know what you are doing (isn’t that always true? :) )

Filesystem calls

While we are at it. How would virtual threads behave when working with files?

// Let's read in a million files in parallel!

for (int i = 0; i < 1000000; i++) {
    // Java 19 virtual threads to the rescue?
    Thread.startVirtualThread(() -> readFile(someFile))
                                        .start();
}

With sockets it was easy, because you could just set them to non-blocking. But with file access, there is no async IO (well, except for io_uring in new kernels).

To cut a long story short, your file access call inside the virtual thread, will actually be delegated to a (…​.drum roll…​.) good-old operating system thread, to give you the illusion of non-blocking file access.

How do virtual threads work?

Even though good,old Java threads and virtual threads share the name…​Threads, the comparisons/online discussions feel a bit apple-to-oranges to me.

It helped me think of virtual threads as tasks, that will eventually run on a real thread⟨™) (called carrier thread) AND that need the underlying native calls to do the heavy non-blocking lifting.

In the case of IO-work (REST calls, database calls, queue, stream calls etc.) this will absolutely yield benefits, and at the same time illustrates why they won’t help at all with CPU-intensive work (or make matters worse). So, don’t get your hopes high, thinking about mining Bitcoins in hundred-thousand virtual threads.

Hype & Promises

Almost every blog post on the first page of Google surrounding JDK 19 copied the following text, describing virtual threads, verbatim.

A preview of virtual threads, which are lightweight threads that dramatically
reduce the effort of writing, maintaining, and observing high-throughput,
concurrent applications. Goals include enabling server applications written
in the simple thread-per-request style to scale with near-optimal
hardware utilization (...) enable troubleshooting, debugging, and
profiling of virtual threads with existing JDK tools.

While I do think virtual threads are a great feature, I also feel paragraphs like the above will lead to a fair amount of scale hype-train’ism. Web servers like Jetty have long been using NIO connectors, where you have just a few threads able to keep open hundreds of thousand or even a million connections.

The problem with real applications is them doing silly things, like calling databases, working with the file system, executing REST calls or talking to some sort of queue/stream.

And yes, it’s this type of I/O work where Project Loom will potentially shine. Loom gives you, the programmer or maybe even more "just" the (HTTP/database/queue) library & framework maintainers, the benefit of essentially non-blocking code, without having to resort back to the somewhat unintuitive async programming model (think of RxJava / Project Reactor ) and all the consequences that entails (troubleshooting, debugging etc).

However, forget about automagically scaling up to a million of private threads in real-life scenarios without knowing what you are doing. There is no free lunch.

What about the Thread.sleep example?

We started this article with making threads sleep. So, how does that work?

  • When calling Thread.sleep() on a good, old Java, OS-backed thread, you will in turn, generate a native call that makes the thread sleepey-sleep for a given amount of time. Which is a non-sensical scenario anyway quite costly for 100_000 threads.

  • In case of VirtualThread.sleep(), you will mark the virtual thread as sleeping and create a scheduled task on a good, old Java (OS-thread-based) ScheduledThreadPoolExecutor. That task will unpark / resume your virtual thread after the given [sleep-time]. Exercise for you: apples-to-oranges, again?

Fin

Want to see more of these short technology deep dives? Leave a comment below.

Meanwhile, check out Load Testing: An Unorthodox Guide to find out, why you should worry about other things than scale.

Acknowledgements

Thanks to Tagir Valeev, Vsevolod Tolstopyatov. Andreas Eisele for comments/corrections/discussions.

There's more where that came from

I'll send you an update when I publish new guides. Absolutely no spam, ever. Unsubscribe anytime.


Share

Comments (read-only since June '24)

11 comments

Anonymous March 04, 2023
Did you realise that in your 'old Java threads' example , you're doing everything in the main thread?

for (int i = 0; i < 1000000; i++) {
// good, old Java Threads
new Thread( getURL("https://www.marcobehler.com"))
.start();
}

You're calling the getURL() function WHILE constructing the thread object.
You haven't overridden the run() method so when you call start() on the Thread object it will execute the run() implementation of the base class

https://docs.oracle.com/en/java/javase/19/docs/api/java.base/java/lang/Thread.html#run()

The default implementation executes the Runnable task that the Thread was created with. If the thread was created without a task then this method does nothing.

Also, your getURL() implementation is poorly chosen because the URLConnection will use caching by default.

Try to run your example while running Wireshark or tcpdump and you will see that it does NOT try to open a million connections to
https://www.marcobehler.com

Have you tried to run your example (without virtual threads)?
It won't start a lot of concurrent threads, it won't open many HTTPS connections and it will finish fine, if you wait long enough.

Marco Behler March 05, 2023
Hi there,
thanks for spotting that I missed the ()-> lambda when copy & pasting & fumbling code into the blog article. This is now fixed. No need to reference half the JavaDoc next time :)

Also, thanks for bringing up the caching behavior of URLConnection, but that is "somewhat" irrelevant for what this example is being used in the article. The main question opened up by all other blog posts was: "can we ask the OS to spawn 1.000.000" threads and what effect will that have, as compared to virtual threads. Making sure that indeed we open up a million sockets and thus demonstrating that we might reach the OS's socket limit is another topic.
veiko.soomets December 20, 2022
Thanks for the article. I love your combination of explaining topics in a technical and low level way while also keeping things simple and easy to understand. I am a fan now. :)
Anonymous December 13, 2022
thanks for the article (s) i like to read them, so I hope you will write more and more :)
alprab December 07, 2022
The Dragonwell JVM from Ali Baba effectively turns your Java threads into fibers (i.e. multiplexes them on OS threads) and thus probably already gives us many of the benefits of Project Loom in a real current JVM without having to modify our program. See: https://github.com/alibaba/dragonwell8/wiki/Wisp-Documentation
Anonymous December 07, 2022
I find very necessary to highlight here (as in any "generating-hype" improvement) this phrase: "There is no free lunch." to count not only the potential benefits but also the limitations and/or drawbacks. I liked the post.
Anonymous December 07, 2022
Could you also explain/add article about how introduction of Project loom impact on use of CompletableFuture API ?
Anonymous November 29, 2022
Typo: search for availabel :)
Marco Behler December 02, 2022
Thanks, fixed.
Anonymous November 26, 2022
I definitively want to see more of these short technology deep dives!
aydar.kh November 10, 2022
As always, everything is clear and understandable about the complex topic!
Raghavan alias November 08, 2022
A well narrated article, making the perceived-to-be-complex-topic in an easy manner. Thank you Marco! I liked the stuff you mentioned about SysCalls in the right context!

Cheers,
Raghavan alias Saravanan Muthu.
Anonymous November 08, 2022
I enjoy your writing style and how you have presented your journey, as always, mixed with some common sense.

let mut author = ?

I'm @MarcoBehler and I share everything I know about making awesome software through my guides, screencasts, talks and courses.

Follow me on Twitter to find out what I'm currently working on.