In 2018, I had the privilege to follow a talk by Alan Bateman at Devoxx. It was about Fibers in Java, which we are familiar with from other programming languages with different names like coroutines. There are slight differences between coroutines and fibers, yet this is a topic for another time. Time passed, and Project Loom released the Fibers with JDK 21 while dropping the name “Fibers” for “Virtual Threads.” - they did a superb job naming them virtual threads because it describes how the Java version of collaborative threads is implemented, “Just like regular threads but virtual ones.” Otherwise, it could confuse the community and induce further discussions among terminology and jargon fanatics. But, putting all these aside, the main idea behind coroutines and virtual threads is to eliminate the cost of context switches between the operating system threads by implementing a lightweight concurrency mechanism at the application layer. It is especially essential for applications like web services, which must handle multiple client requests parallel and make outgoing requests like database calls and other dependent services, i.e., I/O-bound applications where the threads often need to be parked. Before we start, let’s look at what is wrong with the operating system threads.
Nothing. However, operating system threads are considered expensive compared to their lightweight counterparts because their stacks will be loaded and unloaded constantly as the schedular preempts another one by interrupting the currently executing thread. The scheduler in the operating system may force it to stop according to its preemptive multitasking and time-slicing heuristics. In other words, it is all about the scheduler’s decision when a thread may occupy hardware resources. Besides that, the operating system threads run in kernel space and require full privileges on system resources. This constitutes further overhead if the program often starts and stops threads instead of pooling them.
JVM threads are so-called platform threads that rest on operating system threads and have a one-to-one relationship between them. It means that whenever we create a new thread in our Java application, it will be backed by an operating system thread. Therefore, they are as expensive as operating system threads. Their stack size is fixed but can be configured before the application bootstraps, though it can’t grow later. So, it is essential to determine a good ballpark number for the right stack size to satisfy threads’ memory needs, keeping the memory small enough so that the overhead can be lowered during context switches. Allocating fixed-size memory space for every thread has its upper limit by the system’s capacity.
How do Virtual Threads in JDK 21 make our programs more efficient?
First, virtual threads, as their name state, simulate a thread virtually while mimicking them at the application layer. Therefore, they are executed in user space rather than the kernel, which saves the overhead caused by the operating system’s resource allocation and privilege checks. Their stack is dynamically resizable because the memory occupied by a virtual thread resides in the heap. It means the threads may consume no memory initially but can receive new memory space if needed. However, the most significant benefit of using virtual threads is their multitasking implementation strategy, which is collaborative multitasking in contrast to platform threads, which employ preemptive multitasking and utilize a scheduler. So, virtual threads, as their coroutine siblings, can yield their execution for another thread at any time. Regardless, they will be executed by platform threads in the end. But, whenever the execution needs to wait for an IO operation, a virtual thread yields its execution for another one. It will consequently be detached from its carrier platform thread so that the underlying heavyweight thread can be scheduled for another virtual thread task.
Using virtual threads also raises the limit of how many threads you start on the machine simultaneously, while platform threads with high resource consumption are bound to available hardware capacity. Virtual threads are a good fit for the one-thread-per-task use case. They are cheap to create, and pooling them at all is unnecessary. If you are starting a greenfield project, don’t think of using any thread pool with virtual threads. Create a new virtual thread whenever you need to start a parallelized task.
How to use Virtual Threads?
Switching from Platform to Virtual Threads should be manageable if you have already updated your project dependencies for the newest JDK release. You may face some challenges with the libraries that instrument your code, but other than that, the upgrade must be straightforward. The Thread interface remained the same so you can use the same thread instances in your application code. The requirement for pooling threads has vanished, as pooling cheap virtual threads doesn’t make sense at all. Therefore, you should offload the work from thread pools to virtual threads by refactoring your code. A caveat of creating virtual threads might be the thread locals, which will still work, but due to the high scalability of virtual threads, the application may quickly exhaust the memory. So, if you want to drop that risk, it is time to stop using thread locals, as thread-local variables are never really welcomed to the applications.
To leverage the full power of virtual threads, you should refactor the synchronized blocks, as synchronization would pin the virtual threads to their carrier and block the platform threads because of the “synchronized” semantics. If the platform thread attached to the virtual thread is unmounted, another platform thread might pick up the same virtual thread in the same synchronized block once it is ready to run, which violates the synchronization contract. Therefore, synchronized blocks may affect the scalability of virtual threads if they are blocked in the critical section for a long time, so you should consider replacing them, e.g., with Reentrant Locks in your code.
Virtual threads simplify your application code since they drop the need for thread pools. Consider the parameters, what makes up a thread pool, core thread size, max size, necessity of queues, queue size, etc. All these parameters must be configured and fine-tuned according to the load on the system. With virtual threads, threadpools become obsolete. See the following example:
Runnable longRunning = () -> {
Thread currentThread = Thread.currentThread();
logger.info("Long running - start : {}", currentThread);
sleep(Duration.ofSeconds(10).toMillis());
logger.info("Long running - done : {}", currentThread);
};
Thread virtual = Thread.ofVirtual()
.name("virt-0")
.factory()
.newThread(longRunning);
virtual.start();
We are starting a new virtual thread similar to how we create and start platform threads.
Starting: Thread[#1,main,5,main]
Long running - start : VirtualThread[#21,virt-0]/runnable@ForkJoinPool-1-worker-1
Long running - done : VirtualThread[#21,virt-0]/runnable@ForkJoinPool-1-worker-1
Process finished with exit code 0
In the logs, you see the virtual thread with the name “virt-0” and id “#21” is executed by the platform thread, “ForkJoinPool-1-worker-1”. You can also use a particular ExecutorService method, newVirtualThreadPerTaskExecutor, which creates a new virtual-thread-per-task executor so as to submit tasks and run them on virtual threads:
try (ExecutorService exec = Executors.newVirtualThreadPerTaskExecutor()) {
exec.submit(longRunning);
}
I mentioned that whenever the virtual thread is blocked, it will be detached from its carrier, and the carrier platform thread can execute another virtual thread. Let’s look at another example where two threads are in action. The first thread will sleep, and the second one will also sleep but remain more active. The following code demonstrates how the same platform thread is re-used for different virtual threads:
Runnable longRunning = () -> {
Thread currentThread = Thread.currentThread();
logger.info("Long running - start : {}", currentThread);
sleep(Duration.ofSeconds(5).toMillis());
logger.info("Long running - done : {}", currentThread);
};
Runnable active = () -> {
for (int i = 0; i < 5; i++) {
logger.info("Active running {}", Thread.currentThread());
sleep(Duration.ofSeconds(1).toMillis());
}
};
Thread thread1 = Thread.ofVirtual().name("virt-0").start(longRunning);
Thread thread2 = Thread.ofVirtual().name("virt-1").start(active);
thread1.join();
thread2.join();
I mentioned that whenever the virtual thread is blocked, it will be detached from its carrier, and the carrier platform thread can execute another virtual thread. Let’s look at another example where two threads are in action. The first thread will sleep, and the second one will also sleep but remain more active. The following code demonstrates how the same platform thread is re-used for different virtual threads.
The output demonstrates that two virtual threads started, virt-0 and virt-1. virt-0 sleeps for five seconds, while virt-1 loops for five seconds. You can see both threads are backed by the carrier ForkJoinPool-1-worker-1.
Starting: Thread[#1,main,5,main]
Long running - start : VirtualThread[#21,virt-0]/runnable@ForkJoinPool-1-worker-1
Active running VirtualThread[#23,virt-1]/runnable@ForkJoinPool-1-worker-1
Active running VirtualThread[#23,virt-1]/runnable@ForkJoinPool-1-worker-1
Active running VirtualThread[#23,virt-1]/runnable@ForkJoinPool-1-worker-1
Active running VirtualThread[#23,virt-1]/runnable@ForkJoinPool-1-worker-1
Active running VirtualThread[#23,virt-1]/runnable@ForkJoinPool-1-worker-1
Long running - done : VirtualThread[#21,virt-0]/runnable@ForkJoinPool-1-worker-1
Terminating: Thread[#1,main,5,main]
Since we’re playing with only two threads, the scheduler would likely initiate a second platform thread to complete the task attached to the virt-1 if there are threads lingering in the Forkjoin thread pool. For the sake of the example and to force the test environment to behave deterministic, I explicitly set the parallelism parameter of the VM to one thread before I started the application:
-Djdk.virtualThreadScheduler.parallelism=1
-Djdk.virtualThreadScheduler.maxPoolSize=1
-Djdk.virtualThreadScheduler.minRunnable=1
As you can see, the virtual threads I created run by a single platform thread, while the virtual threads simulate the runtime of multiple platform threads. This is a big deal.
Let’s wrap up…
Project Loom has been running for quite a long time. As set out in late 2017, the main challenge highlighted was implementing a new concurrency concept behind existing APIs, making it transparent for the application code. Say, if a virtual thread steps into a synchronized block and needs to wait for I/O, can the carrier platform thread be detached even though it still holds the lock, and no other thread may enter the critical section again? Such design decisions may change the application’s behavior and even cause deadlocks, starving threads, and breaking the thread’s isolation guarantees in the execution environment. Therefore, such essential changes in JVM are challenging and must be carefully designed and implemented, which certainly requires much time. Nevertheless, the virtual threads are there. They help build efficient I/O bound applications by using cheap-to-create virtual threads. It is possible you think that the concept is not new to many other programming languages, but it is a big deal for the Java community.