Minimize garbage collection by using object pooling, stack allocation via escape analysis, primitive collections, and limiting object churn, while leveraging low-pause GCs like ZGC or Shenandoah. 2. Use ring buffers and lock-free data structures such as the Disruptor pattern for high-throughput, low-latency inter-thread communication with pre-allocated events. 3. Optimize JIT compilation by warming up the JVM, tuning compilation thresholds, pinning critical methods, avoiding reflection in hot paths, and using profiling tools to monitor inlining and deoptimization. 4. Reduce contention through thread-per-core design and prevent false sharing by padding variables or using @Contended to isolate frequently modified fields on separate cache lines. 5. Use efficient I/O with NIO, direct buffers, zero-copy techniques like transferTo() or memory-mapped files, and high-performance serialization formats such as Protobuf, FlatBuffers, or SBE. 6. Pin threads to CPU cores, use real-time scheduling, disable CPU frequency scaling, and apply latency-focused JVM flags including -XX: UseZGC, -XX: AlwaysPreTouch, and -XX:-UseBiasedLocking to minimize system-level variability. By combining these techniques, Java applications can achieve microsecond-scale response times with high predictability, making them suitable for real-time, low-latency domains such as financial trading and gaming.
When building high-performance, real-time systems in Java—such as financial trading platforms, online gaming backends, or real-time data processing pipelines—low latency is critical. While Java is often perceived as "slower" due to garbage collection and JIT compilation, with the right techniques, it can achieve sub-millisecond response times consistently. Here are key low-latency Java programming techniques that help minimize delays and improve predictability.

1. Minimize Garbage Collection (GC) Pressure
Garbage collection is the biggest source of unpredictable pauses in Java applications. To reduce GC impact:
- Object pooling: Reuse objects instead of creating new ones. For example, use object pools for messages, buffers, or DTOs.
- Stack allocation via escape analysis: Write code that allows the JIT to allocate short-lived objects on the stack (e.g., avoid storing references in fields or collections).
-
Use primitive collections: Libraries like Eclipse Collections or Trove avoid boxing by using
int[]
instead ofInteger[]
. -
Limit object churn: Avoid unnecessary object creation in hot paths (e.g., don’t use
new String("hello")
,String.format()
, or autoboxing in tight loops).
Example:

// Bad: creates temporary objects String log = String.format("Error at %d", System.nanoTime()); // Better: use a reusable buffer or pre-allocated strings StringBuilder sb = new StringBuilder(); sb.append("Error at ").append(System.nanoTime());
Also consider using ZGC or Shenandoah—low-pause GCs introduced in recent JDK versions—that can keep pauses under 10ms even with large heaps.
2. Use Ring Buffers and Lock-Free Data Structures
For inter-thread communication, traditional queues (like LinkedBlockingQueue
) involve locks and memory allocation, which add latency.

- Disruptor pattern: Developed by LMAX, the Disruptor uses a ring buffer with memory barriers instead of locks, enabling million-messages-per-second throughput with microsecond latency.
- Wait-free or bounded queues: Use
Phaser
,AtomicReference
, orVarHandle
-based structures to avoid blocking.
Example (simplified Disruptor usage):
EventFactory<Event> factory = Event::new; RingBuffer<Event> ringBuffer = RingBuffer.createSingleProducer(factory, 1024); SequenceBarrier barrier = ringBuffer.newBarrier();
This design ensures predictable latency by pre-allocating all events and eliminating GC and lock contention.
3. Optimize JIT Compilation Behavior
The HotSpot JVM uses Just-In-Time (JIT) compilation to optimize hot methods. But this can cause "warm-up" delays.
- Warm up the JVM: Run representative workloads before measuring performance.
- Use
-XX:CompileThreshold
or tiered compilation tuning: Adjust when methods get compiled. - Pin critical methods: Use
-XX:CompileCommand=compileonly
to force early compilation. - Avoid reflection in hot paths: It inhibits inlining. Use method handles or code generation (e.g., via
ASM
orByteBuddy
) if needed.
Tip: Use JITWatch or Async-Profiler to identify methods that are not being inlined or are deoptimized.
4. Reduce Contention with Thread Confinement and False Sharing Avoidance
Thread contention adds latency unpredictably.
- Thread-per-core design: Assign dedicated threads to specific tasks to avoid synchronization.
- False sharing prevention: Pad variables to avoid adjacent data on the same cache line.
Example:
public class PaddedCounter { public volatile long value; private long p1, p2, p3, p4, p5, p6, p7; // Cache line padding }
Or use @Contended
(with -XX:-RestrictContended
):
@jdk.internal.vm.annotation.Contended public class IsolatedCounter { public volatile long value; }
This ensures each counter lives in its own CPU cache line, preventing performance-killing false sharing.
5. Use Efficient I/O and Serialization
High-latency I/O can dominate end-to-end response time.
- NIO and direct buffers: Use
ByteBuffer.allocateDirect()
for off-heap I/O buffers to avoid copying. - Zero-copy techniques: Leverage
FileChannel.transferTo()
orMemory-mapped files
(MappedByteBuffer
). - Efficient serialization: Use Protobuf, FlatBuffers, or SBE (Simple Binary Encoding) instead of JSON or Java serialization.
SBE is especially popular in low-latency finance due to its schema-driven, zero-allocation decoding.
6. Pin Threads and Tune the OS/JVM
Even the best code can be derailed by OS scheduling.
- CPU affinity: Bind critical threads to specific CPU cores using
taskset
or APIs likeJNA
withsched_setaffinity
. - Real-time scheduling: On Linux, use
SCHED_FIFO
withchrt
(requires root). - Disable CPU frequency scaling: Use "performance" governor:
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
- JVM flags for latency:
-XX: UseZGC -XX: UnlockExperimentalVMOptions -XX: PerfDisableSharedMem -XX:-UseBiasedLocking -XX: AlwaysPreTouch -Xms4g -Xmx4g
These reduce variability from memory allocation, locking, and background JVM processes.
Low-latency Java isn’t about raw speed—it’s about predictability. The goal is to minimize jitter (variation in response time), not just average latency. By combining GC control, lock-free designs, JIT tuning, and system-level optimizations, Java can reliably achieve microsecond-scale responses.
Basically, it’s not whether Java can be fast—it’s how carefully you manage resources and avoid hidden delays.
The above is the detailed content of Low-Latency Java Programming Techniques. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Enums in Java are special classes that represent fixed number of constant values. 1. Use the enum keyword definition; 2. Each enum value is a public static final instance of the enum type; 3. It can include fields, constructors and methods to add behavior to each constant; 4. It can be used in switch statements, supports direct comparison, and provides built-in methods such as name(), ordinal(), values() and valueOf(); 5. Enumeration can improve the type safety, readability and flexibility of the code, and is suitable for limited collection scenarios such as status codes, colors or week.

Interface Isolation Principle (ISP) requires that clients not rely on unused interfaces. The core is to replace large and complete interfaces with multiple small and refined interfaces. Violations of this principle include: an unimplemented exception was thrown when the class implements an interface, a large number of invalid methods are implemented, and irrelevant functions are forcibly classified into the same interface. Application methods include: dividing interfaces according to common methods, using split interfaces according to clients, and using combinations instead of multi-interface implementations if necessary. For example, split the Machine interfaces containing printing, scanning, and fax methods into Printer, Scanner, and FaxMachine. Rules can be relaxed appropriately when using all methods on small projects or all clients.

Java supports asynchronous programming including the use of CompletableFuture, responsive streams (such as ProjectReactor), and virtual threads in Java19. 1.CompletableFuture improves code readability and maintenance through chain calls, and supports task orchestration and exception handling; 2. ProjectReactor provides Mono and Flux types to implement responsive programming, with backpressure mechanism and rich operators; 3. Virtual threads reduce concurrency costs, are suitable for I/O-intensive tasks, and are lighter and easier to expand than traditional platform threads. Each method has applicable scenarios, and appropriate tools should be selected according to your needs and mixed models should be avoided to maintain simplicity

There are three main differences between Callable and Runnable in Java. First, the callable method can return the result, suitable for tasks that need to return values, such as Callable; while the run() method of Runnable has no return value, suitable for tasks that do not need to return, such as logging. Second, Callable allows to throw checked exceptions to facilitate error transmission; while Runnable must handle exceptions internally. Third, Runnable can be directly passed to Thread or ExecutorService, while Callable can only be submitted to ExecutorService and returns the Future object to

In Java, enums are suitable for representing fixed constant sets. Best practices include: 1. Use enum to represent fixed state or options to improve type safety and readability; 2. Add properties and methods to enums to enhance flexibility, such as defining fields, constructors, helper methods, etc.; 3. Use EnumMap and EnumSet to improve performance and type safety because they are more efficient based on arrays; 4. Avoid abuse of enums, such as dynamic values, frequent changes or complex logic scenarios, which should be replaced by other methods. Correct use of enum can improve code quality and reduce errors, but you need to pay attention to its applicable boundaries.

JavaNIO is a new IOAPI introduced by Java 1.4. 1) is aimed at buffers and channels, 2) contains Buffer, Channel and Selector core components, 3) supports non-blocking mode, and 4) handles concurrent connections more efficiently than traditional IO. Its advantages are reflected in: 1) Non-blocking IO reduces thread overhead, 2) Buffer improves data transmission efficiency, 3) Selector realizes multiplexing, and 4) Memory mapping speeds up file reading and writing. Note when using: 1) The flip/clear operation of the Buffer is easy to be confused, 2) Incomplete data needs to be processed manually without blocking, 3) Selector registration must be canceled in time, 4) NIO is not suitable for all scenarios.

Javaprovidesmultiplesynchronizationtoolsforthreadsafety.1.synchronizedblocksensuremutualexclusionbylockingmethodsorspecificcodesections.2.ReentrantLockoffersadvancedcontrol,includingtryLockandfairnesspolicies.3.Conditionvariablesallowthreadstowaitfor

Java's class loading mechanism is implemented through ClassLoader, and its core workflow is divided into three stages: loading, linking and initialization. During the loading phase, ClassLoader dynamically reads the bytecode of the class and creates Class objects; links include verifying the correctness of the class, allocating memory to static variables, and parsing symbol references; initialization performs static code blocks and static variable assignments. Class loading adopts the parent delegation model, and prioritizes the parent class loader to find classes, and try Bootstrap, Extension, and ApplicationClassLoader in turn to ensure that the core class library is safe and avoids duplicate loading. Developers can customize ClassLoader, such as URLClassL
