亚洲国产日韩欧美一区二区三区,精品亚洲国产成人av在线,国产99视频精品免视看7,99国产精品久久久久久久成人热,欧美日韩亚洲国产综合乱

Home Java JavaBase Introduction to Java CAS principle analysis

Introduction to Java CAS principle analysis

Dec 24, 2020 pm 05:37 PM
cas java concurrent

java basic tutorialColumn introduction and analysis Java CAS

Introduction to Java CAS principle analysis

##Recommendation (free): java basic tutorial

1. Introduction

CAS stands for compare and swap. A mechanism for implementing synchronization functionality in a multi-threaded environment. A CAS operation contains three operands -- a memory location, an expected value, and a new value. The implementation logic of CAS is to compare the value at the memory location with the expected value. If they are equal, replace the value at the memory location with the new value. If not equal, no operation is performed.

In Java, Java does not directly implement CAS. CAS-related implementations are implemented in the form of C inline assembly. Java code needs to be called through JNI. I will analyze the implementation details in Chapter 3.

As mentioned earlier, the process of CAS operation is not difficult. But the above explanation is not enough. Next, I will introduce some other background knowledge. Only with this background knowledge can we better understand the subsequent content.

2. Background introduction

We all know that the CPU transmits data through the bus and memory. In the multi-core era, multiple cores communicate with memory and other hardware through the same bus. As shown below:

Introduction to Java CAS principle analysis

Picture source: "In-depth Understanding of Computer Systems"

The above picture is a relatively simple computer structure diagram. Although simple, it is sufficient to explain question. In the diagram above, the CPU communicates with the memory via the bus marked by the two blue arrows. Let’s consider a question. If multiple cores of the CPU operate on the same memory at the same time, what kind of errors will occur if it is not controlled? Here is a brief explanation, assuming that core 1 writes 64-bit data to the memory via a 32-bit bandwidth bus, core 1 needs to write twice to complete the entire operation. If after core 1 writes 32-bit data for the first time, core 2 reads 64-bit data from the memory location written by core 1. Since core 1 has not completely written all 64-bit data into the memory, core 2 begins to read data from this memory location, so the read data must be chaotic.

But there is actually no need to worry about this issue. Through the Intel Developer Manual, we can learn that starting with Pentium processors, Intel processors will ensure atomic reading and writing of quadwords aligned on 64-bit boundaries.

Based on the above description, we can conclude that Intel processors can ensure that single-access memory-aligned instructions are executed atomically. But what if it is an instruction to access memory twice? The answer is no guarantee. For example, the increment instruction

inc dword ptr [...] is equivalent to DEST = DEST 1. This instruction contains three operationsRead->Modify->Write, involving two memory accesses. Consider a situation where a value of 1 is stored at a specified location in memory. Now both CPU cores execute the instruction at the same time. The process of alternate execution of the two cores is as follows:

    Core 1 reads the value 1 from the specified location in the memory and loads it into the register
  1. Core 2 reads from the specified location in the memory Value 1 and load it into the register
  2. Core 1 Decrement the value in the register by 1
  3. Core 2 Decrement the value in the register by 1
  4. Core 1 Write the modified value Back to memory
  5. Core 2 Write the modified value back to memory
After executing the above process, the final value in the memory is 2, and what we expect is 3, this is what happens Problem. To deal with this problem, it is necessary to prevent two or more cores from operating the same memory area at the same time. So how to avoid it? This introduces the protagonist of this article - the lock prefix. For a detailed description of this instruction, please refer to the Intel Developer Manual Volume 2 Instruction Set Reference, Chapter 3 Instruction Set Reference A-L. I quote a section of it here, as follows:

LOCK—Assert LOCK# Signal Prefix
Causes the processor's LOCK# signal to be asserted during execution of the accompanying instruction (
turns the instruction into an atomic instruction). In a multiprocessor environment, the LOCK# signal ensures that the processor has exclusive use of any shared memory while the signal is asserted.
The key points described above have been used It is highlighted in bold that in a multi-processor environment, the LOCK# signal can ensure that the processor has exclusive use of some shared memory. lock can be added before the following instructions:

ADD, ADC, AND, BTC, BTR, BTS, CMPXCHG, CMPXCH8B, CMPXCHG16B, DEC, INC, NEG, NOT, OR, SBB, SUB, XOR, XADD, and XCHG.

By adding the lock prefix before the inc instruction, the instruction can be made atomic. When multiple cores execute the same inc instruction at the same time, they will do so in a serial manner, thus avoiding the situation mentioned above. So there is another question here. How does the lock prefix ensure that the core exclusively occupies a certain memory area? The answer is as follows:

In Intel processors, there are two ways to ensure that a certain core of the processor occupies a certain memory area exclusively. The first way is to lock the bus and let a certain core use the bus exclusively, but this is too expensive. After the bus is locked, other cores cannot access the memory, which may cause other cores to stop working for a short time. The second way is to lock the cache, if some memory data is cached in the processor cache. The LOCK# signal issued by the processor does not lock the bus, but locks the memory area corresponding to the cache line. Other processors cannot perform related operations on this memory area while this memory area is locked. Compared with locking the bus, the cost of locking the cache is obviously smaller. Regarding bus locks and cache locks, for a more detailed description, please refer to the Intel Developer’s Manual Volume 3 Software Developer’s Manual, Chapter 8 Multiple-Processor Management.

3. Source code analysis

With the above background knowledge, now we can read the source code of CAS leisurely. The content of this chapter will analyze the compareAndSet method in the atomic class AtomicInteger under the java.util.concurrent.atomic package. The relevant analysis is as follows:

public?class?AtomicInteger?extends?Number?implements?java.io.Serializable?{

????//?setup?to?use?Unsafe.compareAndSwapInt?for?updates
????private?static?final?Unsafe?unsafe?=?Unsafe.getUnsafe();
????private?static?final?long?valueOffset;

????static?{
????????try?{
????????????//?計算變量?value?在類對象中的偏移
????????????valueOffset?=?unsafe.objectFieldOffset
????????????????(AtomicInteger.class.getDeclaredField("value"));
????????}?catch?(Exception?ex)?{?throw?new?Error(ex);?}
????}

????private?volatile?int?value;
????
????public?final?boolean?compareAndSet(int?expect,?int?update)?{
????????/*
?????????*?compareAndSet?實際上只是一個殼子,主要的邏輯封裝在?Unsafe?的?
?????????*?compareAndSwapInt?方法中
?????????*/
????????return?unsafe.compareAndSwapInt(this,?valueOffset,?expect,?update);
????}
????
????//?......
}

public?final?class?Unsafe?{
????//?compareAndSwapInt?是?native?類型的方法,繼續(xù)往下看
????public?final?native?boolean?compareAndSwapInt(Object?o,?long?offset,
??????????????????????????????????????????????????int?expected,
??????????????????????????????????????????????????int?x);
????//?......
}
//?unsafe.cpp
/*
?*?這個看起來好像不像一個函數(shù),不過不用擔心,不是重點。UNSAFE_ENTRY?和?UNSAFE_END?都是宏,
?*?在預(yù)編譯期間會被替換成真正的代碼。下面的?jboolean、jlong?和?jint?等是一些類型定義(typedef):
?*?
?*?jni.h
?*?????typedef?unsigned?char???jboolean;
?*?????typedef?unsigned?short??jchar;
?*?????typedef?short???????????jshort;
?*?????typedef?float???????????jfloat;
?*?????typedef?double??????????jdouble;
?*?
?*?jni_md.h
?*?????typedef?int?jint;
?*?????#ifdef?_LP64?//?64-bit
?*?????typedef?long?jlong;
?*?????#else
?*?????typedef?long?long?jlong;
?*?????#endif
?*?????typedef?signed?char?jbyte;
?*/
UNSAFE_ENTRY(jboolean,?Unsafe_CompareAndSwapInt(JNIEnv?*env,?jobject?unsafe,?jobject?obj,?jlong?offset,?jint?e,?jint?x))
??UnsafeWrapper("Unsafe_CompareAndSwapInt");
??oop?p?=?JNIHandles::resolve(obj);
??//?根據(jù)偏移量,計算?value?的地址。這里的?offset?就是?AtomaicInteger?中的?valueOffset
??jint*?addr?=?(jint?*)?index_oop_from_field_offset_long(p,?offset);
??//?調(diào)用?Atomic?中的函數(shù)?cmpxchg,該函數(shù)聲明于?Atomic.hpp?中
??return?(jint)(Atomic::cmpxchg(x,?addr,?e))?==?e;
UNSAFE_END

//?atomic.cpp
unsigned?Atomic::cmpxchg(unsigned?int?exchange_value,
?????????????????????????volatile?unsigned?int*?dest,?unsigned?int?compare_value)?{
??assert(sizeof(unsigned?int)?==?sizeof(jint),?"more?work?to?do");
??/*
???*?根據(jù)操作系統(tǒng)類型調(diào)用不同平臺下的重載函數(shù),這個在預(yù)編譯期間編譯器會決定調(diào)用哪個平臺下的重載
???*?函數(shù)。相關(guān)的預(yù)編譯邏輯如下:
???*?
???*?atomic.inline.hpp:
???*????#include?"runtime/atomic.hpp"
???*????
???*????//?Linux
???*????#ifdef?TARGET_OS_ARCH_linux_x86
???*????#?include?"atomic_linux_x86.inline.hpp"
???*????#endif
???*???
???*????//?省略部分代碼
???*????
???*????//?Windows
???*????#ifdef?TARGET_OS_ARCH_windows_x86
???*????#?include?"atomic_windows_x86.inline.hpp"
???*????#endif
???*????
???*????//?BSD
???*????#ifdef?TARGET_OS_ARCH_bsd_x86
???*????#?include?"atomic_bsd_x86.inline.hpp"
???*????#endif
???*?
???*?接下來分析?atomic_windows_x86.inline.hpp?中的?cmpxchg?函數(shù)實現(xiàn)
???*/
??return?(unsigned?int)Atomic::cmpxchg((jint)exchange_value,?(volatile?jint*)dest,
???????????????????????????????????????(jint)compare_value);
}

The above analysis seems to be more, but the main process is not complicated. . If you don't get hung up on the details of the code, it's relatively easy to understand. Next, I will analyze the Atomic::cmpxchg function under the Windows platform. Read on.

//?atomic_windows_x86.inline.hpp
#define?LOCK_IF_MP(mp)?__asm?cmp?mp,?0??\
???????????????????????__asm?je?L0??????\
???????????????????????__asm?_emit?0xF0?\
???????????????????????__asm?L0:
??????????????
inline?jint?Atomic::cmpxchg?(jint?exchange_value,?volatile?jint*?dest,?jint?compare_value)?{
??//?alternative?for?InterlockedCompareExchange
??int?mp?=?os::is_MP();
??__asm?{
????mov?edx,?dest
????mov?ecx,?exchange_value
????mov?eax,?compare_value
????LOCK_IF_MP(mp)
????cmpxchg?dword?ptr?[edx],?ecx
??}
}

The above code consists of the LOCK_IF_MP precompiled identifier and the cmpxchg function. To see it a little clearer, let's replace LOCK_IF_MP in the cmpxchg function with the actual content. As follows:

inline?jint?Atomic::cmpxchg?(jint?exchange_value,?volatile?jint*?dest,?jint?compare_value)?{
??//?判斷是否是多核?CPU
??int?mp?=?os::is_MP();
??__asm?{
????//?將參數(shù)值放入寄存器中
????mov?edx,?dest????//?注意:?dest?是指針類型,這里是把內(nèi)存地址存入?edx?寄存器中
????mov?ecx,?exchange_value
????mov?eax,?compare_value
????
????//?LOCK_IF_MP
????cmp?mp,?0
????/*
?????*?如果?mp?=?0,表明是線程運行在單核?CPU?環(huán)境下。此時?je?會跳轉(zhuǎn)到?L0?標記處,
?????*?也就是越過?_emit?0xF0?指令,直接執(zhí)行?cmpxchg?指令。也就是不在下面的?cmpxchg?指令
?????*?前加?lock?前綴。
?????*/
????je?L0
????/*
?????*?0xF0?是?lock?前綴的機器碼,這里沒有使用?lock,而是直接使用了機器碼的形式。至于這樣做的
?????*?原因可以參考知乎的一個回答:
?????*?????https://www.zhihu.com/question/50878124/answer/123099923
?????*/?
????_emit?0xF0
L0:
????/*
?????*?比較并交換。簡單解釋一下下面這條指令,熟悉匯編的朋友可以略過下面的解釋:
?????*???cmpxchg:?即“比較并交換”指令
?????*???dword:?全稱是?double?word,在?x86/x64?體系中,一個?
?????*??????????word?=?2?byte,dword?=?4?byte?=?32?bit
?????*???ptr:?全稱是?pointer,與前面的?dword?連起來使用,表明訪問的內(nèi)存單元是一個雙字單元
?????*???[edx]:?[...]?表示一個內(nèi)存單元,edx?是寄存器,dest?指針值存放在?edx?中。
?????*??????????那么?[edx]?表示內(nèi)存地址為?dest?的內(nèi)存單元
?????*??????????
?????*?這一條指令的意思就是,將?eax?寄存器中的值(compare_value)與?[edx]?雙字內(nèi)存單元中的值
?????*?進行對比,如果相同,則將?ecx?寄存器中的值(exchange_value)存入?[edx]?內(nèi)存單元中。
?????*/
????cmpxchg?dword?ptr?[edx],?ecx
??}
}

The implementation process of CAS is finished here. The implementation of CAS is inseparable from the support of the processor. There are so many codes above, but the core code is actually a cmpxchg instruction with lock prefix, that is, lock cmpxchg dword ptr [edx], ecx.

4. ABA problem

When talking about CAS, we basically have to talk about the ABA problem of CAS. CAS consists of three steps, namely "read->compare->writeback". Consider a situation where thread 1 and thread 2 execute CAS logic at the same time. The execution sequence of the two threads is as follows:

  1. Time 1: Thread 1 performs a read operation and obtains the original value A, and then the thread Switched away
  2. Time 2: Thread 2 completes the CAS operation and changes the original value from A to B
  3. Time 3: Thread 2 performs the CAS operation again and changes the original value from B to A
  4. Moment 4: Thread 1 resumes running, compares the comparison value (compareValue) with the original value (oldValue), and finds that the two values ??are equal. Then write the new value (newValue) into the memory to complete the CAS operation

As in the above process, thread 1 does not know that the original value has been modified, and it seems that there is no change, so it The process will continue to execute. For ABA problems, the usual solution is to set a version number for each CAS operation. The java.util.concurrent.atomic package provides an atomic class AtomicStampedReference that can handle ABA issues. The specific implementation will not be analyzed here. Interested friends can check it out for themselves.

5. Summary

Writing this, this article is finally coming to an end. Although the principle of CAS itself, including its implementation, is not difficult, it is really not easy to write. This involves some low-level knowledge. Although I can understand it, it is still a bit difficult to understand it. Due to my lack of underlying knowledge, some of the above analysis will inevitably be wrong. So if there is an error, please feel free to comment. Of course, it is best to explain why it is wrong. Thank you.

Okay, that’s it for this article. Thanks for reading and bye.

Appendix

The paths to several files used in the previous source code analysis section are posted here. It will help everyone index, as follows:

File name Path
Unsafe.java openjdk/jdk/src/share/classes/sun/misc/Unsafe.java
unsafe.cpp openjdk/ hotspot/src/share/vm/prims/unsafe.cpp
atomic.cpp openjdk/hotspot/src/share/vm/runtime/atomic.cpp
atomic_windows_x86.inline.hpp openjdk/hotspot/src/os_cpu/windows_x86/vm/atomic_windows_x86.inline.hpp

The above is the detailed content of Introduction to Java CAS principle analysis. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Hot Topics

PHP Tutorial
1488
72
VSCode settings.json location VSCode settings.json location Aug 01, 2025 am 06:12 AM

The settings.json file is located in the user-level or workspace-level path and is used to customize VSCode settings. 1. User-level path: Windows is C:\Users\\AppData\Roaming\Code\User\settings.json, macOS is /Users//Library/ApplicationSupport/Code/User/settings.json, Linux is /home//.config/Code/User/settings.json; 2. Workspace-level path: .vscode/settings in the project root directory

How to handle transactions in Java with JDBC? How to handle transactions in Java with JDBC? Aug 02, 2025 pm 12:29 PM

To correctly handle JDBC transactions, you must first turn off the automatic commit mode, then perform multiple operations, and finally commit or rollback according to the results; 1. Call conn.setAutoCommit(false) to start the transaction; 2. Execute multiple SQL operations, such as INSERT and UPDATE; 3. Call conn.commit() if all operations are successful, and call conn.rollback() if an exception occurs to ensure data consistency; at the same time, try-with-resources should be used to manage resources, properly handle exceptions and close connections to avoid connection leakage; in addition, it is recommended to use connection pools and set save points to achieve partial rollback, and keep transactions as short as possible to improve performance.

python itertools combinations example python itertools combinations example Jul 31, 2025 am 09:53 AM

itertools.combinations is used to generate all non-repetitive combinations (order irrelevant) that selects a specified number of elements from the iterable object. Its usage includes: 1. Select 2 element combinations from the list, such as ('A','B'), ('A','C'), etc., to avoid repeated order; 2. Take 3 character combinations of strings, such as "abc" and "abd", which are suitable for subsequence generation; 3. Find the combinations where the sum of two numbers is equal to the target value, such as 1 5=6, simplify the double loop logic; the difference between combinations and arrangement lies in whether the order is important, combinations regard AB and BA as the same, while permutations are regarded as different;

Mastering Dependency Injection in Java with Spring and Guice Mastering Dependency Injection in Java with Spring and Guice Aug 01, 2025 am 05:53 AM

DependencyInjection(DI)isadesignpatternwhereobjectsreceivedependenciesexternally,promotingloosecouplingandeasiertestingthroughconstructor,setter,orfieldinjection.2.SpringFrameworkusesannotationslike@Component,@Service,and@AutowiredwithJava-basedconfi

python pytest fixture example python pytest fixture example Jul 31, 2025 am 09:35 AM

fixture is a function used to provide preset environment or data for tests. 1. Use the @pytest.fixture decorator to define fixture; 2. Inject fixture in parameter form in the test function; 3. Execute setup before yield, and then teardown; 4. Control scope through scope parameters, such as function, module, etc.; 5. Place the shared fixture in conftest.py to achieve cross-file sharing, thereby improving the maintainability and reusability of tests.

Troubleshooting Common Java `OutOfMemoryError` Scenarios Troubleshooting Common Java `OutOfMemoryError` Scenarios Jul 31, 2025 am 09:07 AM

java.lang.OutOfMemoryError: Javaheapspace indicates insufficient heap memory, and needs to check the processing of large objects, memory leaks and heap settings, and locate and optimize the code through the heap dump analysis tool; 2. Metaspace errors are common in dynamic class generation or hot deployment due to excessive class metadata, and MaxMetaspaceSize should be restricted and class loading should be optimized; 3. Unabletocreatenewnativethread due to exhausting system thread resources, it is necessary to check the number of threads, use thread pools, and adjust the stack size; 4. GCoverheadlimitexceeded means that GC is frequent but has less recycling, and GC logs should be analyzed and optimized.

How to work with Calendar in Java? How to work with Calendar in Java? Aug 02, 2025 am 02:38 AM

Use classes in the java.time package to replace the old Date and Calendar classes; 2. Get the current date and time through LocalDate, LocalDateTime and LocalTime; 3. Create a specific date and time using the of() method; 4. Use the plus/minus method to immutably increase and decrease the time; 5. Use ZonedDateTime and ZoneId to process the time zone; 6. Format and parse date strings through DateTimeFormatter; 7. Use Instant to be compatible with the old date types when necessary; date processing in modern Java should give priority to using java.timeAPI, which provides clear, immutable and linear

Understanding the Java Virtual Machine (JVM) Internals Understanding the Java Virtual Machine (JVM) Internals Aug 01, 2025 am 06:31 AM

TheJVMenablesJava’s"writeonce,runanywhere"capabilitybyexecutingbytecodethroughfourmaincomponents:1.TheClassLoaderSubsystemloads,links,andinitializes.classfilesusingbootstrap,extension,andapplicationclassloaders,ensuringsecureandlazyclassloa

See all articles