欧美夜夜橹橹高清,久久婷婷这里只有精品69,三级片在线观看国产

本文選自極術(shù)專(zhuān)欄“Arm服務(wù)器”，文章將帶你了解Arm架構(gòu)下的Synchronization專(zhuān)業(yè)知識(shí)。

一、簡(jiǎn)介

隨著近年來(lái)Arm服務(wù)器的應(yīng)用越來(lái)越廣泛，越來(lái)越多的云廠商開(kāi)始提供基于Arm架構(gòu)的云實(shí)例，越來(lái)越多的開(kāi)發(fā)人員正在為Arm平臺(tái)編寫(xiě)軟件。

Synchronization是軟件遷移和優(yōu)化過(guò)程中的熱門(mén)話(huà)題。基于Arm架構(gòu)的服務(wù)器通常具有比其他架構(gòu)更多的CPU內(nèi)核，對(duì)Synchronization的深入理解顯得更為重要。

Arm和X86 CPU之間最顯著的區(qū)別之一是它們的內(nèi)存模型：Arm架構(gòu)具有與x86架構(gòu)的TSO（Total Store Order）模型不同的弱內(nèi)存模型。不同的內(nèi)存模型可能會(huì)導(dǎo)致程序在一種架構(gòu)上運(yùn)行良好，但在另一種架構(gòu)上會(huì)遇到性能問(wèn)題或錯(cuò)誤。Arm服務(wù)器更寬松的內(nèi)存模型允許更多的編譯器和硬件優(yōu)化以提高系統(tǒng)性能，但代價(jià)是它更難理解并且可能更容易編寫(xiě)錯(cuò)誤代碼。

我們創(chuàng)作此文檔是為了分享有關(guān)Arm架構(gòu)的Synchronization專(zhuān)業(yè)知識(shí)，可以幫助其他架構(gòu)的開(kāi)發(fā)人員在Arm系統(tǒng)上進(jìn)行開(kāi)發(fā)。

二、Armv8-A架構(gòu)上的Synchronization方法

本文檔首先介紹了Armv8-A架構(gòu)上的Synchronization相關(guān)知識(shí)，包括原子操作、Arm內(nèi)存順序和數(shù)據(jù)訪問(wèn)屏障指令。

2.1 原子操作

鎖的實(shí)現(xiàn)要求原子訪問(wèn)，Arm架構(gòu)定義了兩種類(lèi)型的原子訪問(wèn)：

Load exclusive and store exclusive

Atomic operation, which is introduced in armv8.1-a large system extension (LSE)

2.1.1 Exclusive load and store

LDREX/LDXR - The load exclusive instruction performs a load from an addressed memory location, the PE (e.g. the CPU) also marks the physical address being accessed as an exclusive access. The exclusive access mark is checked by store exclusive instructions.

STREX/STXR - The store exclusive instruction tries to a value from a register to memory if the PE (e.g. the CPU) has exclusive access to the memory address, and returns a status value of 0 if the store was successful, or of 1 if no store was performed.

2.1.2 LSE Atomic operation

LDXR/STXR使用了try and test機(jī)制，LSE不一樣，它直接強(qiáng)制原子訪問(wèn)，主要有如下指令:

Compare and Swap instructions, CAS, and CASP. These instructions perform a read from memory and compare it against the value held in the first register. If the comparison is equal, the value in the second register is written to memory. If the write is performed, the read and write occur atomically such that no other modification of the memory location can take place between the read and write.

Atomic memory operation instructions, LD, and ST, whereis one of ADD, CLR, EOR, SET, SMAX, SMIN, UMAX, and UMIN. Each instruction atomically loads a value from memory, performs an operation on the values, and stores the result back to memory. The LDinstructions save the originally read value in the destination register of the instruction.

Swap instruction, SWP. This instruction atomically reads a location from memory into a register and writes back a different supplied value back to the same memory location.

2.2 Arm內(nèi)存順序

Arm架構(gòu)定義了一種弱內(nèi)存模型，內(nèi)存訪問(wèn)可能不會(huì)按照代碼順序：

2.3 Arm數(shù)據(jù)訪問(wèn)屏障指令

Arm架構(gòu)定義了屏障指令來(lái)保證內(nèi)存訪問(wèn)的順序。

DMB– Data Memory Barrier
Explicit memory accesses before the DMB are observed before any explicit access after the DMB

Does not guarantee when the operations happen, just guarantee the order

LDR X0, [X1] ;Must be seen by memory system before STR

DMB SY

ADD X2, #1 ; May be executed before or after memory system sees LDR

STR X3, [X4] ;Must be seen by memory system after LDR

DSB– Data Synchronization Barrier
A DSB is more restrictive than a DMB

Use a DSB when necessary, but do not overuse them

No instruction after a DSB will execute until:

All explicit memory accesses before the DSB in program order have completed

Any outstanding cache/TLB/branch predictor operations complete

DC ISW ; Operation must have completed before DSB can complete

STR X0, [X1] ; Access must have completed before DSB can complete

DSB SY

ADD X2, X2, #3 ;Cannot be executed until DSB completes

DMB和DSB是雙向柵欄，對(duì)兩個(gè)方向都限制，Armv8-a也設(shè)計(jì)了一種單向柵欄：load-acquire和store-release機(jī)制，只在一個(gè)方向上做限制。

Load-Acquire (LDAR)

All accesses after the LDAR are observed by memory system after the LDAR.

Accesses before the LDAR are not affected.

Store-Release (STLR)

All accesses before the STLR are observed by memory system before the STLR.

Accesses after the STLR are not affected.

三、C++內(nèi)存模型

有了語(yǔ)言層面的內(nèi)存模型，對(duì)于大多數(shù)情況，開(kāi)發(fā)者不需要去寫(xiě)依賴(lài)于具體架構(gòu)的匯編代碼，而只需要借助于良好設(shè)計(jì)的語(yǔ)言層面的內(nèi)存模型來(lái)編寫(xiě)高質(zhì)量代碼，不必?fù)?dān)心架構(gòu)差異。

C++ memory model:
https://en.cppreference.com/w/cpp/header/atomic

我們做了一個(gè)C++內(nèi)存模型與Armv8-A實(shí)現(xiàn)之間的映射：

四、總結(jié)

在白皮書(shū)中，為幫助讀者更好地理解，我們選取了三個(gè)典型案例進(jìn)行深入分析。由于與Synchronization相關(guān)的編程非常復(fù)雜，因此我們必須仔細(xì)權(quán)衡其正確性和性能。

我們建議首先使用較重的屏障指令保證邏輯的正確性，然后通過(guò)移除一些冗余屏障或在必要時(shí)切換到較輕的屏障來(lái)繼續(xù)提高性能。對(duì)Arm內(nèi)存模型和相關(guān)指令的深入理解，是對(duì)實(shí)現(xiàn)準(zhǔn)確和高性能的Synchronization編程非常有必要的。

在附錄部分，我們還介紹了內(nèi)存模型工具（The litmus test suite），它可以幫助理解內(nèi)存模型并在各種架構(gòu)上驗(yàn)證程序。

關(guān)于以上內(nèi)容更完整的講解，請(qǐng)參考“Arm架構(gòu)下的Synchronization概述和案例分析白皮書(shū)”。

審核編輯：湯梓紅

聲明：本文內(nèi)容及配圖由入駐作者撰寫(xiě)或者入駐合作網(wǎng)站授權(quán)轉(zhuǎn)載。文章觀點(diǎn)僅代表作者本人，不代表電子發(fā)燒友網(wǎng)立場(chǎng)。文章及其配圖僅供工程師學(xué)習(xí)之用，如有內(nèi)容侵權(quán)或者其他違規(guī)問(wèn)題，請(qǐng)聯(lián)系本站處理。舉報(bào)投訴

ARM

ARM

+關(guān)注

關(guān)注
134

文章
9350

瀏覽量
377408
內(nèi)核

內(nèi)核

+關(guān)注

關(guān)注
3

文章
1416

瀏覽量
41417
cpu

cpu

+關(guān)注

關(guān)注
68

文章
11076

瀏覽量
217015
服務(wù)器

服務(wù)器

+關(guān)注

關(guān)注
13

文章
9793

瀏覽量
87933
編譯器

編譯器

+關(guān)注

關(guān)注
1

文章
1662

瀏覽量
50203

原文標(biāo)題：Arm架構(gòu)下的Synchronization概述和案例分析白皮書(shū)｜附下載

文章出處：【微信號(hào)：Ithingedu，微信公眾號(hào)：安芯教育科技】歡迎添加關(guān)注！文章轉(zhuǎn)載請(qǐng)注明出處。

女人自慰AV免费观看内涵网,日韩国产剧情在线观看网址,神马电影网特片网,最新一级电影欧美,在线观看亚洲欧美日韩,黄色视频在线播放免费观看,ABO涨奶期羡澄,第一导航fulione,美女主播操b

搜索歷史

Arm架構(gòu)下的Synchronization概述和案例分析

評(píng)論