(Corrected compiler arch flags) |
|||
(26 intermediate revisions by 5 users not shown) | |||
Line 1: | Line 1: | ||
− | {{armh title| | + | {{armh title|Neoverse N1|arch}} |
{{microarchitecture | {{microarchitecture | ||
|atype=CPU | |atype=CPU | ||
− | |name= | + | |name=Neoverse N1 |
|designer=ARM Holdings | |designer=ARM Holdings | ||
|manufacturer=TSMC | |manufacturer=TSMC | ||
− | |process= | + | |introduction=February 20, 2019 |
− | | | + | |process=7 nm |
+ | |cores=4 | ||
+ | |cores 2=8 | ||
+ | |cores 3=16 | ||
+ | |cores 4=32 | ||
+ | |cores 5=64 | ||
+ | |cores 6=96 | ||
+ | |cores 7=128 | ||
+ | |type=Superscalar | ||
+ | |type 2=Superpipeline | ||
|oooe=Yes | |oooe=Yes | ||
|speculative=Yes | |speculative=Yes | ||
|renaming=Yes | |renaming=Yes | ||
− | |predecessor= | + | |stages=11 |
− | |predecessor link=arm_holdings/ | + | |decode=4-way |
+ | |isa=ARMv8.2 | ||
+ | |l1i=64 KiB | ||
+ | |l1i per=core | ||
+ | |l1i desc=4-way set associative | ||
+ | |l1d=64 KiB | ||
+ | |l1d per=core | ||
+ | |l1d desc=4-way set associative | ||
+ | |l2=512-1 MiB | ||
+ | |l2 per=core | ||
+ | |l2 desc=8-way set associative | ||
+ | |l3=2-4 MiB | ||
+ | |l3 per=core duplex | ||
+ | |l3 desc=16-way set associative | ||
+ | |predecessor=Cosmos | ||
+ | |predecessor link=arm_holdings/cosmos | ||
|successor=Zeus | |successor=Zeus | ||
|successor link=arm_holdings/microarchitectures/zeus | |successor link=arm_holdings/microarchitectures/zeus | ||
}} | }} | ||
− | '''Ares''' is a high-performance [[ARM]] [[microarchitecture]] designed by [[ARM Holdings]] for the server market. This microarchitecture is designed as a synthesizable [[IP core]] and is sold to other semiconductor companies to be implemented in their own chips. | + | '''Neoverse N1''' (codename '''Ares''') is a high-performance [[ARM]] [[microarchitecture]] designed by [[ARM Holdings]] for the server market. This microarchitecture is designed as a synthesizable [[IP core]] and is sold to other semiconductor companies to be implemented in their own chips. |
== History == | == History == | ||
− | Ares was first announced by Drew Henry, Arm’s SVP and GM of Infrastructure Business Unit, at his TechCon 2018 keynote. Ares | + | [[File:arm server roadmap techcon 2018.jpg|thumb|right|Arm's server roadmap.]] |
+ | The Neoverse N1, formerly Ares, is the first [[Arm]] design to specifically target the infrastructure market, serving as the successor to the {{armh|Cosmos|Cosmos platform}} which used the same cores as the client platform. The N1 was first announced by Drew Henry, Arm’s SVP and GM of Infrastructure Business Unit, at his TechCon 2018 keynote. Ares was officially unveiled on February 20, 2019. | ||
== Release Dates == | == Release Dates == | ||
− | Ares | + | Ares was officially disclosed on February 20, 2019. |
== Process Technology == | == Process Technology == | ||
Ares specifically takes advantage of the power and area advantages of the [[7 nm process]]. | Ares specifically takes advantage of the power and area advantages of the [[7 nm process]]. | ||
+ | |||
+ | == Compiler Support == | ||
+ | {| class="wikitable" | ||
+ | |- | ||
+ | ! Compiler !! Arch-Specific || Arch-Favorable || Arch-Target | ||
+ | |- | ||
+ | | [[GCC]] || <code>-march=armv8.2-a</code> || <code>-mtune=neoverse-n1</code> || <code>-mcpu=neoverse-n1</code> | ||
+ | |- | ||
+ | | [[LLVM]] || <code>-march=armv8.2-a</code> || <code>-mtune=neoverse-n1</code> || <code>-mcpu=neoverse-n1</code> | ||
+ | |} | ||
== Architecture == | == Architecture == | ||
− | {{ | + | The Neoverse N1 core is almost identical to the {{\\|Cortex-A76}} but features a number of enhancements for infrastructure workload. |
+ | |||
+ | * [[ARMv8.2]] | ||
+ | * [[7 nm process]] | ||
+ | * Core | ||
+ | ** 11-stage | ||
+ | ** 4-way decode | ||
+ | ** 8-way issue | ||
+ | * System architecture | ||
+ | ** Designed for the [[Coherent Mesh Network 600]] (CMN-600) mesh interconnect | ||
+ | |||
+ | === Block Diagram === | ||
+ | ==== Typical SoC ==== | ||
+ | :[[File:neoverse n1 soc block diagram.svg|850px]] | ||
+ | |||
+ | |||
+ | The Neoverse N1 is also expected to be integrated along with {{\\|Neoverse E1}} high-efficiency cores and possibly other custom IP blocks. | ||
+ | |||
+ | |||
+ | :[[File:neoverse e1 n1 soc example.svg|750px]] | ||
+ | |||
+ | ==== Individual Core ==== | ||
+ | :[[File:neoverse n1 block diagram.svg|850px]] | ||
+ | |||
+ | === Memory Hierarchy === | ||
+ | The Neoverse N1 has a private L1I, L1D, and L2 cache. | ||
+ | |||
+ | * Cache | ||
+ | ** L1I Cache | ||
+ | *** 64 KiB, 4-way set associative | ||
+ | *** 64-byte cache lines | ||
+ | *** SECDED ECC | ||
+ | *** Write-back | ||
+ | ** L1D Cache | ||
+ | *** 64 KiB, 4-way set associative | ||
+ | *** 64-byte cache lines | ||
+ | *** 4-cycle fastest load-to-use latency | ||
+ | *** SECDED ECC | ||
+ | *** Write-back | ||
+ | ** L2 Cache | ||
+ | *** 512 KiB OR 1 MiB (2 banks) | ||
+ | *** 8-way set associative | ||
+ | *** 9-11 cycle | ||
+ | **** 9-cycle fastest load-to-use latency | ||
+ | *** ECC protection per 64 bits | ||
+ | *** [[Modified Exclusive Shared Invalid]] (MESI) coherency | ||
+ | *** Strictly inclusive of the L1 data cache & non-inclusive of the L1 instruction cache | ||
+ | *** Write-back | ||
+ | ** System-level cache (SLC) | ||
+ | *** 1 Bank per core duplex | ||
+ | *** 2 MiB to 4 MiB, 16-way set associative | ||
+ | |||
+ | The Neoverse N1 TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally, there is a unified L2 TLB (STLB). | ||
+ | |||
+ | * TLBs | ||
+ | ** ITLB | ||
+ | *** 4 KiB, 16 KiB, 64 KiB, 2 MiB, and 32 MiB page sizes | ||
+ | *** 48-entry fully associative | ||
+ | ** DTLB | ||
+ | *** 48-entry fully associative | ||
+ | *** 4 KiB, 16 KiB, 64 KiB, 2 MiB, and 512 MiB page sizes | ||
+ | ** STLB | ||
+ | *** 1280-entry 5-way set associative | ||
+ | |||
+ | == Overview == | ||
+ | [[File:neoverse n1 overview.svg|right|500px|thumb|Neoverse N1 Typical SoC]] | ||
+ | Formerly known as Ares, the Neoverse N1 is the first ground-up Arm microarchitecture design that targets infrastructure, targetting a wide range of markets from the [[edge computing|edge]] to [[hyperscalers]] data centers. Departing from Arm's low-power {{armh|cortex|mobile cores}}, the N1 targets high-performance server workloads at higher TDPs and higher compute power. Compared to the prior {{armh|Cosmos|l=arch}} platform, the Neoverse N1 is said to deliver a significant uplift in single-thread performance. | ||
+ | |||
+ | The Neoverse N1 is designed to enable Arm partners rapid development of high-performance server products. The N1 features an 11-stage [[out-of-order]] core with private [[L1]] and [[L2]] caches. The core is intended to leverage Arm's {{armh|Coherent Mesh Network}} 600 (CMN-600) [[interconnect]] to scale from as little as a [[quad-core]] design to as much as [[128 cores]] and from a single [[DDR]] channel all the way up to eight channels, depending on the kind of workload being addressed. Extending the base design is a framework for [[multiprocessing]] support as well as [[chiplets]] support which can be used by companies who are looking to improve [[yield]] and manufacturability with large SoC designs. The N1 is also designed to work seamlessly with the {{\\|Neoverse E1}} which was introduced at the same time as N1 but is optimized for high throughput multithreaded workloads. | ||
+ | |||
+ | == Core == | ||
+ | The Neoverse N1 features an 11-stage accordion integer pipeline. | ||
+ | |||
+ | |||
+ | ::[[File:neoverse n1 pipeline.svg|600px]] | ||
+ | == Die == | ||
+ | === N1 core === | ||
* [[7 nm process]] | * [[7 nm process]] | ||
− | {{ | + | * 1 Core + L2 |
+ | * 1.2 mm² die size (1C + 512 KiB L2) | ||
+ | * 1.4 mm² die size (1C + 1 MiB L2) | ||
+ | * 1 W @ 2.6 GHz (0.75 V), 1.8 W @ 3.1 GHz (1.0 V) | ||
+ | |||
+ | |||
+ | :[[File:neoverse n1 core die plot.png|600px]] | ||
+ | |||
+ | == All Neoverse N1 Processors == | ||
+ | <!-- NOTE: | ||
+ | This table is generated automatically from the data in the actual articles. | ||
+ | If a microprocessor is missing from the list, an appropriate article for it needs to be | ||
+ | created and tagged accordingly. | ||
+ | |||
+ | Missing a chip? please dump its name here: https://en.wikichip.org/wiki/WikiChip:wanted_chips | ||
+ | --> | ||
+ | {{comp table start}} | ||
+ | <table class="comptable sortable tc4 tc6 tc9"> | ||
+ | {{comp table header|main|8:List of Neoverse N1-based Processors}} | ||
+ | {{comp table header|cols|Family|Launched|Process|Arch|Cores|%Frequency}} | ||
+ | {{#ask: [[Category:all microprocessor models]] [[microarchitecture::Neoverse N1]] | ||
+ | |?full page name | ||
+ | |?model number | ||
+ | |?family | ||
+ | |?first launched | ||
+ | |?process | ||
+ | |?microarchitecture | ||
+ | |?core count | ||
+ | |?base frequency#GHz | ||
+ | |format=template | ||
+ | |template=proc table 3 | ||
+ | |userparam=8 | ||
+ | |mainlabel=- | ||
+ | |valuesep=, | ||
+ | }} | ||
+ | {{comp table count|ask=[[Category:all microprocessor models]] [[microarchitecture::Neoverse N1]]}} | ||
+ | </table> | ||
+ | {{comp table end}} | ||
== Bibliography == | == Bibliography == | ||
* Drew Henry keynote, TechCon 2018 keynote. | * Drew Henry keynote, TechCon 2018 keynote. | ||
+ | * Drew Henry, direct communication | ||
+ | * Most of the technical details were obtained directly from Arm | ||
+ | |||
+ | == Documents == | ||
+ | * [[:File:arm neoverse n1 sog.pdf|Neoverse N1 Software Optimization Guide]] | ||
+ | * [[:File:arm neoverse n1 trm.pdf|Neoverse N1 Technical Reference Manual]] |
Latest revision as of 13:46, 18 February 2023
Edit Values | |
Neoverse N1 µarch | |
General Info | |
Arch Type | CPU |
Designer | ARM Holdings |
Manufacturer | TSMC |
Introduction | February 20, 2019 |
Process | 7 nm |
Core Configs | 4, 8, 16, 32, 64, 96, 128 |
Pipeline | |
Type | Superscalar, Superpipeline |
OoOE | Yes |
Speculative | Yes |
Reg Renaming | Yes |
Stages | 11 |
Decode | 4-way |
Instructions | |
ISA | ARMv8.2 |
Cache | |
L1I Cache | 64 KiB/core 4-way set associative |
L1D Cache | 64 KiB/core 4-way set associative |
L2 Cache | 512-1 MiB/core 8-way set associative |
L3 Cache | 2-4 MiB/core duplex 16-way set associative |
Succession | |
Neoverse N1 (codename Ares) is a high-performance ARM microarchitecture designed by ARM Holdings for the server market. This microarchitecture is designed as a synthesizable IP core and is sold to other semiconductor companies to be implemented in their own chips.
Contents
History[edit]
The Neoverse N1, formerly Ares, is the first Arm design to specifically target the infrastructure market, serving as the successor to the Cosmos platform which used the same cores as the client platform. The N1 was first announced by Drew Henry, Arm’s SVP and GM of Infrastructure Business Unit, at his TechCon 2018 keynote. Ares was officially unveiled on February 20, 2019.
Release Dates[edit]
Ares was officially disclosed on February 20, 2019.
Process Technology[edit]
Ares specifically takes advantage of the power and area advantages of the 7 nm process.
Compiler Support[edit]
Compiler | Arch-Specific | Arch-Favorable | Arch-Target |
---|---|---|---|
GCC | -march=armv8.2-a |
-mtune=neoverse-n1 |
-mcpu=neoverse-n1
|
LLVM | -march=armv8.2-a |
-mtune=neoverse-n1 |
-mcpu=neoverse-n1
|
Architecture[edit]
The Neoverse N1 core is almost identical to the Cortex-A76 but features a number of enhancements for infrastructure workload.
- ARMv8.2
- 7 nm process
- Core
- 11-stage
- 4-way decode
- 8-way issue
- System architecture
- Designed for the Coherent Mesh Network 600 (CMN-600) mesh interconnect
Block Diagram[edit]
Typical SoC[edit]
The Neoverse N1 is also expected to be integrated along with Neoverse E1 high-efficiency cores and possibly other custom IP blocks.
Individual Core[edit]
Memory Hierarchy[edit]
The Neoverse N1 has a private L1I, L1D, and L2 cache.
- Cache
- L1I Cache
- 64 KiB, 4-way set associative
- 64-byte cache lines
- SECDED ECC
- Write-back
- L1D Cache
- 64 KiB, 4-way set associative
- 64-byte cache lines
- 4-cycle fastest load-to-use latency
- SECDED ECC
- Write-back
- L2 Cache
- 512 KiB OR 1 MiB (2 banks)
- 8-way set associative
- 9-11 cycle
- 9-cycle fastest load-to-use latency
- ECC protection per 64 bits
- Modified Exclusive Shared Invalid (MESI) coherency
- Strictly inclusive of the L1 data cache & non-inclusive of the L1 instruction cache
- Write-back
- System-level cache (SLC)
- 1 Bank per core duplex
- 2 MiB to 4 MiB, 16-way set associative
- L1I Cache
The Neoverse N1 TLB consists of dedicated L1 TLB for instruction cache (ITLB) and another one for data cache (DTLB). Additionally, there is a unified L2 TLB (STLB).
- TLBs
- ITLB
- 4 KiB, 16 KiB, 64 KiB, 2 MiB, and 32 MiB page sizes
- 48-entry fully associative
- DTLB
- 48-entry fully associative
- 4 KiB, 16 KiB, 64 KiB, 2 MiB, and 512 MiB page sizes
- STLB
- 1280-entry 5-way set associative
- ITLB
Overview[edit]
Formerly known as Ares, the Neoverse N1 is the first ground-up Arm microarchitecture design that targets infrastructure, targetting a wide range of markets from the edge to hyperscalers data centers. Departing from Arm's low-power mobile cores, the N1 targets high-performance server workloads at higher TDPs and higher compute power. Compared to the prior Cosmos platform, the Neoverse N1 is said to deliver a significant uplift in single-thread performance.
The Neoverse N1 is designed to enable Arm partners rapid development of high-performance server products. The N1 features an 11-stage out-of-order core with private L1 and L2 caches. The core is intended to leverage Arm's Coherent Mesh Network 600 (CMN-600) interconnect to scale from as little as a quad-core design to as much as 128 cores and from a single DDR channel all the way up to eight channels, depending on the kind of workload being addressed. Extending the base design is a framework for multiprocessing support as well as chiplets support which can be used by companies who are looking to improve yield and manufacturability with large SoC designs. The N1 is also designed to work seamlessly with the Neoverse E1 which was introduced at the same time as N1 but is optimized for high throughput multithreaded workloads.
Core[edit]
The Neoverse N1 features an 11-stage accordion integer pipeline.
Die[edit]
N1 core[edit]
- 7 nm process
- 1 Core + L2
- 1.2 mm² die size (1C + 512 KiB L2)
- 1.4 mm² die size (1C + 1 MiB L2)
- 1 W @ 2.6 GHz (0.75 V), 1.8 W @ 3.1 GHz (1.0 V)
All Neoverse N1 Processors[edit]
List of Neoverse N1-based Processors | ||||||||
---|---|---|---|---|---|---|---|---|
Model | Family | Launched | Process | Arch | Cores | Frequency | ||
ALC12B00 | Graviton | 3 December 2019 | 7 nm 0.007 μm 7.0e-6 mm | Neoverse N1 | 64 | 2.5 GHz 2,500 MHz 2,500,000 kHz | ||
Count: 1 |
Bibliography[edit]
- Drew Henry keynote, TechCon 2018 keynote.
- Drew Henry, direct communication
- Most of the technical details were obtained directly from Arm
Documents[edit]
codename | Neoverse N1 + |
core count | 4 +, 8 +, 16 +, 32 +, 64 +, 96 + and 128 + |
designer | ARM Holdings + |
first launched | February 20, 2019 + |
full page name | arm holdings/microarchitectures/neoverse n1 + |
instance of | microarchitecture + |
instruction set architecture | ARMv8.2 + |
manufacturer | TSMC + |
microarchitecture type | CPU + |
name | Neoverse N1 + |
pipeline stages | 11 + |
process | 7 nm (0.007 μm, 7.0e-6 mm) + |