Parallel Architecture - Hardware Edition.

People often think that is equivalent to the parallel multi-core, but modern computers are used on different levels of parallelism.

First, bit-level (bit-level) in parallel

Why 32-bit computer running speed faster than 8-bit computer it? Because parallel, for two 32-bit addition, the computer must be eight times eight calculations, the computer 32 may be in one step, i.e. parallel processing of four 32-bit byte.

Development of computers has gone through eight, 16, 32, 64 are now in the era. Yet by the bit upgrades to improve the performance bottleneck exists, which is the reason for the short term we can not walk 128 times.

Second, instruction level (instruction-level) in parallel

Modern high CPU parallelism, pipelining techniques which include, executed out of order execution and speculation.

Third, the level data (data) in parallel

The graphics processing is suitable for data-level parallelism scene.

Fourth, task level (task-level) in parallel

We finally arrived in the form of parallel thought - multiple processors. From a programmer's point of view, the most obvious classification system is its multi-processor memory model (shared memory and distributed memory model)

Mononuclear

In 1965, Intel co-founder Gordon Moore made his own name of "Moore's Law" means the integrated circuit can accommodate the number of components will double every 18-24 months, the performance will also be double.

For now, Moore's Law has failed, for several reasons, degree of integration of components close to the limit.

  • Conductor width, has reached the physical limits
  • Cooling, will likely become the oven chassis
  • Changes in demand, is not simply the pursuit of frequency

Multicore / processor

Bandwidth chip manufacturer based technology and cost considerations, the multi-core direction, the core 8, the core 32, the core 64 and the like, but still using the shared memory access bus mode, thus limiting the CPU processed data

IN

To solve the problem of memory bandwidth, the introduction of NUMA. Only when the CPU accesses its own direct attach corresponding physical memory address, will have a shorter response time (later known as Local Access). And if you need to access the memory of other CPU attach data, you need to access through the inter-connect channel, response time is slower than before it (later known as Remote Access). So NUMA (Non-Uniform Memory Access) this name.

Almost all of the operation and maintenance will be more or less been harmed NUMA

distributed

Referring to Hadoop cluster operation mode

the above.

Guess you like

Origin blog.csdn.net/weixin_34112208/article/details/91013064