You must know will be - Memory Hierarchy

I believe we must have used a variety of storage technologies, such as mysql, mongodb, redis, mq, these storage service performance has a very big difference, one of which is different from the underlying storage devices used. As a programmer, you need to understand the hierarchy of memory, so as to differences in the performance of the program clear in mind. Today take you to find under the hierarchy of a computer system memory.

Storage Technology

At first understand what is the memory system?

It is essentially a hierarchical structure of different capacities, the storage device having the access time and cost. From the order of fast to slow: CPU registers, cache, main memory, disk;

Here to introduce a set of data, so that we have a clearer understanding of:

If the data is stored in CPU registers, clock cycle 0 needs to be able to access, store requires 4 to 75 clock cycles in the cache. If stored in main memory requires hundreds of cycles, and if stored on disk, about tens of millions of cycles! - From CSAPP

Next, with insight into several storage devices at a computer system involved:

Random access memory

A random access memory (RAM) is divided into static RAM (SRAM) and dynamic RAM (DRAM). SRAM is faster, but also much more expensive, usually no more than a few megabytes, commonly used for tell cache memory. DRAM is what we often say that the main memory.

Access main memory

Data stream is a to and fro between the processor and DRAM via the operating system shared bus electronic circuits. Each data transfer between the CPU and main memory are accomplished by a complex series of steps which become the bus transaction. Main memory read transaction is to transfer data to the CPU. Write transactions to transfer data from the CPU to the main memory.

Bus is a set of parallel wires, can carry address, data and control signals. The following diagram shows how the CPU chip is connected to the main memory is DRAM.

Connecting the CPU and main memory bus structure

So when we load the data and store data, CPU and main memory in the end is how to interact to achieve it?

First look at a basic instruction memory data is loaded into the CPU registers:

movq A,%rax

The contents of the address A is loaded into a register% rax, this command causes the bus interface circuit (bus interface) called on-chip CPU initiates a read transaction on the bus, particularly into three steps:

  1. A CPU address onto the system bus, I / O bridge will transmit a signal to the memory bus. For more details, a look at FIG.
  2. Feel the main memory address signals on the memory bus, the data word taken from the DRAM read address from the memory bus, which is written to the memory bus. I / O bus bridge to translate the signals to the system memory bus signal is transmitted along the system bus to the CPU bus interface. FIG case b
  3. Feel on the CPU data bus system, to read data from the bus, copy the data to the register% rax. FIG under c

a.CPU onto the memory address bus A

b. A from the main memory bus read, remove the word x, and then placed on the bus

word read out from the bus c.CPU x, and copy it to the% rax

A random access memory, there is a drawback that when the power failure, DRAM and SRAM lose their information, so as volatile memory.

Disk Storage

Disk is widely mass storage device to save data used, we are now home computer, are also frequently of 1T. Compared to gigabytes of RAM-based memory or only a few hundred, although a large difference between the read and write but performance. Time is milliseconds, 10 times slower than DRAM reads, 100 times slower than SRAM.

Disk structure

Disk is a disk made of. Each platter has two sides. The surface is covered with a magnetic recording material. A central disc is rotatable spindle (Spindle), so that the disk rotation may be at a fixed rate, typically 5400 to 15,000 revolutions per minute, typically comprising a plurality of disk platters in a sealed container.

Disc structure

As shown above, we can see that the surface is divided into many concentric circles called tracks. Track is divided into a plurality of sectors, each sector having the same data bit (typically 512 bytes). There is a gap between spaced sectors, the bit storage format used for identifying sectors.

A plurality of discs to be packaged together in a container, that is, we usually use a hard disk, a magnetic disk drive referred to.

Disk capacity

Capacity is well understood, it is the total disk can store data bits. According to the configuration of the disk, we have come to the disk capacity is determined by the following factors:

  • A recording density (recording density, the bit / inch): tracks the number of bits can be placed one inch.
  • Track density (track density, track / inch): the radius from the center of the spindle outwardly, one inch can have many tracks.
  • The areal density (areal density, bit / inch): a product of the recording density and the track density.

Through the above understanding, in fact, increase disk capacity is to increase areal density in recent years areal density will double every few years. Here we can look at the disk capacity is calculated:

Disk capacity = the number of bytes / sector * Average number of sectors / track Number of tracks * / * number of surface surface / disc * Number of disk / disk

An example of combination of the convenience of our understanding:

假如我们有一个磁盘,有5个盘片,每个扇区512字节,没个面20000条磁道,每条磁道 300 个扇区,那么容量计算为:

磁盘容量 = 512 * 300 * 20000 * 2 * 5 = 30720000000字节=30.72G

磁盘操作

磁盘读写操作靠的是读写头来读写存储在磁性表面的位,它在传动臂的一端,通过这个传动臂沿着半径前后移动,从而读取不同的磁盘上数据,这个过程就成为寻道(seek)

Disk dynamic characteristics

通过上图可以清晰的了解到,在读取数据的时候,首先通过传动臂沿着半径将读写头移动到对应表面的磁道上,而表面一直在以固定的速率旋转,读取指定扇区的数据(磁盘是以扇区大小来读写数据)。因为对于数据访问来说,消耗时间主要集中在:寻道时间、旋转时间和传送时间。

  • 寻道时间:即移动传动臂到包含目标扇区的磁道上所需的时间;
  • 旋转时间:即寻道完成后,等待目标扇区的第一个位旋转到读写头下的时间;
  • 传送时间:即扇区第一个位开始位于读写头下,到最后一个位所需的时间;

这里给出一个书上写的结论,访问一个磁盘扇区中512字节的时间主要是寻道时间和旋转延迟。也就是访问扇区中第一个字节花费很长时间,剩下的几乎不用时间。

这里大家可能有疑问,CPU是如何读取磁盘的数据到主存的,这就需要了解I/O总线。他们通过多种适配器连接到总线,而I/O总线连接了内存和CPU。如下图所示:

Bus configuration example

也就是I/O总线连接各种I/O设备、主存等。

固态硬盘

固态硬盘也就是俗称的SSD(Solid State Disk),是一种基于闪存的存储技术,目前常用的日常PC都用它来代替了磁盘,获取更快的速度。

SSD是内部由闪存构成,一个闪存由B个块的序列组成,每个块由P页组成。通常页的大小是512字节~4KB,块由32~128页组成,块的大小为16KB~512KB。

Solid State Drive SSD

SSD的随机读比写快很多,是因为:

  1. 在写的时候,只有一页所属的整个块被擦除之后才能写。而擦除块需要较长时间,1ms级的,比读取高一个数量级。
  2. 如果写的页P已经有数据,那么这个块中所有带数据的页都必须被复制到一个新的已经擦除过的块,然后才能对页P写操作。

在大约进行100000次重复写之后,块会被磨损,不能在使用,所以这也是网上建议保存固态磁盘不要频繁格式化,作为系统盘的原因。

局部性

现在计算机频繁的使用基于SRAM的告诉缓存,为了弥补处理器-内存之间的差距,这种方法行之有效是因为局部性这个基本属性。

程序的局部性原理是指程序在执行时呈现出局部性规律,即在一段时间内,整个程序的执行仅限于程序中的某一部分。相应地,执行所访问的存储空间也局限于某个内存区域。局部性原理又表现为:时间局部性和空间局部性。时间局部性是指如果程序中的某条指令一旦执行,则不久之后该指令可能再次被执行;如果某数据被访问,则不久之后该数据可能再次被访问。空间局部性是指一旦程序访问了某个存储单元,则不久之后。其附近的存储单元也将被访问。

上面我们介绍了内存和磁盘的读取逻辑,因此一旦某个数据被访问过,很快的时间内再次被访问,则会有缓存等手段,提高访问效率。

因此我们程序中应该尊村下列普遍方法:

  1. 重复引用相同变量的程序有良好的时间局部性;
  2. 总是顺序访问数据,跨越的步长越小,则程序的空间局部性越好。
  3. 对于取指令来说,循环有好的时间和空间局部性。循环体越小,循环迭代次数越多,局部性越好。

比如一个for循环,这是平时经常使用到的场景。假设它访问一个同一个数组元素,那么这个数组就是当前阶段的访问工作集,在缓存够大的情况下,它是可以直接命中缓存的。

存储器层次结构

上面主要介绍了存储技术和计算机软件一些基本的和持久的属性:

  • 存储技术:不同的存储技术的访问时间差异很大。速度较快的技术每字节的成本要比速度慢技术高,而且容量越小。CPU和主存之间的速度差距在增大;
  • 计算机软件:一个便携良好的程序倾向于展示出良好的局部性。

Now computer systems, hardware and software, these basic attributes complement each other very perfect, that go from the ground level, the storage device becomes slower, cheaper and more, the top of the CPU register, CPU can be in one clock cycle they visit, followed by a cache SRAM, main memory and so on.

Memory Hierarchy

As shown in FIG fancy, the central idea is: For each k, faster, smaller storage layer positioned k k + 1 as a layer located on a slower device larger cache.

In summary, based on the effective cache memory hierarchy, because the slower storage device equipment faster and cheaper, but also because programs tend to show locality.

  • The use of temporal locality: Due to temporal locality, the same data can be used multiple times, once in the first use of the cache miss is copied into the cache after, back when accessing performance is much faster than the first time.
  • Using spatial locality: the underlying concept of the storage device block has, as the basic unit of reading. Typically includes a plurality of blocks of data, due to the spatial locality, blocks access to the back of other objects, i.e., a cache hit to make up for the consumption of the first access block copy;

to sum up

Today, this article mainly study the relevant knowledge of computer memory.

  1. Common storage technology, and computer data is how these storage devices are.
  2. He explained the principle of locality program, temporal locality and spatial locality. Write to facilitate faster program.
  3. Finally learning the memory hierarchy of the entire computer system. In fact, the storage system is a multi-level cache system, the upper storage device is expensive, small capacity, the price is expensive, but fast, the next layer of the device as a cache.

Read more, please visit my personal station: fall in love with programming

Guess you like

Origin www.cnblogs.com/pekxxoo/p/csapp-6.html