Information Storage and Management (b)

Connected text information storage and management (a)
Here Insert Picture Description
on each disk head has two, one on each side.

Tracks are centered on the disk spindle of a set of concentric rings, tracks are numbered sequentially from the outside, the outermost track number is 0.
Here Insert Picture Description
Metadata (Metadata): storing a sector number, head number, or the disk number, the track number, etc., to help the controller is positioned on the disk data.
There are two recording disk
1. legacy - non-partitioned recording mode (the same number of tracks in different sectors)
2. new - partition recording mode (the number of different tracks in different sectors)
new solutions that, since the disk out the larger the area, it should be divided into more sectors, each sector of the area is the same, to accommodate the amount of data is the same.
The more the better performance on the outer tracks.
CHS: positioning using the physical disk address.
LBA: Logical Block Addressing, using the linear address to access the physical data blocks.
Disk service time: a time to complete the disk I / O request takes.
• seek: to move the driving arm moves the head to the correct track of time required. Generally 3 ~ 15ms.
• rotational delay: read-write head to a particular track, while the disk is rotated to move the head sector below. Average disk rotation 5400r / min delay of 5.5ms, 15000r / min for the disk 2ms.
• Transmission rate: disk per unit time average amount of data can be transmitted to the HBA.
Read and write data transfer rate is divided into internal and external.
Internal transfer rate: transmission of data from a single track on a disk to the internal buffer rate. Including the seek time .
External transmission rate: moving from the interface to the data rate of the HBA. Typically external interface transmission rate is declared rate.
Here Insert Picture Description
U refers to the utilization of I / O Controller
Rs service time is the average time to process a request of the controller or
a is the rate of arrival, or per unit time reaches the system I / O requests to the number of
1 / Rs is the service rate
Ra = 1 / a, the average inter-arrival time of
utilization U = Rs / Ra
average response time R = Rs / (1-U )
the number of requests in the queue is also called average queue length NQ = U² / (1-U )
a request waiting time in the queue = U * R
response time increases the utilization of non-linear variations, increasing the load queue, the response time is gradually increased, when the usage exceeds the turning point curve, the response time of exponential growth.

Host logic component
logic components from the host application software, and protocols. Host logic means comprises:
• an operating system (OS): support data access, monitor and respond to user actions. It organizes and controls the physical resources and is responsible for allocation.
• Device Drivers
• Volume Manager (VM): Logical Volume Manager (LVM) to several smaller disks are combined into one large virtual disk. Optimize memory access, simplifies storage resource management, storage allocation allows the administrator to change without changing the hardware. Member is substantially LVM physical volume (PV), the volume group (Volume Group) and the logical volume (LV).
Physical volume (PV): Each physical disk connected to the host system.
LVM the physical storage space provided by the physical volumes to logical memory space. Create a volume group is to one or more physical volumes to become set. When creating a volume group, each physical volume is divided into several data blocks of same size, the data blocks referred to as a physical area. A volume group can be divided into a number of logical volumes. Logical Volume may be composed of a discrete physical partitions, and may span multiple physical volumes.
• File system
• Application
Data access can be divided into block-level and file-level in two ways. Depending on the application using the logical block address, or a file name and file record read disk identifier.
File level access to data access by specifying the file name and path, the complexity of the logic blocks shielding addressing (LBA) of.
File system journaling file system check shorten the time to ensure the integrity of the file system.
Total disk service time (Rs) is a seek time (E), the sum of the rotational latency (L) and an internal transmission time (X) is.
E is a random decision request by the I / O's.
Rotational latency is half the time of the revolution. Disk rotation rate is calculated according to r / min.
Internal transmission time (X) may be calculated according to I / O block size.
Maximum per second I / O service times. I.e. IOPS = 1 / Rs.

Chapter 3: RAID data protection

External RAID controllers: hardware-based RAID array technology. ? Maximum amount of data equal to the size of a stripe can be read or written from a single hard disk.
All points of the strip contains the same number of disk blocks. Means reducing the size of the strip will be more dispersed data stored on a plurality of physical disks.
Disk parity check value is calculated using a bitwise exclusive-OR (XOR) operation.
Additional disk required for parity checking overhead is small, but parity information with the data changes need to re-calculate the checksum value, time-consuming will affect the performance of RAID controllers.
RAID0: can be concurrently read and write more data. Suitable for large applications I / O data block. Unable to provide data protection and high availability to deal with disk failures.
RAID1: data mirroring.
RAID10: first image data, then stores two copies of each stripe on the plurality of hard disks in a RAID sets. When the failed disk is replaced, we only need to rebuild the Mirror. Suitable for write-intensive, random access, a small amount of data I / O load. RAID01 almost no practical use. When a disk fails, the entire band will fail, rebuild operation must be replicated throughout the strip.
RAID3: a storage sub-band, the parity. It can allow a disk is destroyed. Data is always read in units of bands across all disk operations can be performed concurrently, there is no update only a certain portion of a storage tape with the write operation.
It provides for the transmission of large amounts of data very high bandwidth , often used in video streaming service involving a large number of sequential data access scene.
RAID4: on a dedicated disk, reconstructed data corruption when the parity information to the disk storage apparatus. And RAID3 difference is that the data disk support independent access. A data unit may be read from the disk in a single block, without having to access the entire strip. There are better read and write throughput.
RAID5: RAID5, the check value distribution is stored on all disks, overcomes the shortcomings of the check value write performance bottlenecks.
RAID6: introducing a second check elements that allow simultaneous failure of two disks. At least four disks. The cost of large write, write performance is weaker than RAID6. Reconstruction of slower operation.

Write costs: each write operation will generate additional I / O overhead on disk.
RAID1 each write operation needs to be simultaneously on two disks, write the price of 2.
RAID5 controller executes whenever a write operation, must read the old values and the old check data from disks in order to calculate a new checksum value. Each write operation, the controller can perform read two times, two times a write operation, RAID5 write consideration 4 .
RAID6 maintained the two check values, a disk write operation is actually required to complete six I / O operations, which write the price of 6.

Chapter 4 Intelligent Storage System

Intelligent Storage System: Front, cache, and back-end physical disk.
Here Insert Picture Description
The front end of
the front end provides an interface between the storage system and the host: a front end and a front end port controller.
Distal port so that the host system connected to said intelligent storage. Each port has a front end processing logic respective transmission protocols SCSI, Fiber Channel, iSCSI, and the like. Generally provides redundant port at the front end.
Front controller data into the internal data bus to the cache. After the cache data is received, the controller sends a response message to the host. The controller queuing algorithm optimized I / O processing command. The front end of the command queue has:
1. the FIFO (first in first out algorithm): default algorithm, the performance of the worst
2. seek optimization: reduced read-write head moves
3. Access time to optimize
cache
semiconductor memory, for temporarily storing data . The slowest physical disk, access the data from the physical disks usually takes a few milliseconds, and access the data from the cache takes less than 1ms. The minimum allocation unit is a cache page or grooves , cached by the data storage (data storage) and the tag RAM (recording data in the data store and the location on the disk) composition. There is a dirty tag RAM flag records whether the data has been saved to disk. Also saved time information to eliminate the buffer by the last access time.
Read Cache Hit: a host issues a read request, the front controller has no tag RAM query data stored in the cache, if the requested data is found in the cache, it called a read cache hit occurs, data may be sent directly to the host, among any disk operation does not occur. Then the host can respond quickly (about 1ms). If not found in the cache is called a cache miss occurs, the data read from the hard disk, the responsibility of the appropriate back-end disk access and read the data request. Cache miss increases the I / O response time.
Read request is sequential, generally employed prefetchAlgorithm, a set of contiguous disk blocks are read, a number have not been requested to read ahead cache read hit increased significantly reduced response time to the host.
Intelligent storage systems provide fixed and variable length pre-reading, when the unified I / O size most suitable. The maximum value for limiting the number of prefetch pre-read data blocks, to prevent the prefetch operation affecting the other disk-intensive I / O.
Read performance can be represented by the hit rate, the number of hits is read hit ratio to the total number of requests.
Here Insert Picture Description
Here Insert Picture Description
Finite cache used for small, random I / O access, the large I / O requests are sent directly to the disk to prevent the writing takes a lot of the cache area.
Global cache: read and write operations can be used in any free memory, global cache dynamically adjusted according to the ratio of read and write cache.
Cache management algorithms:
least recently accessed algorithm (LRU): based on the assumption that if a page has not been accessed for a long time, in the future it will not be accessed. Identify the page has not been accessed for a long time, direct to relieve these pages.
Last Visit algorithm (MRU): recently used page is released, on the assumption that if a page is visited, then at a later time will no longer be accessed.
When the cache write data, the storage system must be dirty pages (data page has been written, but not yet written to disk) clear brush, brush clearing is to submit the data from the cache to the disk process.
Here Insert Picture Description
Cache data protection
cache mirroring and caching technology can reduce the jump risk of data loss uncommitted cache.
Cache mirroring: each write cache data is stored in different positions independently of each other on the memory. When a cache failure occurs, the data stored in the mirror position is still safe and to be committed to disk. Originally read from disk into the cache, after a failure, the data read from the disk.
Jump buffer: dump data cache physical disk during a power outage. Is referred to skip the dump disk drive. After power is restored, the data is read back into the cache, the corresponding write back to disk.
rear end
Provides an interface between the rear end of the cache and a physical disk, and the rear end of the rear port controller. Backhaul control data transfer between the cache and the physical disk. Back-end controller during read and write operations to the disk exchange, provide limited data storage, provides error checking and correction functions together in a RAID algorithms and back-end implementation.
It equipped with a dual memory system controller and a plurality of ports.
Physical Disk
solid-state flash memory is used to access data. No mechanical movement, far short response time, low energy consumption.
Flash drive may be provided by a conventional virtual disk drive storage interface, having a relatively high write performance, high reliability and data integrity. Flash memory technology is ideal for fast access to large amounts of application data.
A logical volume LUN (logical device number) uniform addressing. Improved disk utilization.
A LUN volume capacity can be expanded by combining, greater capacity element called LUN LUN.
LUN masking is a data access control, the front end controller is implemented to prevent unauthorized access in a distributed environment and accidental access.
High-end storage array, is movable - movable arrays that are equipped with a large memory and cache controllers. The host can use any available path to access its LUN.

Published 10 original articles · won praise 8 · views 937

Guess you like

Origin blog.csdn.net/qq_44710568/article/details/104944210
Recommended