理解cache-line&tuple-at-a-time&clock cycles

Overview

Those who cannot remember the past are condemned to repeat it - George Santayana
Cache

A Cache is a hardware or software component that stores data so that future requests for that data can be served faster.
CPU cache

A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory.

When trying to read from or write to a location in the main memory, the processor checks whether the data from that location is already in the cache. If so, the processor will read from or write to the cache instead of the much slower main memory.

Most modern desktop and server CPUs have at least three independent caches:
- instruction cache to speed up executable instruction fetch
- data cache to speed up data fetch and store
- translation lookaside buffer(TLB) to speed up virual-to-physical address translation for both executable instructions and data
Cache line

Data is transferred between memory and cache in blocks of fixed size, called cache lines or cache blocks.

When a cache line is copied from memory into the cache, a cache entry is created.

The cache entry will include the copied data as well as the requested memory location (called a tag).
Query Processing

The vectorization model aims to increase the efficiency of the materialization model with a better use of the CPU caches.

There are three ways for a DBMS to execute a query plan:
- Tuple-at-a-time: Each operator calls next on their child to get the next tuple to process. Also known as the Volcano interator model;
- Operator-at-a-time: Each operator materializes their entire output for their parent operator, it is ideal for in-memory OLTP engine;
- Vector-at-a-time: Each operator calls next on their child to get the bext batch of data to process;
Volcano iterator model

OceanBase: 数据库查询引擎的进化之路

Volcano–An Extensible and Parallel Query Evaluation System
clock cycles

A clock signal oscillates between a high and a low state and is used like a metronome to coordinate actions of digital circuits.

A clock cycle is a single electronic pulse of a CPU. During each cycle, a CPU can perform a basic operation such as fetching an instruction, accessing memory, or writing data.

In physics, the frequency of a signal is determined by cycles per second, or “hertz”, similarly, the frequency of a processor is measured in clock cycles per second.

The speed of a computer processor, or CPU, is determined by the Clock Cycle, which is the amount of time between two pulses of an oscillator.
References

理解cache-line&tuple-at-a-time&clock cycles

Overview

Cache

CPU cache

Cache line

Query Processing

Volcano iterator model

clock cycles

References

猜你喜欢