Selection of row storage table and column storage table of GaussDB

Table of contents

I. Introduction

2. The concept of row and column storage table

1. Definition

2. Advantages and disadvantages

3. Logical introduction of row and column storage table

1. The row storage table and the storage method of the row storage table on the hard disk

2. The column storage table and the storage method of the column storage table on the hard disk

4. Suggestions and scenarios for the use of row and column storage tables

1. Row table usage scenarios and GaussDB SQL examples

2. Column storage table usage scenarios and GaussDB SQL examples

V. Summary

I. Introduction

Row-stored tables and column-stored tables are two common data storage methods in databases. With the rapid development of information technology, data storage and management, as well as how to efficiently store and process large amounts of data have become our major challenges.

In order to solve this problem, row-storage tables and column-storage tables came into being, and they have been efficiently applied in various scenarios with their unique advantages. GaussDB supports row and column storage. This article will briefly introduce the application of row and column storage in the GassuDB database.

2. The concept of row and column storage table

1. Definition

  • Row-Based Table is a way of storing data in units of rows, and each record has a unique row identifier.
  • The Column-Based Table stores data in units of columns, and each record has a unique column identifier.

2. Advantages and disadvantages

1) The advantage of row storage table is its simple structure, easy to understand and operate. Since data is stored in rows, when querying a row of data, you can quickly locate the target location. In addition, row storage tables are relatively efficient when inserting, deleting, and updating data. However, the disadvantage of the row storage table is also obvious, that is, it is not suitable for complex data analysis and processing, because this storage method cannot make full use of the correlation of data, resulting in poor query performance.

2) The advantage of the column storage table lies in its powerful query function and high storage efficiency. Since data is stored in columns, operations such as aggregation and grouping of data in a certain column can be easily performed. In addition, column storage tables can also improve query performance through technologies such as indexes. However, the disadvantage of the column-stored table is that its structure is complex and not easy to understand and operate. Especially when inserting, deleting and updating data, it is necessary to consider the integrity and consistency of data, so the operation is relatively cumbersome.

3. Logical introduction of row and column storage table

GaussDB supports row and column storage. By default, the created table is row storage. The difference between row storage and column storage is illustrated below.

1. The row storage table and the storage method of the row storage table on the hard disk

In a database based on row storage, data is stored according to the row data as the basic logical storage unit, and the data in a row exists in the form of continuous storage in the storage medium.

2. The column storage table and the storage method of the column storage table on the hard disk

In a database based on columnar storage, data is stored according to the column data as the basic logical storage unit, and the data in a column exists in the form of continuous storage in the storage medium.

Therefore, the storage methods of the row storage table and the column storage table on the hard disk are also different. For a row-stored table, each record occupies a contiguous space block, while for a column-stored table, each attribute has a separate space block, and all attribute values ​​are stored in a contiguous space block.

4. Suggestions and scenarios for the use of row and column storage tables

In general, if the table has many fields (large and wide table) and few columns are involved in the query, it is suitable for column storage. If the number of fields in the table is relatively small and most of the fields are queried, it is better to choose row storage.

1. Row table usage scenarios and GaussDB SQL examples

 Create a row storage table, the default is to create a row storage table:

--创建行存表,默认是创建的是行存表
CREATE TABLE test_1
(
EMPLOYEE__ID CHAR(4),
EMPLOYEE_NAME VARCHAR2(10),
EMPLOYEE_SEX CHAR(2),
EMPLOYEE_AGE INT,
EMPLOYEE_SALARY MONEY
);

--查看已创建的表结构
SELECT * FROM PG_GET_TABLEDEF(‘test_1’)

   2. Column storage table usage scenarios and GaussDB SQL examples

 To create a column-stored table, use the keyword: WITH (ORIENTATION = COLUMN)

--创建列存表,使用关键字:WITH (ORIENTATION = COLUMN)
CREATE TABLE test_2
(
EMPLOYEE__ID CHAR(4),
EMPLOYEE_NAME VARCHAR2(10),
EMPLOYEE_SEX CHAR(2),
EMPLOYEE_AGE INT,
EMPLOYEE_SALARY MONEY
)
WITH (ORIENTATION = COLUMN);

--查看已创建的表结构
SELECT * FROM PG_GET_TABLEDEF(‘test_2’)

V. Summary

Row-stored tables and column-stored tables have their own advantages and disadvantages, and are suitable for different scenarios. GaussDB supports row and column storage. Row and column storage models have their own advantages and disadvantages. In practical applications, we need to choose an appropriate storage method according to specific needs to achieve efficient data management and analysis. Whether it is a row storage table or a column storage table, it is an important tool for us to explore the data world, and it is worthy of our in-depth study and mastery.

--Finish

Guess you like

Origin blog.csdn.net/GaussDB/article/details/131973380