SQL SERVER table partition technology

When the data of a database table reaches tens of millions, the I/O of the disk becomes the bottleneck of performance. Therefore, if the I/O capability of the disk can be improved, the efficiency of the database will be improved accordingly. SQL Server introduced table partitioning technology.

Table partitioning is to split a database table file into multiple files and put them in different filegroups or even different disks, but when accessing from the outside world, you still see a database table, so that different disks have different filegroups. It can improve the efficiency of concurrent access, and for developers and technicians, it is only that table, which is no different from a single table.

The advantages of table partitioning:

1. Improve I/O efficiency, the efficiency of insertion and query are greatly improved, and performance is improved, which is also the main advantage .

2. Improve disaster tolerance. When a problem occurs in the data of one partition, it will not affect the content of other partitions.

3. The backup management is more convenient, and the partition files can be backed up separately.

The implementation of table partitioning is also relatively simple, mainly divided into three points:

1. Create a partition function.

2. Create a partition structure.

3. Create a partition table.

The first step is to create a partition function:

CREATE PARTITION FUNCTION [fnPartition](INT) AS RANGE LEFT FOR VALUES (1, 2)

[fnPartition] is the method name, followed by the type of partition based on the column, and then there is a RANGE where left and right can be written, this value represents where the critical value belongs, and VALUES is the critical value of the partition. The left written in the example, if the value of the partition column is 1 or less than 1 in the inserted data, then it belongs to the first data file, and the value of 2 belongs to the second file, and the value is greater than 2. belongs to the third data file. Here 1 and 2 can be any data, time can also be defined by yourself.

The second step is to create a partition schema:

Before creating the partition schema, we need to create each data file of the partition. The creation method is as follows:

ALTER DATABASE Test ADD FILEGROUP fg1

Then we put the data file into the filegroup:

ALTER DATABASE Test ADD FILE 
(NAME=N'fg1',FILENAME=N'F:\Test\fg1.ndf',SIZE=5MB,FILEGROWTH=5MB)
TO FILEGROUP fg1

We create three files and file groups. The above code will not be repeated here. Let's start to create the partition structure:

CREATE PARTITION SCHEME [SchemaForParirion] AS PARTITION [fnPartition] TO ([fg1], [fg2], [fg3])

[SchemaForParirion] is the schema name, [fnPartition] is the partition function we created earlier, and fg1, 2, and 3 are the three data files stored in the partition.

The third step is to create a partition table:

CREATE TABLE [dbo].[TestTable](
	[Id] [INT] IDENTITY(1,1) NOT NULL,
	[ProvinceId] [INT] NOT NULL,
	[Data] [NVARCHAR](50) NULL
)ON SchemaForParirion(ProvinceId)

We will all create a new database table, but here we need to pay attention to adding an operation to specify the table as a partition table, ON SchemaForParirion(ProvinceId), this SchemaForParirion is the partition schema we just defined, and the parameter is the partition column.

There is one more thing to note here: if you create a primary key or unique index on the table, the partition by column must be that column . Therefore, if we do not use the primary key or index for partitioning, we must not set the primary key or index for the table when building the table. Of course, the index can improve the query efficiency, and the specific situation needs to be considered in detail.

There are a few more methods to note:

1. View the partition where the specified value of the partition basis column is located. For example, to query which partition the province Id is 2 is stored in, the statement is as follows:

SELECT  $partition.fnPartition(2)

The result is: 2. That is to say, the data whose province Id is 2 is stored in the second database file when it is stored.

2. View the number of rows in each non-empty partition in the partition table:

SELECT  $partition.fnPartition(ProvinceId) AS Num ,
        COUNT(*) AS recordCount
FROM    TestTable
GROUP BY $partition.fnPartition(ProvinceId)

The results show that the data volume of each database file of the three ProvinceIds is 10 million.

3. View the data records in the specified partition, and query the data whose provincial ID is 1 :

SELECT  *
FROM    TestTable
WHERE   $partition.fnPartition(ProvinceId) = 1

Well, it's time to look at the results of table partitioning The TestTable table is partitioned, and the data is stored in three files according to the province Id ( ProvinceId ) equal to 1, 2, and 3. Each file contains 10 million data, a total of 30 million. Just write the insert statement directly here, and SQL Server will determine which storage file it is . The structure of the TestTable_1 table is the same as that of the TestTable, without partition processing, it stores 30 million data, and the provincial ID is 123. Next, let's take a look at the comparison of query results. The results are only for reference. The query statement is as follows:

SELECT TOP 10000 * FROM dbo.TestTable where ProvinceId = 1 AND Id>0
SELECT TOP 10000 * FROM dbo.TestTable where ProvinceId = 2 AND Id>10000000
SELECT TOP 10000 * FROM dbo.TestTable where ProvinceId = 3 AND Id>20000000

The statement to read data from the TestTable_1 table is as follows :

SELECT TOP 10000 * FROM dbo.TestTable_1 WHERE ProvinceId = 123 AND Id>0
SELECT TOP 10000 * FROM dbo.TestTable_1 WHERE ProvinceId = 123 AND Id>10000000
SELECT TOP 10000 * FROM dbo.TestTable_1 WHERE ProvinceId = 123 AND Id>20000000

Although the contrast of the statements is not very strong, we can see that the read speed of the partition table is indeed faster than that of the ordinary table. So for the problem of single-table big data, we can try to use this method to deal with it.


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325807641&siteId=291194637