index storage

For index storage, since 2008, there are two optimization methods, both of which reduce the storage space by removing duplicate data, so that the original storage space is reduced. Less space means fewer pages, and fewer pages means fewer I/O requests during a query. Row compression and page compression, respectively

1. Row Compression

The first: reduce the volume of the row. Row compression achieves its purpose by changing the storage form of rows. It can be used on heap or B_Tree. When row compression is enabled, the corresponding function will be enabled.

row of raw data for this table

Fixed-length data will be stored in variable-length format

Numeric data types are also stored in variable-length format

--Create two test tables for comparison

IF OBJECT_ID('dbo.NoCompression') IS NOT NULL
    DROP TABLE dbo.NoCompression
IF OBJECT_ID('dbo.RowCompression') IS NOT NULL
    DROP TABLE dbo.RowCompression
SELECT  SalesOrderID ,
        SalesOrderDetailID ,
        CarrierTrackingNumber ,
        OrderQty ,
        ProductID ,
        SpecialOfferID,
        UnitPrice ,
        UnitPriceDiscount ,
        LineTotal ,
        rowguid ,
        ModifiedDate
INTO    dbo.NoCompression
FROM    Sales.SalesOrderDetail
SELECT  SalesOrderID ,
        SalesOrderDetailID ,
        CarrierTrackingNumber ,
        OrderQty ,
        ProductID ,
        SpecialOfferID,
        UnitPrice ,
        UnitPriceDiscount ,
        LineTotal ,
        rowguid ,
        ModifiedDate
INTO    dbo.RowCompression
FROM    Sales.SalesOrderDetail

 Compression is implemented on the DATA_COMPRESSION of a CREATE or ALTER INDEX statement, and compression can be used on both clustered and non-clustered indexes.

The following statement is row compression, about 33% of the space is compressed

-- no compression
CREATE CLUSTERED INDEX CLIX_NoCompression ON dbo.NoCompression
(SalesOrderID, SalesOrderDetailID);
--line compression
CREATE CLUSTERED INDEX CLIX_RowCompression ON dbo.RowCompression
(SalesOrderID, SalesOrderDetailID)
WITH (DATA_COMPRESSION = ROW);
--check page used
SELECT  OBJECT_NAME(object_id) AS table_name ,
        in_row_reserved_page_count
FROM    sys.dm_db_partition_stats
WHERE   object_id IN ( OBJECT_ID('dbo.NoCompression'),
                       OBJECT_ID('dbo.RowCompression') )

 Number of pages with and without row compression:

Compression not only reduces storage space, but also improves query performance by reducing data pages.

   	   SET STATISTICS IO ON
					   SELECT SalesOrderID,SalesOrderDetailID ,
                     CarrierTrackingNumber  FROM dbo.NoCompression
					   WHERE salesorderID BETWEEN 51500 AND 5200

					   SELECT SalesOrderID,SalesOrderDetailID ,
                     CarrierTrackingNumber  FROM dbo.RowCompression
					   WHERE salesorderID BETWEEN 51500 AND 5200
					   SET STATISTICS IO OFF
                       

 When compressing, the following needs to be considered

1. The premise of compression is to operate on large tables.

2. If the largest function exceeds 8060bytes, the compression cannot be performed

3. Non-clustered indexes do not inherit compressed positions on heap or clustered indexes. Each needs to be done manually

4. High-frequency CPU overhead operations during compression cannot be performed frequently

two. page compression

Page compression can also be done in heap and B-Tree structures. Page compression is usually more efficient than row compression because it includes row compression, prefix compression, and dictionary compression.

Before page compression, row compression is performed first, and then data with the same prefix in the page is compressed.

--Create test table:

IF OBJECT_ID('dbo.PageCompression') IS NOT NULL
    DROP TABLE dbo.PageCompression
SELECT SalesOrderID
    ,SalesOrderDetailID
    ,CarrierTrackingNumber
    ,OrderQty
    ,ProductID
    , SpecialOfferID
    ,UnitPrice
    ,UnitPriceDiscount
    ,LineTotal
    ,rowguid
    ,ModifiedDate
INTO dbo.PageCompression
FROM Sales.SalesOrderDetail

 To compress:

CREATE CLUSTERED INDEX CLIX_PageCompression ON dbo.PageCompression
(SalesOrderID, SalesOrderDetailID)
WITH (DATA_COMPRESSION = PAGE);
SELECT  OBJECT_NAME(object_id) AS table_name ,
        in_row_reserved_page_count
FROM    sys.dm_db_partition_stats
WHERE   object_id IN ( OBJECT_ID('dbo.NoCompression'),
                       OBJECT_ID('dbo.PageCompression') )

 

Execute the query:

   SET STATISTICS IO ON
					   SELECT SalesOrderID,SalesOrderDetailID
    ,CarrierTrackingNumber FROM dbo.PageCompression
					   WHERE SalesOrderID BETWEEN 51500 AND 5200
SET STATISTICS IO OFF

 Indexed view:

Due to permissions, the query may not return a lot of data. At this time, the view may be a candidate solution. For static data that is only queried, creating an indexed view is also a good solution.

Without indexed views

	 SET STATISTICS IO ON
             SELECT  
			 psc.Name,
			 SUM(sod.LineTotal) AS SumLIneTotal,
			 SUM(sod.OrderQty) AS SumOrderQty,
			 AVG(sod.UnitPrice) AS AvgUnitPrice
			  FROM Sales.SalesOrderDetail sod
			 INNER JOIN Production.Product p ON sod.ProductID=p.ProductID
			 INNER JOIN Production.ProductSubcategory psc ON p.ProductSubcategoryID=psc.ProductSubcategoryID
			 GROUP BY psc.Name
			 ORDER BY psc.Name

 

 

Create an indexed view:

CREATE VIEW dbo.ProductSubcategorySummry
			 -- used to create indexed views
			 WITH  SCHEMABINDING
			 AS
             SELECT
			 psc.Name,
			 SUM(sod.LineTotal) AS SumLIneTotal,
			 SUM(sod.OrderQty) AS SumOrderQty,
			 AVG(sod.UnitPrice) AS AvgUnitPrice
			 FROM Sales.SalesOrderDetail sod
			 INNER JOIN Production.Product p ON sod.ProductID =p.ProductID
			 INNER JOIN production.ProductSubcategory psc ON p.ProductSubcategoryID=psc.ProductSubcategoryID GROUP BY psc.Name;
-- create a clustered index
			 CREATE UNIQUE CLUSTERED INDEX CLIX_ProductSubcategorySummay ON dbo.ProductSubcategorySummry(Name)

			 SET STATISTICS IO  ON
			 SELECT name,SumLineTotal,SumOrderQty,TotalUnitPrice/Occurances AS AvgUnitPrice FROM dbo.ProductSubcategorySummry
			 ORDER BY name

 Logical reads dropped a lot after using indexed views

Indexed views are very effective when multiple tables need to be associated into a unit, which can reduce IO requests when associated

Restrictions for indexed views:

1. All columns in the view must be deterministic

2. Indexed views must use the SCHEMA_BINDING option

3. Clustered indexes must use the unique option

4. The referenced table must have a schema name with

5. Some summary functions, such as AVG(), cannot be used for indexed views

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324671765&siteId=291194637