Detailed explanation of sql index

  There are two types of indexes in SQL, one is clustered index and non-clustered index. The similarities and differences between the two are described below.

First, clustered index and non-clustered index:

  1. Clustered index:
  The meaning of clustered index can be understood as sequential arrangement. For example, a table with an auto-incrementing primary key is a clustered index, that is, the id of 1 exists in the first entry, and the id of 2 exists in the second entry... If the data in this table is stored in an array in the database, then if I need to find the 100th item, then the address of the first item of data plus 100 is the address of the 100th item, which can be queried at one time.
  Because the data in the database can only be arranged in one order, there can only be one clustered index per database. In mysql, you cannot create a clustered index by yourself. The primary key is the clustered index. If the primary key is not created, the default non-null column is the clustered index. If there is no non-null column, a hidden column is automatically generated.
  So generally in mysql, the primary key we create is the clustered index, and the data is arranged in the order of our primary key. So it will be very fast when querying based on the primary key.  2. Non-clustered index: A  non-clustered index can be simply understood as an ordered directory, which is a method of exchanging space for time. For example, in a user table, there is an id_num, which is an identity number, which is not the primary key id, so these data are stored in disorder, for example   , id_num with id 1 is 100, and id is 2. id_num is 97, id_num with id 3 is 98, id_num with id 4 is 99, and id_num with id 5 is 96. . . id_num with id 67 is 56. . .   So if I want to find people whose id_num is 56, I can only traverse them one by one, and n items need to be queried n times, and the time complexity is O(n), which is very performance-intensive.   Therefore, now I need to add a non-clustered index to id_num. After adding a non-clustered index, id_num will be sorted (the internal structure is B+ tree), and after sorting, I only need to query this directory (that is, query B+ tree), It is quickly known that the id is 56 in the 67th entry in the database, and there is no need to traverse all the data in the table.
  






  Therefore, in a non-clustered index, the more data that is not duplicated, the higher the efficiency of the index.
 

Second, the operation of the index:

  The indexes we usually use in the database are generally non-clustered indexes. The following describes how to use them:

1. Create an index:
1.1. Create a common index:
Mode:
CREATE INDEX index name ON table name (column name 1, column name 2, .. .);
or
modify the table: ALTER TABLE table name ADD INDEX index name (column name 1, column name 2,...);
or
specify the index when creating a table: CREATE TABLE table name ( [...], INDEX index name (column name 1, column name 2,...) );

eg:
CREATE INDEX name_index ON index_test(name);
This is to create an index name_index on the name column on the index_test table.

The table tested is:
CREATE TABLE index_test (
id INT  NOT NULL, name VARCHAR(50),idNum INT,
   
   
     PRIMARY KEY (id)
);

1.2、创建唯一索引:
表示唯一的,不允许重复的索引,如果该字段信息保证不会重复例如身份证号用作索引时,可设置为unique
下面三种模式都可以创建唯一索引:
  1、创建索引:CREATE UNIQUE INDEX 索引名 ON 表名(列的列表);
  2、在表上增加索引:ALTER TABLE 表名ADD UNIQUE 索引名 (列的列表);
  3、创建表时指定索引:CREATE TABLE 表名( [...], UNIQUE 索引名 (列的列表) );
eg:
 CREATE UNIQUE INDEX id_num_index ON index_test(idNum);
也可以写成下面的形式:
 ALTER TABLE index_test ADD UNIQUE id_num_index(idNum);
此为在index_test表的idNum列上创建一个唯一索引id_num_index

在创建了唯一索引之后,列中即不能重复,比如,现在我给表中插入一条重复的值,会报:
Error Code: 1062. Duplicate entry '3' for key 'id_num_index'
即在id_num_index唯一索引上出现了重复。

  2、删除索引:
以下两种模式都可以删除索引:

DROP INDEX index_name ON talbe_name
ALTER TABLE table_name DROP INDEX index_name

eg:
DROP INDEX name_index ON index_test;
此为删除在index_test表上的name_index索引                                             

  3、查看索引:
    SHOW INDEX FROM index_test;
即返回index_test表中的所有索引。

在返回的字段中,

Table:表的名称
Non_unique:是否不唯一,0为唯一,1不为唯一
Key_name:索引的名称
Seq_in_index:索引中的列序列号,从1开始
Column_name:列名称
Collation:列以什么方式存储在索引中。在MySQL中,有值‘A’(升序)或NULL(无分类)。
Cardinality:索引中唯一值的数目的估计值。通过运行ANALYZE TABLE或myisamchk -a可以更新。基数根据被存储为整数的统计数据来计数,所以即使对于小型表,该值也没有必要是精确的。基数越大,当进行联合时,MySQL使用该索引的机会就越大。
Sub_part:如果列只是被部分地编入索引,则为被编入索引的字符的数目。如果整列被编入索引,则为NULL。
Packed:指示关键字如何被压缩。如果没有被压缩,则为NULL。
Null:如果列含有NULL,则含有YES。如果没有,则该列含有NO。
Index_type:用过的索引方法(BTREE, FULLTEXT, HASH, RTREE)。
Comment:更多评注。

三、索引的选择原则:

  非聚集索引在数据库创建、增加、删除、修改的时候都需要作出相应的修改,所以,使用索引也是有一定的原则,即:

  1、较频繁的作为查询条件的字段应该创建索引
  2、重复太多的字段不适合单独创建索引,即使频繁作为查询条件
  3、不会出现在WHERE子句中的字段不应该创建索引

 以下两种情况不建议使用索引:
  1、表的记录比较少,比如只有几百,一千条记录,那么没必要建立索引,直接全表查询即可。
  2、不重复的字段越多,那么索引的价值越高,查看不重复的字段占总体的比例可以使用下面的sql语句:
    SELECT count(DISTINCT(name))/count(*) AS Selectivity FROM index_test;

  比如上面这个sql就是判断index_test表中name字段中不重复的值占整体的比例,这个比例应该在(0,1]之间,这个数值越大,越应该使用索引。


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325198690&siteId=291194637
Recommended