Ajuste del rendimiento de consultas SQL en MySQL

Acelere las consultas SQL con índices en MySQL. Instale, analice consultas y utilice procedimientos almacenados para obtener mejores resultados.

        En este artículo, veremos cómo la indexación de columnas de la tabla puede ayudar a mejorar el rápido tiempo de respuesta de las consultas SQL. Revisaremos los pasos para instalar MySQL, crear procedimientos almacenados, analizar consultas y comprender el impacto de los índices.
        Estoy usando MySQL versión 8 en Ubuntu. Además, utilizo la herramienta Dbeavor como cliente MySQL para conectarme al servidor MySQL. Así que aprendamos juntos.
        Utilizo MySQL para demostración; sin embargo, el concepto sigue siendo el mismo en todas las demás bases de datos.

1. De la siguiente manera, podemos instalar MySQL y acceder a él con el usuario root. Esta instancia de MySQL es sólo para pruebas; por lo tanto, uso una contraseña simple.

$ sudo apt install mysql-server
$ sudo systemctl start mysql.service
$ sudo mysql
mysql> SET GLOBAL validate_password.policy = 0;
mysql> ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY 'password';
mysql> exit
$ mysql -uroot -ppassword

2. Cree una base de datos y úsela.

mysql> create database testdb;

mysql> show databases;

mysql> use testdb;

3. Cree dos tablas empleado1 y empleado2. Aquí, el empleado 1 no tiene una clave principal y el empleado 2 tiene una clave principal.

mysql> CREATE TABLE employee1 (id int,LastName varchar(255),FirstName varchar(255),Address varchar(255),profile varchar(255));
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE employee2 (id int primary key,LastName varchar(255),FirstName varchar(255),Address varchar(255),profile varchar(255));
Query OK, 0 rows affected (0.02 sec

mysql> show tables;
+------------------+
| Tables_in_testdb |
+------------------+
| employee1        |
| employee2        |
+------------------+
2 rows in set (0.00 sec)

4. Ahora, si revisamos los índices de cada tabla, encontraremos que la columna id de la tabla empleado2 ya tiene un índice porque es la clave principal.

mysql> SHOW INDEXES FROM employee1 \G;
Empty set (0.00 sec)

ERROR: 
No query specified

mysql> SHOW INDEXES FROM employee2 \G;
*************************** 1. row ***************************
        Table: employee2
   Non_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: id
    Collation: A
  Cardinality: 0
     Sub_part: NULL
       Packed: NULL
         Null: 
   Index_type: BTREE
      Comment: 
Index_comment: 
      Visible: YES
   Expression: NULL
1 row in set (0.00 sec)

ERROR: 
No query specified

5. Ahora, cree un procedimiento almacenado para insertar datos masivos en ambas tablas. Insertamos 20000 registros en cada tabla. Luego podemos llamar al procedimiento almacenado usando el comando CALL nombre-procedimiento.

mysql> 

CREATE PROCEDURE testdb.BulkInsert()
BEGIN
		DECLARE i INT DEFAULT 1;
truncate table employee1;
truncate table employee2;
WHILE (i <= 20000) DO
    INSERT INTO testdb.employee1 (id, FirstName, Address) VALUES(i, CONCAT("user","-",i), CONCAT("address","-",i));
    INSERT INTO testdb.employee2 (id,FirstName, Address) VALUES(i,CONCAT("user","-",i), CONCAT("address","-",i));    
   SET i = i+1;
END WHILE;
END

mysql> CALL testdb.BulkInsert() ;

mysql> SELECT COUNT(*) from employee1 e ;
COUNT(*)|
--------+
    20000|
    

mysql> SELECT COUNT(*) from employee2 e ;
COUNT(*)|
--------+
    20000|

6. Ahora, si seleccionamos cualquier registro con identificación aleatoria, encontraremos que la tabla empleado1 tarda en responder porque no tiene ningún índice.

mysql> select * from employee2 where id = 15433;
+-------+----------+------------+---------------+---------+
| id    | LastName | FirstName  | Address       | profile |
+-------+----------+------------+---------------+---------+
| 15433 | NULL     | user-15433 | address-15433 | NULL    |
+-------+----------+------------+---------------+---------+
1 row in set (0.00 sec)

mysql> select * from employee1 where id = 15433;
+-------+----------+------------+---------------+---------+
| id    | LastName | FirstName  | Address       | profile |
+-------+----------+------------+---------------+---------+
| 15433 | NULL     | user-15433 | address-15433 | NULL    |
+-------+----------+------------+---------------+---------+
1 row in set (0.03 sec)

mysql> select * from employee1 where id = 19728;
+-------+----------+------------+---------------+---------+
| id    | LastName | FirstName  | Address       | profile |
+-------+----------+------------+---------------+---------+
| 19728 | NULL     | user-19728 | address-19728 | NULL    |
+-------+----------+------------+---------------+---------+
1 row in set (0.03 sec)

mysql> select * from employee2 where id = 19728;
+-------+----------+------------+---------------+---------+
| id    | LastName | FirstName  | Address       | profile |
+-------+----------+------------+---------------+---------+
| 19728 | NULL     | user-19728 | address-19728 | NULL    |
+-------+----------+------------+---------------+---------+
1 row in set (0.00 sec)

mysql> select * from employee1 where id = 3456;
+------+----------+-----------+--------------+---------+
| id   | LastName | FirstName | Address      | profile |
+------+----------+-----------+--------------+---------+
| 3456 | NULL     | user-3456 | address-3456 | NULL    |
+------+----------+-----------+--------------+---------+
1 row in set (0.04 sec)

mysql> select * from employee2 where id = 3456;
+------+----------+-----------+--------------+---------+
| id   | LastName | FirstName | Address      | profile |
+------+----------+-----------+--------------+---------+
| 3456 | NULL     | user-3456 | address-3456 | NULL    |
+------+----------+-----------+--------------+---------+
1 row in set (0.00 sec)

7. Ahora examine el resultado del comando EXPLAIN ANALYZE. Este comando en realidad ejecuta la consulta y la planifica, instrumentándola y ejecutándola, contando filas y midiendo el tiempo necesario en varios puntos del plan de ejecución.
Aquí encontramos que el empleado1 realizó un escaneo de la tabla, lo que significa escanear o buscar resultados en toda la tabla. A esto también lo llamamos escaneo completo de la tabla.

mysql> explain analyze select * from employee1 where id = 3456;
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                               |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Filter: (employee1.id = 3456)  (cost=1989 rows=1965) (actual time=5.24..29.3 rows=1 loops=1)
    -> Table scan on employee1  (cost=1989 rows=19651) (actual time=0.0504..27.3 rows=20000 loops=1)
 |
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.03 sec)

# Here is detailed explanation from ChatGPT.

Filter: (employee1.id = 3456): This indicates that there is a filter operation being performed on the "employee1" table, and only rows where the "id" column has a value of 3456 will be selected.

(cost=1989 rows=1965) (actual time=5.3..31.9 rows=1 loops=1): This part provides some performance-related information about the query execution:

cost=1989: It represents the cost estimate for the entire query execution. Cost is a relative measure of how much computational effort is required to execute the query.

rows=1965: It indicates the estimated number of rows that will be processed in this part of the query.

actual time=5.3..31.9: This shows the actual time taken for this part of the query to execute, which is measured in milliseconds.

rows=1 loops=1: The number of times this part of the query is executed in a loop.
-> Table scan on employee1 (cost=1989 rows=19651) (actual time=0.034..29.7 rows=20000 loops=1): This part shows that a table scan is being performed on the "employee1" table:

Table scan: This means that the database is scanning the entire "employee1" table to find the rows that match the filter condition.

cost=1989: The cost estimate for this table scan operation.

rows=19651: The estimated number of rows in the "employee1" table.

actual time=0.034..29.7: The actual time taken for the table scan operation, measured in milliseconds.

rows=20000 loops=1: The number of times this table scan operation is executed in a loop.

Overall, this query plan suggests that the database is executing a query that filters the "employee1" table to only return rows where the "id" column is equal to 3456. 
The table scan operation reads a total of 20,000 rows to find the matching row(s) and has an estimated cost of 1989 units. 
The actual execution time is 5.3 to 31.9 milliseconds, depending on the number of rows that match the filter condition.

8. Para la tabla empleado2, encontramos que solo se buscó en una fila y obtuvimos el resultado. Por tanto, si la tabla tiene una gran cantidad de registros, observaremos una mejora considerable en el tiempo de respuesta de la consulta SQL.

mysql> explain analyze select * from employee2 where id = 3456;
+---------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                           |
+---------------------------------------------------------------------------------------------------+
| -> Rows fetched before execution  (cost=0..0 rows=1) (actual time=110e-6..190e-6 rows=1 loops=1)
 |
+---------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

# As per ChatGPT explanation of this query plan is :

Rows fetched before execution: This part indicates that the database is fetching some data before the main query is executed.

(cost=0..0 rows=1): The cost estimate for this operation is 0 units, and it expects to fetch only one row.

(actual time=110e-6..190e-6 rows=1 loops=1): This provides the actual time taken for the data fetching operation:

actual time=110e-6..190e-6: The actual time range for the fetching operation, measured in microseconds (µs).

rows=1: The number of rows fetched.

loops=1: The number of times this data fetching operation is executed in a loop.

Overall, this part of the query plan indicates that the database is fetching a single row before executing the main query. 
The actual time taken for this data fetching operation is in the range of 110 to 190 microseconds. This preliminary data fetch might be related to obtaining some essential information or parameters needed for the subsequent execution of the main query.

9. Ahora, hagámoslo más interesante. Analicemos el plan de consulta cuando buscamos registros de la columna no indexada Nombre en ambas tablas. Del resultado encontramos que el escaneo de la tabla es para buscar registros y lleva bastante tiempo recuperar los datos.

mysql> explain analyze select * from employee2 where FirstName = 'user-13456';
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                            |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Filter: (employee2.FirstName = 'user-13456')  (cost=2036 rows=2012) (actual time=15.7..24 rows=1 loops=1)
    -> Table scan on employee2  (cost=2036 rows=20115) (actual time=0.0733..17.8 rows=20000 loops=1)
 |
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)

mysql> explain analyze select * from employee1 where FirstName = 'user-13456';
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                              |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Filter: (employee1.FirstName = 'user-13456')  (cost=1989 rows=1965) (actual time=23.7..35.2 rows=1 loops=1)
    -> Table scan on employee1  (cost=1989 rows=19651) (actual time=0.0439..28.9 rows=20000 loops=1)
 |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.03 sec)

10. Ahora, creemos un índice en la tabla empleado1 para la columna Nombre.

mysql> CREATE INDEX index1 ON employee1 (FirstName);
Query OK, 0 rows affected (0.13 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> show indexes from employee1 \G;
*************************** 1. row ***************************
        Table: employee1
   Non_unique: 1
     Key_name: index1
 Seq_in_index: 1
  Column_name: FirstName
    Collation: A
  Cardinality: 19651
     Sub_part: NULL
       Packed: NULL
         Null: YES
   Index_type: BTREE
      Comment: 
Index_comment: 
      Visible: YES
   Expression: NULL
1 row in set (0.01 sec)

ERROR: 
No query specified

11. Ahora, revisemos nuevamente el plan de consulta para ambas tablas cuando busquemos un solo registro de la columna Nombre. Descubrimos que el empleado1 respondió rápidamente, solo era necesario buscar 1 fila y, cuando se usaba el índice en la columna Nombre, se realizaba una búsqueda de índice en la tabla empleado1. Pero para el empleado2 el tiempo de respuesta es enorme y es necesario buscar en las 20.000 filas para obtener una respuesta.

mysql> explain analyze select * from employee1 where FirstName = 'user-13456';
+-------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                             |
+-------------------------------------------------------------------------------------------------------------------------------------+
| -> Index lookup on employee1 using index1 (FirstName='user-13456')  (cost=0.35 rows=1) (actual time=0.0594..0.0669 rows=1 loops=1)
 |
+-------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)


mysql> explain analyze select * from employee2 where FirstName = 'user-13456';
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                                                                                             |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| -> Filter: (employee2.FirstName = 'user-13456')  (cost=2036 rows=2012) (actual time=15.7..23.5 rows=1 loops=1)
    -> Table scan on employee2  (cost=2036 rows=20115) (actual time=0.075..17.5 rows=20000 loops=1)
 |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.02 sec)

Eso es todo, amigos. Este artículo nos ayudará a comprender el impacto de los índices en las tablas. Cómo analizar consultas utilizando el comando explicar analizar. Además, aprenda cómo configurar MySQL y escribir procedimientos almacenados para inserciones masivas.

Ajuste del rendimiento de consultas SQL en MySQL

Acelere las consultas SQL con índices en MySQL. Instale, analice consultas y utilice procedimientos almacenados para obtener mejores resultados.

Supongo que te gusta