MySQL database deduplication methods

 Demo data for reference

Lookup table structure:

mysql> desc test;
+-------+------------------+------+-----+---------+----------------+
| Field | Type             | Null | Key | Default | Extra          |
+-------+------------------+------+-----+---------+----------------+
| id    | int(11) unsigned | NO   | PRI | NULL    | auto_increment |
| site  | varchar(100)     | NO   | MUL |         |                |
+-------+------------------+------+-----+---------+----------------+
2 rows in set (0.00 sec)

Query data:

mysql> select * from test order by id;
+----+------------------------+
| id | site                   |
+----+------------------------+|  1 | http://www.baidu.com  |
|  2 | http://www.hao123.com        |
|  3 | http://www.huwei.com |
|  4 | http://www.baidu.com  |
|  5 | http://www.huwei.com |
+----+------------------------+
5 rows in set (0.00 sec)

When the table is not created or create index permission, if you want to delete the older duplicate records, you can use the following statement:

mysql> delete from a
    -> using test as a, test as b
    -> where (a.id > b.id)
    -> and (a.site = b.site);
Query OK, 2 rows affected (0.12 sec)

mysql> select * from test order by id;
+----+------------------------+
| id | site                   |
+----+------------------------+
|  1 | http://www.baidu.com  |
|  2 | http://www.hao123.com        |
|  3 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)

 

如果你要删除较新的重复记录,可以使用下面的语句:

mysql> delete from a
    -> using test as a, test as b
    -> where (a.id < b.id)
    -> and (a.site = b.site);
Query OK, 2 rows affected (0.12 sec)

mysql> select * from test order by id;
+----+------------------------+
| id | site                   |
+----+------------------------+
|  2 | http://www.hao123.com        |
|  4 | http://www.baidu.com  |
|  5 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)

You can use the following statement to confirm the duplicate records will be deleted:

mysql> SELECT a.*
    -> FROM test a, test b
    -> WHERE a.id > b.id
    -> AND (a.site = b.site);
+----+------------------------+
| id | site                   |
+----+------------------------+
|  1 | http://www.baidu.com  |
|  3 | http://www.huwei.com |
+----+------------------------+
2 rows in set (0.00 sec)

If you have permission to create the index, create a unique key index on the table, you can use the following method:

mysql> alter ignore table test add unique index ukey (site);
Query OK, 5 rows affected (0.46 sec)
Records: 5  Duplicates: 2  Warnings: 0

mysql> select * from test order by id;
+----+------------------------+
| id | site                   |
+----+------------------------+
|  1 | http://www.baidu.com  |
|  2 | http://www.hao123.com        |
|  3 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)
 
 

Duplicate records are deleted, if necessary, remove the index:

mysql> alter table test drop index ukey;
Query OK, 3 rows affected (0.37 sec)
Records: 3  Duplicates: 0  Warnings: 0

If you have permission to create tables, create a new table, and then the original table does not duplicate the data into the new table:

mysql> create table test_new as select * from test group by site;
Query OK, 3 rows affected (0.19 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| test           |
| test_new       |
+----------------+
2 rows in set (0.00 sec)

mysql> select * from test order by id;
+----+------------------------+
| id | site                   |
+----+------------------------+
|  1 | http://www.baidu.com  |
|  2 | http://www.hao123.com        |
|  3 | http://www.huwei.com |
|  4 | http://www.baidu.com  |
|  5 | http://www.huwei.com |
+----+------------------------+
5 rows in set (0.00 sec)

mysql> select * from test_new order by id;
+----+------------------------+
| id | site                   |
+----+------------------------+
|  1 | http://www.baidu.com  |
|  2 | http://www.hao123.com        |
|  3 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)

Then back up the original table, the new table will rename the current table:

mysql> rename table test to test_old, test_new to test;
Query OK, 0 rows affected (0.04 sec)

mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| test           |
| test_old       |
+----------------+
2 rows in set (0.00 sec)

mysql> select * from test order by id;
+----+------------------------+
| id | site                   |
+----+------------------------+
|  1 | http://www.baidu.com  |
|  2 | http://www.hao123.com        |
|  3 | http://www.huwei.com |
+----+------------------------+
3 rows in set (0.00 sec)

Note: Use Table creation of index information will be lost in this way the original table!

mysql> desc test;
+-------+------------------+------+-----+---------+-------+
| Field | Type             | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+-------+
| id    | int(11) unsigned | NO   |     | 0       |       |
| site  | varchar(100)     | NO   |     |         |       |
+-------+------------------+------+-----+---------+-------+
2 rows in set (0.00 sec)

 

Guess you like

Origin www.cnblogs.com/tangbohu2008/p/11308826.html