A: repeating the data field is determined according to the single
1, first of all, look-up table redundant data by key fields (name) to query.
select * from OA_ADDRESS_BOOK where name in (select name from OA_ADDRESS_BOOK group by name having count(name)>1)
2, remove duplicate data in the table, a single field duplicate data (Name) is determined in accordance with, leaving only the smallest recording rowid
delete from OA_ADDRESS_BOOK where (Name) in
(select Name from OA_ADDRESS_BOOK group by Name having count(Name) >1)
and rowid not in (select min(rowid) from OA_ADDRESS_BOOK group by Name having count(Name)>1)
II: The determination is repeated a plurality of data fields
1, first, the duplicate data look-up table, to query the key field (Name, UNIT_ID).
select * from OA_ADDRESS_BOOK book1 where (book1.name,book1.unit_id) in
(select book2.name,book2.unit_id from OA_ADDRESS_BOOK book2 group by book2.name,book2.unit_id having count(*)>1)
2, remove duplicate data table, the data is repeated a plurality of fields (Name, UNIT_ID) to determine, leaving only the smallest recording rowid
(select Name,UNIT_ID from OA_ADDRESS_BOOK group by Name,UNIT_ID having count(*) > 1)
and rowid not in (select min(rowid) from OA_ADDRESS_BOOK group by Name,UNIT_ID having count(*)>1)
3, duplicate data lookup table, the data is repeated a plurality of fields (Name, UNIT_ID) is determined, the recording does not include the smallest rowid
(select Name,UNIT_ID from OA_ADDRESS_BOOK group by Name,UNIT_ID having count(*) > 1)
and rowid not in (select min(rowid) from OA_ADDRESS_BOOK group by Name,UNIT_ID having count(*)>1)
1. Problem Description
BBSCOMMENT table BBSDETAIL from the table, the evaluation record business information. Because the data and shift to shift the go, there are a lot of duplicate data. Table structure is as follows:
COMMENT_ID NOT NULL NUMBER - primary key
DETAIL_ID NOT NULL NUMBER - foreign key references BBSDETAIL table
COMMENT_BODY NOT NULL VARCHAR2 (500) - Content Evaluation
- Ignore the other fields
In which the primary key is not repeated, repetition is DETAIL_ID + COMMENT_BODY + ...... and other information, evaluation information is duplicated some businesses.
2. The resolution steps
2.1 lookup table unnecessary duplicate records
- Query all duplicate data SELECT DETAIL_ID, Comment_Body, COUNT (*) from BBSCOMMENT Group by DETAIL_ID, Comment_Body HAVING COUNT (*)>. 1 Order by DETAIL_ID, Comment_Body; --1 955 bar
2.2 shows all non-redundant data
- This command shows all non-redundant data SELECT min (COMMENT_ID) AS COMMENT_ID, DETAIL_ID, Comment_Body from BBSCOMMENT Group by DETAIL_ID, Comment_Body; --21,453 bar, why this value is not equal to the total number of records in Table -1955 because the 1955 record, some repeated more than once.
2.3 If a small number of records (thousand level), the above statement may be made sub-query then delete
- If the table is not large amount of data (less than 1 one thousand), the above statement may be made sub-query and then delete the Delete from BBSCOMMENT the WHERE COMMENT_ID not in ( the SELECT min (COMMENT_ID) from BBSCOMMENT Group by DETAIL_ID, Comment_Body ); --782 seconds, in my place, 20,000 records, duplicate records more than 2,000 (too slow !!)
Another 2.4 Delete method
- This statement can achieve the above functions, but not tested, I data has been deleted - to delete a condition: there are duplicate records data; second condition: keep a record of the smallest rowid. A BBSCOMMENT from Delete WHERE (a.DETAIL_ID, a.COMMENT_BODY) in (SELECT DETAIL_ID, Comment_Body from Group BBSCOMMENT by DETAIL_ID, Comment_Body HAVING COUNT (*)>. 1) and in ROWID Not (SELECT min (ROWID) from Group BBSCOMMENT by DETAIL_ID , COMMENT_BODY having count (*)> 1);
2.5 large amount of data or use PL / SQL convenient
DECLARE - definition storage structure type bbscomment_type IS Record ( the comment_id type BBSCOMMENT.COMMENT_ID%, detail_id type BBSCOMMENT.DETAIL_ID%, Comment_Body type BBSCOMMENT.COMMENT_BODY% ); bbscomment_record bbscomment_type; - for variables comparable v_comment_id BBSCOMMENT.COMMENT_ID% type; % BBSCOMMENT.DETAIL_ID type v_detail_id; v_comment_body BBSCOMMENT.COMMENT_BODY% type; - other variables v_batch_size Integer: = 5000; v_counter Integer: = 0; Cursor cur_dupl iS - remove all duplicate records SELECT COMMENT_ID, DETAIL_ID, Comment_Body from BBSCOMMENT WHERE (DETAIL_ID, Comment_Body) in ( - duplicate records DETAIL_ID SELECT, Comment_Body from BBSCOMMENT Group by DETAIL_ID, Comment_Body HAVING COUNT (*)>. 1) Order by DETAIL_ID, Comment_Body; the begin for bbscomment_record in cur_dupl Loop ! v_detail_id IF or IS null (or v_detail_id bbscomment_record.detail_id = NVL (bbscomment_record.comment_body, ''!) = NVL (v_comment_body, '')) the then - for the first time to enter, for the record, are reassigned v_detail_id: = bbscomment_record.detail_id; v_comment_body: = bbscomment_record.comment_body; the else - another record delete IF MOD (v_counter, v_batch_size) = 0 the then delete from BBSCOMMENT where COMMENT_ID = bbscomment_record.comment_id; v_counter: + = v_counter. 1; - submitted once every how many the commit; End IF; End IF; End Loop; IF v_counter> 0 the then - the last commit the commit; End IF; DBMS_OUTPUT.PUT_LINE (TO_CHAR (v_counter) || 'records are deleted!'); Exception When the then Others DBMS_OUTPUT.PUT_LINE ( 'SQLERRM ->' || SQLERRM); ROLLBACK; End;