Oracle paging query and data deduplication in-depth understanding of

Paging query

A high efficiency wording

**
1. No ORDER BY sort of wording. (Maximum efficiency)
 (After testing, this method is the lowest cost, only one layer of nesting, the fastest! Amount of data even if the query is bigger, almost unaffected, speed still!)
The SELECT *

  FROM (SELECT ROWNUM AS rowno, t.*

          FROM emp t

        WHERE hire_date BETWEEN TO_DATE ('20060501', 'yyyymmdd')

                            AND TO_DATE ('20060731', 'yyyymmdd')

          AND ROWNUM <= 20) table_alias

 WHERE table_alias.rowno >= 10;


2. ORDER BY sort of wording. (Higher efficiency)
 (tested this method with the expansion of the scope of the inquiry, the speed will be more slow Oh!)
The SELECT *

  FROM (SELECT tt.*, ROWNUM AS rowno

          FROM (  SELECT t.*

                    FROM emp t

                  WHERE hire_date BETWEEN TO_DATE ('20060501', 'yyyymmdd')

                                      AND TO_DATE ('20060731', 'yyyymmdd')

                ORDER BY create_time DESC, emp_no) tt

        WHERE ROWNUM <= 20) table_alias

 WHERE table_alias.rowno >= 10;


**

Second, the efficiency of garbage but it seems very common paging wording

**

3. No ORDER BY sort of wording. (1 instead of the recommended method)
 (this method with the expansion of the amount of data query, can slow oh!)
The SELECT *

  FROM (SELECT ROWNUM AS rowno, t.*

          FROM k_task t

        WHERE flight_date BETWEEN TO_DATE ('20060501', 'yyyymmdd')

                              AND TO_DATE ('20060731', 'yyyymmdd')) table_alias

 WHERE table_alias.rowno <= 20 AND table_alias.rowno >= 10;

--TABLE_ALIAS.ROWNO  between 10 and 100;

4. There is ORDER BY sort of wording. (2 instead of the recommended method)
 (this method with the expansion of the scope of the inquiry, the speed will become slower oh!)
The SELECT *

  FROM (SELECT tt.*, ROWNUM AS rowno

          FROM (  SELECT *

                    FROM k_task t

                  WHERE flight_date BETWEEN TO_DATE ('20060501', 'yyyymmdd')

                                        AND TO_DATE ('20060531', 'yyyymmdd')

                ORDER BY fact_up_time, flight_no) tt) table_alias

 WHERE table_alias.rowno BETWEEN 10 AND 20;

5. Alternative syntax. (There ORDER BY wording)
(grammar and style of traditional SQL syntax is different, is not convenient to read and understand, to standardize and unify standard, is not recommended.)
The WITH partdata AS

    (

        SELECT ROWNUM AS rowno, tt.*

          FROM (  SELECT *

                    FROM k_task t

                  WHERE flight_date BETWEEN TO_DATE ('20060501', 'yyyymmdd')

                                        AND TO_DATE ('20060531', 'yyyymmdd')

                ORDER BY fact_up_time, flight_no) tt

        WHERE ROWNUM <= 20)

SELECT *

  FROM partdata

 WHERE rowno >= 10;

 

--6 alternative syntax. (No ORDER BY wording)

WITH partdata AS

    (

        SELECT ROWNUM AS rowno, t.*

          FROM k_task t

        WHERE flight_date BETWEEN TO_DATE ('20060501', 'yyyymmdd')

                              AND TO_DATE ('20060531', 'yyyymmdd')

          AND ROWNUM <= 20)

SELECT *

  FROM partdata

 WHERE rowno >= 10;

**

Third, analysis

**
the Oracle paging query can basically be applied in accordance with the format set forth herein.

Paging query format:
the SELECT *

  FROM (SELECT a.*, ROWNUM rn

          FROM (SELECT *

                  FROM table_name) a

        WHERE ROWNUM <= 40)

 WHERE rn >= 21

One of the most inner query SELECT * FROM TABLE_NAME represents the original query without turning pages. ROWNUM <= 40 and RN> = control range per query page 21.

This page given above query, with a higher efficiency in most cases. The purpose is to control the output of pagination result set size, the results are returned as soon as possible. In the above query paging, this consideration is mainly reflected in the WHERE ROWNUM <= 40 sentence.

Presence of the selective recording 21 to 40 two methods, one is a second layer ROWNUM <= 40 in the example shown above, the query to the control maximum value, the minimum value of the outermost layer of control in the query. The other way is to remove the second layer of the query WHERE ROWNUM <= 40 statement, the minimum and maximum control outermost tabs in the query. This is the query as follows:
the SELECT *

  FROM (SELECT a.*, ROWNUM rn

          FROM (SELECT *

                  FROM table_name) a)

 WHERE rn BETWEEN 21 AND 40

compare the two written, the vast majority of cases, the first query efficiency is much higher than the second.

This is because the CBO optimization mode, Oracle query may be pushed to the inner layer of the outer query, the query to improve the efficiency of the inner layer. For the first query, the second query layer WHERE ROWNUM <= 40 can be pushed into the inner query Oracle, the result of such a query Oracle ROWNUM restrictions than once, on the termination of the query results returned.

The second query, since the query BETWEEN 21 AND 40 is present in the third layer of the query, the query and Oracle will not be pushed to the third layer of the innermost layer (even if pushed to the innermost does not make sense, because most do not know what the inner query on behalf of RN). Thus, for the second query, Oracle returns to the innermost intermediate layer satisfies the conditions for all the data, and returns to the intermediate layer is the outermost layer of all the data. Filter the data in the outermost layer is completed, it is clear that efficiency is much lower than in the first query.

The above analysis is not just for the simple single-table queries, as effective for most inner query is a complicated multi-table joins query or innermost query contains sort.

Here is not the query contains the sort of explanation, the next article will be described in detail by way of example.

Following is a brief discussion about the situation of multi-table joins.

For the most common equivalent table join query, CBO generally may use two connections NESTED LOOP and HASH JOIN (MERGE JOIN efficient than HASH JOIN efficiency, generally will not consider CBO). Here, since the tab, so a specified maximum number of records returned, NESTED LOOP when more than the maximum number of records may be returned immediately stopped and returns the result to the intermediate layer, and the HASH JOIN must resolve all the result sets (the MERGE JOIN is). So in most cases, for paging query select NESTED LOOP as a query connection method is more efficient (when the vast majority of cases paging query is a query the first few pages of data, the more chance on the back pages access The smaller).

Therefore, if you do not mind using HINT in the system, it can be paged query rewrite:
the SELECT *

  FROM (SELECT a.*, ROWNUM rn

          FROM (SELECT *

                  FROM table_name) a

        WHERE ROWNUM <= 40)

 WHERE rn >= 21

Oracle data deduplication

A complete duplicate data deduplication method

  Specifically idea is to first create a temporary table, and then insert the table data after DISTINCT to this temporary table; then emptied the original table data; insert data revisit the temporary table to the original table; and finally delete the temporary table.

  For an exact duplicate of the table data deduplication, the following SQL statement can be employed.

      --Code

    CREATE TABLE "#temp" AS (SELECTDISTINCT * FROM table); - create a temporary table, and the data de-duplication DISTINCT inserted into the temporary table

    truncate TABLE table name; - empty the original table data

    INSERT INTO table name (SELECT * FROM "#temp"); - the temporary table data into the original table

    DROP TABLE "#temp"; - delete the temporary table

Second, the method of partial data deduplication

We can consider the establishment of a temporary table, will need to determine duplicate field, rowid inserted into the temporary table, then delete when making comparisons.

createtable temporary table AS
 
SELECT A field 1, a field 2, MAX (a.ROWID) dataid from a GROUPBY a formal table field 1, a field 2;....
 
deletefrom table A
 
WHERE a.rowid =!
 
(
 
SELECT B .dataid from temporary table B
 
.. A field WHERE 1 = b and field. 1
 
.. A field field 2 = b 2
 
);
 
the commit;

Example:

- The MAX (a.rowid) filter duplicate data, the data will not be repeated to obtain a temporary table
create table temporary table AS
SELECT a.ip, a.port, MAX (a.ROWID) DataID from ipresult
A the GROUP BY A .ip, a.port;

- Delete a formal table duplicate data, retaining only the latest piece of data
the Delete ipresult A from
the WHERE a.rowid =!
(
The SELECT b.dataid from temporary table b
the WHERE a.ip = b.ip and
a.port = b.port
);

- delete temporary form and submit
drop table temporary table;
the commit;

Guess you like

Origin www.linuxidc.com/Linux/2019-07/159490.htm