Today talk about how efficient SELECT

On stage to share how to import to the table in an interactive data analysis, standard insert statements in PostgreSQL, to the wonderful view Portal: today talk about how to write data to the table  , while construction-related operations can also see to the table period: today's talk about how to build efficient table

In general, data storage takes storage resources, when survey data, will consume computing resources, how to quickly find the desired results and can eliminate unnecessary computing resources, but also the performance optimization of a challenge. Today small as we explain how the molecule in an interactive, efficient use of purpose to achieve fast query SELECT data but also save resources.

grammar

SELECT [ALL | DISTINCT [ON (expression [, ...])]]
  * | expression [[AS] output_name] [, ...]
  [FROM from_item [, ...]]
  [WHERE condition]
  [GROUP BY grouping_element [, ...]]
  [HAVING condition [, ...]]
  [{UNION | INTERSECT | EXCEPT} [ALL] select]
  [ORDER BY expression [ASC | DESC | USING operator] [, ...]]
  [LIMIT {count | ALL}]

Examples of Use

1.SELECT *

Direct search the entire table using a SELECT statement, as in the example:

select * from test;

When using select *, the entire table will make the index, but the actual business, the entire table of data will not all need the same time, thus wasting a lot of computing resources, when the entire table is particularly large amount of data will consume a lot of time to make inquiries.

2. SELECT FROM

When a large number of queries to certain fields is generally recommended not to search the entire table, but check the columns you want, for example:

select id,name from test;

3.GROUP BY

GROUP BY statement is used in conjunction with polymerizable function groups the result set according to one or more columns.

SELECT kind, sum(length) AS total FROM films GROUP BY kind;

Description: group by a field which contains not float / double this float.

4.DISTINCT/COUNT DISTINCT

In the table, it may contain duplicate values, to use distinct return different values, and for calculating the number of COUNT is distinct non-duplicated data.

//distinct用法
SELECT DISTINCT ON (location) location, time, report
    FROM weather_reports
    ORDER BY location, time DESC;
 
//精确uv
SELECT c1, COUNT(DISTINCT c2) FROM table GROUP BY c1

//非精确uv
SELECT c1, approx_count_distinct(c2) FROM table GROUP BY c1
    

5.UNION

UNION for two or more calculation results are combined SELECT statement.

Note: Each SELECT statement must have the same number of selected columns, the same number of column expressions, the same type of data, and the order in which they appear to be consistent, but not necessarily the same length.

 SELECT  ID, NAME, AMOUNT, DATE
         FROM CUSTOMERS
         LEFT JOIN ORDERS
         ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID
    UNION
         SELECT  ID, NAME, AMOUNT, DATE
         FROM CUSTOMERS
         RIGHT JOIN ORDERS
         ON CUSTOMERS.ID = ORDERS.CUSTOMER_ID;

6.ORDER BY

ORDER BY is used to sort the results of a query. It may be used to sort the result set according to the specified column.

Note: The default sort is ascending, descending to use, you need to use the DESC keyword

SELECT Company, OrderNumber FROM Orders ORDER BY Company;

//降序排列
SELECT Company, OrderNumber FROM Orders ORDER BY Company DESC;

7.WHERE

where used to select the data according to specified criteria.

SELECT * FROM Persons WHERE FirstName='Bush';

Optimized

1 only needs to check some columns, select the designated field select replace the whole table

select * from table
替换成:
select id,name from table;

2. For large data query, add some filters, and defined conditions

select id,value from table where id=123 limit 1000;

3. For the large amount of data tables, queries, mainly in the equivalent query as the main query scenarios, it is recommended to create or modify the original table to table Hash Clustering, using the CLUSTERED field to accelerate the efficiency of the filter field.

4. For the amount of data but a long time significantly large query, you need to determine whether the data skew problem: can ANALYZE EXPLAIN view the execution plan ways to obtain diagnostic information.

If the use of interactive analysis process, have any questions, please consult into the group.
image.png

Guess you like

Origin yq.aliyun.com/articles/740810