SQL must know - create advanced joins

"SQL must know must know" reading notes

This lesson explains some other joins (including their meaning and usage), how to alias a table, and how to use aggregate functions on the joined tables.

1. Use table aliases

In addition to aliasing column names and calculated fields, SQL also allows aliasing of table names. There are two main reasons for this:

  • shorten the SQL statement;
  • Allows the same table to be used multiple times in a SELECT statement.

See the SELECT statement below. It's basically the same statement used in the example from the previous lesson, but changed to use an alias:

SELECT cust_name,cust_contact
FROM Customers AS C, Orders AS O, OrderItems AS OI
WHERE C.cust_id = O.cust_id
    AND OI.order_num = O.order_num
    AND prod_id = 'RGAN01';

Note: There is no AS in Oracle

Oracle does not support the AS keyword. To use an alias in Oracle, you can simply specify the column name without AS (thus, it should be Customers C, not Customers AS C).

Note that table column names are only used during query execution. Unlike column aliases, table aliases are not returned to the client.

2. Self-association

Types of joins: inner join, self join, natural join, outer join

Query requirements: First find out the company that Jim Jones works for, and then find out the customers that the company works for.

--通过子查询的方式
SELECT cust_id, cust_name, cust_contact
FROM Customers
WHERE cust_name = (
                    SELECT cust_name
                    FROM Customers
                    WHERE cust_contact = 'Jim Jones' 
                );

This is the first solution, using subqueries.

Now look at the same query using joins:

SELECT c1.cust_id, c1.cust_name, c1.cust_contact
FROM Customers AS c1, Customers AS c2
WHERE c1.cust_name = c2.cust_name
    AND c2.cust_contact = 'Jim Jones';

The two tables needed in this query are actually the same table, so the Customers table appears twice in the FROM clause. While this is perfectly legal, the reference to Customers is ambiguous because the DBMS doesn't know which Customers table you are referencing. To solve this problem, you need to use an alias table.

Tip: Use Self-Join Instead of Subqueries

Self-joins are often used as outer statements to replace the use of subqueries that retrieve data from the same table. While the end result is the same, many DBMSs process joins much faster than subqueries.

3. Natural connection

Standard joins (inner joins introduced in the previous lesson) return all data, even multiple occurrences of the same column. Natural joins exclude multiple occurrences so that each column is returned only once.

How to accomplish this work? Natural joins require you to select only those unique columns, typically done by using wildcards (SELECT *) for one table and an explicit subset of columns in other tables.

SELECT C.*, O.order_num, O.order_date,
        OI.prod_id, OI.quantity, OI.item_price
FROM Customers AS C, Order AS O, OrderItems AS OI
WHERE C.cust_id = O.cust_id
    AND OI.order_num = O.order_num
    AND prod_id = 'RGAN01';

In fact, every inner join we've built so far is a natural join, and it's likely that an inner join that isn't a natural join will never be used.

4. External link

Many joins associate rows in one table with rows in another table, but sometimes it is necessary to include those rows that have no associated rows. For example, you might want to use joins to do the following:

  • Count orders placed by each customer, including those who have not placed an order to date;
  • List all products and quantities ordered, including products no one has ordered;
  • Calculate the average sales size, including those customers who have not yet placed an order.

In the above example, the join includes those rows that are not associated in the related table. This kind of join is called an outer join.

The following SELECT statement gives a simple inner join. It retrieves all customers and their orders:

SELECT Customers.cust_id, Orders.order_num
FROM Customers INNER JOIN Orders
    ON Customers.cust_id = Orders.cust_id;

The outer join syntax is similar. To retrieve all customers including customers without orders, proceed as follows:

SELECT Customers.cust_id, Orders.order_num
FROM Customers LEFT OUTER JOIN Orders
ON Customers.cust_id = Orders.cust_id;

Similar to the inner join mentioned in the previous lesson, this SELECT statement uses the keyword OUTER JOIN to specify the join type (rather than in the WHERE clause).

However, unlike an inner join that associates rows from two tables, an outer join also includes rows that have no associated rows. When using the OUTER JOIN syntax, the RIGHT or LEFT keyword must be used to specify the table that includes all its rows (RIGHT refers to the table to the right of the OUTER JOIN, and LEFT refers to the table to the left of the OUTER JOIN.)

The above example uses LEFT OUTER JOIN to select all rows from the table (Customers) on the left side of the FROM clause. In order to select all rows from the table on the right, a RIGHT OUTER JOIN needs to be used, like this:

SELECT Customers.cust_id, Orders.order_num
FROM Customers RIGHT OUTER JOIN Orders
ON Orders.cust_id = Customers.cust_id;

Note: SQLite outer joins

SQLite supports LEFT OUTER JOIN, but not RIGHT OUTER JOIN.

Tip: Types of Outer Joins

Remember that there are always two basic forms of outer joins: left outer joins and right outer joins. The only difference between them is the order of the associated tables. In other words, by adjusting the order of the tables in the FROM or WHERE clause, a left outer join can be converted to a right outer join. Therefore, the two outer joins can be interchanged, whichever is more convenient to use.

There is another type of outer join, the full outer join, which retrieves all the rows in both tables and correlates those that can be correlated. Unlike a left or right outer join, which contains unrelated rows from one table, a full outer join contains unrelated rows from both tables. The syntax of a full outer join is as follows:

SELECT Customers.cust_id, Orders.order_num
FROM Orders FULL OUTER JOIN Customers
    ON Orders.cust_id = Customers.cust_id

Note: FULL OUTER JOIN support

Access, MariaDB, MySQL, Open Office Base, or SQLite do not support the FULL OUTER JOIN syntax.

5. Use joins with aggregate functions

To retrieve all customers and the number of orders placed by each customer, the following code uses the COUNT() function to do the job:

SELECT Customers.cust_id,
       COUNT(Orders.order_num) AS num_ord
FROM Customers INNER JOIN Orders
    ON Customers.cust_id = Orders.cust_id
GROUP BY Customers.cust_id


cust_id         num_ord
10000001        2
10000003        1
10000004        1
10000005        1

Aggregate functions can also be conveniently used with other joins. See the example below:

SELECT Customers.cust_id,
       COUNT(Orders.order_num) AS num_ord
FROM Customers LEFT OUTER JOIN Orders
    ON Customers.cust_id = Orders.cust_id
GROUP BY Customers.cust_id

cust_id     num_ord
10000001    2
10000002    0
10000003    1
10000004    1
10000005    1

This example uses a left outer join to include all customers, even those without any orders. The result also includes customer 10000002, who has 0 orders.

6. Use joins and join conditions

To summarize the key points of joins and their use:

  • Note the type of join used. Normally we use inner joins, but it is also valid to use outer joins.
  • For the exact join syntax, you should check the specific documentation to see what syntax is supported by the corresponding DBMS.
  • Be sure to use the correct join conditions, otherwise incorrect data will be returned.
  • A join condition should always be provided, otherwise a Cartesian product is obtained.
  • You can have multiple tables in a join, and you can even use a different join type for each join. While this is legal and generally useful, you should test each join individually before testing them together. This makes troubleshooting easier.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324552241&siteId=291194637