Database interview data summary

database

affairs

  1. The concept of transaction: can be seen as a unit or a collection of SQL statements
  2. Four characteristics of transactions: atomicity, consistency, isolation, durability
    1. Atomicity: All operations are either executed or none of them are executed
    2. Consistency: Each individual transaction, that is, without concurrency with other transactions, must ensure the consistency of the database
    3. Isolation: Transactions are independent and are not affected by other concurrent transactions
    4. Durability: The result of a transaction is persisted permanently even if a failure occurs before the transaction is stored on disk
  3. There are three commonly used statements in database transactions:
    1. transaction begin
    2. transaction commit
    3. rollback transaction

database isolation level

  1. Dirty read (read uncommitted): Transaction B reads data that transaction A has not yet committed (WR conflict)

    Transaction B reads the data that Transaction A has modified but not yet committed.

  2. Non-repeatable read (read commit): The data read by two transactions is inconsistent (RW conflict)

  3. Phantom read (repeatable read): transaction A modifies data, and transaction B modifies data (WW conflict)

  4. Strict two-phase locking (2PL):

    1. If transaction T wants to read a certain data, it should first apply for a shared lock on the object (if there is an exclusive lock, the data can also be read).
    2. When the transaction ends, all locks are released.

database paradigm

Functional dependency: If two tuples are the same on property X, they are also the same on property Y.

  1. First Normal Form: Each column in a database table is an indivisible basic data column, that is, there cannot be multiple values ​​in the same column.

    1. Each field can only hold a single value
    2. Each record can be identified by a unique primary key
  2. Second Normal Form: The second normal form is built on the basis of the first normal form, and the non-primary properties are completely dependent on the code (eliminating partial dependencies), such as:

    ( N a m e , S u b j e c t ) → D e p a r t m e n t (Name, Subject)\rightarrow Department(Name,Subject)Department,但是 N a m e → D e p a r t m e n t Name\rightarrow Department NameDepartment

  3. Third normal form: if for RREvery function on R depends on X → AX \rightarrow AXA , one of the following conditions holds (removing transitive dependencies):

    1. A ∈ X A \in XAX,即 X → A X \rightarrow A XA is a trivial dependency
    2. X XX is super size
    3. A AA isRRpart of R code
  4. BC-Paradigm: For each relational schema RRThe functional dependence of R holdsX → AX\rightarrow AXA , one of the following conditions holds

    1. A ∈ X A \in XAX is a trivial dependency
    2. X XX is a super size

MySQL

  1. Explain statement can locate SQL with poor execution efficiency, and at the same time know the execution plan of SQL (such as full table scan or index scan)

    type : the type of join, avoid full scan (ALL)

    possible_key : the index that may be used

    key : the actual index used

    rows : Estimate the number of rows to scan each time

  2. profile is used to analyze the distribution consumption of sql

  3. The show status statement checks table_locks_waited (higher values ​​have more serious table-level lock contention) and table_locks_immediate to see lock usage.

  4. MySQL implements serialization by locking at write time.


Pessimistic locking and optimistic locking

  1. Pessimistic lock: Every time you modify the data, you will worry that other threads will modify the data, and all the data will be locked every time you operate.
  2. Optimistic locking: By default, other threads will not modify the data, so it will not be locked. Every time the data is modified, it is necessary to determine whether other programs have modified the data during this period.
  3. When there are many write operations, use pessimistic locking, and when there are many read operations, use optimistic locking.
  4. Exclusive lock: No other program can modify or read the data.
  5. Shared lock: other programs cannot modify the data, but can read the data.

principles of indexing

  1. The leftmost prefix matching principle, such as a = 1 a=1a=1, b = 1 b=1 b=1, c > 1 c>1 c>1, d = 1 d=1 d=1 , MySQL will always search to the right until it encounters a range query, such asd = 1 d=1d=1 does not use an index.
  2. = and in can be out of order, such as a = 1 , b = 1 , c = 1 a=1,b=1,c=1a=1,b=1,c=1 The index can be built in any order, and MySQL's query optimizer can help optimize it into a queryable form.
  3. Try to select a column with a high degree of discrimination as an index. The calculation formula of the degree of discrimination = different elements in the column/total elements.
  4. The index column cannot participate in the calculation, otherwise all elements need to participate in the calculation during retrieval, and the cost is very high.
  5. Try to expand the index, do not create a new index.
  6. Build indexes on columns that often use where clauses to speed up the efficiency of conditional judgment.
  7. Composite index, an index contains multiple fields, but not necessarily all indexes are used for each search

The advantages of indexing

  1. By creating a unique index, the uniqueness of each row of data in the database table can be guaranteed.
  2. The retrieval speed of data can be greatly accelerated.
  3. Can speed up table-to-table joins.

B-trees and B+ trees

  1. The non-leaf nodes of the B+ tree do not store keyword data pointers, but only perform data indexing. The number of layers is shallower and the efficiency is higher.
  2. The B+ tree retains the pointers of all the keyword records of the parent node, the data must be accessed at the leaf node, and the retrieval process is stable.
  3. B+ tree full node traversal is faster.

MySQL optimization

  1. MySQL statement optimization

    1. Use limit to optimize query results
    2. Avoid select *, list the required fields
    3. Use joins instead of subqueries
    4. Split large delete or insert statements
  2. Choose the appropriate data type

    1. Use the smallest data type that can hold the data
    2. Use simple data types
    3. Use reasonable field lengths
    4. Define fields with NOT NULL whenever possible
    5. Use as little text as possible
  3. Choose the appropriate index column

    1. Query frequent columns, columns appearing in where, group by, order by, on clauses
    2. small length column
    3. Columns with high dispersion
  4. SHOW [SESSION|GLOBAL] STATUS LIKE ‘%STATUS_NAME%’;

    1. com_select: number of queries
    2. connections: the number of connections
    3. uptime: database runtime
    4. handler_read: index usage

    Display system variables: SHOW VARIABLES LIKE '%Variable_name%';

    Display the storage engine status: SHOW ENGINE INNODB STATUS;

    EXPLAIN :

    1. type: connection status
    2. possible_keys: indexes that may be used
    3. key: the actual index used

    PROFILE:

    1. SET PROFILING=ON;
    2. SHOW profiles; view the running time of the SQL statement
    3. SHOW processlist; view the current connections of all users
    4. PROCEDURE ANALYSE() gives optimization suggestions for existing tables
    5. OPTIMIZE TABLE table_name; reclaim idle database space
    6. REPAIR TABLE table_name; repair corrupted table
    7. CHECK TABLE table_name; Check the table for errors

MySQL engine

MyISAM InnoDB
Transactions are not supported Support transactions
External health is not supported Support foreign health
Use a variable to store the number of rows in the full table without scanning the full table Do not save the specific number of rows, you need to scan the whole table
Support table lock Support table lock and row lock
Support full-text indexing, high query efficiency Full-text indexing is not supported

The operation of scanning the full table row count:

select count(*) from table

Table lock, the entire table must be locked for each operation, no deadlock will occur, but the concurrency is poor;

Row lock, each operation locks a row, there is a possibility of deadlock, but the probability of lock conflict is low, and the concurrency is good.

InnoDB performs row locks based on indexes, otherwise table locks


MySQL exercises

  1. Write an SQL query that satisfies the conditions: whether or not person has address information, the following information about person needs to be provided based on the above two tables:

    SELECT p.FirstName, p.LastName, addr.City, addr.State
    FROM Person p LEFT JOIN Address addr
    ON p.PersonId = addr.PersonId
    

    LEFT JOIN returns all the records in the left column and the linked records in the right column, that is, there may be NULL in the right column

  2. Write an SQL query to Employeeget the second highest salary (Salary) in the table.

    SELECT MAX(E.Salary) as SecondHighestSalary
    FROM Employee E
    WHERE E.Salary < (SELECT MAX(EM.Salary) FROM Employee EM)
    
  3. Given a Employeetable , write an SQL query that gets the names of employees who earn more than their managers. In the table above, Joe is the only employee who earns more than his manager.

    SELECT p.Email AS Email
    FROM Person p
    GROUP BY p.Email
    HAVING COUNT(*) > 1
    
  4. A website contains two tables, CustomersTable and OrdersTable . Write an SQL query to find all customers who never order anything.

    SELECT c.Name as Customers
    FROM Customers c
    WHERE c.Id not in (SELECT O.CustomerId FROM Orders O, Customers CC WHERE O.CustomerId = CC.Id)
    
  5. Write an SQL query to delete all duplicate email addresses in the Persontable , keeping only the one with the smallest Id .

    DELETE P1 
    FROM Person P1, Person P2
    WHERE P1.Email = P2.Email and P1.Id > P2.Id
    
  6. Given a Weathertable , write an SQL query to find the Ids of all dates with a higher temperature than the previous (yesterday's) date.

    SELECT W1.Id
    FROM Weather AS W1, Weather AS W2  
    WHERE DATEDIFF(W1.RecordDate, W2.RecordDate) = 1 AND W1.Temperature > W2.Temperature
    
  7. There is a coursestable with: student (student) and class (course) .

    Please list all classes with 5 or more students.

    SELECT C.class
    FROM Courses C
    GROUP BY C.class
    HAVING COUNT(DISTINCT(C.student)) >= 5
    
  8. Given a salary table, as shown below, with values ​​m = male and f = female. Swap all f and m values ​​(eg, change all f values ​​to m and vice versa). It is required to use only one update (Update) statement, and there is no intermediate temporary table.

    UPDATE Salary S SET S.sex = (
        CASE 
            WHEN S.sex = 'm' THEN 'f' 
            WHEN S.sex = 'f' THEN 'm' 
        END)
    

How to choose an engine

  1. When you need to support transactions, choose InnoDB
  2. Choose MyISAM if most of the tables are read-only queries, and choose InnoDB if there are frequent read and write operations in the table.
  3. After the system crashes, the recovery of MyISAM is more difficult, depending on the degree of reception.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324134919&siteId=291194637