mysql basic knowledge combing

mysql basic knowledge combing

sql execution order

form on where group by having select distinct onder by limit duplicated code

Bulk update

update table a, table b set a.filed = b.filed where = duplicated code

mysql8 recursive query

with resursive cte as (select * from table where id =? union select t.* from table t,cte tcte where t.filed = select id from cte where filed =? Copy code
1. Define a cte, cte is the final result, which is the tree structure we want to wait until recursively. Resursive represents that the current cte is recursive 2. The first selection is the ten result set 3. The second select is a recursive department. Use rough or result set to query to get a new result set, until the recursive department result set returns to null, end the query 4. Finally, union all will merge the index results of the above steps. **Union disttinct will perform deduplication** Get all result sets through select * from cte Copy code

sql classification

1. DQL database query language 2. DML database operation language 3. DDL database definition language 4. TCL affairs and transaction processing Copy code


ifnull() replace null with a custom field left( field, number) right( field,number) substr() concat() splicing length() now() str_to_date string is converted to date format mod() modulo truncate interception Copy code

Aggregate function

1, sum 2, avg 3. max 4. min 5. count count (condition or null) Copy code
  • sum avg is used to deal with numeric types
  • max min count handles any type
  • The field requirement of the aggregate function is the field after the group by
  • The aggregate function is calculated after group by

Grouping function

group by having duplicated code

The filter condition behind where and having must be an aggregate function or a column to be queried after select



IF (?, value1, value2) Copy the code

case is similar to Java's swict

case field then returned when a constant value; the else Default end duplicated code

case is similar to Java's if...else

Case When then the return value or condition statement; the else Default end duplicated code


Internal connection

The connection result only contains rows that meet the connection conditions, and the two tables participating in the connection should meet the connection conditions

Equivalent (inner join)

elect,, table1 a inner join table2 b on = duplicated code

Non-equivalent (inner join)

select,, table1 a inner join table2 b on bteween b.fileda and b.filedb duplicated code

Outer join

The connection result contains not only the rows that meet the connection conditions, but also the rows that do not meet the connection conditions.

Left/right outer connection

select * from table a left join table b on = duplicated code


The select statement that appears in other statements is called a subquery

Appearance position

  • Select is used as a count later
  • from table subquery
  • where, having scalar quantum query
  • in any all column subquery
  • exists
  • union query

in, not in, exists, not exists


Suitable for large appearance and small inside appearance

Determine whether the given value matches the value in the subquery or the list. When querying in, first query the table of the subquery, then make a Cartesian product of the inner table and the outer side, and then filter according to the conditions


Applicable to the situation where the outer table is small and the inner table is large, exists execute table.lenght times

First query the content of the main table, execute the statement according to each record in the table, and judge the conditions behind where in turn, return true, false true to retain the row, false to delete the row

not in

The query statement uses not in, both the internal and external tables are scanned, and the index is not used

not exists

Subqueries can still be used, so not exists is more efficient than not in

in, exists difference
  • The subquery produces fewer records, and when the table in the main query is larger and has an index, you should use in
  • The main query has fewer records, the table in the subquery is large, and exists when there is an index
  • In and exists mainly cause the change of the driving sequence. If it is exists, the outer table is the driving table and is accessed first. If it is in, then execute the subquery first. So we will take the fast return of the driven table as the goal, then we will consider the relationship between the index and the result set, in addition, in does not deal with null
  • in is to make a hash connection between the outer table and the inner table, while exists is to make a loop loop for the outer table, and each time the loop loops to query the inner table


1, not null non-empty constraint 2, default default value 3. Primary key primary key constraint 4. Unique 5. Check check constraints 6, foreign key foreign key constraints Copy code


1. Atomicity, a transaction is indivisible, either all executed or not executed 2. Consistency, the execution of a transaction will switch the data from one state to another 3. Isolation, the execution of a transaction is not interfered by other transactions 4. Persistence, once a transaction is committed, it will permanently change the data in the database Copy code


Create view

create view viewname as sql copy the code

Modify view

alter view viewname as sql copy the code

Delete view

drop view viewname,...,... copy code

View the view structure

desc viewname show create view viewname Copy code


Mysql help efficiently get the data structure of the data, an index is to improve query efficiency sorted quickly find data structure

  • Improve the efficiency of data retrieval and reduce the IO cost of the database
  • Sort data by index column, reduce the cost of data sorting, and reduce CPU consumption
  • The index is also a table, the table saves the primary key index field, and executes the record of the entity table, the index column also takes up space
  • The index improves the query data, but it will reduce the speed of updating the table
  • Ordinary index: the most basic index
  • Combined index: Indexes established on multiple fields can speed up searches that meet the index conditions
  • Unique index: The value of the index column must be unique, and null values are allowed
  • Combination unique index: the combination of column values must be unique
  • Primary key index: uniquely identifies a record in the database, no null values are allowed
  • Full-text index: query of massive text


create [unique] index name on table (column name) alter table add [unique] index name on (column name) Copy code


drop index name on table duplicated code


show index from tableCopy code

Create a composite index

create index name on table(col1,col2,...) Copy code
  • Btree
  • Hash
  • Hash is suitable for equivalence query, range search is not possible
  • Hash can't use index to complete sorting
  • Hash does not support the leftmost matching rule of multi-column joint index
  • If there are a large number of duplicate keys, the efficiency of the hash index will be very low because of hash collisions
Difference and connection between Btree and Hash


The bottom layer of the hash index is the hash table. The hash table is a key-value storage data structure. There is no relationship between indexing multiple data stores, so the hash can only be equivalent search

B+ tree

B+ tree is a kind of multi-way balanced tree. Its nodes are naturally ordered ** The left child node is smaller than the parent node, and the parent node is smaller than the right child node ** So the B+ tree should not scan the entire table when performing range queries

Use index
  • The primary key automatically creates a unique index
  • Frequently used as query conditions should automatically create indexes
  • Automatic, foreign key relationship indexing in the query associated with the lifted table
  • Statistics or grouping fields in the query
  • Composite indexes have advantages over single-key indexes
  • Sorting in the query, if the sort field is accessed through the index, the sorting speed will be greatly improved
  • Do not create an index if the field is not used in the where condition
Cases where indexes are not applicable
  • Too few table records
  • Tables that are frequently added, deleted, and modified
  • Frequently updated fields
Index failure
  • When the index field like query, the wildcard is on the left
  • When the index field is not used on the left and right sides of the or statement, when there is only one query field on the left and right of the statement
  • Composite index, the first column does not use the index
  • Implicit conversion of data type
  • Use is null is not null on the index column, the index field cannot be null
  • Use not <> != on the index field
  • Use functions or calculations on indexed fields
  • When the full table scan is faster
Avoid index failure
  • Full match
  • Best left prefix rule
  • Do not perform any calculation operations on the index column
  • The storage engine cannot use all columns on the right side of the middle range condition in the index
  • Try to use overwrite all to reduce select *
  • mysql is not used! = <> is null is not null
  • like ends with a wildcard
  • String with single quotes
  • Use less or
Index optimization
  • Reasonable use of covering indexes
  • The uniqueness of the field is too low, it is meaningless to increase the index
  • Strings can be indexed by prefix, and the prefix length is controlled within 5-8 characters
  • The number of single table indexes does not exceed 5, and the number of single index fields does not exceed 5
  • Paging query is very important, if the query data volume exceeds 30%, mysql will not use the index

Full match

The number of query columns is the same as the number of joint indexes created

Best left prefix rule

The query starts from the leftmost front column of the index and does not skip the columns in the index

join query optimization

The implementation of join uses the Nested Loop join algorithm, which uses the result set of the driving table as the basic data, and uses the data as a filter condition to loop the query in the next table, and then merge the results. If there are multiple joins, the previous results Set as cyclic data, and query data in the latter table again

  • Scale query, left outer right outer, index is added in reverse, add index from the table
  • Minimize the total number of loops of nestedloop in the join statement as much as possible, always small tables drive large tables
  • Prioritize optimization of the inner loop of nestedloop
  • Ensure that the join added field on the driven table in the join statement has been indexed
  • When it is impossible to ensure that the join condition field of the driven table is indexed and the memory resources are sufficient, the setting of joinbuffer can be adjusted larger
  • Drive the table and driven table to increase the query conditions as much as possible, meet the on condition and use less where, and use a small result set to drive a large result set
  • All are added to the join field of the driven table. If the index cannot be established, set sufficient join buffer size
  • It is forbidden to join more than three tables, and verify that redundant fields are added

[nestedloop] Detailed explanation

Query optimization
  • Small table drives big table
  • in the main table data set is larger than the internal table data set
  • exists The outer table data set is smaller than the inner table data set
order by ordering optimization

mysql supports fileSort, index sorting index efficiency is high

index sort
1. Satisfy the rule of order by statement using the leftmost prefix 2. Use the where clause and the order by clause conditional column combination to satisfy the leftmost prefix rule of the index Copy code

The sort field of order by is consistent with the position, order, and number of the combined field used to create the index, and the sorting method is the same, it will trigger index sorting, otherwise it is fileSort

Cannot use select * query when using order by

Group by group optimization
1. Roughly the same as order by 2. Wehere is higher than having and can be written in where to limit the conditions, not having to limit Copy code
limit optimization

Which line of parameter meaning, check a few

Limit pagination query performance is worse as you turn backwards

Reduce the scan range

select * from table order by id desc limit 100000,10 copy the code

First filter out the ID to narrow the scope of the query

select * from table where id> ( select id form table order by id desc limit 100000,1) order by id desc limit 1,10 duplicated code

If the query condition only has the primary key ID

select id from table where id betweent 100000 and 100010 order by id desc duplicated code

Cursor, jdbc uses cursors to implement paging queries

like optimization

Like optimization is basically to prevent the index from failing. If you must fuzzy match before and after, you can try fulltext index

<> or optimization

Operators <>, or cannot hit the index, you can use union instead

text type optimization

Extracted and placed in the sub-table, the business primary key is associated

Avoid null values

When the field in mysql is null, it still occupies space, which will make the index and index statistics more complicated. Update from null value to non-null cannot be updated in situ. If the glory occurs, a split will affect performance. Use the null value as meaningful as possible. Value instead, can also avoid the SQL statement contains is not null

The basic principle
  • Make full use of but not abuse the index, the index also consumes disk and upc
  • It is not recommended to format data by using database functions
  • It is not recommended to use foreign key constraints, use programs to ensure data accuracy
  • Scenarios with more writing and less reading, it is not recommended to use unique index
  • Appropriate redundant fields, try to create an intermediate table, calculate intermediate results with programs, and exchange space for time
  • Do not follow the line to execute extremely time-consuming transactions, and cooperate with the program to split into smaller transactions
  • Work less for databases, work more for programs