国产av日韩一区二区三区精品,成人性爱视频在线观看,国产,欧美,日韩,一区,www.成色av久久成人,2222eeee成人天堂

Home Database Mysql Tutorial MySQL paging query method for millions of data and its optimization suggestions

MySQL paging query method for millions of data and its optimization suggestions

May 07, 2021 pm 03:09 PM
mysql Paging query

MySQL paging query method for millions of data and its optimization suggestions

Database SQL optimization is a commonplace issue. When faced with paging queries with millions of data volumes, what are some good optimization suggestions? Some commonly used methods are listed below for your reference and learning!

Method 1: Directly use the SQL statement provided by the database

  • Statement style: In MySQL, the following methods are available: select It is very slow and some database result sets return unstable (for example, one time returns 1, 2, 3, and another time returns 2, 1, 3). Limit restricts N outputs from the M position of the result set, and the rest Abandon.

  • Method 2: Create a primary key or unique index and use the index (assuming 10 entries per page)

Statement style: In MySQL, the following methods can be used: SELECT * FROM table name WHERE id_pk > (pageNum*10) LIMIT M

Adapted to scenarios: Suitable for situations with large amounts of data (number of tuples) 10,000)
  • Reason: Index scanning will be very fast. A friend suggested: Because the data query is not sorted according to pk_id, there will be data missing, so the only way is method 3

  • Method 3: Reorder based on index

Statement style: In MySQL, the following methods are available: SELECT * FROM table Name WHERE id_pk > (pageNum*10) ORDER BY id_pk ASC LIMIT M

Suitable for scenarios: Suitable for situations with large amounts of data (tens of thousands of tuples). It is best that the column object after ORDER BY is the primary key Or the only reason why the ORDERBY operation can use the index to be eliminated but the result set is stable (for the meaning of stability, see method 1)
  • Cause: Index scanning will be very fast. But MySQL's sorting operation only ASC does not have DESC (DESC is fake, real DESC will be made in the future, looking forward to...).

  • Method 4: Use prepare
based on index

The first question mark represents pageNum, the second one? Indicates the number of tuples per page

Statement style: In MySQL, the following method can be used: PREPARE stmt_name FROM SELECT * FROM table name WHERE id_pk > (?* ?) ORDER BY id_pk ASC LIMIT M

Adapt to scenarios: large data volume
  • Reason: Index scanning will be very fast. The prepare statement is faster than the general query statement.

  • Method 5: Using MySQL to support ORDER operations can use indexes to quickly locate some tuples and avoid full table scans

For example: Read tuples in rows 1000 to 1019 (pk is the primary key/unique key).

SELECT * FROM your_table WHERE pk>=1000 ORDER BY pk ASC LIMIT 0,20

Method 6: Use subquery/join index to quickly locate the tuple, and then Read the tuple again.

For example (id is the primary key/unique key, variable in blue font)Example of using subquery:

SELECT * FROM your_table WHERE id <=
(SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize ORDER BY id desc
LIMIT $pagesize

Use connection example:

SELECT * FROM your_table AS t1
JOIN (SELECT id FROM your_table ORDER BY id desc LIMIT ($page-1)*$pagesize AS t2
WHERE t1.id <= t2.id ORDER BY t1.id desc LIMIT $pagesize;

Mysql uses limit paging for large amounts of data. As the page number increases, the query efficiency becomes lower.

Test experiment

1. Directly use limit start, count paging statements, which is also the method used in my program:

select * from product limit start, count

When the starting page is small, there is no performance problem with the query. Let's look at the execution time of paging starting from 10, 100, 1000, and 10000 (20 entries per page). is as follows:

select * from product limit 10, 20 0.016秒
select * from product limit 100, 20 0.016秒
select * from product limit 1000, 20 0.047秒
select * from product limit 10000, 20 0.094秒

We have seen that as the starting record increases, the time also increases. This shows that the paging statement limit is closely related to the starting page number. Then let’s change the starting record to 40w and take a look (that is, the average record)

select * from product limit 400000, 20 3.229秒

Let’s look at the time when we took the last page of records

select * from product limit 866613, 20 37.44秒

The largest page number for this kind of paging Page Apparently this kind of time is intolerable.

We can also conclude two things from this:

The query time of the limit statement is proportional to the position of the starting record

Mysql’s limit statement is very Convenient, but it is not suitable for direct use for tables with many records.

  1. 2. Performance optimization method for limit paging problem

Use the covering index of the table to speed up paging queryWe As we all know, if the statement using an index query only contains that index column (covering index), then the query will be very fast.

Because there is an optimization algorithm for index search, and the data is on the query index, there is no need to find the relevant data address, which saves a lot of time. In addition, there is also a related index cache in Mysql. It is better to use the cache when the concurrency is high.

In our example, we know that the id field is the primary key, and naturally includes the default primary key index. Now let's see how the query using the covering index performs.

This time we query the data of the last page (using a covering index, including only the id column), as follows:

select id from product limit 866613, 20 0.2秒

Compared with the 37.44 seconds of querying all columns, it has improved by about More than 100 times faster

那么如果我們也要查詢所有列,有兩種方法,一種是id>=的形式,另一種就是利用join,看下實(shí)際情況:

SELECT * FROM product WHERE ID > =(select id from product limit 866613, 1) limit 20

查詢時(shí)間為0.2秒!

另一種寫法

SELECT * FROM product a JOIN (select id from product limit 866613, 20) b ON a.ID = b.id

查詢時(shí)間也很短!

3. 復(fù)合索引優(yōu)化方法

MySql 性能到底能有多高?MySql 這個(gè)數(shù)據(jù)庫(kù)絕對(duì)是適合dba級(jí)的高手去玩的,一般做一點(diǎn)1萬篇新聞的小型系統(tǒng)怎么寫都可以,用xx框架可以實(shí)現(xiàn)快速開發(fā)??墒菙?shù)據(jù)量到了10萬,百萬至千萬,他的性能還能那么高嗎?一點(diǎn)小小的失誤,可能造成整個(gè)系統(tǒng)的改寫,甚至更本系統(tǒng)無法正常運(yùn)行!好了,不那么多廢話了。

用事實(shí)說話,看例子:

數(shù)據(jù)表 collect ( id, title ,info ,vtype) 就這4個(gè)字段,其中 title 用定長(zhǎng),info 用text, id 是逐漸,vtype是tinyint,vtype是索引。這是一個(gè)基本的新聞系統(tǒng)的簡(jiǎn)單模型。現(xiàn)在往里面填充數(shù)據(jù),填充10萬篇新聞。最后collect 為 10萬條記錄,數(shù)據(jù)庫(kù)表占用硬1.6G。

OK ,看下面這條sql語句:

select id,title from collect limit 1000,10;

很快;基本上0.01秒就OK,再看下面的

select id,title from collect limit 90000,10;

從9萬條開始分頁(yè),結(jié)果?

8-9秒完成,my god 哪出問題了?其實(shí)要優(yōu)化這條數(shù)據(jù),網(wǎng)上找得到答案。看下面一條語句:

select id from collect order by id limit 90000,10;

很快,0.04秒就OK。為什么?因?yàn)橛昧薸d主鍵做索引當(dāng)然快。網(wǎng)上的改法是:

select id,title from collect where id>=(select id from collect order by id limit 90000,1) limit 10;

這就是用了id做索引的結(jié)果??墒菃栴}復(fù)雜那么一點(diǎn)點(diǎn),就完了??聪旅娴恼Z句

select id from collect where vtype=1 order by id limit 90000,10;

很慢,用了8-9秒!

到了這里我相信很多人會(huì)和我一樣,有崩潰感覺!vtype 做了索引了???怎么會(huì)慢呢?vtype做了索引是不錯(cuò),你直接

select id from collect where vtype=1 limit 1000,10;

是很快的,基本上0.05秒,可是提高90倍,從9萬開始,那就是0.05*90=4.5秒的速度了。和測(cè)試結(jié)果8-9秒到了一個(gè)數(shù)量級(jí)。

從這里開始有人提出了分表的思路,這個(gè)和dis #cuz 論壇是一樣的思路。思路如下:

建一個(gè)索引表:t (id,title,vtype) 并設(shè)置成定長(zhǎng),然后做分頁(yè),分頁(yè)出結(jié)果再到 collect 里面去找info 。是否可行呢?實(shí)驗(yàn)下就知道了。

10萬條記錄到 t(id,title,vtype) 里,數(shù)據(jù)表大小20M左右。用

select id from collect where vtype=1 limit 1000,10;

很快了?;旧?.1-0.2秒可以跑完。為什么會(huì)這樣呢?我猜想是因?yàn)閏ollect 數(shù)據(jù)太多,所以分頁(yè)要跑很長(zhǎng)的路。limit 完全和數(shù)據(jù)表的大小有關(guān)的。其實(shí)這樣做還是全表掃描,只是因?yàn)閿?shù)據(jù)量小,只有10萬才快。OK, 來個(gè)瘋狂的實(shí)驗(yàn),加到100萬條,測(cè)試性能。加了10倍的數(shù)據(jù),馬上t表就到了200多M,而且是定長(zhǎng)。還是剛才的查詢語句,時(shí)間是0.1-0.2秒完成!分表性能沒問題?

錯(cuò)!因?yàn)槲覀兊膌imit還是9萬,所以快。給個(gè)大的,90萬開始

select id from t where vtype=1 order by id limit 900000,10;

看看結(jié)果,時(shí)間是1-2秒!why ?

分表了時(shí)間還是這么長(zhǎng),非常之郁悶!有人說定長(zhǎng)會(huì)提高limit的性能,開始我也以為,因?yàn)橐粭l記錄的長(zhǎng)度是固定的,mysql 應(yīng)該可以算出90萬的位置才對(duì)?。靠墒俏覀兏吖懒薽ysql 的智能,他不是商務(wù)數(shù)據(jù)庫(kù),事實(shí)證明定長(zhǎng)和非定長(zhǎng)對(duì)limit影響不大?怪不得有人說discuz到了100萬條記錄就會(huì)很慢,我相信這是真的,這個(gè)和數(shù)據(jù)庫(kù)設(shè)計(jì)有關(guān)!

難道MySQL 無法突破100萬的限制嗎???到了100萬的分頁(yè)就真的到了極限?

答案是:NO 為什么突破不了100萬是因?yàn)椴粫?huì)設(shè)計(jì)mysql造成的。下面介紹非分表法,來個(gè)瘋狂的測(cè)試!一張表搞定100萬記錄,并且10G 數(shù)據(jù)庫(kù),如何快速分頁(yè)!

好了,我們的測(cè)試又回到 collect表,開始測(cè)試結(jié)論是:

30萬數(shù)據(jù),用分表法可行,超過30萬他的速度會(huì)慢道你無法忍受!當(dāng)然如果用分表+我這種方法,那是絕對(duì)完美的。但是用了我這種方法后,不用分表也可以完美解決!

答案就是:復(fù)合索引!有一次設(shè)計(jì)mysql索引的時(shí)候,無意中發(fā)現(xiàn)索引名字可以任取,可以選擇幾個(gè)字段進(jìn)來,這有什么用呢?

開始的

select id from collect order by id limit 90000,10;

這么快就是因?yàn)樽吡怂饕?,可是如果加了where 就不走索引了。抱著試試看的想法加了 search(vtype,id) 這樣的索引。

然后測(cè)試

select id from collect where vtype=1 limit 90000,10;

非??欤?.04秒完成!

再測(cè)試:

select id ,title from collect where vtype=1 limit 90000,10;

非常遺憾,8-9秒,沒走search索引!

再測(cè)試:search(id,vtype),還是select id 這個(gè)語句,也非常遺憾,0.5秒。

綜上:如果對(duì)于有where 條件,又想走索引用limit的,必須設(shè)計(jì)一個(gè)索引,將where 放第一位,limit用到的主鍵放第2位,而且只能select 主鍵!

完美解決了分頁(yè)問題了。可以快速返回id就有希望優(yōu)化limit , 按這樣的邏輯,百萬級(jí)的limit 應(yīng)該在0.0x秒就可以分完??磥韒ysql 語句的優(yōu)化和索引時(shí)非常重要的!

Recommendation: "mysql tutorial"

The above is the detailed content of MySQL paging query method for millions of data and its optimization suggestions. For more information, please follow other related articles on the PHP Chinese website!

Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn

Hot AI Tools

Undress AI Tool

Undress AI Tool

Undress images for free

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Clothoff.io

Clothoff.io

AI clothes remover

Video Face Swap

Video Face Swap

Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Tools

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 Chinese version

SublimeText3 Chinese version

Chinese version, very easy to use

Zend Studio 13.0.1

Zend Studio 13.0.1

Powerful PHP integrated development environment

Dreamweaver CS6

Dreamweaver CS6

Visual web development tools

SublimeText3 Mac version

SublimeText3 Mac version

God-level code editing software (SublimeText3)

Performing logical backups using mysqldump in MySQL Performing logical backups using mysqldump in MySQL Jul 06, 2025 am 02:55 AM

mysqldump is a common tool for performing logical backups of MySQL databases. It generates SQL files containing CREATE and INSERT statements to rebuild the database. 1. It does not back up the original file, but converts the database structure and content into portable SQL commands; 2. It is suitable for small databases or selective recovery, and is not suitable for fast recovery of TB-level data; 3. Common options include --single-transaction, --databases, --all-databases, --routines, etc.; 4. Use mysql command to import during recovery, and can turn off foreign key checks to improve speed; 5. It is recommended to test backup regularly, use compression, and automatic adjustment.

Handling NULL Values in MySQL Columns and Queries Handling NULL Values in MySQL Columns and Queries Jul 05, 2025 am 02:46 AM

When handling NULL values ??in MySQL, please note: 1. When designing the table, the key fields are set to NOTNULL, and optional fields are allowed NULL; 2. ISNULL or ISNOTNULL must be used with = or !=; 3. IFNULL or COALESCE functions can be used to replace the display default values; 4. Be cautious when using NULL values ??directly when inserting or updating, and pay attention to the data source and ORM framework processing methods. NULL represents an unknown value and does not equal any value, including itself. Therefore, be careful when querying, counting, and connecting tables to avoid missing data or logical errors. Rational use of functions and constraints can effectively reduce interference caused by NULL.

Aggregating data with GROUP BY and HAVING clauses in MySQL Aggregating data with GROUP BY and HAVING clauses in MySQL Jul 05, 2025 am 02:42 AM

GROUPBY is used to group data by field and perform aggregation operations, and HAVING is used to filter the results after grouping. For example, using GROUPBYcustomer_id can calculate the total consumption amount of each customer; using HAVING can filter out customers with a total consumption of more than 1,000. The non-aggregated fields after SELECT must appear in GROUPBY, and HAVING can be conditionally filtered using an alias or original expressions. Common techniques include counting the number of each group, grouping multiple fields, and filtering with multiple conditions.

Paginating Results with LIMIT and OFFSET in MySQL Paginating Results with LIMIT and OFFSET in MySQL Jul 05, 2025 am 02:41 AM

MySQL paging is commonly implemented using LIMIT and OFFSET, but its performance is poor under large data volume. 1. LIMIT controls the number of each page, OFFSET controls the starting position, and the syntax is LIMITNOFFSETM; 2. Performance problems are caused by excessive records and discarding OFFSET scans, resulting in low efficiency; 3. Optimization suggestions include using cursor paging, index acceleration, and lazy loading; 4. Cursor paging locates the starting point of the next page through the unique value of the last record of the previous page, avoiding OFFSET, which is suitable for "next page" operation, and is not suitable for random jumps.

Implementing Transactions and Understanding ACID Properties in MySQL Implementing Transactions and Understanding ACID Properties in MySQL Jul 08, 2025 am 02:50 AM

MySQL supports transaction processing, and uses the InnoDB storage engine to ensure data consistency and integrity. 1. Transactions are a set of SQL operations, either all succeed or all fail to roll back; 2. ACID attributes include atomicity, consistency, isolation and persistence; 3. The statements that manually control transactions are STARTTRANSACTION, COMMIT and ROLLBACK; 4. The four isolation levels include read not committed, read submitted, repeatable read and serialization; 5. Use transactions correctly to avoid long-term operation, turn off automatic commits, and reasonably handle locks and exceptions. Through these mechanisms, MySQL can achieve high reliability and concurrent control.

Calculating Database and Table Sizes in MySQL Calculating Database and Table Sizes in MySQL Jul 06, 2025 am 02:41 AM

To view the size of the MySQL database and table, you can query the information_schema directly or use the command line tool. 1. Check the entire database size: Execute the SQL statement SELECTtable_schemaAS'Database',SUM(data_length index_length)/1024/1024AS'Size(MB)'FROMinformation_schema.tablesGROUPBYtable_schema; you can get the total size of all databases, or add WHERE conditions to limit the specific database; 2. Check the single table size: use SELECTta

Handling character sets and collations issues in MySQL Handling character sets and collations issues in MySQL Jul 08, 2025 am 02:51 AM

Character set and sorting rules issues are common when cross-platform migration or multi-person development, resulting in garbled code or inconsistent query. There are three core solutions: First, check and unify the character set of database, table, and fields to utf8mb4, view through SHOWCREATEDATABASE/TABLE, and modify it with ALTER statement; second, specify the utf8mb4 character set when the client connects, and set it in connection parameters or execute SETNAMES; third, select the sorting rules reasonably, and recommend using utf8mb4_unicode_ci to ensure the accuracy of comparison and sorting, and specify or modify it through ALTER when building the library and table.

Setting up asynchronous primary-replica replication in MySQL Setting up asynchronous primary-replica replication in MySQL Jul 06, 2025 am 02:52 AM

To set up asynchronous master-slave replication for MySQL, follow these steps: 1. Prepare the master server, enable binary logs and set a unique server-id, create a replication user and record the current log location; 2. Use mysqldump to back up the master library data and import it to the slave server; 3. Configure the server-id and relay-log of the slave server, use the CHANGEMASTER command to connect to the master library and start the replication thread; 4. Check for common problems, such as network, permissions, data consistency and self-increase conflicts, and monitor replication delays. Follow the steps above to ensure that the configuration is completed correctly.

See all articles