1. Description
In JDBC, the executeBatch method can execute multiple dml statements in batches, which is much more efficient than executing executeUpdate individually. This is What is the principle? How to implement batch execution in mysql and oracle? This article will introduce to you the principle behind this.
2. Experiment introduction
This experiment will be carried out through the following three steps
a. Record the time consuming of jdbc batch execution and single execution in mysql
b. Record the time consuming of batch execution and single execution of jdbc in Oracle
c. Record the time consuming of batch execution and single execution of Oracle plsql
The relevant java and database versions are as follows: Java17, Mysql8, Oracle11G
3. Formal experiment
Create a table in mysql and oracle respectively
create table t ( -- mysql中創(chuàng)建表的語句 id int, name1 varchar(100), name2 varchar(100), name3 varchar(100), name4 varchar(100) );rrree
You need to turn on the audit of the database before the experiment
Mysql turns on auditing:
create table t ( -- oracle中創(chuàng)建表的語句 id number, name1 varchar2(100), name2 varchar2(100), name3 varchar2(100), name4 varchar2(100) );
oracle turns on auditing:
set global general_log = 1;
java code is as follows:
alter system set audit_trail=db, extended; audit insert table by scott; -- 實驗采用scott用戶批量執(zhí)行insert的方式
Several points to note in the code,
-
The url of mysql needs to add useServerPrepStmts=true&rewriteBatchedStatements=true parameters.
batchCnt represents the number of SQL statements executed in each batch, and 0 represents single execution.
First test mysql
import java.sql.*; public class JdbcBatchTest { /** * @param dbType 數(shù)據(jù)庫類型,oracle或mysql * @param totalCnt 插入的總行數(shù) * @param batchCnt 每批次插入的行數(shù),0表示單條插入 */ public static void exec(String dbType, int totalCnt, int batchCnt) throws SQLException, ClassNotFoundException { String user = "scott"; String password = "xxxx"; String driver; String url; if (dbType.equals("mysql")) { driver = "com.mysql.cj.jdbc.Driver"; url = "jdbc:mysql://ip/hello?useServerPrepStmts=true&rewriteBatchedStatements=true"; } else { driver = "oracle.jdbc.OracleDriver"; url = "jdbc:oracle:thin:@ip:orcl"; } long l1 = System.currentTimeMillis(); Class.forName(driver); Connection connection = DriverManager.getConnection(url, user, password); connection.setAutoCommit(false); String sql = "insert into t values (?, ?, ?, ?, ?)"; PreparedStatement preparedStatement = connection.prepareStatement(sql); for (int i = 1; i <= totalCnt; i++) { preparedStatement.setInt(1, i); preparedStatement.setString(2, "red" + i); preparedStatement.setString(3, "yel" + i); preparedStatement.setString(4, "bal" + i); preparedStatement.setString(5, "pin" + i); if (batchCnt > 0) { // 批量執(zhí)行 preparedStatement.addBatch(); if (i % batchCnt == 0) { preparedStatement.executeBatch(); } else if (i == totalCnt) { preparedStatement.executeBatch(); } } else { // 單條執(zhí)行 preparedStatement.executeUpdate(); } } connection.commit(); connection.close(); long l2 = System.currentTimeMillis(); System.out.println("總條數(shù):" + totalCnt + (batchCnt>0? (",每批插入:"+batchCnt) : ",單條插入") + ",一共耗時:"+ (l2-l1) + " 毫秒"); } public static void main(String[] args) throws SQLException, ClassNotFoundException { exec("mysql", 10000, 50); } }
Put in different batchCnt values ??to see the execution time
batchCnt=50 Total number: 10000, each batch Insert: 50, total time spent: 4369 milliseconds
batchCnt=100 Total number of items: 10000, insert per batch: 100, total time spent: 2598 milliseconds
batchCnt=200 Total number of items: 10000, insert per batch: 200, total time spent: 2211 milliseconds
batchCnt=1000 Total number of entries: 10000, each batch of insertions: 1000, total time spent: 2099 milliseconds
batchCnt=10000 Total number of entries: 10000, each batch of insertions: 10000, Total time spent: 2418 milliseconds
batchCnt=0 Total number of entries: 10000, single insertion, total time spent: 59620 milliseconds
View general log
batchCnt=5
batchCnt=0
Several conclusions can be drawn:
The efficiency of batch execution is greatly improved compared to single execution.
The batch execution of mysql actually rewrites sql and merges multiple inserts into insert xx values(),()... for execution.
When changing batchCnt from 50 to 100, the time is basically shortened by half. However, when this value is expanded, the time reduction is not obvious, and the execution time will even increase. high.
Analysis reason:
After the client sends the SQL statement to be executed to the database server, the database executes the SQL statement and returns the result to the client. Total time taken = database execution time network transmission time. Reducing the number of round trips through batch execution reduces network transfer time and therefore overall time. However, when batchCnt becomes larger, even if the network transmission time is no longer the main bottleneck, the reduction in the total time will not be so obvious. Especially when batchCnt=10000, that is, all 10,000 statements are executed at one time, the time becomes longer. This may be because the program and database need to apply for larger memory when preparing these input parameters, so it takes more time. (My guess).
One more thing, can the value of batchCnt be infinite? Suppose I need to insert 100 million items, can I insert 100 million items in batches at one time? Of course not, we don’t consider the space problem of undo. First of all, your computer does not have such a large memory to save all the 100 million SQL input parameters at once. Secondly, mysql also has a parameter max_allowed_packet to limit the length of a single statement. The maximum is 1Gbyte. When the statement is too long, "Packet for query is too large (1,773,901 > 1,599,488). You can change this value on the server by setting the 'max_allowed_packet' variable" will be reported.
Next test oracle
exec("mysql", 10000, batchCnt);
Substitute different batchCnt values ????to see the execution time
batchCnt=50 Total number: 10000, each batch insertion: 50, total Time consumption: 2055 milliseconds
batchCnt=100 Total number of items: 10000, each batch insertion: 100, total time consumption: 1324 milliseconds
batchCnt=200 Total number of items: 10000, each batch insertion: 200, total time consumption : 856 milliseconds
batchCnt=1000 Total number of entries: 10000, each batch of insertions: 1000, total time spent: 785 milliseconds
batchCnt=10000 Total number of entries: 10000, each batch of insertions: 10000, total time spent: 804 Milliseconds
batchCnt=0 Total number of entries: 10000, single insertion, total time consumption: 60830 milliseconds
The effect of execution in Oracle is basically the same as that in MySQL, and the efficiency of batch processing operations is obviously high. Executed in a single line. The problem is that there is no such insert xx values(),()... syntax in Oracle, so how does it achieve batch execution?
View the audit view dba_audit_trail
when executing batchCnt=50從審計的結(jié)果中可以看到,batchCnt=50的時候,審計記錄只有200條(扣除登入和登出),也就是sql只執(zhí)行了200次。sql_text沒有發(fā)生改寫,仍然是"insert into t values (:1 , :2 , :3 , :4 , :5 )",而且sql_bind只記錄了批量執(zhí)行的最后一個參數(shù),即50的倍數(shù)。根據(jù)awr報告可以看出,實際只執(zhí)行了200次(由于篇幅限制,省略了awr截圖)。那么oracle是怎么做到只執(zhí)行200次但插入1萬條記錄的呢?我們來看看oracle中使用存儲過程的批量插入。
四、存儲過程
準備數(shù)據(jù):
首先將t表清空 truncate table t;
用java往t表灌10萬數(shù)據(jù) exec("oracle", 100000, 1000);
創(chuàng)建t1表 create table t1 as select * from t where 1 = 0;
以下兩個過程的意圖一致,均為將t表中的數(shù)據(jù)導(dǎo)入t1表。nobatch是單次執(zhí)行,usebatch是批量執(zhí)行。
create or replace procedure nobatch is begin for x in (select * from t) loop insert into t1 (id, name1, name2, name3, name4) values (x.id, x.name1, x.name2, x.name3, x.name4); end loop; commit; end nobatch; /
create or replace procedure usebatch (p_array_size in pls_integer) is type array is table of t%rowtype; l_data array; cursor c is select * from t; begin open c; loop fetch c bulk collect into l_data limit p_array_size; forall i in 1..l_data.count insert into t1 values l_data(i); exit when c%notfound; end loop; commit; close c; end usebatch; /
執(zhí)行上述存儲過程
SQL> exec nobatch; ?
Elapsed: 00:00:32.92
SQL> exec usebatch(50);
Elapsed: 00:00:00.77
SQL> exec usebatch(100);
Elapsed: 00:00:00.47
SQL> exec usebatch(1000);
Elapsed: 00:00:00.19
SQL> exec usebatch(100000);
Elapsed: 00:00:00.26
存儲過程批量執(zhí)行效率也遠遠高于單條執(zhí)行。查看usebatch(50)執(zhí)行時的審計日志,sql_bind也只記錄了批量執(zhí)行的最后一個參數(shù),即50的倍數(shù)。與使用executeBatch方法在記錄內(nèi)容方面相同。因此可以推斷,JDBC的executeBatch和存儲過程的批量執(zhí)行都采用了相同的方法
存儲過程的這個關(guān)鍵點就是forall。查閱相關(guān)文檔。
The FORALL statement runs one DML statement multiple times, with different values in the VALUES and WHERE clauses.
The different values come from existing, populated collections or host arrays. The FORALL statement is usually much faster than an equivalent FOR LOOP statement.
The FORALL syntax allows us to bind the contents of a collection to a single DML statement, allowing the DML to be run for each row in the collection without requiring a context switch each time.
翻譯過來就是forall很快,原因就是不需要每次執(zhí)行的時候等待參數(shù)。
The above is the detailed content of How to implement JDBC batch insert in Java. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undress AI Tool
Undress images for free

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Singleton design pattern in Java ensures that a class has only one instance and provides a global access point through private constructors and static methods, which is suitable for controlling access to shared resources. Implementation methods include: 1. Lazy loading, that is, the instance is created only when the first request is requested, which is suitable for situations where resource consumption is high and not necessarily required; 2. Thread-safe processing, ensuring that only one instance is created in a multi-threaded environment through synchronization methods or double check locking, and reducing performance impact; 3. Hungry loading, which directly initializes the instance during class loading, is suitable for lightweight objects or scenarios that can be initialized in advance; 4. Enumeration implementation, using Java enumeration to naturally support serialization, thread safety and prevent reflective attacks, is a recommended concise and reliable method. Different implementation methods can be selected according to specific needs

Analyzing Java heap dumps is a key means to troubleshoot memory problems, especially for identifying memory leaks and performance bottlenecks. 1. Use EclipseMAT or VisualVM to open the .hprof file. MAT provides Histogram and DominatorTree views to display the object distribution from different angles; 2. sort in Histogram by number of instances or space occupied to find classes with abnormally large or large size, such as byte[], char[] or business classes; 3. View the reference chain through "ListObjects>withincoming/outgoingreferences" to determine whether it is accidentally held; 4. Use "Pathto

ThreadLocal is used in Java to create thread-private variables, each thread has an independent copy to avoid concurrency problems. It stores values ??through ThreadLocalMap inside the thread. Pay attention to timely cleaning when using it to prevent memory leakage. Common uses include user session management, database connections, transaction context, and log tracking. Best practices include: 1. Call remove() to clean up after use; 2. Avoid overuse; 3. InheritableThreadLocal is required for child thread inheritance; 4. Do not store large objects. The initial value can be set through initialValue() or withInitial(), and the initialization is delayed until the first get() call.

The core steps in Java to use JDBC to connect to a database include: loading the driver, establishing a connection, executing SQL, processing results, and closing resources. The specific operations are as follows: 1. Add database driver dependencies. For example, MySQL requires mysql-connector-java; 2. Load and register JDBC drivers, such as Class.forName("com.mysql.cj.jdbc.Driver"); 3. Use DriverManager.getConnection() to establish a connection, pass in the URL, username and password; 4. Create a Statement or PreparedStatemen

Optional can clearly express intentions and reduce code noise for null judgments. 1. Optional.ofNullable is a common way to deal with null objects. For example, when taking values ??from maps, orElse can be used to provide default values, so that the logic is clearer and concise; 2. Use chain calls maps to achieve nested values ??to safely avoid NPE, and automatically terminate if any link is null and return the default value; 3. Filter can be used for conditional filtering, and subsequent operations will continue to be performed only if the conditions are met, otherwise it will jump directly to orElse, which is suitable for lightweight business judgment; 4. It is not recommended to overuse Optional, such as basic types or simple logic, which will increase complexity, and some scenarios will directly return to nu.

The core workaround for encountering java.io.NotSerializableException is to ensure that all classes that need to be serialized implement the Serializable interface and check the serialization support of nested objects. 1. Add implementsSerializable to the main class; 2. Ensure that the corresponding classes of custom fields in the class also implement Serializable; 3. Use transient to mark fields that do not need to be serialized; 4. Check the non-serialized types in collections or nested objects; 5. Check which class does not implement the interface; 6. Consider replacement design for classes that cannot be modified, such as saving key data or using serializable intermediate structures; 7. Consider modifying

There are three common methods to traverse Map in Java: 1. Use entrySet to obtain keys and values at the same time, which is suitable for most scenarios; 2. Use keySet or values to traverse keys or values respectively; 3. Use Java8's forEach to simplify the code structure. entrySet returns a Set set containing all key-value pairs, and each loop gets the Map.Entry object, suitable for frequent access to keys and values; if only keys or values are required, you can call keySet() or values() respectively, or you can get the value through map.get(key) when traversing the keys; Java 8 can use forEach((key,value)->

ToimproveperformanceinJavaapplications,choosebetweenEhCacheandCaffeinebasedonyourneeds.1.Forlightweight,modernin-memorycaching,useCaffeine—setitupbyaddingthedependency,configuringacachebeanwithsizeandexpiration,andinjectingitintoservices.2.Foradvance
