In-depth understanding of JDBC timeout settings

In-depth understanding of JDBC timeout settings

Reprinted from: http://www.kgc.cn/bbs/post/33272.shtml

 

Appropriate JDBC timeout settings can effectively reduce the time of service failure. This article will introduce various timeout settings of the database and their setting methods. 

Real Case: Application Server Unresponsive After DDos Attack 

After the DDos attack, the whole service went down. Because the Layer 4 switches are overwhelmed, the network becomes unconnectable, and the business system cannot function properly. The security group quickly blocked all DDos attacks and restored the network, but the business system still didn't work. By analyzing the thread dump of the system, it was found that the business system stopped at the JDBC API call. After 20 minutes, the system is still in the WAITING state and cannot respond. After 30 minutes, the system throws an exception and the service returns to normal. 

Why did we set the query timeout to 3 seconds, but the system stayed in the WAITING state for 30 minutes? Why did the system return to normal after 30 minutes? When you understand the timeout settings of JDBC, you can find the answer to your question. 

Why do we need to understand JDBC? The business system and the database are usually the two parts we care most about when encountering performance issues or system errors. In the company, these two parts are handed over to two different departments, so each department will focus on finding problems in its own area, so that the part between the business system and the database will become a blind spot. For Java applications, this blind spot is the DBCP database connection pool and JDBC. This article will focus on JDBC.  

What is JDBC? 

JDBC is a standard API for connecting to relational databases in Java applications. Sun has defined 4 types of JDBC in total , and we mainly use the 4th type. This type of Driver is completely implemented by Java code and communicates with the database by using socket. 

 

图1 JDBC Type 4.

The fourth type of JDBC processes byte streams through sockets, so there will also be some basic network operations, similar to HttpClient, a code base for network operations. When encountering problems in network operation, it will consume a lot of cpu resources and lose response time out. If you have used HttpClient before, then you must have encountered errors caused by not setting timeout. Similarly, the fourth type of JDBC, if the socket timeout is not set properly, will have the same error - the connection is blocked. 
Next, let's learn how to set the socket timeout correctly, and the issues that need to be considered. 

The timeout level between the application and the database 

 

Figure 2 Timeout Class.

The diagram above shows the simplified timeout hierarchy between the application and the database. (Translator's Note: WAS/BLOC is the specific application name of the author's company, no need to delve into it) The 
high-level timeout depends on the low-level timeout. Only when the low-level timeout is correct, the high-level timeout can be guaranteed to be normal. For example, when there is a problem with socket timeout, both the high-level statement timeout and transaction timeout will fail. 
Many of the comments we received mentioned: 

quote

Even with the statement timeout set, when the network fails, the app cannot recover from the error.

statement timeout cannot handle the timeout when the network connection fails, all it can do is limit the operation time of the statement. The timeout when the network connection fails must be handled by JDBC. 
The JDBC socket timeout is affected by the OS socket timeout setting, which explains why in the previous case, the JDBC connection would block for 30 minutes after a network error, and then miraculously recover, even though we did not have a socket timeout for JDBC timeout is set. 

The DBCP connection pool is located on the left side of Figure 2, and you will find that the timeout level is independent of DBCP. DBCP is responsible for the creation and management of database connections, and does not interfere with timeout processing. When a connection is created in DBCP, or when DBCP sends a check query to check the validity of the connection, socket timeout will affect these processes, but it does not directly affect the application. 
When calling DBCP's getConnection() method in your application, you can set a timeout for getting a database connection, but this has nothing to do with JDBC timeout. 

 

图3 Timeout for Each Levels.

What is Transaction Timeout? 

transaction timeout generally exists at the framework (Spring, EJB) or application level. Transaction timeout may be a relatively unfamiliar concept. Simply put, transaction timeout is "statement Timeout * N (the number of statements to be executed) + @ (other times such as garbage collection)". The transaction timeout is used to limit the total time to execute the statement. 
For example, assuming it takes 0.1 seconds to execute a statement, there is no problem executing a small number of statements, but 10,000 seconds (about 7 hours) if you want to execute 100,000 statements. At this time, transaction timeout comes in handy. EJB CMT (Container Managed Transaction) is a typical implementation, which provides a variety of methods for developers to choose. But we don't use EJB, Spring's transaction timeout setting will be more commonly used. In Spring, you can set this using the XML shown below or using the @Transactional annotation in the source code. 

Xml code

  1. <tx:attributes>  

  2.         <tx:method name=“…” timeout=“3″/>  

  3. </tx:attributes>  

Spring提供的transaction timeout配置非常简单,它会记录每个事务的开始时间和消耗时间,当特定的事件发生时就会对消耗时间做校验,当超出timeout值时将抛出异常。 
Spring 中,数据库连接被保存在ThreadLocal里,这被称为事务同步(Transaction Synchronization),与此同时,事务的开始时间和消耗时间也被保存下来。当使用这种代理连接创建statement时,就会校验事务的消耗 时间。EJB CMT的实现方式与之类似,其结构本身也十分简单。 
当 你选用的容器或框架并不支持transaction timeout这一特性,你可以考虑自己来实现。transaction timeout并没有标准的API。Lucy框架的1.5和1.6版本都不支持transaction timeout,但是你可以通过使用Spring的Transaction Manager来达到与之同样的效果。 
假设某个事务中包含5个statement,每个statement的执行时间是200ms,其他业务逻辑的执行时间是100ms,那么transaction timeout至少应该设置为1,100ms(200 * 5 + 100)。 

什么是Statement Timeout? 
statement timeout用来限制statement的执行时长,timeout的值通过调用JDBC的 java.sql.Statement.setQueryTimeout(int timeout) API进行设置。不过现在开发者已经很少直接在代码中设置,而多是通过框架来进行设置。 
以 iBatis为例,statement timeout的默认值可以通过sql-map-config.xml中的defaultStatementTimeout 属性进行设置。同时,你还可以设置sqlmap中select,insert,update标签的timeout属性,从而对不同sql语句的超时时间进 行独立的配置。 
如果你使用的是Lucy1.5或1.6版本,通过设置queryTimeout属性可以在datasource层面对statement timeout进行设置。 
statement timeout的具体值需要依据应用本身的特性而定,并没有可供推荐的配置。 

JDBC的statement timeout处理过程 
不同的关系型数据库,以及不同的JDBC驱动,其statement timeout处理过程会有所不同。其中,Oracle和MS SQLServer的处理相类似,MySQL和CUBRID类似。 

Oracle JDBC Statement的QueryTimeout处理过程 
1. 通过调用Connection的createStatement()方法创建statement 
2. 调用Statement的executeQuery()方法 
3. statement通过自身connection将query发送给Oracle数据库 
4. statement在OracleTimeoutPollingThread(每个classloader一个)上进行注册 
5. 达到超时时间 
6. OracleTimeoutPollingThread调用OracleStatement的cancel()方法 
7. 通过connection向正在执行的query发送cancel消息 

 

图4 Query Timeout Execution Process for Oracle JDBC Statement.

JTDS (MS SQLServer) Statement的QueryTimeout处理过程 
1. 通过调用Connection的createStatement()方法创建statement 
2. 调用Statement的executeQuery()方法 
3. statement通过自身connection将query发送给MS SqlServer数据库 
4. statement在TimerThread上进行注册 
5. 达到超时时间 
6. TimerThread调用JtdsStatement实例中的TsdCore.cancel()方法 
7. 通过ConnectionJDBC向正在执行的query发送cancel消息 

 

图5 QueryTimeout Execution Process for JTDS (MS SQLServer) Statement.

MySQL JDBC Statement的QueryTimeout处理过程 
1. 通过调用Connection的createStatement()方法创建statement 
2. 调用Statement的executeQuery()方法 
3. statement通过自身connection将query发送给MySQL数据库 
4. statement创建一个新的timeout-execution线程用于超时处理 
5. 5.1版本后改为每个connection分配一个timeout-execution线程 
6. 向timeout-execution线程进行注册 
7. 达到超时时间 
6. TimerThread调用JtdsStatement实例中的TsdCore.cancel()方法 
7. timeout-execution线程创建一个和statement配置相同的connection 
8. 使用新创建的connection向超时query发送cancel query(KILL QUERY “connectionId”) 

 

图6 QueryTimeout Execution Process for MySQL JDBC Statement (5.0.8).

CUBRID JDBC Statement的QueryTimeout处理过程 
1. 通过调用Connection的createStatement()方法创建statement 
2. 调用Statement的executeQuery()方法 
3. statement通过自身connection将query发送给CUBRID数据库 
4. statement创建一个新的timeout-execution线程用于超时处理 
5. 5.1版本后改为每个connection分配一个timeout-execution线程 
6. 向timeout-execution线程进行注册 
7. 达到超时时间 
6. TimerThread调用JtdsStatement实例中的TsdCore.cancel()方法 
7. timeout-execution线程创建一个和statement配置相同的connection 
8. 使用新创建的connection向超时query发送cancel消息 

 

图7 QueryTimeout Execution Process for CUBRID JDBC Statement.

什么是JDBC的socket timeout? 
第4种类型的JDBC使用socket与数据库连接,数据库并不对应用与数据库间的连接超时进行处理。 
JDBC 的socket timeout在数据库被突然停掉或是发生网络错误(由于设备故障等原因)时十分重要。由于TCP/IP的结构原因,socket没有办法探测到网络错 误,因此应用也无法主动发现数据库连接断开。如果没有设置socket timeout的话,应用在数据库返回结果前会无期限地等下去,这种连接被称为dead connection。 
为了避免dead connections,socket必须要有超时配置。socket timeout可以通过JDBC设置,socket timeout能够避免应用在发生网络错误时产生无休止等待的情况,缩短服务失效的时间。 

不 推荐使用socket timeout来限制statement的执行时长,因此socket timeout的值必须要高于statement timeout,否则,socket timeout将会先生效,这样statement timeout就变得毫无意义,也无法生效。 

下面展示了socket timeout的两个设置项,不同的JDBC驱动其配置方式会有所不同。 

  • socket连接时的timeout:通过Socket.connect(SocketAddress endpoint, int timeout)设置

  • socket读写时的timeout:通过Socket.setSoTimeout(int timeout)设置

通过查看CUBRID,MySQL,MS SQL Server (JTDS)和Oracle的JDBC驱动源码,我们发现所有的驱动内部都是使用上面的2个API来设置socket timeout的。 

下面是不同驱动的socket timeout配置方式。 

JDBC Driver connectTimeout配置项 socketTimeout配置项 url格式 示例
MySQL Driver connectTimeout(默认值:0,单位:ms) socketTimeout(默认值:0,单位:ms) jdbc:mysql://[host:port],[host:port]…/[database][?propertyName1][=propertyValue1][&propertyName2][=propertyValue2]… jdbc:mysql://xxx.xx.xxx.xxx:3306/database?connectTimeout=60000&socketTimeout=60000
MS-SQL DriverjTDS Driver loginTimeout(默认值:0,单位:s) socketTimeout(默认值:0,单位:s) jdbc:jtds:<server_type>://<server>[:<port>][/<database>][;<property>=<value>[;...]] jdbc:jtds:sqlserver://server:port/database;loginTimeout=60;socketTimeout=60
Oracle Thin Driver oracle.net.CONNECT_TIMEOUT (默认值:0,单位:ms) oracle.jdbc.ReadTimeout(默认值:0,单位:ms) 不支持 通过url配置,只能通过OracleDatasource.setConnectionProperties() API设置,使用DBCP时可以调用BasicDatasource.setConnectionProperties()或 BasicDatasource.addConnectionProperties()进行设置  
CUBRID Thin Driver 无独立配置项(默认值:5,000,单位:ms) 无独立配置项(默认值:5,000,单位:ms)    
  • connectTimeout和socketTimeout的默认值为0时,timeout不生效。

  • 除了调用DBCP的API以外,还可以通过properties属性进行配置。

通过properties属性进行配置时,需要传入key为“connectionProperties”的键值对,value的格式为“[propertyName=property;]*”。下面是iBatis中的properties配置。 

Xml代码

  1. <transactionManager type=“JDBC”>  

  2.   <dataSource type=“com.nhncorp.lucy.db.DbcpDSFactory”>  

  3.      ….  

  4.      <property name=“connectionProperties” value=“oracle.net.CONNECT_TIMEOUT=6000;oracle.jdbc.ReadTimeout=6000″/>   

  5.   </dataSource>  

  6. </transactionManager>  

操作系统的socket timeout配置 
如 果不设置socket timeout或connect timeout,应用多数情况下是无法发现网络错误的。因此,当网络错误发生后,在连接重新连接成功或成功接收到数据之前,应用会无限制地等下去。但是, 通过本文开篇处的实际案例我们发现,30分钟后应用的连接问题奇迹般的解决了,这是因为操作系统同样能够对socket timeout进行配置。公司的Linux服务器将socket timeout设置为了30分钟,从而会在操作系统的层面对网络连接做校验,因此即使JDBC的socket timeout设置为0,由网络错误造成的数据库连接问题的持续时间也不会超过30分钟。 

通 常,应用会在调用Socket.read()时由于网络问题被阻塞住,而很少在调用Socket.write()时进入waiting状态,这取决于网络 构成和错误类型。当Socket.write()被调用时,数据被写入到操作系统内核的缓冲区,控制权立即回到应用手上。因此,一旦数据被写入内核缓冲 区,Socket.write()调用就必然会成功。但是,如果系统内核缓冲区由于某种网络错误而满了的话,Socket.write()也会进入 waiting状态。这种情况下,操作系统会尝试重新发包,当达到重试的时间限制时,将产生系统错误。在我们公司,重新发包的超时时间被设置为15分 钟。 

至此,我已经对JDBC的内部操作做了讲解,希望能够让大家学会如何正确的配置超时时间,从而减少错误的发生。 

最后,我将列出一些常见的问题。 

FAQ 
Q1. 我已经使用Statement.setQueryTimeout()方法设置了查询超时,但在网络出错时并没有产生作用。 

➔ 查询超时仅在socket timeout生效的前提下才有效,它并不能用来解决外部的网络错误,要解决这种问题,必须设置JDBC的socket timeout。 

Q2. transaction timeout,statement timeout和socket timeout和DBCP的配置有什么关系? 

➔ 当通过DBCP获取数据库连接时,除了DBCP获取连接时的waitTimeout配置以外,其他配置对JDBC没有什么影响。 

Q3. 如果设置了JDBC的socket timeout,那DBCP连接池中处于IDLE状态的连接是否也会在达到超时时间后被关闭? 

➔ 不会。socket的设置只会在产生数据读写时生效,而不会对DBCP中的IDLE连接产生影响。当DBCP中发生新连接创建,老的IDLE连接被移除,或是连接有效性校验的时候,socket设置会对其产生一定的影响,但除非发生网络问题,否则影响很小。 

Q4. socket timeout应该设置为多少? 

➔ 就像我在正文中提的那样,socket timeout必须高于statement timeout,但并没有什么推荐值。在发生网络错误的时候,socket timeout将会生效,但是再小心的配置也无法避免网络错误的发生,只是在网络错误发生后缩短服务失效的时间(如果网络恢复正常的话)。

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326392771&siteId=291194637