MySQL JDBC StreamResult 和 net_write_timeout

MySQL JDBC StreamResult 和 net_write_timeout

MySQL JDBC 拉取数据的三种方式

MySQL JDBC 在从 MySQL 拉取数据的时候有三种方式

  1. 简单模式,也就是默认模式,数据都先要从MySQL Server发到client的OS TCP buffer,然后JDBC把 OS buffer读取到JVM内存中,读取到JVM内存的过程中憋着不让client读取,全部读完再通知inputStream.read(). 数据大的话容易导致JVM OOM
  2. useCursorFetch=true,配合FetchSize,也就是MySQL Server把查到的数据先缓存到本地磁盘,然后按照FetchSize挨个发给client。这需要占用MySQL很高的IOPS(先写磁盘缓存),其次每次Fetch需要一个RTT,效率不高。
  3. Stream读取,Stream读取是在执行SQL前设置FetchSize:statement.setFetchSize(Integer.MIN_VALUE),同时确保游标是只读、向前滚动的(为游标的默认值),MySQL JDBC内置的操作方法是将Statement强制转换为:com.mysql.jdbc.StatementImpl,调用其方法:enableStreamingResults(),这2者达到的效果是一致的,都是启动Stream流方式读取数据。这个时候MySQL不停地发数据,inputStream.read()不停地读取。一般来说发数据更快些,很快client的OS TCP recv buffer就满了,这时MySQL停下来等buffer有空闲就继续发数据。等待过程中如果超过 net_write_timeout MySQL就会报错,中断这次查询。

从这里的描述来看,数据小的时候第一种方式还能接受,但是数据大了容易OOM,方式三看起来不错,但是要特别注意 net_write_timeout。

1和3对MySQL Server来说处理上没有啥区别,也感知不到这两种方式的不同。只是对1来说从OS Buffer中的数据复制到JVM内存中速度快,JVM攒多了数据内存就容易爆掉;对3来说JDBC一条条将OS Buffer中的数据复制到JVM(内存复制速度快)同时返回给execute挨个处理(慢),一般来说挨个处理要慢一些,这就导致了从OS Buffer中复制数据较慢,容易导致 TCP Receive Buffer满了,那么MySQL Server感知到的就是TCP 传输窗口为0了,导致暂停传输数据。

在数据量很小的时候方式三没什么优势,因为总是多一次set net_write_tiemout,也就是多了一次RTT。

img

MySQL timeout

  1. Creates a statement by calling Connection.createStatement().
  2. Calls Statement.executeQuery().
  3. The statement transmits the Query to MySqlServer by using the internal connection.
  4. The statement creates a new timeout-execution thread for timeout process.
  5. For version 5.1.x, it changes to assign 1 thread for each connection.
  6. Registers the timeout execution to the thread.
  7. Timeout occurs.
  8. The timeout-execution thread creates a connection that has the same configurations as the statement.
  9. Transmits the cancel Query (KILL QUERY “connectionId“) by using the connection.

Figure 6: QueryTimeout Execution Process for MySQL JDBC Statement (5.0.8).

参考《揭秘JDBC超时机制》

net_read_timeout

Command-Line Format --net-read-timeout=#
System Variable net_read_timeout
Scope Global, Session
Dynamic Yes
Type Integer
Default Value 30
Minimum Value 1
Maximum Value 31536000
Unit seconds

The number of seconds to wait for more data from a connection before aborting the read. When the server is reading from the client, net_read_timeout is the timeout value controlling when to abort.

如下图,MySQL Server监听3017端口,195228号包 客户端发一个SQL 给 MySQL Server,但是似乎这个时候正好网络异常,30秒钟后(从 SQL 请求的前一个 ack 开始算,Server应该一直都没有收到),Server 端触发 net_read_timeout 超时异常(疑问:这里没有 net_read_timeout 描述的读取了一半的现象)

image-20230209155545142

解决方案:建议调大 net_read_timeout 以应对可能出现的网络丢包

net_write_timeout

show processlist 看到的State的值一直处于“Sending to client”,说明SQL这个语句已经执行完毕,而此时由于请求的数据太多,MySQL不停写入net buffer,而net buffer又不停的将数据写入服务端的网络棧,服务器端的网络栈(socket send buffer)被写满了,又没有被客户端读取并消化,这时读数据的流程就被MySQL暂停了。直到客户端完全读取了服务端网络棧的数据,这个状态才会消失。

先看下 net_write_timeout的解释:The number of seconds to wait for a block to be written to a connection before aborting the write. 只针对执行查询中的等待超时,网络不好,tcp buffer满了(应用迟迟不读走数据)等容易导致mysql server端报net_write_timeout错误,指的是mysql server hang在那里长时间无法发送查询结果。

Property Value
Command-Line Format --net-write-timeout=#
System Variable net_write_timeout
Scope Global, Session
Dynamic Yes
Type Integer
Default Value 60
Minimum Value 1

报这个错就是RDS等了net_write_timeout这么久没写数据,可能是客户端卡死没有读走数据,也可能是从多个分片挨个拉取,还没开始拉第7片前面6片拉取耗时就超过了net_write_timeout。

案例:DRDS 到 MySQL 多个分片拉取数据生成了许多 cursor 并发执行,但拉数据的时候是串行拉取的,如果用户端拉取数据过慢会导致最后一个 cursor 执行完成之后要等待很久.会超过 MySQL 的 net_write_timeout 配置从而引发报错. 也就是最后一个cursor打开后一直没有去读取数据,直到MySQL Server 触发 net_write_timeout 异常

首先可以尝试在 DRDS jdbcurl 配置 netTimeoutForStreamingResults 参数,设置为 0 可以使其一直等待,或设置一个合理的值(秒).

从JDBC驱动中可以看到,当调用PreparedStatement的executeQuery() 方法的时候,如果我们是去获取流式resultset的话,就会默认执行SET net_write_timeout= ? 这个命令去重新设置timeout时间。源代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
if (doStreaming && this.connection.getNetTimeoutForStreamingResults() > 0) {  
java.sql.Statement stmt = null;
try {
stmt = this.connection.createStatement(); ((com.mysql.jdbc.StatementImpl)stmt).executeSimpleNonQuery(this.connection, "SET net_write_timeout="
+ this.connection.getNetTimeoutForStreamingResults());
} finally {
if (stmt != null) {
stmt.close();
}
}
}

//另外DRDS代码 AppLoader.java 中写死了net_write_timeout 8小时
ds.putConnectionProperties(ConnectionProperties.NET_WRITE_TIMEOUT, 28800);

而 this.connection.getNetTimeoutForStreamingResults() 默认是600秒,或者在JDBC连接串种通过属性 netTimeoutForStreamingResults 来指定。

netTimeoutForStreamingResults 默认值:

What value should the driver automatically set the server setting ‘net_write_timeout’ to when the streaming result sets feature is in use? Value has unit of seconds, the value “0” means the driver will not try and adjust this value.

Default Value 600
Since Version 5.1.0

一般在数据导出场景中容易出现 net_write_timeout 这个错误,比如这个错误堆栈:

或者:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
ErrorMessage:
Communications link failure
The last packet successfully received from the server was 7 milliseconds ago. The last packet sent successfully to the server was 709,806 milliseconds ago. - com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet successfully received from the server was 7 milliseconds ago. The last packet sent successfully to the server was 709,806 milliseconds ago.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:377)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1036)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3427)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3327)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3814)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:870)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1928)
at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:378)
at com.mysql.jdbc.RowDataDynamic.next(RowDataDynamic.java:358)
at com.mysql.jdbc.ResultSetImpl.next(ResultSetImpl.java:6337)
at com.alibaba.datax.plugin.rdbms.reader.CommonRdbmsReader$Task.startRead(CommonRdbmsReader.java:275)
at com.alibaba.datax.plugin.reader.drdsreader.DrdsReader$Task.startRead(DrdsReader.java:148)
at com.alibaba.datax.core.taskgroup.runner.ReaderRunner.run(ReaderRunner.java:62)
at java.lang.Thread.run(Thread.java:834)
Caused by: java.io.EOFException: Can not read response from server. Expected to read 258 bytes, read 54 bytes before connection was unexpectedly lost.
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:2914)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3387)
... 11 more

特别注意

JDBC驱动报如下错误

Application was streaming results when the connection failed. Consider raising value of ‘net_write_timeout’ on the server. - com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Application was streaming results when the connection failed. Consider raising value of ‘net_write_timeout’ on the server.

不一定是 net_write_timeout 设置过小导致的,JDBC 在 streaming 流模式下只要连接异常就会报如上错误,比如:

  • 连接被 TCP reset
  • 连接因为某种原因(比如 QueryTimeOut、比如用户监控kill 慢查询) 触发 kill Query导致连接中断

比如出现内核bug,内核卡死不发包的话,客户端同样报 net_write_timeout 错误

max_allowed_packet: 单个SQL或者单条记录的最大大小

一些其他的 Timeout

connectTimeout:表示等待和MySQL数据库建立socket链接的超时时间,默认值0,表示不设置超时,单位毫秒,建议30000。 JDBC驱动连接属性

queryTimeout:超时后jdbc驱动触发新建一个连接来发送一个 kill 给DB

socketTimeout:JDBC参数,表示客户端发送请求给MySQL数据库后block在read的等待数据的超时时间,linux系统默认的socketTimeout为30分钟,可以不设置。socketTimeout 超时不会触发发kill,只会断开tcp连接

要特别注意socketTimeout仅仅是指等待socket数据时间,如果在传输数据那么这个值就没有用了。socketTimeout通过mysql-connector中的NativeProtocol最终设置在socketOptions上

image-20211024171459127

static final int SO_TIMEOUT。 Set a timeout on blocking Socket operations:

ServerSocket.accept();
SocketInputStream.read();
DatagramSocket.receive();

The option must be set prior to entering a blocking operation to take effect. If the timeout expires and the operation would continue to block, java.io.InterruptedIOException is raised. The Socket is not closed in this case.

Statement Timeout:用来限制statement的执行时长,timeout的值通过调用JDBC的java.sql.Statement.setQueryTimeout(int timeout) API进行设置。不过现在开发者已经很少直接在代码中设置,而多是通过框架来进行设置。

max_execution_time:The execution timeout for SELECT statements, in milliseconds. If the value is 0, timeouts are not enabled. MySQL 属性,可以set修改,一般用来设置一个查询最长不超过多少秒,避免一个慢查询一直在跑,跟statement timeout对应。

Property Value
Command-Line Format --max-execution-time=#
System Variable max_execution_time
Scope Global, Session
Dynamic Yes
Type Integer
Default Value 0

wait_timeout The number of seconds the server waits for activity on a noninteractive connection before closing it. MySQL 属性,一般设置tcp keepalive后这个值基本不会超时(这句话存疑 202110)。

On thread startup, the session wait_timeout value is initialized from the global wait_timeout value or from the global interactive_timeout value, depending on the type of client (as defined by the CLIENT_INTERACTIVE connect option to mysql_real_connect()). See also interactive_timeout.

Property Value
Command-Line Format --wait-timeout=#
System Variable wait_timeout
Scope Global, Session
Dynamic Yes
Type Integer
Default Value 28800
Minimum Value 1
Maximum Value (Other) 31536000
Maximum Value (Windows) 2147483

一般来说应该设置: max_execution_time/statement timeout < Tranction Timeout < socketTimeout

SocketTimeout

这个 httpclient 的bug 就是在 TCP 连接握手成功(只受ConnectTimeout影响,SocketTimeout还不起作用)后,还需要进行 SSL的数据交换(HandShake),但因为httpclient是在连接建立后(含 SSL HandShake)才设置的 SocketTimeout,导致在SSL HandShake的时候卡在了读数据,此时恰好还没设置SocketTimeout,导致连接永久卡死在SSL HandShake的读数据

所以代码的fix方案就是在建连接前就设置好 SocketTimeout。

一次 PreparedStatement 执行

useServerPrepStmts=true&cachePrepStmts=true

5.0.5版本后的驱动 useServerPrepStmts 默认值是false;另外跨Statement是没法重用PreparedStatement预编译的,还需要设置 cachePrepStmts 才可以

对于打开预编译的URL(String url = “jdbc:mysql://localhost:3306/studb?useServerPrepStmts=true&cachePrepStmts=true”)获取数据库连接之后,本质是获取预编译语句 **pstmt = conn.prepareStatement(sql)**时会向MySQL服务端发送一个RPC,发送一个预编译的SQL模板(驱动会拼接MySQL预编译语句prepare s1 from ‘select * from user where id = ?’),然后MySQL服务端会编译好收到的SQL模板,再会为此预编译模板语句分配一个 serverStatementId发送给JDBC驱动,这样以后PreparedStatement就会持有当前预编译语句的服务端的serverStatementId,并且会把此 PreparedStatement缓存在当前数据库连接中,以后对于相同SQL模板的操作 **pstmt.executeUpdate()**,都用相同的PreparedStatement,执行SQL时只需要发送 serverStatementId 和参数,节省一次SQL编译, 直接执行。并且对于每一个连接(驱动端及MySQL服务端)都有自己的prepare cache,具体的源码实现是在com.mysql.jdbc.ServerPreparedStatement中实现。

根据SQL模板和设置的参数,解析成一条完整的SQL语句,最后根据MySQL协议,序列化成字节流,RPC发送给MySQL服务端

1
2
// 解析封装需要发送的SQL语句,序列化成MySQL协议对应的字节流
Buffer sendPacket = fillSendPacket();

准备好需要发送的MySQL协议的字节流(sendPacket)后,就可以一路通过ConnectionImpl.execSQL –> MysqlIO.sqlQueryDirect –> MysqlIO.send – > OutPutStram.write把字节流数据通过Socket发送给MySQL服务器,然后线程阻塞等待服务端返回结果数据,拿到数据后再根据MySQL协议反序列化成我们熟悉的ResultSet对象。

image-20230802101859567

SocketTimeoutException: Read timed out

如果SQL 超过 SocketTimeout 报错如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
#java -cp .:./mysql-connector-java-5.1.45.jar Test "jdbc:mysql://127.0.0.1:3306/test?useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&connectTimeout=15000&socketTimeout=3700" root 123 "select sleep(10), id from sbtest1 where id= ?" 100
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 3,705 milliseconds ago. The last packet sent successfully to the server was 3,705 milliseconds ago.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:990)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3559)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3459)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3900)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527)
at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1283)
at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:783)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1966)
at Test.main(Test.java:31)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3008)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3469)
... 7 more

连接超时

在防火墙里设置了 3306 的包都 drop

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#java -cp .:./mysql-connector-java-5.1.45.jar Test "jdbc:mysql://127.0.0.1:3306/test?useSSL=false&useServerPrepStmts=true&cachePrepStmts=true&connectTimeout=15000&socketTimeout=3700" root 123 "select sleep(10), id from sbtest1 where id= ?" 100
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:990)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:341)
at com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2186)
at com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2219)
at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2014)
at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:776)
at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:47)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:386)
at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:330)
at java.sql.DriverManager.getConnection(DriverManager.java:664)
at java.sql.DriverManager.getConnection(DriverManager.java:247)
at Test.main(Test.java:26)
Caused by: java.net.ConnectException: 连接超时 (Connection timed out)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:607)
at com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:211)
at com.mysql.jdbc.MysqlIO.<init>(MysqlIO.java:300)
... 15 more

SocketException connection timed out

image-20240522165928366

image-20240522165849942

案例

设置JDBC参数不合理(不设置的话默认值是:queryTimeout=10s,socketTimeout=10s),会导致在异常情况下,第二条get获得了第一条的结果,拿到了错误的数据,数据库则表现正常

socketTimeout触发后,连接抛CommunicationsException(严重异常,触发后连接应该断开), 但JDBC会检查请求是否被cancle了,如果cancle就会抛出MySQLTimeoutException异常,这是一个普通异常,连接会被重新放回连接池重用(导致下一个获取这个连接的线程可能会得到前一个请求的response)。

queryTimeout(queryTimeoutKillsConnection=True–来强制关闭连接)会触发启动一个新的连接向server发送 kill id的命令,MySQL5.7增加了max_statement_time/max_execution_time来做到在server上直接检测到这种查询,然后结束掉

jdbc 和 RDS之间 socket_timeout

jdbc驱动设置socketTimeout=1459,如果是socketTimeout触发客户端断开后,server端的SQL会继续执行,如果是client被kill则server端的SQL会被终止

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
# java -cp /home/admin/drds-server/lib/*:. Test "jdbc:mysql://172.16.40.215:3008/bank_000000?socketTimeout=1459" "user" "pass" "select sleep(2)" "1"
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 1,461 milliseconds ago. The last packet sent successfully to the server was 1,461 milliseconds ago.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:80)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3749)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3649)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4090)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2658)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2811)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2806)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2764)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1399)
at Test.main(Test.java:29)
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3183)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3659)
... 8 more

或者开协程后的错误堆栈
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 1,460 milliseconds ago. The last packet sent successfully to the server was 1,459 milliseconds ago.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:80)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3749)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3649)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4090)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2658)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2811)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2806)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2764)
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1399)
at Test.main(Test.java:29)
Caused by: java.net.SocketTimeoutException: time out
at sun.nio.ch.WispSocketImpl$1$1.read0(WispSocketImpl.java:244)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:208)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:201)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3183)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3659)
... 8 more

对应抓包,没有 kill动作

image-20220601141709318

CN 和 DN 间socket_timeout案例

设置CN到DN的socket_timeout为2秒,然后执行一个sleep

CN上抓包分析(stream 5是客户端到CN、stream6是CN到DN)如下,首先CN会计时2秒钟后发送quit给DN,然后断开和DN的连接,并返回一个错误给client,client发送quit断开连接:

image-20220601122556415

CN完整报错堆栈:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
2022-06-01 12:10:00.178 [ServerExecutor-bucket-2-19-thread-181] ERROR com.alibaba.druid.pool.DruidPooledStatement - [user=polardbx_root,host=10.101.32.6,port=43947,schema=bank] CommunicationsException, druid version 1.1.24, jdbcUrl : jdbc:mysql://172.16.40.215:3008/bank_000000?maintainTimeStats=false&rewriteBatchedStatements=false&failOverReadOnly=false&cacheResultSetMetadata=true&allowMultiQueries=true&clobberStreamingResults=true&autoReconnect=false&usePsMemOptimize=true&useServerPrepStmts=true&netTimeoutForStreamingResults=0&useSSL=false&metadataCacheSize=256&readOnlyPropagatesToServer=false&prepStmtCacheSqlLimit=4096&connectTimeout=5000&socketTimeout=9000000&cachePrepStmts=true&characterEncoding=utf8&prepStmtCacheSize=256, testWhileIdle true, idle millis 11861, minIdle 5, poolingCount 4, timeBetweenEvictionRunsMillis 60000, lastValidIdleMillis 11861, driver com.mysql.jdbc.Driver, exceptionSorter com.alibaba.polardbx.common.jdbc.sorter.MySQLExceptionSorter
2022-06-01 12:10:00.179 [ServerExecutor-bucket-2-19-thread-181] ERROR com.alibaba.druid.pool.DruidDataSource - [user=polardbx_root,host=10.101.32.6,port=43947,schema=bank] discard connection
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
at sun.reflect.GeneratedConstructorAccessor72.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3749)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3649)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4090)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2658)
at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1281)
at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:782)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1367)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.execute(DruidPooledPreparedStatement.java:497)
at com.alibaba.polardbx.group.jdbc.TGroupDirectPreparedStatement.execute(TGroupDirectPreparedStatement.java:84)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQueryInner(MyJdbcHandler.java:1133)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQuery(MyJdbcHandler.java:990)
at com.alibaba.polardbx.repo.mysql.spi.MyPhyQueryCursor.doInit(MyPhyQueryCursor.java:83)
at com.alibaba.polardbx.executor.cursor.AbstractCursor.init(AbstractCursor.java:53)
at com.alibaba.polardbx.repo.mysql.spi.MyPhyQueryCursor.<init>(MyPhyQueryCursor.java:67)
at com.alibaba.polardbx.repo.mysql.spi.CursorFactoryMyImpl.repoCursor(CursorFactoryMyImpl.java:42)
at com.alibaba.polardbx.repo.mysql.handler.MyPhyQueryHandler.handle(MyPhyQueryHandler.java:24)
at com.alibaba.polardbx.executor.handler.HandlerCommon.handlePlan(HandlerCommon.java:102)
at com.alibaba.polardbx.executor.AbstractGroupExecutor.executeInner(AbstractGroupExecutor.java:58)
at com.alibaba.polardbx.executor.AbstractGroupExecutor.execByExecPlanNode(AbstractGroupExecutor.java:36)
at com.alibaba.polardbx.executor.TopologyExecutor.execByExecPlanNode(TopologyExecutor.java:34)
at com.alibaba.polardbx.transaction.TransactionExecutor.execByExecPlanNode(TransactionExecutor.java:120)
at com.alibaba.polardbx.executor.ExecutorHelper.executeByCursor(ExecutorHelper.java:155)
at com.alibaba.polardbx.executor.ExecutorHelper.execute(ExecutorHelper.java:70)
at com.alibaba.polardbx.executor.PlanExecutor.execByExecPlanNodeByOne(PlanExecutor.java:130)
at com.alibaba.polardbx.executor.PlanExecutor.execute(PlanExecutor.java:75)
at com.alibaba.polardbx.matrix.jdbc.TConnection.executeQuery(TConnection.java:682)
at com.alibaba.polardbx.matrix.jdbc.TConnection.executeSQL(TConnection.java:457)
at com.alibaba.polardbx.matrix.jdbc.TPreparedStatement.executeSQL(TPreparedStatement.java:65)
at com.alibaba.polardbx.matrix.jdbc.TStatement.executeInternal(TStatement.java:133)
at com.alibaba.polardbx.matrix.jdbc.TPreparedStatement.execute(TPreparedStatement.java:50)
at com.alibaba.polardbx.server.ServerConnection.innerExecute(ServerConnection.java:1131)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:883)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:850)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:844)
at com.alibaba.polardbx.server.handler.SelectHandler.handle(SelectHandler.java:82)
at com.alibaba.polardbx.server.handler.SelectHandler.handle(SelectHandler.java:31)
at com.alibaba.polardbx.server.ServerQueryHandler.executeSql(ServerQueryHandler.java:155)
at com.alibaba.polardbx.server.ServerQueryHandler.executeStatement(ServerQueryHandler.java:133)
at com.alibaba.polardbx.server.ServerQueryHandler.queryRaw(ServerQueryHandler.java:118)
at com.alibaba.polardbx.net.FrontendConnection.query(FrontendConnection.java:460)
at com.alibaba.polardbx.net.handler.FrontendCommandHandler.handle(FrontendCommandHandler.java:49)
at com.alibaba.polardbx.net.FrontendConnection.lambda$handleData$0(FrontendConnection.java:753)
at com.alibaba.polardbx.common.utils.thread.RunnableWithCpuCollector.run(RunnableWithCpuCollector.java:36)
at com.alibaba.polardbx.common.utils.thread.ServerThreadPool$RunnableAdapter.run(ServerThreadPool.java:793)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:874)
at com.alibaba.wisp.engine.WispTask.runOutsideWisp(WispTask.java:277)
at com.alibaba.wisp.engine.WispTask.runCommand(WispTask.java:252)
at com.alibaba.wisp.engine.WispTask.access$100(WispTask.java:33)
at com.alibaba.wisp.engine.WispTask$CacheableCoroutine.run(WispTask.java:223)
at java.dyn.CoroutineBase.startInternal(CoroutineBase.java:60)
Caused by: java.net.SocketTimeoutException: time out
at sun.nio.ch.WispSocketImpl$1$1.read0(WispSocketImpl.java:244)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:208)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:201)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3183)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3659)
... 53 common frames omitted
2022-06-01 12:10:00.179 [ServerExecutor-bucket-2-19-thread-181] WARN com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler - [user=polardbx_root,host=10.101.32.6,port=43947,schema=bank] [TDDL] [1461cdf8b2809000]Execute ERROR on GROUP: BANK_000000_GROUP, ATOM: dskey_bank_000000_group#pxc-xdb-s-pxcunrcbmk4g9lcpk0f24#172.16.40.215-3008#bank_000000, MERGE_UNION_SIZE:1, SQL: /*DRDS /10.101.32.6/1461cdf8b2809000/0// */SELECT SLEEP(?) AS `sleep(236)`, PARAM: [236], ERROR: Communications link failure, tddl version: 5.4.13-16522656
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
at sun.reflect.GeneratedConstructorAccessor72.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3749)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3649)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4090)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2658)
at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1281)
at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:782)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1367)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.execute(DruidPooledPreparedStatement.java:497)
at com.alibaba.polardbx.group.jdbc.TGroupDirectPreparedStatement.execute(TGroupDirectPreparedStatement.java:84)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQueryInner(MyJdbcHandler.java:1133)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQuery(MyJdbcHandler.java:990)
at com.alibaba.polardbx.repo.mysql.spi.MyPhyQueryCursor.doInit(MyPhyQueryCursor.java:83)
at com.alibaba.polardbx.executor.cursor.AbstractCursor.init(AbstractCursor.java:53)
at com.alibaba.polardbx.repo.mysql.spi.MyPhyQueryCursor.<init>(MyPhyQueryCursor.java:67)
at com.alibaba.polardbx.repo.mysql.spi.CursorFactoryMyImpl.repoCursor(CursorFactoryMyImpl.java:42)
at com.alibaba.polardbx.repo.mysql.handler.MyPhyQueryHandler.handle(MyPhyQueryHandler.java:24)
at com.alibaba.polardbx.executor.handler.HandlerCommon.handlePlan(HandlerCommon.java:102)
at com.alibaba.polardbx.executor.AbstractGroupExecutor.executeInner(AbstractGroupExecutor.java:58)
at com.alibaba.polardbx.executor.AbstractGroupExecutor.execByExecPlanNode(AbstractGroupExecutor.java:36)
at com.alibaba.polardbx.executor.TopologyExecutor.execByExecPlanNode(TopologyExecutor.java:34)
at com.alibaba.polardbx.transaction.TransactionExecutor.execByExecPlanNode(TransactionExecutor.java:120)
at com.alibaba.polardbx.executor.ExecutorHelper.executeByCursor(ExecutorHelper.java:155)
at com.alibaba.polardbx.executor.ExecutorHelper.execute(ExecutorHelper.java:70)
at com.alibaba.polardbx.executor.PlanExecutor.execByExecPlanNodeByOne(PlanExecutor.java:130)
at com.alibaba.polardbx.executor.PlanExecutor.execute(PlanExecutor.java:75)
at com.alibaba.polardbx.matrix.jdbc.TConnection.executeQuery(TConnection.java:682)
at com.alibaba.polardbx.matrix.jdbc.TConnection.executeSQL(TConnection.java:457)
at com.alibaba.polardbx.matrix.jdbc.TPreparedStatement.executeSQL(TPreparedStatement.java:65)
at com.alibaba.polardbx.matrix.jdbc.TStatement.executeInternal(TStatement.java:133)
at com.alibaba.polardbx.matrix.jdbc.TPreparedStatement.execute(TPreparedStatement.java:50)
at com.alibaba.polardbx.server.ServerConnection.innerExecute(ServerConnection.java:1131)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:883)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:850)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:844)
at com.alibaba.polardbx.server.handler.SelectHandler.handle(SelectHandler.java:82)
at com.alibaba.polardbx.server.handler.SelectHandler.handle(SelectHandler.java:31)
at com.alibaba.polardbx.server.ServerQueryHandler.executeSql(ServerQueryHandler.java:155)
at com.alibaba.polardbx.server.ServerQueryHandler.executeStatement(ServerQueryHandler.java:133)
at com.alibaba.polardbx.server.ServerQueryHandler.queryRaw(ServerQueryHandler.java:118)
at com.alibaba.polardbx.net.FrontendConnection.query(FrontendConnection.java:460)
at com.alibaba.polardbx.net.handler.FrontendCommandHandler.handle(FrontendCommandHandler.java:49)
at com.alibaba.polardbx.net.FrontendConnection.lambda$handleData$0(FrontendConnection.java:753)
at com.alibaba.polardbx.common.utils.thread.RunnableWithCpuCollector.run(RunnableWithCpuCollector.java:36)
at com.alibaba.polardbx.common.utils.thread.ServerThreadPool$RunnableAdapter.run(ServerThreadPool.java:793)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:874)
at com.alibaba.wisp.engine.WispTask.runOutsideWisp(WispTask.java:277)
at com.alibaba.wisp.engine.WispTask.runCommand(WispTask.java:252)
at com.alibaba.wisp.engine.WispTask.access$100(WispTask.java:33)
at com.alibaba.wisp.engine.WispTask$CacheableCoroutine.run(WispTask.java:223)
at java.dyn.CoroutineBase.startInternal(CoroutineBase.java:60)
Caused by: java.net.SocketTimeoutException: time out
at sun.nio.ch.WispSocketImpl$1$1.read0(WispSocketImpl.java:244)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:208)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:201)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3183)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3659)
... 53 common frames omitted
2022-06-01 12:10:00.179 [ServerExecutor-bucket-2-19-thread-181] WARN com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler - [user=polardbx_root,host=10.101.32.6,port=43947,schema=bank] [TDDL] Reset conn socketTimeout failed, lastSocketTimeout is 9000000, tddl version: 5.4.13-16522656
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:80)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.Util.getInstance(Util.java:408)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:918)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:897)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:886)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860)
at com.mysql.jdbc.ConnectionImpl.throwConnectionClosedException(ConnectionImpl.java:1326)
at com.mysql.jdbc.ConnectionImpl.checkClosed(ConnectionImpl.java:1321)
at com.mysql.jdbc.ConnectionImpl.setNetworkTimeout(ConnectionImpl.java:5888)
at com.alibaba.polardbx.atom.utils.NetworkUtils.setNetworkTimeout(NetworkUtils.java:18)
at com.alibaba.polardbx.group.jdbc.TGroupDirectConnection.setNetworkTimeout(TGroupDirectConnection.java:433)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.resetPhyConnSocketTimeout(MyJdbcHandler.java:721)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQueryInner(MyJdbcHandler.java:1173)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQuery(MyJdbcHandler.java:990)
at com.alibaba.polardbx.repo.mysql.spi.MyPhyQueryCursor.doInit(MyPhyQueryCursor.java:83)
at com.alibaba.polardbx.executor.cursor.AbstractCursor.init(AbstractCursor.java:53)
at com.alibaba.polardbx.repo.mysql.spi.MyPhyQueryCursor.<init>(MyPhyQueryCursor.java:67)
at com.alibaba.polardbx.repo.mysql.spi.CursorFactoryMyImpl.repoCursor(CursorFactoryMyImpl.java:42)
at com.alibaba.polardbx.repo.mysql.handler.MyPhyQueryHandler.handle(MyPhyQueryHandler.java:24)
at com.alibaba.polardbx.executor.handler.HandlerCommon.handlePlan(HandlerCommon.java:102)
at com.alibaba.polardbx.executor.AbstractGroupExecutor.executeInner(AbstractGroupExecutor.java:58)
at com.alibaba.polardbx.executor.AbstractGroupExecutor.execByExecPlanNode(AbstractGroupExecutor.java:36)
at com.alibaba.polardbx.executor.TopologyExecutor.execByExecPlanNode(TopologyExecutor.java:34)
at com.alibaba.polardbx.transaction.TransactionExecutor.execByExecPlanNode(TransactionExecutor.java:120)
at com.alibaba.polardbx.executor.ExecutorHelper.executeByCursor(ExecutorHelper.java:155)
at com.alibaba.polardbx.executor.ExecutorHelper.execute(ExecutorHelper.java:70)
at com.alibaba.polardbx.executor.PlanExecutor.execByExecPlanNodeByOne(PlanExecutor.java:130)
at com.alibaba.polardbx.executor.PlanExecutor.execute(PlanExecutor.java:75)
at com.alibaba.polardbx.matrix.jdbc.TConnection.executeQuery(TConnection.java:682)
at com.alibaba.polardbx.matrix.jdbc.TConnection.executeSQL(TConnection.java:457)
at com.alibaba.polardbx.matrix.jdbc.TPreparedStatement.executeSQL(TPreparedStatement.java:65)
at com.alibaba.polardbx.matrix.jdbc.TStatement.executeInternal(TStatement.java:133)
at com.alibaba.polardbx.matrix.jdbc.TPreparedStatement.execute(TPreparedStatement.java:50)
at com.alibaba.polardbx.server.ServerConnection.innerExecute(ServerConnection.java:1131)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:883)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:850)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:844)
at com.alibaba.polardbx.server.handler.SelectHandler.handle(SelectHandler.java:82)
at com.alibaba.polardbx.server.handler.SelectHandler.handle(SelectHandler.java:31)
at com.alibaba.polardbx.server.ServerQueryHandler.executeSql(ServerQueryHandler.java:155)
at com.alibaba.polardbx.server.ServerQueryHandler.executeStatement(ServerQueryHandler.java:133)
at com.alibaba.polardbx.server.ServerQueryHandler.queryRaw(ServerQueryHandler.java:118)
at com.alibaba.polardbx.net.FrontendConnection.query(FrontendConnection.java:460)
at com.alibaba.polardbx.net.handler.FrontendCommandHandler.handle(FrontendCommandHandler.java:49)
at com.alibaba.polardbx.net.FrontendConnection.lambda$handleData$0(FrontendConnection.java:753)
at com.alibaba.polardbx.common.utils.thread.RunnableWithCpuCollector.run(RunnableWithCpuCollector.java:36)
at com.alibaba.polardbx.common.utils.thread.ServerThreadPool$RunnableAdapter.run(ServerThreadPool.java:793)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:874)
at com.alibaba.wisp.engine.WispTask.runOutsideWisp(WispTask.java:277)
at com.alibaba.wisp.engine.WispTask.runCommand(WispTask.java:252)
at com.alibaba.wisp.engine.WispTask.access$100(WispTask.java:33)
at com.alibaba.wisp.engine.WispTask$CacheableCoroutine.run(WispTask.java:223)
at java.dyn.CoroutineBase.startInternal(CoroutineBase.java:60)
2022-06-01 12:10:00.179 [ServerExecutor-bucket-2-19-thread-181] WARN com.alibaba.polardbx.executor.ExecutorHelper - [user=polardbx_root,host=10.101.32.6,port=43947,schema=bank] [TDDL] PhyQuery(node="BANK_000000_GROUP", sql="SELECT SLEEP(?) AS `sleep(236)`")
, tddl version: 5.4.13-16522656
2022-06-01 12:10:00.180 [ServerExecutor-bucket-2-19-thread-181] WARN com.alibaba.polardbx.server.ServerConnection - [user=polardbx_root,host=10.101.32.6,port=43947,schema=bank] [TDDL] [ERROR-CODE: 3009][1461cdf8b2809000] SQL: /*+TDDL:node(0) and SOCKET_TIMEOUT=2000 */ select sleep(236), tddl version: 5.4.13-16522656
com.alibaba.polardbx.common.exception.TddlRuntimeException: ERR-CODE: [TDDL-4614][ERR_EXECUTE_ON_MYSQL] Error occurs when execute on GROUP 'BANK_000000_GROUP' ATOM 'dskey_bank_000000_group#pxc-xdb-s-pxcunrcbmk4g9lcpk0f24#172.16.40.215-3008#bank_000000': Communications link failure
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.handleException(MyJdbcHandler.java:1935)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.generalHandlerException(MyJdbcHandler.java:1911)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQueryInner(MyJdbcHandler.java:1168)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQuery(MyJdbcHandler.java:990)
at com.alibaba.polardbx.repo.mysql.spi.MyPhyQueryCursor.doInit(MyPhyQueryCursor.java:83)
at com.alibaba.polardbx.executor.cursor.AbstractCursor.init(AbstractCursor.java:53)
at com.alibaba.polardbx.repo.mysql.spi.MyPhyQueryCursor.<init>(MyPhyQueryCursor.java:67)
at com.alibaba.polardbx.repo.mysql.spi.CursorFactoryMyImpl.repoCursor(CursorFactoryMyImpl.java:42)
at com.alibaba.polardbx.repo.mysql.handler.MyPhyQueryHandler.handle(MyPhyQueryHandler.java:24)
at com.alibaba.polardbx.executor.handler.HandlerCommon.handlePlan(HandlerCommon.java:102)
at com.alibaba.polardbx.executor.AbstractGroupExecutor.executeInner(AbstractGroupExecutor.java:58)
at com.alibaba.polardbx.executor.AbstractGroupExecutor.execByExecPlanNode(AbstractGroupExecutor.java:36)
at com.alibaba.polardbx.executor.TopologyExecutor.execByExecPlanNode(TopologyExecutor.java:34)
at com.alibaba.polardbx.transaction.TransactionExecutor.execByExecPlanNode(TransactionExecutor.java:120)
at com.alibaba.polardbx.executor.ExecutorHelper.executeByCursor(ExecutorHelper.java:155)
at com.alibaba.polardbx.executor.ExecutorHelper.execute(ExecutorHelper.java:70)
at com.alibaba.polardbx.executor.PlanExecutor.execByExecPlanNodeByOne(PlanExecutor.java:130)
at com.alibaba.polardbx.executor.PlanExecutor.execute(PlanExecutor.java:75)
at com.alibaba.polardbx.matrix.jdbc.TConnection.executeQuery(TConnection.java:682)
at com.alibaba.polardbx.matrix.jdbc.TConnection.executeSQL(TConnection.java:457)
at com.alibaba.polardbx.matrix.jdbc.TPreparedStatement.executeSQL(TPreparedStatement.java:65)
at com.alibaba.polardbx.matrix.jdbc.TStatement.executeInternal(TStatement.java:133)
at com.alibaba.polardbx.matrix.jdbc.TPreparedStatement.execute(TPreparedStatement.java:50)
at com.alibaba.polardbx.server.ServerConnection.innerExecute(ServerConnection.java:1131)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:883)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:850)
at com.alibaba.polardbx.server.ServerConnection.execute(ServerConnection.java:844)
at com.alibaba.polardbx.server.handler.SelectHandler.handle(SelectHandler.java:82)
at com.alibaba.polardbx.server.handler.SelectHandler.handle(SelectHandler.java:31)
at com.alibaba.polardbx.server.ServerQueryHandler.executeSql(ServerQueryHandler.java:155)
at com.alibaba.polardbx.server.ServerQueryHandler.executeStatement(ServerQueryHandler.java:133)
at com.alibaba.polardbx.server.ServerQueryHandler.queryRaw(ServerQueryHandler.java:118)
at com.alibaba.polardbx.net.FrontendConnection.query(FrontendConnection.java:460)
at com.alibaba.polardbx.net.handler.FrontendCommandHandler.handle(FrontendCommandHandler.java:49)
at com.alibaba.polardbx.net.FrontendConnection.lambda$handleData$0(FrontendConnection.java:753)
at com.alibaba.polardbx.common.utils.thread.RunnableWithCpuCollector.run(RunnableWithCpuCollector.java:36)
at com.alibaba.polardbx.common.utils.thread.ServerThreadPool$RunnableAdapter.run(ServerThreadPool.java:793)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:874)
at com.alibaba.wisp.engine.WispTask.runOutsideWisp(WispTask.java:277)
at com.alibaba.wisp.engine.WispTask.runCommand(WispTask.java:252)
at com.alibaba.wisp.engine.WispTask.access$100(WispTask.java:33)
at com.alibaba.wisp.engine.WispTask$CacheableCoroutine.run(WispTask.java:223)
at java.dyn.CoroutineBase.startInternal(CoroutineBase.java:60)
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
at sun.reflect.GeneratedConstructorAccessor72.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3749)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3649)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4090)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2658)
at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1281)
at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:782)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1367)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.execute(DruidPooledPreparedStatement.java:497)
at com.alibaba.polardbx.group.jdbc.TGroupDirectPreparedStatement.execute(TGroupDirectPreparedStatement.java:84)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQueryInner(MyJdbcHandler.java:1133)
... 44 common frames omitted
Caused by: java.net.SocketTimeoutException: time out
at sun.nio.ch.WispSocketImpl$1$1.read0(WispSocketImpl.java:244)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:208)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:201)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3183)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3659)
... 53 common frames omitted

应用和 DB 间丢包导致 keepalive 心跳失败

应用使用了 Druid 连接池来维护到 DB 间的所有长连接

应用和 DB 间丢包导致 keepalive 心跳失败,进而 OS会断开这个连接

image-20230322171621838

一个连接归还给Druid连接池都要做清理动作,就是第一个红框的rollback/autocommit=1

归还后OS 层面会探活TCP 连接,DB(4381端口)多次后多次不响应keepalive 后,OS 触发reset tcp断开连接,此时上层应用(比如Druid连接池、比如Tomcat)还不知道此连接在OS 层面已经断开

1
2
3
4
#sysctl -a |grep -i keepalive
net.ipv4.tcp_keepalive_intvl = 3
net.ipv4.tcp_keepalive_probes = 60
net.ipv4.tcp_keepalive_time = 20

继续过来一个新连接,业务取到这个连接执行查询就会报如下错误:

1
2
3
4
5
6
7
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 162,776 milliseconds ago. The last packet sent successfully to the server was 162,776 milliseconds ago.

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

The last packet successfully received from the server was 162,776 milliseconds ago. The last packet sent successfully to the server was 162,776 milliseconds ago.

这个错误就是因为OS层面连接断开了,并且断开了162秒(和截图时间戳能对应上)

对应的错误堆栈:

1
2
3
4
5
6
7
8
Caused by: java.net.SocketException: Connection timed out (Write failed)
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:111)
at java.net.SocketOutputStream.write(SocketOutputStream.java:155)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3725)
... 46 common frames omitted

kill 案例

kill mysql client

mysql client连cn执行一个很慢的SQL,然后kill掉mysql client

cn报错:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
2022-06-01 11:45:59.063 [ServerExecutor-bucket-0-17-thread-158] ERROR com.alibaba.druid.pool.DruidPooledStatement - [user=polardbx_root,host=10.101.32.6,port=50684,schema=bank] CommunicationsException, druid version 1.1.24, jdbcUrl : jdbc:mysql://172.16.40.215:3008/bank_000000?maintainTimeStats=false&rewriteBatchedStatements=false&failOverReadOnly=false&cacheResultSetMetadata=true&allowMultiQueries=true&clobberStreamingResults=true&autoReconnect=false&usePsMemOptimize=true&useServerPrepStmts=true&netTimeoutForStreamingResults=0&useSSL=false&metadataCacheSize=256&readOnlyPropagatesToServer=false&prepStmtCacheSqlLimit=4096&connectTimeout=5000&socketTimeout=9000000&cachePrepStmts=true&characterEncoding=utf8&prepStmtCacheSize=256, testWhileIdle true, idle millis 72028, minIdle 5, poolingCount 4, timeBetweenEvictionRunsMillis 60000, lastValidIdleMillis 345734, driver com.mysql.jdbc.Driver, exceptionSorter com.alibaba.polardbx.common.jdbc.sorter.MySQLExceptionSorter
2022-06-01 11:45:59.064 [ServerExecutor-bucket-0-17-thread-158] ERROR com.alibaba.druid.pool.DruidDataSource - [user=polardbx_root,host=10.101.32.6,port=50684,schema=bank] discard connection
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
at sun.reflect.GeneratedConstructorAccessor72.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
…………
at com.alibaba.wisp.engine.WispTask$CacheableCoroutine.run(WispTask.java:223)
at java.dyn.CoroutineBase.startInternal(CoroutineBase.java:60)
Caused by: java.net.SocketException: Socket is closed
at java.net.Socket.getSoTimeout(Socket.java:1291)
at sun.nio.ch.WispSocketImpl$1$1.read0(WispSocketImpl.java:249)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:208)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:201)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3183)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3659)
... 53 common frames omitted
2022-06-01 11:45:59.065 [ServerExecutor-bucket-0-17-thread-158] WARN com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler - [user=polardbx_root,host=10.101.32.6,port=50684,schema=bank] [TDDL] [1461c86bbe809001]Execute ERROR on GROUP: BANK_000000_GROUP, ATOM: dskey_bank_000000_group#pxc-xdb-s-pxcunrcbmk4g9lcpk0f24#172.16.40.215-3008#bank_000000, MERGE_UNION_SIZE:1, SQL: /*DRDS /10.101.32.6/1461c86bbe809001/0// */SELECT SLEEP(?) AS `sleep(236)`, PARAM: [236], ERROR: Communications link failure, tddl version: 5.4.13-16522656
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
at sun.reflect.GeneratedConstructorAccessor72.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
…………
at java.dyn.CoroutineBase.startInternal(CoroutineBase.java:60)
Caused by: java.net.SocketException: Socket is closed
at java.net.Socket.getSoTimeout(Socket.java:1291)
at sun.nio.ch.WispSocketImpl$1$1.read0(WispSocketImpl.java:249)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:208)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:201)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3183)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3659)
... 53 common frames omitted
2022-06-01 11:45:59.065 [ServerExecutor-bucket-0-17-thread-158] WARN com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler - [user=polardbx_root,host=10.101.32.6,port=50684,schema=bank] [TDDL] Reset conn socketTimeout failed, lastSocketTimeout is 9000000, tddl version: 5.4.13-16522656
com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:80)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
…………
at com.alibaba.wisp.engine.WispTask$CacheableCoroutine.run(WispTask.java:223)
at java.dyn.CoroutineBase.startInternal(CoroutineBase.java:60)
2022-06-01 11:45:59.065 [ServerExecutor-bucket-0-17-thread-158] WARN com.alibaba.polardbx.executor.ExecutorHelper - [user=polardbx_root,host=10.101.32.6,port=50684,schema=bank] [TDDL] PhyQuery(node="BANK_000000_GROUP", sql="SELECT SLEEP(?) AS `sleep(236)`")
, tddl version: 5.4.13-16522656
2022-06-01 11:45:59.066 [ServerExecutor-bucket-0-17-thread-158] ERROR com.alibaba.polardbx.server.ServerConnection - [user=polardbx_root,host=10.101.32.6,port=50684,schema=bank] [TDDL] Interrupted unexpectedly for 1461c86bbe809001, tddl version: 5.4.13-16522656
java.lang.InterruptedException: null
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1310)
at com.alibaba.polardbx.common.utils.BooleanMutex$Sync.innerGet(BooleanMutex.java:136)
at com.alibaba.polardbx.common.utils.BooleanMutex.get(BooleanMutex.java:53)
at com.alibaba.polardbx.common.utils.thread.ServerThreadPool.waitByTraceId(ServerThreadPool.java:445)
at com.alibaba.polardbx.server.ServerConnection.innerExecute(ServerConnection.java:1291)
……
at com.alibaba.wisp.engine.WispTask.access$100(WispTask.java:33)
at com.alibaba.wisp.engine.WispTask$CacheableCoroutine.run(WispTask.java:223)
at java.dyn.CoroutineBase.startInternal(CoroutineBase.java:60)
2022-06-01 11:45:59.066 [ServerExecutor-bucket-0-17-thread-158] WARN com.alibaba.polardbx.server.ServerConnection - [user=polardbx_root,host=10.101.32.6,port=50684,schema=bank] [TDDL] [ERROR-CODE: 3009][1461c86bbe809001] SQL: /*+TDDL:node(0) and SOCKET_TIMEOUT=40000 */ select sleep(236), tddl version: 5.4.13-16522656
com.alibaba.polardbx.common.exception.TddlRuntimeException: ERR-CODE: [TDDL-4614][ERR_EXECUTE_ON_MYSQL] Error occurs when execute on GROUP 'BANK_000000_GROUP' ATOM 'dskey_bank_000000_group#pxc-xdb-s-pxcunrcbmk4g9lcpk0f24#172.16.40.215-3008#bank_000000': Communications link failure
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.handleException(MyJdbcHandler.java:1935)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.generalHandlerException(MyJdbcHandler.java:1911)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQueryInner(MyJdbcHandler.java:1168)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQuery(MyJdbcHandler.java:990)
…………
at com.alibaba.wisp.engine.WispTask$CacheableCoroutine.run(WispTask.java:223)
at java.dyn.CoroutineBase.startInternal(CoroutineBase.java:60)
Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
at sun.reflect.GeneratedConstructorAccessor72.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:989)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3749)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3649)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4090)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2658)
at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1281)
at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:782)
at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1367)
at com.alibaba.druid.pool.DruidPooledPreparedStatement.execute(DruidPooledPreparedStatement.java:497)
at com.alibaba.polardbx.group.jdbc.TGroupDirectPreparedStatement.execute(TGroupDirectPreparedStatement.java:84)
at com.alibaba.polardbx.repo.mysql.spi.MyJdbcHandler.executeQueryInner(MyJdbcHandler.java:1133)
... 44 common frames omitted
Caused by: java.net.SocketException: Socket is closed
at java.net.Socket.getSoTimeout(Socket.java:1291)
at sun.nio.ch.WispSocketImpl$1$1.read0(WispSocketImpl.java:249)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:208)
at sun.nio.ch.WispSocketImpl$1$1.read(WispSocketImpl.java:201)
at com.mysql.jdbc.util.ReadAheadInputStream.fill(ReadAheadInputStream.java:101)
at com.mysql.jdbc.util.ReadAheadInputStream.readFromUnderlyingStreamIfNecessary(ReadAheadInputStream.java:144)
at com.mysql.jdbc.util.ReadAheadInputStream.read(ReadAheadInputStream.java:174)
at com.mysql.jdbc.MysqlIO.readFully(MysqlIO.java:3183)
at com.mysql.jdbc.MysqlIO.reuseAndReadPacket(MysqlIO.java:3659)
... 53 common frames omitted
2022-06-01 11:45:59.071 [KillExecutor-15-thread-49] WARN com.alibaba.polardbx.server.ServerConnection - [user=polardbx_root,host=10.101.32.6,port=50684,schema=bank] [TDDL] Connection Killed, tddl version: 5.4.13-16522656

mysqld报错:

1
2022-06-01T11:45:58.915371+08:00 8218735 [Note] Aborted connection 8218735 to db: 'bank_000000' user: 'rds_polardb_x' host: '172.16.40.214' (Got an error reading communication packets)

172.16.40.214是客户端IP

抓包看到CN收到mysql client发过来的fin,CN回复fin断开连接

CN会给DN在新的连接上发Kill Query(stream 1596),同时会在原来的连接(stream 583)上发fin,然后原来的连接收到DN的response(被kill),然后CN发reset给DN

image-20220601120626629

下图是sleep 连接的收发包

image-20220601120417026

Kill jdbc client

Java jdbc client被kill后没有错误堆栈,kill后触发socket.close(对应client发送fin断开连接),kill后server端SQL也被立即中断

抓包:

image-20220601143200253

server端报错信息:

1
2022-06-01T14:33:52.204848+08:00 8288839 [Note] Aborted connection 8288839 to db: 'bank_000000' user: 'user' host: '172.16.40.214' (Got an error reading communication packets)

Statement timeout

1
2
3
4
# java -cp /home/admin/drds-server/lib/*:. Test "jdbc:mysql://172.16.40.215:3008/bank_000000?socketTimeout=5459" "user" "pass" "select sleep(180)" "1" 3
com.mysql.jdbc.exceptions.MySQLTimeoutException: Statement cancelled due to timeout or client request
at com.mysql.jdbc.StatementImpl.executeQuery(StatementImpl.java:1419)
at Test.main(Test.java:31)

statement会设置一个timer,到时间还没有返回结果就创建一个新连接发送kill query

server 端收到kill后终止SQL执行,抓包看到Server端主动提前返回了错误

image-20220601152401387

参考资料

MySQL JDBC StreamResult通信原理浅析