所以如果这个reset是MySQL发出来的话,因为MySQL发出的前一个包的 identification 是23403,所以这个必须是23404,实际上居然是13502(而且还和Navicat发出的 Use Database包是同一个 identification),这是非常不对的。
所以可以大胆猜测,这里有个中间设备收到 Use Database后触发了不放行的逻辑,于是冒充 Navicat给 MySQL Server发了reset包,src ip/src port/seq等都直接用Navicat的,identification也用Navicat的,所以 MySQL Server收到的 Reset看起来很正常(啥都是对的,没留下一点冒充的痕迹)。
[ren@vb 18:14 /home/ren]
$sudo mkdir /media/ren/hd
[ren@vb 18:15 /home/ren]
$sudo mount /dev/sd
sda sda1 sda2 sda5 sdb sdb1 sdb2 sdb5 sdc sdc1 sdc2 sdc5
[ren@vb 18:15 /home/ren]
$sudo mount /dev/sdc1 /media/ren/hd
mount: /dev/sdc1 is write-protected, mounting read-only
mount: wrong fs type, bad option, bad superblock on /dev/sdc1,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so.
The new socket option allows multiple sockets on the same host to bind to the same port, and is intended to improve the performance of multithreaded network server applications running on top of multicore systems.
Product of data link’s capacity and its end-to-end delay. The result is the maximum amount of unacknowledged data that can be in flight at any point in time.
ss -itmpn dst "10.81.212.8"
State Recv-Q Send-Q Local Address:Port Peer Address:Port
ESTAB 0 0 10.xx.xx.xxx:22 10.yy.yy.yyy:12345 users:(("sshd",pid=1442,fd=3))
skmem:(r0,rb369280,t0,tb87040,f4096,w0,o0,bl0,d92)
Here we can see this socket has Receive Buffer 369280 bytes, and Transmit Buffer 87040 bytes.Keep in mind the kernel will double any socket buffer allocation for overhead.
So a process asks for 256 KiB buffer with setsockopt(SO_RCVBUF) then it will get 512 KiB buffer space. This is described on man 7 tcp.
初始窗口计算的代码逻辑,重点在17行:
/* TCP initial congestion window as per rfc6928 */
#define TCP_INIT_CWND 10
/* 3. Try to fixup all. It is made immediately after connection enters
established state.
*/
void tcp_init_buffer_space(struct sock *sk)
{
int tcp_app_win = sock_net(sk)->ipv4.sysctl_tcp_app_win;
struct tcp_sock *tp = tcp_sk(sk);
int maxwin;
if (!(sk->sk_userlocks & SOCK_SNDBUF_LOCK))
tcp_sndbuf_expand(sk);
//初始最大接收窗口计算过程
tp->rcvq_space.space = min_t(u32, tp->rcv_wnd, TCP_INIT_CWND * tp->advmss);
tcp_mstamp_refresh(tp);
tp->rcvq_space.time = tp->tcp_mstamp;
tp->rcvq_space.seq = tp->copied_seq;
maxwin = tcp_full_space(sk);
if (tp->window_clamp >= maxwin) {
tp->window_clamp = maxwin;
if (tcp_app_win && maxwin > 4 * tp->advmss)
tp->window_clamp = max(maxwin -
(maxwin >> tcp_app_win),
4 * tp->advmss);
}
/* Force reservation of one segment. */
if (tcp_app_win &&
tp->window_clamp > 2 * tp->advmss &&
tp->window_clamp + tp->advmss > maxwin)
tp->window_clamp = max(2 * tp->advmss, maxwin - tp->advmss);
tp->rcv_ssthresh = min(tp->rcv_ssthresh, tp->window_clamp);
tp->snd_cwnd_stamp = tcp_jiffies32;
}
protected int socketRecvBuffer = 32 * 1024; //接收32K protected int socketSendBuffer = 64 * 1024; //发送64K,实际会分配128K
// If bufs set 0, using '/etc/sysctl.conf' system settings on default // refer: net.ipv4.tcp_wmem / net.ipv4.tcp_rmem if (socketRecvBuffer > 0) { socket.setReceiveBufferSize(socketRecvBuffer); } if (socketSendBuffer > 0) { socket.setSendBufferSize(socketSendBuffer); }
#!/usr/bin/stap
# Simple probe to detect when a process is waiting for more socket send
# buffer memory. Usually means the process is doing writes larger than the
# socket send buffer size or there is a slow receiver at the other side.
# Increasing the socket's send buffer size might help decrease application
# latencies, but it might also make it worse, so buyer beware.
probe kernel.function("sk_stream_wait_memory")
{
printf("%u: %s(%d) blocked on full send buffern",
gettimeofday_us(), execname(), pid())
}
probe kernel.function("sk_stream_wait_memory").return
{
printf("%u: %s(%d) recovered from full send buffern",
gettimeofday_us(), execname(), pid())
}
# Typical output: timestamp in microseconds: procname(pid) event
#
# 1218230114875167: python(17631) blocked on full send buffer
# 1218230114876196: python(17631) recovered from full send buffer
# 1218230114876271: python(17631) blocked on full send buffer
# 1218230114876479: python(17631) recovered from full send buffer
2024 Netflix: Investigation of a Cross-regional Network Performance Issue 因为内核升级去掉了内核参数 sysctl_tcp_adv_win_scale,换了一个新的计算方式,导致原来30秒 内能传输完毕的请求在新内核机制下传输不完,从而导致了业务端的请求超时 This commit obsoleted sysctl_tcp_adv_win_scale and introduced a scaling_ratio that can more accurately calculate the overhead or window size, which is the right thing to do. With the change, the window size is now rcvbuf * scaling_ratio. 简而言之,内核升级后,接收缓存大小减半。因此,吞吐量也减半,导致数据传输时间翻倍。
receive window is not fully opened immediately. Linux keeps the receive window small, as it tries to predict the metadata cost and avoid overshooting the memory budget, therefore hitting TCP collapse. By default, with the net.ipv4.tcp_adv_win_scale=1, the upper limit for the advertised window is 50% of “free” memory. rcv_ssthresh starts up with 64KiB and grows linearly up to that limit.
Using a large chunk of receive buffer space for the metadata is not really what the programmer wants. To counter that, when the socket is under memory pressure complex logic is run with the intention of freeing some space. One of the operations is tcp_collapse and it will merge adjacent TCP packets into one larger sk_buff. This behavior is pretty much a garbage collection (GC)—and as everyone knows, when the garbage collection kicks in, the latency must spike.
################################################################# # tcp_retransmit.stp # Author: Yang Bingwu (detailyang) <detailyang@gmail.com> # This systemtap script will prints the tcp retransmission packet #################################################################
global record% global cwnd_record%
probe begin { log("Printing tcp retransmission") }
cat tcpsynbl_example.txt
Demonstrations of tcpsynbl, the Linux bpftrace/eBPF version.
This tool shows the TCP SYN backlog size during SYN arrival as a histogram.
This lets you see how close your applications are to hitting the backlog limit
and dropping SYNs (causing performance issues with SYN retransmits). For
example:
# ./tcpsynbl.bt
Attaching 4 probes...
Tracing SYN backlog size. Ctrl-C to end.
^C
@backlog[backlog limit]: histogram of backlog size
@backlog[500]:
[0] 2266 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
[1] 3 | |
[2, 4) 1 | |
cat tcpaccept_example.txt
Demonstrations of tcpaccept, the Linux bpftrace/eBPF version.
This tool traces the kernel function accepting TCP socket connections (eg, a passive connection via accept(); not connect()). Some example output (IP addresses changed to protect the innocent): # ./tcpaccept Tracing tcp accepts. Hit Ctrl-C to end. TIME PID COMM RADDR RPORT LADDR LPORT BL 00:34:19 3949061 nginx 10.228.22.228 44226 10.229.20.169 8080 0/128 00:34:19 3951399 ruby 127.0.0.1 52422 127.0.0.1 8000 0/128 00:34:19 3949062 nginx 10.228.23.128 35408 10.229.20.169 8080 0/128
This output shows three connections, an IPv4 connections to PID 1463622, a “redis-server” process listening on port 6379, and one IPv6 connection to a “thread.rb” process listening on port 8000. The remote address and port are also printed, and the accept queue current size as well as maximum size are shown. The overhead of this tool should be negligible, since it is only tracing the kernel function performing accept. It is not tracing every packet and then filtering. This tool only traces successful TCP accept()s. Connection attempts to closed ports will not be shown (those can be traced via other functions). There is another version of this tool in bcc: https://github.com/iovisor/bcc
--- 1687260767618 ns (100.00%), 91083 samples [ 0] ch.qos.logback.classic.sift.SiftingAppender [ 1] ch.qos.logback.core.AppenderBase.doAppend [ 2] ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders [ 3] ch.qos.logback.classic.Logger.appendLoopOnAppenders [ 4] ch.qos.logback.classic.Logger.callAppenders [ 5] ch.qos.logback.classic.Logger.buildLoggingEventAndAppend [ 6] ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus [ 7] ch.qos.logback.classic.Logger.info [ 8] com.taobao.tddl.common.utils.logger.slf4j.Slf4jLogger.info [ 9] com.taobao.tddl.common.utils.logger.support.FailsafeLogger.info [10] com.alibaba.cobar.server.util.LogUtils.recordSql [11] com.alibaba.cobar.server.ServerConnection.innerExecute [12] com.alibaba.cobar.server.ServerConnection.innerExecute [13] com.alibaba.cobar.server.ServerConnection$1.run [14] com.taobao.tddl.common.utils.thread.FlowControlThreadPool$RunnableAdapter.run [15] java.util.concurrent.Executors$RunnableAdapter.call [16] java.util.concurrent.FutureTask.run [17] java.util.concurrent.ThreadPoolExecutor.runWorker [18] java.util.concurrent.ThreadPoolExecutor$Worker.run [19] java.lang.Thread.run "ServerExecutor-3-thread-480" #753 daemon prio=5 os_prio=0 tid=0x00007f8265842000 nid=0x26f1 waiting for monitor entry [0x00007f82270bf000] java.lang.Thread.State: BLOCKED (on object monitor) at ch.qos.logback.core.AppenderBase.doAppend(AppenderBase.java:64) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) at ch.qos.logback.core.spi.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:48) at ch.qos.logback.classic.Logger.appendLoopOnAppenders(Logger.java:282) at ch.qos.logback.classic.Logger.callAppenders(Logger.java:269) at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:470) at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:424) at ch.qos.logback.classic.Logger.info(Logger.java:628) at com.taobao.tddl.common.utils.logger.slf4j.Slf4jLogger.info(Slf4jLogger.java:42) at com.taobao.tddl.common.utils.logger.support.FailsafeLogger.info(FailsafeLogger.java:102) at com.alibaba.cobar.server.util.LogUtils.recordSql(LogUtils.java:115) at com.alibaba.cobar.server.ServerConnection.innerExecute(ServerConnection.java:874) - locked <0x00007f87382cb108> (a com.alibaba.cobar.server.ServerConnection) at com.alibaba.cobar.server.ServerConnection.innerExecute(ServerConnection.java:569) - locked <0x00007f87382cb108> (a com.alibaba.cobar.server.ServerConnection) at com.alibaba.cobar.server.ServerConnection$1.run(ServerConnection.java:402) at com.taobao.tddl.common.utils.thread.FlowControlThreadPool$RunnableAdapter.run(FlowControlThreadPool.java:480) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1152) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:627) at java.lang.Thread.run(Thread.java:861)
- waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - locked <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender) - waiting to lock <0x00007f866dcec208> (a ch.qos.logback.classic.sift.SiftingAppender)
/**
* Creates a new buffered output stream to write data to the
* specified underlying output stream.
*
* @param out the underlying output stream.
*/
public BufferedOutputStream(OutputStream out) {
this(out, 8192);
}