获取一直FullGC下的java进程HeapDump的小技巧

获取一直FullGC下的java进程HeapDump的小技巧

就是小技巧,操作步骤需要查询,随手记录

  • 找到java进程,gdb attach上去, 例如 gdb -p 12345
  • 找到这个HeapDumpBeforeFullGC的地址(这个flag如果为true,会在FullGC之前做HeapDump,默认是false)
1
2
(gdb) p &HeapDumpBeforeFullGC
$2 = (<data variable, no debug info> *) 0x7f7d50fc660f <HeapDumpBeforeFullGC>
  • Copy 地址:0x7f7d50fc660f
  • 然后把他设置为true,这样下次FGC之前就会生成一份dump文件
1
2
(gdb) set *0x7f7d50fc660f = 1
(gdb) quit
  • 最后,等一会,等下次FullGC触发,你就有HeapDump了!
    (如果没有指定heapdump的名字,默认是 java_pidxxx.hprof)

(PS. jstat -gcutil pid 可以查看gc的概况)

(操作完成后记得gdb上去再设置回去,不然可能一直fullgc,导致把磁盘打满).

其它

在jvm还有响应的时候可以: jinfo -flag +HeapDumpBeforeFullGC pid 设置HeapDumpBeforeFullGC 为true(- 为false,+-都不要为只打印值)

kill -3 产生coredump 存放在 kernel.core_pattern=/root/core (/etc/sysctl.conf , 先 ulimit -c unlimited;或者 gcore id 获取coredump)

得到core文件后,采用 gdb -c 执行文件 core文件 进入调试模式,对于java,有以下2个技巧:

进入gdb调试模式后,输入如下命令: info threads,观察异常的线程,定位到异常的线程后,则可以输入如下命令:thread 线程编号,则会打印出当前java代码的工作流程。

而对于这个core,亦可以用jstack jmap打印出堆信息,线程信息,具体命令:

jmap -heap 执行文件 core文件 jstack -F -l 执行文件 core文件

容器中的进程的话需要到宿主机操作,并且将容器中的 jdk文件夹复制到宿主机对应的位置。

ps auxff |grep 容器id -A10 找到JVM在宿主机上的进程id

coredump

kill -3 产生coredump 存放在 kernel.core_pattern=/root/core (/etc/sysctl.conf , 先 ulimit -c unlimited;)

或者 gcore id 获取coredump

coredump 所在位置

1
2
$cat /proc/sys/kernel/core_pattern
/home/admin/

coredump 分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
//打开 coredump
$gdb /opt/taobao/java/bin/java core.24086
[New LWP 27184]
[New LWP 27186]
[New LWP 24086]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/opt/tt/java_coroutine/bin/java'.
#0 0x00007f2fa4fada35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
Missing separate debuginfos, use: debuginfo-install jdk-8.9.14-20200203164153.alios7.x86_64
(gdb) info threads //查看所有thread
Id Target Id Frame
583 Thread 0x7f2fa56177c0 (LWP 24086) 0x00007f2fa4fab017 in pthread_join () from /lib64/libpthread.so.0
582 Thread 0x7f2f695f3700 (LWP 27186) 0x00007f2fa4fada35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
581 Thread 0x7f2f6cbfb700 (LWP 27184) 0x00007f2fa4fada35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
580 Thread 0x7f2f691ef700 (LWP 27176) 0x00007f2fa4fada35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
579 Thread 0x7f2f698f6700 (LWP 27174) 0x00007f2fa4fada35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
(gdb) thread apply all bt //查看所有线程堆栈
Thread 583 (Thread 0x7f2fa56177c0 (LWP 24086)):
#0 0x00007f2fa4fab017 in pthread_join () from /lib64/libpthread.so.0
#1 0x00007f2fa4b85085 in ContinueInNewThread0 (continuation=continuation@entry=0x7f2fa4b7fd70 <JavaMain>, stack_size=1048576, args=args@entry=0x7ffe529432d0)
at /ssd1/jenkins_home/workspace/ajdk.8.build.master/jdk/src/solaris/bin/java_md_solinux.c:1044
#2 0x00007f2fa4b81877 in ContinueInNewThread (ifn=ifn@entry=0x7ffe529433d0, threadStackSize=<optimized out>, argc=<optimized out>, argv=0x7f2fa3c163a8, mode=mode@entry=1,
what=what@entry=0x7ffe5294be17 "com.taobao.tddl.server.TddlLauncher", ret=0) at /ssd1/jenkins_home/workspace/ajdk.8.build.master/jdk/src/share/bin/java.c:2033
#3 0x00007f2fa4b8513b in JVMInit (ifn=ifn@entry=0x7ffe529433d0, threadStackSize=<optimized out>, argc=<optimized out>, argv=<optimized out>, mode=mode@entry=1,
what=what@entry=0x7ffe5294be17 "com.taobao.tddl.server.TddlLauncher", ret=ret@entry=0) at /ssd1/jenkins_home/workspace/ajdk.8.build.master/jdk/src/solaris/bin/java_md_solinux.c:1091
#4 0x00007f2fa4b8254d in JLI_Launch (argc=0, argv=0x7f2fa3c163a8, jargc=<optimized out>, jargv=<optimized out>, appclassc=1, appclassv=0x0, fullversion=0x400885 "1.8.0_232-b604",
dotversion=0x400881 "1.8", pname=0x40087c "java", lname=0x40087c "java", javaargs=0 '\000', cpwildcard=1 '\001', javaw=0 '\000', ergo=0)
at /ssd1/jenkins_home/workspace/ajdk.8.build.master/jdk/src/share/bin/java.c:304
#5 0x0000000000400635 in main ()
Thread 582 (Thread 0x7f2f695f3700 (LWP 27186)):
#0 0x00007f2fa4fada35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f2fa342d863 in Parker::park(bool, long) () from /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/server/libjvm.so
#2 0x00007f2fa35ba3c3 in Unsafe_Park () from /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/server/libjvm.so
#3 0x00007f2f9343b44a in ?? ()
#4 0x000000008082e778 in ?? ()
#5 0x0000000000000003 in ?? ()
#6 0x00007f2f88e32758 in ?? ()
#7 0x00007f2f6f532800 in ?? ()
(gdb) thread apply 582 bt //查看582这个线程堆栈,LWP 27186(0x6a32)对应jstack 线程10进程id
Thread 582 (Thread 0x7f2f695f3700 (LWP 27186)):
#0 0x00007f2fa4fada35 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1 0x00007f2fa342d863 in Parker::park(bool, long) () from /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/server/libjvm.so
#2 0x00007f2fa35ba3c3 in Unsafe_Park () from /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/server/libjvm.so
#3 0x00007f2f9343b44a in ?? ()
#35 0x00000000f26fc738 in ?? ()
#36 0x00007f2fa51cec5b in arena_run_split_remove (arena=0x7f2f6ab09c34, chunk=0x80, run_ind=0, flag_dirty=0, flag_decommitted=<optimized out>, need_pages=0) at src/arena.c:398
#37 0x00007f2f695f2980 in ?? ()
#38 0x0000000000000001 in ?? ()
#39 0x00007f2f88e32758 in ?? ()
#40 0x00007f2f695f2920 in ?? ()
#41 0x00007f2fa32f46b8 in CallInfo::set_common(KlassHandle, KlassHandle, methodHandle, methodHandle, CallInfo::CallKind, int, Thread*) ()
from /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/server/libjvm.so
#42 0x00007f2f7d800000 in ?? ()

coredump 转 jmap hprof

1
jmap -dump:format=b,file=24086.hprof /opt/taobao/java/bin/java core.24086

以上命令输入是 core.24086 这个 coredump,输出是一个 jmap 的dump 24086.hprof

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
$jmap -J-d64 /opt/taobao/java/bin/java core.24086
Attaching to core core.24086 from executable /opt/taobao/java/bin/java, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.232-b604
0x0000000000400000 8K /opt/taobao/java/bin/java
0x00007f2fa51be000 6679K /opt/taobao/install/ajdk-8_9_14-b604/bin/../lib/amd64/libjemalloc.so.2
0x00007f2fa4fa2000 138K /lib64/libpthread.so.0
0x00007f2fa4d8c000 88K /lib64/libz.so.1
0x00007f2fa4b7d000 280K /opt/taobao/install/ajdk-8_9_14-b604/bin/../lib/amd64/jli/libjli.so
0x00007f2fa4979000 18K /lib64/libdl.so.2
0x00007f2fa45ab000 2105K /lib64/libc.so.6
0x00007f2fa43a3000 42K /lib64/librt.so.1
0x00007f2fa40a1000 1110K /lib64/libm.so.6
0x00007f2fa5406000 159K /lib64/ld-linux-x86-64.so.2
0x00007f2fa2af1000 17898K /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/server/libjvm.so
0x00007f2fa25f1000 64K /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/libverify.so
0x00007f2fa23c2000 228K /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/libjava.so
0x00007f2fa21af000 60K /lib64/libnss_files.so.2
0x00007f2fa1fa5000 47K /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/libzip.so
0x00007f2f80ded000 96K /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/libnio.so
0x00007f2f80bd4000 119K /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/libnet.so
0x00007f2f7e1f6000 50K /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/libmanagement.so
0x00007f2f75dc8000 209K /home/admin/drds-server/lib/native/libsigar-amd64-linux.so
0x00007f2f6d8ad000 293K /opt/taobao/install/ajdk-8_9_14-b604/jre/lib/amd64/libsunec.so
0x00007f2f6d697000 86K /lib64/libgcc_s.so.1
0x00007f2f6bdf9000 30K /lib64/libnss_dns.so.2
0x00007f2f6bbdf000 107K /lib64/libresolv.so.2

coredump 生成 java stack

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
jstack -J-d64 /opt/taobao/java/bin/java core.24086
Attaching to core core.24086 from executable /opt/taobao/java/bin/java, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 25.232-b604
Deadlock Detection:
No deadlocks found.
Thread 27186: (state = BLOCKED)
- sun.misc.Unsafe.park0(boolean, long) @bci=0 (Compiled frame; information may be imprecise)
- sun.misc.Unsafe.park(boolean, long) @bci=63, line=1038 (Compiled frame)
- java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=176 (Compiled frame)
- java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2047 (Compiled frame)
- java.util.concurrent.LinkedBlockingQueue.take() @bci=29, line=446 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.getTask() @bci=149, line=1074 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=26, line=1134 (Compiled frame)
- java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=624 (Compiled frame)
- java.lang.Thread.run() @bci=11, line=858 (Compiled frame)

gdb coredump with java symbol