• Glen

Examining deadlock with gdb

Updated: Feb 2


Yesterday I needed to debug the case of deadlock in an application of was developing, I found this takes a few steps in gdb which I would never remember of the top of my head, heres the steps I take:


1) First attach to the running process with gdb:


gdb -p {pid} {binary}


for example: gdb -p 123312 /home/xya/myapp


2) next example all threads to see with are deadlocked:


(gdb) thread apply all bt


look for threads that show the following:

===

Thread 3 (Thread 0x7f0390e03700 (LWP 47556)):

#0 0x00007f0392bf88ed in __lll_lock_wait () from /lib64/libpthread.so.0

#1 0x00007f0392bf1b09 in pthread_mutex_lock () from /lib64/libpthread.so.0

===


Take a note of all affected threads


3) for each affected thread do the following:


a) select the thread, for example to select thread 2 do:

(gdb) thread 3

b) if need be show back trace the thread:

(gdb) bt

look for:

#1 0x00007f0392bf1b09 in pthread_mutex_lock () from /lib64/libpthread.so.0


c) select the pthread_mutex_lock frame, in the above example the frame is #1


(gdb) frame 1


output should show just that frame, i.e.

#1 0x00007f0392bf1b09 in pthread_mutex_lock () from /lib64/libpthread.so.0


d) look at the frames register


(gbd) info reg


output should be similar to:


rax 0xfffffffffffffe00 -512

rbx 0x40000 262144

rcx 0x7f0392bf88ed 139653323655405

rdx 0x0 0

rsi 0x80 128

rdi 0x138eae8 20507368

rbp 0x7f0390dfe630 0x7f0390dfe630

rsp 0x7f0390dfe580 0x7f0390dfe580

r8 0x138eae8 20507368

r9 0x7 7

r10 0x4000 16384

r11 0x246 582

r12 0x20000 131072

r13 0x7f0390e02670 139653292238448

r14 0x2009 8201

r15 0x0 0

rip 0x7f0392bf1b09 0x7f0392bf1b09 <pthread_mutex_lock+89>

eflags 0x246 [ PF ZF IF ]

cs 0x33 51

ss 0x2b 43

ds 0x0 0

es 0x0 0

fs 0x0 0

gs 0x0 0


take the hex number next to r8 (in the above example it is 0x138eae8), you need this for the next step


e) find the thread that currently has a the lock this thread is waiting on:


(gdb) p *(pthread_mutex_t*)0x138eae8


the output will be similar to:

$1 = {__data = {__lock = 2, __count = 0, __owner = 47554, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}},

__size = "\002\000\000\000\000\000\000\000¹\000\000\001", '\000' <repeats 26 times>, __align = 2}


This shows just that thread 47554 has the lock this thread is waiting on, repeat these steps for all threads.


===


In this case "thread appy all bt" shows that thread 47554 is thread 2:


(from output of "thread apply bt all" Thread 2 (Thread 0x7f0391604700 (LWP 47554)):


when I looked at that thread, with

(gdb) thread 2

(gdb) bt


I found that the thread was waiting on new work, meaning that it had completed it previous task, but not release its lock, these steps saved me what could have otherwise have been hours of debugging.


Hope these steps can help others, moreover they are now documented here for next time I need to example locks with gdb







11 views0 comments

Recent Posts

See All

​© 2020 by Glen Olsen

  • LinkedIn - White Circle
  • Facebook Clean
  • Twitter Clean