Bug 3259 - SSHD: Log deadlock occurs during sshd running.
Summary: SSHD: Log deadlock occurs during sshd running.
Status: CLOSED FIXED
Alias: None
Product: Portable OpenSSH
Classification: Unclassified
Component: sshd (show other bugs)
Version: 8.4p1
Hardware: ARM64 Linux
: P5 critical
Assignee: Assigned to nobody
URL:
Keywords:
Depends on:
Blocks: V_8_5
  Show dependency treegraph
 
Reported: 2021-02-03 01:10 AEDT by kircher
Modified: 2021-03-04 09:51 AEDT (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description kircher 2021-02-03 01:10:59 AEDT
On the ARM64 platform, if glibc-2.28 is used, deadlocks will occur when sshd records logs.

The call stack is as follows:

(gdb) bt
#0  0x0000ffff9b671434 in __lll_lock_wait_private (
    futex=0xffff9b702a1c <syslog_lock>) at ./lowlevellock.c:33
#1  0x0000ffff9b660494 in openlog (ident=0xaaaaef493260 "sshd", logstat=1,
    logfac=32) at ../misc/syslog.c:390
#2  0x0000aaaab23dab5c in do_log (args=..., fmt=0xfffff8eaa578 "", suffix=0x0,
    force=<optimized out>, level=SYSLOG_LEVEL_DEBUG1, line=343,
    func=0xaaaab240ad80 <__func__.21282> "main_sigchld_handler", file=0x0)
    at log.c:415
#3  sshlogv (file=0x0, func=0xaaaab240ad80 <__func__.21282> "main_sigchld_handler",
    line=343, showfunc=<optimized out>, level=SYSLOG_LEVEL_DEBUG1, suffix=0x0,
    fmt=<optimized out>, args=...) at log.c:485
#4  0x0000aaaab23dac5c in sshlog (file=<optimized out>, func=<optimized out>,
    line=<optimized out>, showfunc=<optimized out>, level=<optimized out>,
    suffix=<optimized out>, fmt=<optimized out>) at log.c:430
#5  0x0000aaaab2390e9c in main_sigchld_handler (sig=<optimized out>) at sshd.c:343
#6  <signal handler called>
#7  __libc_send (fd=6, buf=0xaaaaef4ad650, len=65, flags=flags@entry=16384)
    at ../sysdeps/unix/sysv/linux/send.c:24
#8  0x0000ffff9b660114 in __GI___vsyslog_chk (pri=<optimized out>, pri@entry=65535,
    flag=flag@entry=1,
    fmt=0x7900000001 <error: Cannot access memory at address 0x7900000001>,
    fmt@entry=0xaaaab241bfb0 "%.500s", ap=...) at ../misc/syslog.c:284
#9  0x0000ffff9b6603cc in __syslog_chk (pri=pri@entry=65535, flag=flag@entry=1,
    fmt=fmt@entry=0xaaaab241bfb0 "%.500s") at ../misc/syslog.c:135
#10 0x0000aaaab23dab74 in syslog (__fmt=0xaaaab241bfb0 "%.500s", __pri=65535)
    at /usr/include/bits/syslog.h:31
#11 do_log (args=..., fmt=0xfffff8eac9a8 " %_\233\377\377",
    suffix=0xfffff8eacf10 "\270\320\352\370\377\377", force=<optimized out>,
    level=SYSLOG_LEVEL_DEBUG3, line=124,
    func=0xaaaab241d938 <__func__.15454> "unset_nonblock",
    file=0xfffff8eacf10 "\270\320\352\370\377\377") at log.c:416
#12 sshlogv (file=0xfffff8eacf10 "\270\320\352\370\377\377",
    file@entry=0xaaaab241dab0 "misc.c",
    func=func@entry=0xaaaab241d938 <__func__.15454> "unset_nonblock",
    line=line@entry=124, showfunc=showfunc@entry=0,
    level=level@entry=SYSLOG_LEVEL_DEBUG3,
    suffix=0xfffff8eacf10 "\270\320\352\370\377\377", suffix@entry=0x0,
    fmt=fmt@entry=0xaaaab241dc70 "fd %d is not O_NONBLOCK", args=...) at log.c:485
#13 0x0000aaaab23dac5c in sshlog (file=file@entry=0xaaaab241dab0 "misc.c",
    func=func@entry=0xaaaab241d938 <__func__.15454> "unset_nonblock",
    line=line@entry=124, showfunc=showfunc@entry=0,
    level=level@entry=SYSLOG_LEVEL_DEBUG3, suffix=suffix@entry=0x0,
    fmt=fmt@entry=0xaaaab241dc70 "fd %d is not O_NONBLOCK") at log.c:430
--Type <RET> for more, q to quit, c to continue without paging--
#14 0x0000aaaab23e75cc in unset_nonblock (fd=5) at misc.c:124
#15 0x0000aaaab238f044 in server_accept_loop (config_s=0xfffff8ead0d8,
    newsock=<synthetic pointer>, sock_out=<synthetic pointer>,
    sock_in=<synthetic pointer>) at sshd.c:1247
#16 main (ac=<optimized out>, av=<optimized out>) at sshd.c:2052

We notice that there is a submission in the OpenSSH community: 
https://anongit.mindrot.org/openssh.git/commit?id=3bf2a6ac791d64046a537335a0f1d5e43579c5ad

It adds the debug function to the main_sigchld_handler signal handler function, which is non-reenterable on ARM64 platforms. 

Non-reentrant functions are not welcome for signal processing functions, because they can cause deadlocks..
Comment 1 Darren Tucker 2021-02-05 13:48:41 AEDT
Removed.  Thanks for the report, it'll be in the 8.5 release.
Comment 2 Damien Miller 2021-03-04 09:51:39 AEDT
close bugs that were resolved in OpenSSH 8.5 release cycle