Last active
September 7, 2021 20:55
-
-
Save ziesemer/93d64a074abdd8e6f632 to your computer and use it in GitHub Desktop.
ssh-ControlMaster-test.sh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/bin/bash | |
# Mark Ziesemer, 2016-02-11, 2016-12-14. | |
# As described at https://rhn.redhat.com/errata/RHSA-2015-2088.html , | |
# I'm afraid that the race condition with OpenSSH ControlMaster multiplexing is still not resolved | |
# in recent CentOS / Fedora releases. | |
# This refers to BZ#1240613 (which is apparently restricted), and is also described at | |
# https://access.redhat.com/solutions/1521923 (which is non-public, restricted to subscription access). | |
# Reported to https://bugzilla.redhat.com/show_bug.cgi?id=1308295 on 2016-02-13 . | |
# Use this script to stress-test OpenSSH ControlMaster multiplexing. | |
# Open 2 shell sessions. | |
# In the first, run "ssh-ControlMaster-test.sh setup" (needed one-time only), followed by "ssh-ControlMaster-test.sh master". | |
# In the second, run "ssh-ControlMaster-test.sh threads". | |
# If an error is not observed, repeat running the "threads" command in the 2nd window until the "master" in the 1st window | |
# terminates with an error, or until confident that the issue no longer exists. | |
# This doesn't require any networking outside of a local VM, and has been observed under both VMware ESX and Oracle VirtualBox | |
# - but is also being consistently observed in actual network environments. | |
# As of 2016-02-13, my current testing shows: | |
# - CentOS 7.2.1511, openssh-6.6.1p1-23.el7_2.x86_64 - broken. | |
# - CentOS 7.2.1511, downgraded to openssh-6.6.1p1-22.el7.x86_64 - still broken | |
# (despite this is the supposed fix release per RHSA-2015-2088). | |
# - Fedora 23, openssh-7.1p2-3.fc23.x86_64 - broken. | |
# - Ubuntu 15.10, OpenSSH_6.9p1 Ubuntu-2ubuntu0.1 - Works without issue. | |
# - CentOS 6.6, openssh-5.3p1-104.el6_6.1.i686 - Works without issue. | |
# - Fedora 20, openssh-6.4p1-8.fc20.x86_64 - Works without issue. | |
# Additional research and notes: | |
# - https://ahwhattheheck.wordpress.com/2015/07/02/debugging-sporadically-encountered-ssh-encountered-an-unknown-error-in-ansible-runs/ | |
# - Bypassed the issue by effectively ensuring that no ControlMaster would be concurrently accessed by multiple client sessions, | |
# at the expense of increasing the number of ControlMasters used. | |
# - http://www.zenoss.org/forum/10136 | |
# - Posts indicated that the new "UsePrivilegeSeparation sandbox" could be a problem here - but I am able to consistently | |
# reproduce with or without this enabled. | |
set -euo pipefail | |
trap '_exit' SIGINT | |
_controlPath="-o ControlPath=~/.ssh/sockets/%r@%h-%p" | |
_host='localhost' | |
_thread(){ | |
for i in {1..100}; do | |
ssh ${_controlPath} "${_host}" \ | |
-C "echo here: thread=$1 iter=$i \$(date -Is); sleep 0.1" \ | |
|| { | |
echo "ssh client (thread=$1 iter=$i) failed with result: $?" | |
_exit | |
} | |
done | |
} | |
_exit(){ | |
echo 'Forcing exit...' | |
kill $(jobs -p) 2>/dev/null | |
} | |
_setup(){ | |
set -vx | |
ssh-keygen -f ~/.ssh/id_rsa -N '' | |
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys | |
chmod 600 ~/.ssh/authorized_keys | |
ssh-keyscan "${_host}" >> ~/.ssh/known_hosts | |
mkdir -p -- ~/.ssh/sockets | |
chmod 700 -- ~/.ssh ~/.ssh/sockets | |
} | |
_runMaster(){ | |
date -Is | |
local sshResult= | |
ssh -vvvv -o 'ControlMaster=yes' ${_controlPath} \ | |
-N "${_host}" || sshResult=$? | |
echo "ssh ControlMaster failed with result: $sshResult" | |
date -Is | |
} | |
_runThreads(){ | |
for i in {1..10}; do | |
_thread $i & | |
done | |
wait | |
} | |
case "$1" in | |
'setup') | |
_setup | |
;; | |
'master') | |
_runMaster | |
;; | |
'threads') | |
_runThreads | |
;; | |
esac |
Further testing required - but this now appears fixed as of openssh-6.6.1p1-31.el7.x86_64
under CentOS 7.3.1611! I even cranked the script up to 1,000 iterations x 50 threads, and was unable to cause a ControlMaster failure. 😄
Looks like the fix was actually in -26 (which was never yet released for 7.2):
* Fri Apr 01 2016 Jakub Jelen <[email protected]> 6.6.1p1-26 + 0.9.3-9
...
- Fix race condition between audit messages from different processes (#1310684)
...
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Still broken as of
openssh-6.6.1p1-25.el7_2.x86_64
under CentOS 7.2.1511.