SSH, the boss-fight level: Jupyter notebooks from compute nodes

Secure shell (SSH) is an essential tool for remote operations. However, not everything with it is smooth-sailing. Especially, when you want to do things like reverse–port-forwarding via a proxy-hump or two a Jupyter notebook to your local machine from a compute node on a no-home container . Even if it sounds less plausible than the exploits on Mr Robot, it actually can work and requires zero social-engineering or sneaking in server rooms to install Raspberry Pis while using a baseball cap as a disguise.

Disclaimer: Advanced only

This post addresses particular advance cases and is not intended to be a beginners tutorial. For that please search:

  • Windows machine? Search: Activate Windows Subsystem Linux (WSL). Don’t bother with Putty as older posts will recommend.
  • SSH not working? Use argument -v on ssh and copy-paste the last command.
  • Connect without password. Search “SSH key” and also with your OS SSH agent.
  • SSH keys not working? Check you can get it by typing passwords then search: chmod .ssh folder as your permissions might be wrong
  • SSH seems to hang? See below
  • Unix root folders? The meanings of some Unix root folders seems trial but a lot of confusion stems from these, so see footnote root folders if at all unsure.
  • What’s a config file? It’s $HOME/.ssh/config. Search SSH config file
  • ssh runs on port 22. Different ports are used either for routing or for obfuscation.
  • Environment variables below are not working: remember double quotes not single to extrapolate environment variable, and env and export have different scopes. Source not bash scripts with variables to make them available (eg. source /etc/os-release; echo $PRETTY_NAME;).
  • Most programs are okay with path//path but not ssh. So don’t have environment variables ending in a slash!

Privacy

A basic concept worth reiteration is privacy. Always think security:

  • If you have SSH keys someone remotely who can see them?
  • If only the root users, is there anything they cannot already get to?
  • If on a mounted drive that can it be mounted anywhere and what type of drive is it?
  • If the sys-admin’s family were held hostage by <insert villain of current news cycle>, what of yours would they coerce they sys-admin to give?
  • What steps can you take to encrypt or obfuscate that data?

Fail2ban and local ssh settings

SSH will appear to hang if your are port forwarding or doing an scp/rsync: this is normal it’s doing its jump. But there is a second case: you have been jailed.

A common safeguard against attackers is Fail2ban. This Perl script monitors your logs and jails any IP that violates a given rule. When logged into a remote machine as a non-root user you are unable to see the settings, but the maxretry is generally 3 with a timeout of 600 seconds. If you triggered a jailing for your IP, the ssh -v will hang on “connecting”.

Parenthetically, Fail2ban is useful beyond SSH: on my webservers, e.g. Michelanglo, I not only have an SSH jail, but also Apache-log jails for Nuclei project user-agents —yup, happens— and for excessive 404 brute crawlers —a wee caveat is that as per misread EU rules, one should not keep IP Apache logs, but that is utter lunacy par with leaving your ignition key in your car on the dashboard, so don’t play it legally safe, play it safe against attackers.

Proxy Jumping

Now the fun stuff.

To ssh twice, you proxy jump. The inline command to go to inner-host from a gateway host is ssh -J $GATEWAY_ADDRESS $INNERNODE_ADDRESS. In your config file, it’s the ProxyJump option, as shown in the following example, where the emerald-cpu host is reachable solely via emerald gateway host. The Emerald city is at the centre of the Land of Oz, it is a fictional place, so don’t bother hacking it —unless you are related to Judy Garland. For the sake of sanity the option CanonicalizeHostname is used on all hosts (first block), which means that it is order-invariant, otherwise the gateway host must come before the internal host.

Host *
     ControlMaster auto
     ControlPersist 300
     ControlPath ~/.ssh/control-%h-%p-%r
     GSSAPIAuthentication yes
     ServerAliveInterval 30
     # Mac stuff:
     AddKeysToAgent yes
     UseKeychain yes
     IgnoreUnknown AddKeysToAgent,UseKeychain
     # Repeat settings:
     CanonicalizeHostname yes

Host emerald
     Hostname ssh.emerald.ac.uk
     User abc12345
     IdentityFile ~/.ssh/emerald

Host emerald-cluster
     Hostname submitter.emerald.ac.uk

Host *.emerald.ac.uk !ssh.emerald.ac.uk
     ProxyJump emerald
     User abc12345
     IdentityFile ~/.ssh/emerald_intra

A variant of this to use the ProxyCommand option, this is a command that gets executed before to get to the host, unlike RemoteCommand, which gets run once in the host.

ssh \
-o ProxyCommand="ssh -W %h:%p $GATEWAY_ADDRESS \
$INNERNODE_ADDRESS;

Specifically, the -W option in ssh enables a built-in way to forward a TCP connection over the established SSH connection. SSH does not use bash syntax and has a few tokens that get expanded when the command gets executed: %h is the target host (%L is the local one) and %p is the port. In the above example the environment variables will be expanded by the shell when the command is run, while the tokens will be expanded by SSH.

Whereas ProxyJump makes a clean entry in an SSH config file, the ProxyCommand allows a single argument-laden command, which is needed in cases where there is no home folder or using the home folder is not viable.

Port Forwarding

Port forwarding or tunnelling allows you to make a port run via SSH. Say make port 6789 on your local machine a forwarded port 6789 on your remote host. This can go two directions: local port forwarding or remote/reverse port forwarding.

# Local to remote
ssh -N -L localhost:$LOCAL_PORT:localhost:$REMOTE_PORT $HOST_ADDRESS
# Remote to local
ssh -N -R localhost:$LOCAL_PORT:localhost:$REMOTE_PORT $HOST_ADDRESS
# With fluff
ssh -N -L localhost:$LOCAL_PORT:localhost:$REMOTE_PORT -p $SSH_PORT $HOST_ADDRESS -o ServerAliveInterval=180 -o ExitOnForwardFailure=yes
  • The -N argument tells SSH that you don’t want a bash prompt, although it will execute the ~/.bashrc. This has a different effect that -t/-T arguments which are used to run a command.
  • The -L argument is for local forwarding (i.e. to the local host), -R for remote (i.e. to the remote host).
  • The local port (in the variable $LOCAL_PORT above) that gets bound cannot be a privileged port (i.e. below 1024) without sudo —no port 666, sorry.
  • The localhost (or 127.0.0.1) part can be changed to 0.0.0.0 if one requires the port to be visible on the local network without going through Apache. This applies to HTTP servers such as the .run() method of a Flask app (don’t), waitress, runserver in django or tornado used by Jupyter.
  • The L/R argument can be repeated for different ports.
  • A variant on the above is that you can forward a port that is actually on a different host on the network.
  • SSH by default is via port 22, but a sys-admin may configure a different port, which is provided via -p argument as above (ignore this is if 22). This is unrelated therefore to the port forward.
  • ServerAliveInterval=180 is a must as it makes the local machine nudge the remote periodically so the connection does not fail due to inactivity. The -o argument is to specify an option as you’d use them in the ~/.ssh/config file (or /etc/ssh file or one specified via Include in either). This is different from ConnectTimeout which is for the connection-establishment step.
  • ExitOnForwardFailure=yes quits as opposed to hang pointlessly. This means one could use a loop to make sure it is always working:
#!/bin/bash

: << USAGE
This snippet requires
* $SSH_ADDRESS the address
* $SSH_FORWARD_PORT the port you want to forward
* $SSH_USER your username there
* $SSH_FOLDER/$SSH_KEY the key to use
USAGE

while true;
do
ssh -N -R 0.0.0.0:$SSH_FORWARD_PORT:localhost:$SSH_FORWARD_PORT \
    -l $SSH_USER \
    -p $SSH_PORT \
    -o ServerAliveInterval=180 \
    -o ExitOnForwardFailure=yes \
    -i $SSH_FOLDER/$SSH_KEY \
    -o UserKnownHostsFile=$SSH_FOLDER/known_hosts \
$SSH_ADDRESS;

echo "Connection to $SSH_ADDRESS lost" 1>&2;
sleep 600;
done;

Combining this with the proxy jumping we start to get this behemoth:

#!/bin/bash

: << USAGE
This snippet requires
* $SSH_ADDRESS is now $SSH_GATEWAY_HOST and $SSH_INNER_HOST
* $SSH_FORWARD_PORT the port you want to forward
* $SSH_USER your username there
* $SSH_FOLDER/$SSH_KEY the key to use
USAGE

while true;
do
ssh -N -R 0.0.0.0:$SSH_FORWARD_PORT:0.0.0.0:$SSH_FORWARD_PORT \
-o ProxyCommand="ssh -v -W %h:%p \
                     -l $SSH_USER \
                     -i $SSH_FOLDER/$SSH_KEY \
                     $SSH_INNER_HOST" \
-i $SSH_FOLDER/$SSH_KEY \
-o ServerAliveInterval=180 \
-o UserKnownHostsFile=$SSH_FOLDER/known_hosts \
-l $SSH_USER \
-o ExitOnForwardFailure=yes \
$SSH_GATEWAY_HOST;

echo "Connection to $SSH_INNER_HOST lost" 1>&2;
sleep 600;
done;

Home network

If one wants to connect to a machine behind a router, for example a home network, they need to configure the router to allow it. Most users are likely to use their home network as opposed to do sys-admin work at their work place, so I will focus on that case in this section.

First, have a machine permanently connected to act as a homeserver, such as Raspberry Pi or desktop —don’t use a laptop as that may be a fire risk.
Install fail2ban as mentioned and enable the ssh jail.
Log into your modem/router —192.168.1.1 or somesuch as likely directed by a sticker on its underside. In the DHCP settings make the IP of the homerserver reserved to its mac address. In the security or advanced settings there will be a port forwarding option, where you can make the local port 22 of your homerserver the external port you want (security by obfuscation as I mentioned). Some fancier/new modems will have port triggering/knocking options, which is worth exploring especially you are exposing sensitive content.

Do note that home network IPs are not fixed, but can change periodically. With fibre, generally the IP changes only when the switchboard resets (the dreaded email stating “we are performing essential maintenance in your area between 8 am to 8 pm, we apologise for the inconvenience”).

Known host fingerprints

When one first connects to a remote host one gets asked if it’s trusted. Then a hash is made for future connections and when the host changes the hash will not match and an error will occur making a safeguard in case it something nefarious happened. Problems arise if one is running a script and the script is prompted if it is trustworthy and it hangs as there is no TTY.

This behaviour is controlled by StrictHostKeyChecking option. Setting this to no will therefore skip it. This is considered unsafe, so in OpenSSH 7.6 and above the value accept-new can be used to accept new keys. The catch is that the ancient OS that refuses to die as it underpins a lot of infrastructure, CentOS 7, is older.

In which case, ssh-keygen, curiously, comes to the rescue as it allows one to exchange fingerprints among other secret things.

ssh-keygen -R gate.stats.ox.ac.uk -f "$SSH_FOLDER/known_hosts"
ssh 👾👾👾 -o "UserKnownHostsFile=$SSH_FOLDER/known_hosts"

I should mention that if you are going from one host to another within an intranet the StrictHostKeyChecking=no is most likely fine, unless it is a very lawless intranet (i.e. college dorms).

No home and not interactive

In a container (e.g. Singularity), you might be a no-home user, ie. you ($USER) lack a folder in /home, but have write permission somewhere (i.e. $HOME is some gibberish /dev path). This has three consequences for SSH. No config file to make life easier, fingerprints are new and if no identity key file is specified (the -i argument) it does not check all identity key files in $HOME/.ssh.

Exporting HOME as a different path will do nothing (bar change ~ in that bash) whereas to change a user’s home path a root user either has to run usermod -d /not-home/👾👾👾 👾👾👾 or they manually edit /etc/passwd file.

A few hacks (say specifying keys via -i while remembering about permissions and accepting fingerprints to a file) can be done, although as far as I know there is no way to provide a SSH config file to ssh.

Using Singularity is an option, but that is overkill for that alone.

#!/bin/bash

# ========================
: << USAGE
As above...
USAGE
# ========================

# your standard boiler plate....
# $SSH_USER
if [ -z "$SSH_USER" ]; then
     # raise error is not bash, but will raise an error...
    raise error "Your remote username SSH_USER ($SSH_USER) is not specified"
fi

if [ -z "$SSH_GATEWAY_HOST" ]; then
     raise error "No SSH_GATEWAY_HOST"
fi

if [ -z "$SSH_INNER_HOST" ]; then
     raise error "No SSH_INNER_HOST"
fi

if [ -n "$SSH_FORWARD_PORT" ]; then
    echo '$SSH_FORWARD_PORT provided directly.'
elif [ -n "$JOB_PORT" ]; then
    export SSH_FORWARD_PORT=$JOB_PORT
elif [ -n "$JUPYTER_PORT" ]; then
    export $SSH_FORWARD_PORT=$JUPYTER_PORT
elif [ -n "$APPTAINERENV_JUPYTER_PORT" ]; then
    export $SSH_FORWARD_PORT=$APPTAINERENV_SSH_FORWARD_PORT
else
    raise error 'No $SSH_FORWARD_PORT provided'
fi

# ========================

export DATA=/👾👾👾;
export SSH_KEY=${SSH_KEY:-*}
export SSH_FOLDER=${SSH_FOLDER:-$HOME/.ssh}
#export SSH_PORT=${SSH_PORT:-22}

# most applications are okay with path//path but not ssh
export SSH_FOLDER=$(echo "$SSH_FOLDER" | sed "s/\/\//\//g" | sed "s/\/$//")

touch $SSH_FOLDER/test.txt
if [ ! -f $SSH_FOLDER/test.txt ]
then
	echo "The folder $SSH_FOLDER is inaccessible"
        mkdir -p /tmp/ssh
        export SSH_FOLDER=/tmp/ssh
fi

echo 'prep connections by moving keys from $SSH_FOLDER to $HOME'
mkdir -p $SSH_FOLDER
touch $SSH_FOLDER/known_hosts
chmod 700 $SSH_FOLDER
chmod 600 $SSH_FOLDER/*
# ========================

echo 'accepting fingerprints'
ssh-keygen -R $SSH_GATEWAY_HOST -f "$SSH_FOLDER/known_hosts"
while true;
do
ssh -N -R 0.0.0.0:$SSH_FORWARD_PORT:0.0.0.0:$SSH_FORWARD_PORT \
-o ProxyCommand="ssh -v -W %h:%p -l $SSH_USER -i $SSH_FOLDER/$SSH_KEY \
-o StrictHostKeyChecking=no \
$SSH_GATEWAY_HOST" \
-i $SSH_FOLDER/$SSH_KEY \
-o ServerAliveInterval=180 \
-o UserKnownHostsFile=$SSH_FOLDER/known_hosts \
-l $SSH_USER \
-o ExitOnForwardFailure=yes \
-o StrictHostKeyChecking=no \
$SSH_INNER_HOST \
-v;

echo 'Connection to stats lost' 1>&2;
sleep 600;
done;

Boss fight

Now, having established this is possible, let’s see what this would mean in practive.

Say in the fictional Emerald city in the land of Oz, HTCondor is used in the cluster, then a job could be launched that runs a Jupyter notebook in a singularity container reverse port forwarding it. It just requires a lot of varibles to pass around:

#!/bin/bash

# JOB_ is a custom convention used here to mark the job variables
export COMMON_DIR=👾👾👾
export JOB_SCRIPT=$COMMON_DIR/singularity.sh
export JOB_INIT_SCRIPT=$COMMON_DIR/connection.sh
export JOB_INNER_SCRIPT=$COMMON_DIR/notebook.sh
export JOB_PORT=🤖🤖

# SSH_ is a somewhat custom convention for the ssh script 
export SSH_FORWARD_PORT=$JOB_PORT
export SSH_KEY=👽👽👽
export SSH_USER=👻👻👻
export SSH_FOLDER=$COMMON_DIR/tmp

# CONDA and APPTAINER (Singularity) variables
# APPTAINERENV are environment variables in Singularity container
export CONDA_PREFIX=$COMMON_DIR/waconda
export CONDA_ENVS_PATH=$CONDA_PREFIX/envs:🤠🤠🤠/envs:🤒🤒🤒/envs
export JUPYTER_CONFIG_DIR=$COMMON_DIR/jupyter 
export APPTAINER_CONTAINER=$COMMON_DIR/rockycuda.sif
export APPTAINERENV_CONDA_PREFIX=$CONDA_PREFIX
export APPTAINERENV_JUPYTER_notebook_dir=$COMMON_DIR
export APPTAINERENV_JUPYTER_CONFIG_DIR=$JUPYTER_CONFIG_DIR
export APPTAINERENV_CONDA_ENVS_PATH=$CONDA_ENVS_PATH
export APPTAINERENV_BASHRC_PATH=$COMMON_DIR/bashrc.sh
export APPTAINER_HOSTNAME='💩💩💩'

# Submit the job
condor_submit $COMMON_DIR/target_script.condor -a 'Requirements=(machine == "💀💀💀.🎃🎃🎃.novalocal")'

# remind me what to do:
echo "To debug condor_ssh_to_job and run:";
echo "curl localhost:$JOB_PORT/api";

Where target_script.condor is something like

# ============================================================
# This script runs $JOB_NODE_SCRIPT within initial dir $HOME2
# to specify a particular machine 
# add `-a 'Requirements=(machine == "👾👾👾🤖🤖.novalocal")'` as a cmd arg
# Envs used:
# * $HOME2 the fake home, e.g. in a mounted CephFS volume.
# * $JOB_SCRIPT the job to run
# ============================================================

Executable      = /bin/bash
arguments       = $ENV(JOB_SCRIPT)
Universe        = vanilla
getenv          = JOB_*,SINGULARITY_*,JUPYTER_*,CONDA_*,APPTAINER_*,APPTAINERENV_*,PYTHON*,HOME2,SSH_*
initialdir      = $ENV(HOME2)
Output          = $ENV(HOME2)/logs/condor-log.$(Cluster).$(Process).out
Error           = $ENV(HOME2)/logs/condor-log.$(Cluster).$(Process).err
Log             = $ENV(HOME2)/logs/condor-log.$(Cluster).$(Process).log
request_cpus = Target.TotalSlotCpus
request_gpus = Target.TotalSlotGPUs
request_memory = Target.TotalSlotMemory
+RequiresWholeMachine = True
Queue

Which runs the file $JOB_SCRIPT, which is singularity.sh, which accepts

#!/bin/bash

# ========================
# Run $JOB_INNER_SCRIPT on a singularity container
# if JOB_INIT_SCRIPT is specified it will run that first.
# for example a ssh connection script!
# ========================

export HOST=${HOST:-$(hostname)}
export USER=${USER:-$(users)}
export HOME=${HOME:-$_CONDOR_SCRATCH_DIR}
source /etc/os-release;
echo "Running script ${0} as $USER in $HOST which runs $PRETTY_NAME."
# ---------------------------------------------------------------

export DATA=🤖🤖🤖🤖
export APPTAINER_CONTAINER=${APPTAINER_CONTAINER:-$DATA/cuda113ubuntu20cudnn8_latest.sif}
export APPTAINER_BIND="$DATA:$DATA,$HOME:/data/outerhome"
export APPTAINER_HOSTNAME=${APPTAINER_HOSTNAME:-singularity}
echo "Singularity $APPTAINER_CONTAINER with target script: $JOB_INNER_SCRIPT"
export APPTAINERENV_HOME2=$HOME2
export APPTAINERENV_DATA=$DATA
export APPTAINERENV_CONDA_ENVS_PATH=${APPTAINERENV_CONDA_ENVS_PATH:-$CONDA_ENVS_PATH}
export APPTAINERENV_JUPYTER_CONFIG_DIR=${APPTAINERENV_JUPYTER_CONFIG_DIR:-$JUPYTER_CONFIG_DIR}
export APPTAINER_WORKDIR=${APPTAINER_WORKDIR:-/tmp}
#export APPTAINER_WRITABLE_TMPFS=${APPTAINER_WRITABLE_TMPFS:-true}
export SSH_FORWARD_PORT=$JOB_PORT;
export SINGULARITY_CACHEDIR=/tmp/singularity;
export SINGULARITY_LOCALCACHEDIR=/tmp/singularity;
export APPTAINER_TMPDIR=/tmp/singularity;
export APPTAINERENV_JUPYTER_PORT=${APPTAINERENV_JUPYTER_PORT:-$JOB_PORT}
# overlay does not work.
# ---------------------------------------------------------------

if [ -n "$JOB_INIT_SCRIPT" ]; then
    bash $JOB_INIT_SCRIPT &
fi

# ---------------------------------------------------------------

echo 'Running singularity ...'
if ls /dev | grep -q '^nvidia'; then
    echo "NVIDIA GPU is present."
    /usr/bin/singularity exec --nv --writable-tmpfs $APPTAINER_CONTAINER /bin/bash $JOB_INNER_SCRIPT;
else
    echo "No NVIDIA GPU found."
    /usr/bin/singularity exec --writable-tmpfs $APPTAINER_CONTAINER /bin/bash $JOB_INNER_SCRIPT;
fi


echo 'DIED!' 1>&2;

This script does nothing more than run $JOB_INNER_SCRIPT on a singularity container after that it runs a $JOB_INIT_SCRIPT if present. The latter would be the ssh connection, for example as mentioned above. While the former would be a notebook:

#!/bin/bash

export HOST=${HOST:-$(hostname)}
export USER=${USER:-$(users)}
export HOME=${HOME:-$_CONDOR_SCRATCH_DIR}
source /etc/os-release;

if [ -n "$JUPYTER_PORT" ]; then
    echo "$JUPYTER_PORT set"
elif [ -n "$JOB_PORT" ]; then
    export JUPYTER_PORT=$JOB_PORT
elif [ -n "$SSH_FORWARD_PORT" ]; then
    export JUPYTER_PORT=$SSH_FORWARD_PORT
else
    raise error "Your JUPYTER_PORT is not specified"
fi

if [ -z "$JUPYTER_CONFIG_DIR" ]; then
    raise error "Your JUPYTER_CONFIG_DIR is not specified either"
fi

export LD_LIBRARY_PATH=/usr/local/cuda/compat:$CONDA_PREFIX/lib:$LD_LIBRARY_PATH;

echo "************************"
echo "HELLO JUPYTER!"
echo "************************"
echo "Greet from Jupyter lab script ${0} as $USER in $HOST which runs $PRETTY_NAME on $JUPYTER_PORT with settings from $JUPYTER_CONFIG_DIR"

if [ -n "$BASHRC_PATH"]; then
    source $BASHRC_PATH;
else
    source $JUPYTER_CONFIG_DIR/.bashrc;
fi

conda activate

#export JUPYTER_CONFIG_PATH=$HEADQUARTERS/.jupyter

# First time? Remember to set:
# jupyter notebook --generate-config
# yes foo | jupyter server password

# port is JUPYTER_PORT
while true
do
jupyter lab --ip="0.0.0.0" --no-browser --NotebookNotary.db_file=':memory:'
done

See I said it was easy! but wait. There is still one more level to conquer. However, the boss flight in the Land of Pipes and Sockets is for another time!

Miscellaneous Footnotes

System deamon. If you need to ssh via a system deamon make sure to put a sleep before doing anything and set the service to wait until the network is “online” (not network.target, which just means your network card is working) .

After=network-online.target
Wants=network-online.target

Missing binary. The more barebones the OS is the less stuff is present. As soon as you apt-get or dnf upate it will install ssh (but with the addition of a user called ssh). It is not installed with apt-utils.

Give grief to cluster abusers with wall. On a cluster there are some pieces of etiquette that get abused, like running heavy jobs on log-in nodes. Namely, don’t run notebooks in log-in nodes, run them in compute nodes. If you are a victim of someone doing that and if you don’t know who they are, the command wall "👾👾👾" (with a polite message) will allow you to display your message to all logged in users.

Mac TextEditor. If you are editing .ssh/config locally why not use your GUI editor? On a mac adding to your .zshrc the line alias open_text='open -a /System/Applications/TextEdit.app' will mean that open_text .ssh/config will do it.

Root Folders

  • /usr Unix system resources like the PATH folder /usr/bin —nothing to do with user.
    Often environment module files get put in /usr/share/Modules (module avail will tell the path on the dashed header lines)
  • /home/$USER is the default location of your $HOME. On a Mac it’s /Users, but that’s a rabbithole of differences for another time.
  • /tmp A scratch folder. If you cannot write anyway, can you write there?
  • /dev The devices folder. You will have see it ad nauseaum with mount command, as the null bucket (/dev/null) or as one form of input or other (/dev/tty). But in Singularity & co. df -h will reveal that it does more —if technobabble, please search.
  • /etc folder has configs
  • /var folder has logs
  • /opt is where stuff gets put when the root user is not sure.

Author