Linux 4G LTE Failover
Aug 22, 2020
In my last post I went over how to setup an WiFi/Ethernet bridge on a RaspberryPi for use when your main ISP goes down. In this post I’ll be going over 4G failover with a USB dongle on a Linux server. I won’t be using a RaspberryPi for this since I want it to be 100% automatic and only on a single server. You could do this on a RaspberryPi, but I don’t want it to eat up all the data by having the entire house on a 4G connection.
Requirements
- ZTE MF833V or equivalent
- Debian 10
My Setup
I’m using the USB device listed above along with a Supermicro server that has two bonded/bridged Ethernet ports. If I were to use a RaspberryPi I would use one of the Ethernet ports with the Pi and the other for my main network and let the Pi handle most of the failover (xfinitywifi failover / iPhone tethering).
Setup
- The ZTE device appears as a CD-ROM drive which contains the Windows drivers. Because of this it needs to be switched into the USB modem mode (or rather disable switching into CD-ROM mode).
- https://wiki.archlinux.org/index.php/USB_3G_Modem
apt install usb-modeswitch
- In
/etc/usb_modeswitch.conf
setDisableSwitching
to1
- Plug in the device and find the interface name with
ifconfig -a
- Configure the interfaces file
-
allow-hotplug enp0s20u8 iface enp0s20u8 inet static address 192.168.0.100 netmask 255.255.255.0 dns-nameservers 1.1.1.1 1.0.0.1 8.8.8.8
-
Failover
You’ll notice that the USB device was not given a default gateway. This is to prevent issues with the interface not starting up. You can use post-up
commands to set it with a different metric, but that will not auto failover when the main network cannot communicate with the internet. A script to test the connection and switch it over is required. There might be a better way to do this, but it seems to work fine for my purposes.
This script was taken from here and modified to better suit my needs. What it does is do two ping tests and if both fail it will switch to the backup gateway if it is not currently set to that. If both ping tests succeed it will switch back to the default gateway if it is not already set. I tweaked it a little and added a Pushover notification so I’ll know when something fails.
/opt/failover.sh
#!/bin/bash
#*********************************************************************
# Configuration
#*********************************************************************
DEF_GATEWAY="10.13.37.1" # Default Gateway
BCK_GATEWAY="192.168.0.1" # Backup Gateway
RMT_IP_1="1.1.1.1" # First remote ip
RMT_IP_2="8.8.8.8" # Second remote ip
PING_TIMEOUT="3" # Ping timeout in seconds
CURL_TIMEOUT="5" # Pushover timeout
#*********************************************************************
if [ `whoami` != "root" ]
then
echo "Failover script must be run as root!"
exit 1
fi
CURRENT_GW=`ip route show | grep default | awk '{ print $3 }'`
if [ "$CURRENT_GW" == "$DEF_GATEWAY" ]
then
ping -c 2 -W $PING_TIMEOUT $RMT_IP_1 > /dev/null
PING_1=$?
ping -c 2 -W $PING_TIMEOUT $RMT_IP_2 > /dev/null
PING_2=$?
else
ip route add $RMT_IP_1 via $DEF_GATEWAY
ip route add $RMT_IP_2 via $DEF_GATEWAY
ping -c 2 -W $PING_TIMEOUT $RMT_IP_1 > /dev/null
PING_1=$?
ping -c 2 -W $PING_TIMEOUT $RMT_IP_2 > /dev/null
PING_2=$?
ip route del $RMT_IP_1
ip route del $RMT_IP_2
fi
LOG_TIME=`date +%b' '%d' '%T`
if [ "$PING_1" == "1" ] && [ "$PING_2" == "1" ]
then
if [ "$CURRENT_GW" == "$DEF_GATEWAY" ]
then
ip route del default
ip route add default via $BCK_GATEWAY
ip route flush cache
echo "$LOG_TIME: $0 - switched Gateway to Backup with IP $BCK_GATEWAY"
curl -m $CURL_TIMEOUT -s \
--form-string "token=" \
--form-string "user=" \
--form-string "message=Failing to 4G LTE" \
https://api.pushover.net/1/messages.json > /dev/null
fi
elif [ "$CURRENT_GW" != "$DEF_GATEWAY" ]
then
ip route del default
ip route add default via $DEF_GATEWAY
ip route flush cache
echo "$LOG_TIME: $0 - Gateway switched to Default with IP $DEF_GATEWAY"
curl -m $CURL_TIMEOUT -s \
--form-string "token=" \
--form-string "user=" \
--form-string "message=Network online" \
https://api.pushover.net/1/messages.json > /dev/null
fi
I decided to rewrite it in Python and support infinite gateways. The gateways will be used in the order that they are listed.
/opt/failover.py
#!/usr/bin/env python3
import os
import re
import requests
import subprocess
PUSHOVER_USER = ''
PUSHOVER_APP = ''
PING = ['1.1.1.1', '8.8.8.8']
GATEWAY = ['10.13.37.1', '192.168.0.1', '172.22.0.1']
PING_COUNT = 2
PING_TIMEOUT = 2
PUSHOVER_TIMEOUT = 10.0
def pushover(message):
if PUSHOVER_USER and PUSHOVER_APP:
params = {
'token': PUSHOVER_APP,
'user': PUSHOVER_USER,
'message': message
}
try:
requests.post('https://api.pushover.net/1/messages.json', params=params, timeout=PUSHOVER_TIMEOUT)
return True
except:
pass
return False
def get_default_gateway():
try:
data = subprocess.Popen(['ip', 'route', 'show'], stdout=subprocess.PIPE).communicate()[0].decode('utf-8')
m = re.match(r'default via (\d*\.\d*\.\d*\.\d*) dev', data)
return m[1]
except:
return None
def set_default_gateway(ip):
subprocess.call(['ip', 'route', 'del', 'default'], stdout=open(os.devnull, 'w'))
subprocess.call(['ip', 'route', 'add', 'default', 'via', ip], stdout=open(os.devnull, 'w'))
subprocess.call(['ip', 'route', 'flush', 'cache'], stdout=open(os.devnull, 'w'))
def add_route(ip, gateway):
subprocess.call(['ip', 'route', 'add', ip, 'via', gateway], stdout=open(os.devnull, 'w'))
def remove_route(ip):
subprocess.call(['ip', 'route', 'del', ip], stdout=open(os.devnull, 'w'))
def ping(ip):
return (subprocess.call(['ping', '-c%d' % (PING_COUNT), '-W%d' % (PING_TIMEOUT), ip], stdout=open(os.devnull, 'w')) == 0)
def test_gateway(gateway):
ok = False
for ip in PING:
add_route(ip, gateway)
ok = ping(ip)
remove_route(ip)
if ok: break
return ok
def main():
current = get_default_gateway()
for g in GATEWAY:
if test_gateway(g):
if g != current:
set_default_gateway(g)
print('Changing gateway to %s' % (g))
pushover('Changing gateway to %s' % (g))
break
return 0
if __name__ == "__main__":
os._exit(main())
Modify this service if you plan to use the Python script above.
/etc/systemd/system/failover.service
[Unit]
Description=failover
[Service]
Type=oneshot
ExecStart=/bin/bash /opt/failover.sh
/etc/systemd/system/failover.timer
[Unit]
Description=failover timer
[Timer]
OnUnitActiveSec=15s
OnBootSec=15s
[Install]
WantedBy=timers.target
Finally, enable the timer.
systemctl daemon-reload
systemctl start failover.timer && systemctl enable failover.timer
systemctl list-timers --all
Virtual Machines
I’m running a Windows virtual machine on this server for TradeStation algos. It must be configured to use NAT routing otherwise you will need two interfaces attached to the VM and let Windows figure it out. It will be much easier to use a NAT configuration.
If you also have a Windows VM be sure to set it as a “metered connection” otherwise it could download several gigabytes of updates while connected to 4G.
Testing
There are several ways I want to test this to make sure it fails over correctly.
- Unplug both Ethernet cables from the server.
- Unplug the internet connection from the router.
- Unplug the router itself.
- Unplug various switches the server is connected to.
- Use
ifdown
to disable the bond / bridge interfaces.
If all of these pass it should work when the network really goes out.