By healthy, in this case, I mean running with no errors. Ideally, I want an alert whenever the python script stops. I also want the python script to auto-start when my server boots.
One of these problems is much simpler than the other. First I’m going to try setting up a service for my Inspirobot python script.
Creating a systemd unit for a python script
This is the unit file I’ve created. It allows systemd to handle starting my python script on boot, restart if it fails unexpectedly, and logs my program’s STDOUT and STDERR in my server’s journal.
[Unit]
Description=Inspirobot Telegram bot
After=sshd.service
[Service]
ExecStart=/usr/local/lib/inspirobot-telegram-bot/.venv/bin/python /usr/local/lib/inspirobot-telegram-bot/main.py
ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=on-failure
User=inspirobot
[Install]
WantedBy=multi-user.target
Alias=inspirobot.service
This will go a long way to improving the maintainability of my app!
Application health monitoring
Now that I’ve got the application up and running, I want to get alerts when there’s a problem. These could be email, push notification on my phone, or a site I log into regularly like Grafana. (Funny enough, a Telegram bot programmed to send me error messages would be excellent for this)
Just spitballing some options here:
- Email if the inspirobot.service fails to start, or stops unexpectedly (https://superuser.com/questions/1360346/how-to-send-an-email-alert-when-a-linux-service-has-stopped)
- Monitoring using Grafana, Kibana (maybe not, if this requires setting up a whole ELK stack), or Prometheus
- Set up failure actions in inspirobot.service to run another script to alert me somehow
Daily “All Clear” messages from the bot.Actually no I think that could get really annoying and cause alert fatigue.
There are other things I’ll need to worry about also, since I’m hosting my own server. Security patches, server resources, pip package freezing, etc. Pretty soon I’ll need to add these into my workflow as well.