Hugo, GitHub, VPS (2): Write a Hook Listener and Protect It Using Supervisor

· Tech

The article will talk about the first part mentioned in Hugo, GitHub, VPS (1): A Work Flow for static sites.

Create GitHub webhooks

GitHub has an official tutorial for webhooks, you can follow it and create one easily. The tutorial sets the payload URL to http://localhost:4567/payload, but since my hook is deployed on the VPS, I just fill in http://my.domain:18001. The port :4567 and the subfolder payload doesn’t matter. That’s where the POST request is sent, and you just need to match it with the port and location of your hook. I preserve the whole :18001 port for the hook, so I let the request sent to the root path directly.

The content type is application/json, and we only trigger the hook for push events.

Write a hook listener

The tutorial offers a sample listener app written in Sinatra, which is a Ruby micro-framework. Unfortunately, I know little about Ruby as well as Sinatra, so I choose to use Tornado, a Python framework.

There is a “Hello, world” example in Tornado’s documents. It has a get method in MainHandler, and what we want is a post method. So delete get and write a post:

1 import sys, subprocess
2 
3 class MainHandler(tornado.web.RequestHandler):
4     def post(self):
5         # do something
6         print('New POST Received.')
7         subprocess.call([sys.path[0] + "/generate.sh"])

Now, you can run the app and push something to your blog for which a hook has been set, and the terminal will show New POST Received.. Then, it executes a shell script called generate.sh, which is located in the same folder as the tornado python file. Since we want our VPS to pull down the new source codes immediately and regenerate the static site, so we embed these operations in the script file and make it look like this:

#! /bin/sh

cd path/to/blog-repo
git pull
rm -rf public # remove previous output files
hugo # generate the site in public/

Then change its authority:

chmod +x generate.sh

Next, we judge whether the POST request is from our GitHub repository. Perhaps the hook is listening to several updates from different users, branches and repos, so let’s parse the json file and read its information.

 1 import json
 2 import sys, subprocess
 3 
 4 class MainHandler(tornado.web.RequestHandler):
 5     def post(self):
 6         data = json.loads(self.request.body)
 7         try:
 8             repo = data['repository']['full_name']
 9         except KeyError:
10             print('Not a GitHub webhook post')
11         else:
12             if repo == 'yourusername/yourreponame':
13                 print('POST from my blog')
14                 subprocess.call([sys.path[0] + "/generate.sh"])
15             elif repo == 'user2/repo2':
16                 # do something...
17             elif repo == 'user3/repo3':
18                 # do something...
19             elif repo == 'user4/repo4':
20                 # do something

Finally, we add logging module and filter IP addresses outside GitHub to prevent potentially harmful pseudo POST requests. Luckily, GitHub has a whitelist, so we only allow addresses from 192.30.252/22. Now we’ve got the whole program:

 1 import tornado.ioloop, tornado.web
 2 import json
 3 import os, sys, logging, subprocess
 4 
 5 class MainHandler(tornado.web.RequestHandler):
 6     def post(self):
 7         data = json.loads(self.request.body)
 8         ip = self.request.remote_ip.split('.')
 9         if ip[:2] == ['192', '30'] and int(ip[2]) >= 252:
10             ## GitHub IP range
11             try:
12                 repo = data['repository']['full_name']
13             except KeyError:
14                 logging.warning('Not a GitHub webhook post')
15             else:
16                 if repo == 'yourusername/yourreponame':
17                     print('POST from my blog')
18                     subprocess.call([sys.path[0] + "/generate.sh"])
19                 else:
20                     logging.info('Unknown repo: %s', repo)
21         else:
22             logging.warning('Not from GitHub. IP [%s]: %s', ip, data)
23 
24 def make_app():
25     return tornado.web.Application([
26         (r"/", MainHandler), # match your payload URL
27     ])
28 
29 if __name__ == "__main__":
30     logging.basicConfig(format='%(asctime)s %(message)s', datefmt='%m/%d/%Y, %a, %H:%M:%S', filename=sys.path[0] + '/post.log',level=logging.INFO)
31     app = make_app()
32     app.listen(18001) # match your payload URL
33     tornado.ioloop.IOLoop.current().start()

Protect the listener process

In order to let our listener process run consistently in background and reboot after crash, we use Supervisor as a daemon program. Besides, it can manage multiple subprocesses of a tornado app and balance the load with the help of nginx, but it’s unnecessary for a simple hook.

Follow this direction to put your supervisord.conf file in proper path, and add codes like the following:

[program:mylistener]
command=python /path/to/mylistener.py
redirect_stderr=true
stdout_logfile=/path/to/log.log

Then, run supervisorctl reload to check if mylistener has been started.

Host the blog

Almost all static site generators have their internal server, but you can also use more dedicated web servers such as Apache or nginx for advanced performance and stability, just setting the host path to your output folder in the configuration file.

We are done! Now try pushing something on GitHub, and you should see changes on your VPS.