Mike Gouline

Django in Dev Containers

2024-09-21T00:00:00+00:00

Applications that depend on databases and other services make for fiddly local setup and Docker Compose is a common solution, but what about IDE integration? This article shows how you can develop Django applications entirely within a container using VS Code and Dev Containers.

Just want to see the code? Help yourself to GitHub.

Docker

Before configuring VS Code, we need a working Docker Compose stack. While Docker alone is sufficient, a multiple-container setup is a more compelling proposition to demonstrate. Skip ahead if you already have your own Dockerfile and docker-compose.yaml.

Django specifics are out of scope, I will assume you have an existing application or you can use my example on GitHub (inspired by the polls tutorial).

Dockerfile

Let’s start with a simple Debian-based Python image and inline pip.

FROM python:3.12
WORKDIR /app
RUN pip install django psycopg
COPY . ./
ENTRYPOINT ["python3", "manage.py"]

Feel free to make this fancier while keeping in mind that non-Debian images may require some tweaking.

docker-compose.yaml

We can add PostgreSQL with a health check, Django migration, and — most importantly — the app service that runs our Django server and VS Code will connect to. Repository root is mounted to /app in the container.

name: django-devcontainer
services:
  postgres:
    image: postgres:16
    ports:
      - 5432:5432
    env_file:
      - docker.env
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER"]
      interval: 5s
      timeout: 5s
      retries: 5
      start_period: 10s
    restart: always
  migration:
    build: .
    command: migrate
    env_file:
      - docker.env
    depends_on:
      postgres:
        condition: service_healthy
  app:
    build: .
    command: runserver 0.0.0.0:8080
    ports:
      - 8080:8080
    env_file:
      - docker.env
    volumes:
      - .:/app
    depends_on:
      postgres:
        condition: service_healthy
      migration:
        condition: service_completed_successfully

You can validate everything works by running docker compose up.

Dev Containers

We can start configuring VS Code to connect to our Docker Compose stack:

Ensure Dev Containers extension is installed and enabled;
Create an empty .devcontainer/devcontainer.json file — subsequent sections will fill it with functionality incrementally, producing a working setup at each step (in true Agile™ fashion).

Basics

Here’s a minimum working devcontainer.json setup:

{
  // Path to Docker Compose file(s)
  "dockerComposeFile": "../docker-compose.yaml",
  // Which service inside dockerComposeFile to attach to
  "service": "app",
  // Attach directory within the service container
  "workspaceFolder": "/app"
}

VS Code should prompt you to reopen your project in a container, otherwise you can search commands for Dev Containers: Reopen in Container. Once it builds and connects, your title bar should look like this:

Congratulations, you are now developing inside the container!

Extensions

After the excitement wears off, you will realise there’s more work to do before this containerised environment is ready for Python development. Let’s install and configure your VS Code extensions:

{
  ...
  // VS Code customizations
  "customizations": {
    "vscode": {
      // Extension identifiers to install
      "extensions": [
        "ms-python.python",
        "ms-python.vscode-pylance",
        "ms-python.debugpy",
        "charliermarsh.ruff",
        "batisteo.vscode-django"
      ]
    },
    // Settings from your workspace/project settings.json
    "settings": {
       // Default Python interpreter inside the container
       "python.defaultInterpreterPath": "/usr/local/bin/python3",
       // Django manage.py unit testing arguments
       "python.testing.unittestArgs": ["--no-input"],
       // Enable unittest-based Python tests
       "python.testing.unittestEnabled": true
    }
 }

These are my favourites, you can add others from the Marketplace by their identifiers (shown under “More Info”).

This minimal setup configures Python and unit testing, but you need to set MANAGE_PY_PATH environment variable to run Django unit tests:

{
  ...
  // Environment variables inside the container
  "containerEnv": {
    "MANAGE_PY_PATH": "./manage.py"
  }
}

While you can alternatively set environment variables in Dockerfile, docker-compose.yaml or elsewhere, given this one is only used by VS Code, that’s where I prefer to keep it.

After making these changes to devcontainer.json, you will be prompted to rebuild to apply them, otherwise you can search commands for Dev Containers: Rebuild Container. Once completed, your tests should now be runnable under Testing in the side bar:

Git

We can write and test code, now we need to commit it to Git. Add the following to your devcontainer.json to install Git and Vim (for editing commit messages):

{
  ...
  "features": {
    // Install git for your dev environment
    "ghcr.io/devcontainers/features/git:1": {},
    // Install vim for git commit messages
    "ghcr.io/jungaretti/features/vim:1": {}
  }
}

Once again, you can install packages directly in the container, this just gives you a nice abstraction for tools only used in VS Code.

Other features are available here, you can even contribute your own.

Your .gitconfig should already be passed through to the container, but SSH credentials need to be exposed manually:

{
  ...
  "mounts": [
    // Expose ~/.ssh to the container (read only)
    "source=${localEnv:HOME}/.ssh,target=/root/.ssh,type=bind,ro,consistency=cached"
  ]
}

See Docker documentation for more information about mounts, if you want to share anything else on the host machine with your container.

Bash

What if we want shell creature comforts, such as completions and custom prompts? There’s a feature for that too!

Let’s create .bashrc with your preferences in ~/.config/devcontainer:

# Bash completion for Git
if [ -f /usr/share/bash-completion/completions/git ]; then
    source /usr/share/bash-completion/completions/git
fi
# Custom prompt with current Git branch
if [ -f /usr/lib/git-core/git-sh-prompt ]; then
    source /usr/lib/git-core/git-sh-prompt
    PS1='\[\033[01;32m\]➜\[\033[0m\] \[\033[36m\]\W\[\033[0m\]\[\033[01;31m\]$(__git_ps1 " (%s)")\[\033[0m\] \$ '
fi

See Git documentation for more information on what’s happening here.

Now we need to add bash-profile feature and a corresponding mount for that ~/.config/devcontainer directory:

{
  ...
  "features": {
    ...
    // Source .bashrc under ~/.config/devcontainer mount (if exists)
    "ghcr.io/eliises/devcontainer-features/bash-profile:1": {
      "command": "test -f /devcontainer/.bashrc && . /devcontainer/.bashrc"
    }
  }
  "mounts": [
    ...
    // Other optional container configurations
    "source=${localEnv:HOME}/.config/devcontainer,target=/devcontainer,type=bind,ro,consistency=cached"
  ]
}

Your container’s terminal feels like home without forcing your aesthetic choices on your colleagues, since .bashrc lives outside the repository!

Conclusion

This guide walked you through a working setup for Django development in Dev Containers, see documentation for more devcontainer.json options or simply use autocompletion and tooltips in VS Code. While some extensions were specific to Python, you can recycle everything else for containerised projects in other languages as well.

Hopefully, this saves you some yak shaving and improves your workflow!

From Machine Users to GitHub Apps

2023-11-13T00:00:00+00:00

Whenever a shared machine, such as a build server, needs access to your GitHub organisation, we traditionally opted for personal access tokens (PATs) or SSH keys created against a machine user. This works until you consider the security implications of not being able to attribute actions made by that user to any of the humans with access to it.

GitHub Apps is a more secure alternative, where apps can be installed in your organisation — not only a user — and granted granular permissions to repositories, packages, issues, etc. Unfortunately, the setup is slightly more involved than for PATs and SSH keys, so I decided to write up my experience with it.

Photo by Roman Synkevych on Unsplash

Alternatives

Before diving into GitHub Apps, I should mention that if your shared machine only requires access to a single repository, consider using deploy keys instead — they are SSH keys that you configure a repository, instead of a user or organisation. Note that while you can have many keys for one repository, you can only associate each key with one repository at a time. As a result, this is not a good solution for, say, a build server that needs to clone all repositories in your organisation.

If you are using a cloud service, also check that it does not already provide a GitHub App that you can install. With GitHub’s popularity, this covers many use cases.

App Setup

Initial configuration involves registering a new generic GitHub App and then installing it in your specific organisation. This will happen entirely in the web interface.

Register Your App

Go to your user or organisation settings, expand the “Developer settings” (at the bottom of the sidebar) and click “GitHub Apps”.

Now click “New GitHub App” and fill in the following sections (more information can be found here):

GitHub App name — globally-unique name that describes your organisation and the purpose of this integration, e.g. “Acme CI”
Homepage URL — this app will remain private, so the homepage can be any valid URL, e.g. your company website
Webhook — uncheck “Active”, we will not use it
Permissions — configure all permissions that your shared machine needs, e.g. to clone repositories, you need at least “Read-only” access on “Contents” under “Repository permissions” (read more here)
Where can this GitHub App be installed? — if you are registering this app from a different user or organisation than where you will be installing it, select “Any account”

Save your new app by clicking “Create GitHub App”.

Install Your App

Inside your newly-registered app, note down the “App ID” for later. Scroll down to “Private keys” and click “Generate a private key”. New PEM file will start downloading — you must keep it safe, it will be used for authentication. If it does get lost or compromised, you can always delete it and generate a new one. Finally, open “Install App” in the sidebar and click “Install” next to your organisation. This grants your new app the permissions on your specific organisation that you configured in the previous section.

Authentication

Now that your app is installed, let’s look at authenticating against it from your shared machine. I will describe a simple custom implementation, to help you understand all the steps involved, but there is an easier way at the end if the Git client is all you need.

Custom Implementation

This sample is implemented in Python using jwt and requests packages, but any other alternatives would work just as well.

Note that I am omitting the retrieval of your organisation name, app ID and private key file, created in the previous section, because that is dependent on your environment. For example, you may want to use the secrets manager in your operating system or cloud provider.

We start by preparing your JSON web token (JWT) for authenticating against the GitHub API (you can read more here).

import jwt
import time

def get_encoded_jwt(app_id, private_key_path):
    with open(private_key_path, "rb") as pem_file:
        signing_key = jwt.jwk_from_pem(pem_file.read())

    return jwt.JWT().encode(
        payload={
            "iat": int(time.time()),
            "exp": int(time.time()) + 10 * 60,  # 10 mins (maximum)
            "iss": app_id,
        },
        key=signing_key,
        alg="RS256",
    )

First request retrieves the API URL to request a new access token for the installation in your organisation.

def get_access_token_url(encoded_jwt, org):
    resp = requests.get(
        url="https://api.github.com/app/installations",
        headers={
            "Accept": "application/vnd.github+json",
            "Authorization": f"Bearer {encoded_jwt}",
            "X-GitHub-Api-Version": "2022-11-28",
        },
    )
    for installation in resp.json():
        if installation["account"]["login"] == org:
            return installation.get("access_tokens_url")
    return None

The URL will be of the form https://github.com/app/installations/INSTALLATION_ID/access_tokens and you can alternatively find that INSTALLATION_ID by clicking the ⚙ (cog) icon next to your organisation in the “Install App” section of your app, and checking the end of that URL, e.g. .../installations/INSTALLATION_ID.

Now we can use that URL to request a new installation access token (see endpoint for details).

def get_access_token(encoded_jwt, access_token_url):
    resp = requests.post(
        url=access_tokens_url,
        headers={
            "Accept": "application/vnd.github+json",
            "Authorization": f"Bearer {encoded_jwt}",
            "X-GitHub-Api-Version": "2022-11-28",
        },
    )
    return resp.json().get("token")

That’s it! You can now authenticate with GitHub using this token. Note that it expires after an hour, so your implementation will need to periodically refresh it. When authenticating using anything that takes a username and a password (e.g. Git client), the access token is the password and x-access-token is the username.

Git Client

For the native Git client, used either directly or through another tool that calls out to it, you can configure a credential helper instead.

In a nutshell, Git allows you to configure any executable called git-credential-somename in your PATH to dynamically fetch credentials according to any custom logic you like.

There are several available for GitHub Apps, I prefer this one written in Go — https://github.com/Avinode/git-credential-github-apps. All you have to do is extract your platform-appropriate git-credential-github-apps binary the releases, place it somewhere under your PATH (e.g. /usr/local/bin) and execute the following:

git config --global credential.helper 'github-apps -privatekey  -appid  -login '

Ensure that your user has a ~/.cache directory — that’s where the helper stores cached credentials until they expire.

If your existing setup clones repositories via SSH and you want backwards compatibility, you can force HTTPS URL replacement:

git config --global url."https://github.com/".insteadOf "git@github.com:"

Your Git client can now clone repositories as normal. This should also work for CLI tools that execute Git commands implicitly, but your results may vary and I recommend some testing.

Scraping Internet Outages with Selenium and Python

2021-06-01T00:00:00+00:00

Imagine you have an internet provider. You probably do. This provider sometimes performs scheduled maintenance, resulting in your internet temporarily disconnecting. Now imagine there’s no way of getting notified about future maintenance windows, unless you check their outages page every day. Annoying, but we can do something about that with a bit of code.

This was not a hypothetical situation for me and I figured I could automate the process with a script, like any developer would. Unfortunately, internet providers are seldom developer friendly, so there was no API that I could tap into. There was only a customer dashboard with a list of current and future outages, which I would have to scrape.

While I realise not everyone is facing this exact problem, I’m sure people find themselves needing to automate something on archaic websites from time to time, so hopefully this article is useful to them. To avoid making a target out of any real website, I built a simple HTML mock that we will be scraping instead.

Research

Step one is researching the problem space. There are three major parts:

Obtaining the outages
Serving them
Hosting the code away from your computer

I haven’t scraped before, but 5 minutes of research showed Selenium to be a common solution. You have a lot of language choice here, including Java (and anything interoperable, like Kotlin), C#, JavaScript, Python and Ruby. Since I spend most of my day job writing Python (for data/ML), I picked that and you can follow along with any other language you prefer.

Photo by Sigmund on Unsplash

My first idea for serving the outages was sending notifications via email or SMS, but how do you avoid forgetting right after receiving them? You create an event in your calendar! So why not skip the intermediate step and create a calendar that you can just subscribe to, like public holidays or sporting calendars. All you have to do is generate a static iCalendar file and host it somewhere.

This brings me to the final piece — hosting. There’s no need for a request-response type application here, because the outages are unlikely to change too frequently, so you really just need a scheduled fetch operation that updates the iCal file and hosts it for your calendar app or service to synchronise. The simplest approach I came up with is to run it as a scheduled GitHub Actions workflow and host the static file as a Gist. If you have access to AWS, Google Cloud or Azure, you can just as easily use a scheduled serverless function and blob storage, e.g. Lambda that writes to a public S3 bucket, triggered via CloudWatch Events.

Code

Let’s look at the code. This section will only cover some snippets of code to give you a feel for how things work, you can find the complete working project at https://github.com/gouline/outages.

Website

You will obviously skip this part when scraping a real website, but to avoid pissing off some webmasters, I used the power of GitHub Pages to create a mock login page that redirects to a static list of outages (regardless of what username and password you entered):

Create two static HTML pages: login.html and outages.html
Put them under /docs in the GitHub repository
Go to the repository ‘Settings’, choose the ‘Pages’ tab, and enable GitHub Pages for the primary branch and the /docs directory
The two pages are now hosted under https://[USERNAME].github.io/[REPOSITORY]/ (if you have a custom domain associated with the main GitHub Pages repository, this will redirect and still work)

Now we have something to scrape.

Scraping

For those of you who have never scraped websites before, the process looks roughly like writing HTML files in reverse. You inspect the page in your browser, find the information that you want to grab, and think about how you would traverse down the DOM, by tag, IDs and classes, to get it. First, download the ChromeDriver and put it somewhere in your executable path. Then install the Selenium binding for Python:

pip install selenium

Here’s how you instantiate a simple headless driver:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

chrome_options = Options()
chrome_options.add_argument("--headless")
driver = webdriver.Chrome(options=chrome_options)

When troubleshooting unexpected behaviour, you can temporarily comment out the --headless line to see what the code sees in a separate Chrome window (you can even inspect the page). The driver throws errors whenever something cannot be found, so it’s a good idea to surround whatever you are doing with a try-finally statement to make sure you close it, even if something goes wrong.

try:
    outages = get_outages(driver)
finally:
    driver.close()

To load a page, you just get it:

driver.get("https://gouline.github.io/internet-outages/provider/login.html")

Now let’s fill out the credentials and submit the form:

import os

username = os.getenv("PROVIDER_USERNAME")
password = os.getenv("PROVIDER_PASSWORD")

driver.find_element_by_id("username").send_keys(username)
driver.find_element_by_id("password").send_keys(password)
driver.find_element_by_tag_name("form").submit()

Most of the time you will be using find_element_by_id, find_element_by_tag_name or find_element_by_class_name for simple parsing. All these return the first matching element and throw an error if they cannot find it. You can also use their plural variants (e.g. find_elements_by_tag_name) to return a list of matching elements that you can loop through.

After a page redirect, you can use this to wait until the new page title contains the text “Outages”, to avoid errors just because the page hasn’t loaded yet:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

WebDriverWait(driver, 10).until(EC.title_contains("Outages"))

Now we loop through the elements with the class list-group-item that we know contain each outage on the mock page and then find the container class inside with each attribute of that outage:

outages = []

for i in driver.find_elements_by_class_name("list-group-item"):
    outage = Outage()

    for c in i.find_elements_by_class_name("container"):
        title = c.find_element_by_tag_name("strong").text
        value = c.find_element_by_tag_name("p").text

        outage.put_attribute(title, value)

    outages.append(outage)

Corresponding HTML for reference:

 class="list-group-item">
   class="container">
    Start
    Thu 10 Jun 2021 12:00AM PST


   class="container">
    End
    Thu 10 Jun 2021 07:00AM PST


   class="container">
    Severity
    High

Notice how you can call find_* functions on the driver for the whole page or recursively on any returned elements. Be careful with find_element_* functions because they throw NoSuchElementException when they cannot find what you are looking for, so if this is expected, make sure to surround them with a try-catch statement.

That about covers the overall parsing principle. Unfortunately, in most real-world use cases, HTML gets complex quickly and you will have to resort to find_element_by_xpath, but XPath syntax is way too broad to cover in this article, so I will leave it up to the reader to explore the documentation.

Generating Calendar

Now that we have some outages, we need to generate the iCal file. If you’ve never seen one, here is what an example one looks like:

BEGIN:VCALENDAR
VERSION:2.0
BEGIN:VEVENT
DESCRIPTION:Scheduled maintenance
DTSTART:20210101T000000Z
SUMMARY:Event Name
UID:TEST-UID-1
END:VEVENT
END:VCALENDAR

As you can see, it wouldn’t take much to generate it manually, but fortunately, ics.py exists so we don’t have to. You can install it like so:

pip install ics

Given a list of outages that we scraped before, we add them as events:

import ics

cal = ics.Calendar()

for outage in outages:
    event = ics.Event(
         name=outage.name,
         begin=outage.start.timestamp(),
         end=outage.end.timestamp(),
         uid=outage.uid,
         description=outage.description(),
    )
    cal.events.add(event)

While most arguments are self-explanatory, an important note about uid — the unique identifier for each event — is that it can be omitted, but it’s then generated randomly every time the file is updated. Depending on how your calendar app or service handles synchronisation, this may cause undesirable behaviour, since technically all events will be brand new at each refresh.

Finally, we write the calendar to a plaintext file:

with open(filename, "w") as f:
    f.writelines(cal)

GitHub Actions

Now that we have some Python code that we can run locally, we want to run it somewhere on a schedule, say daily. Here’s how you can configure GitHub Actions to do that.

By default, Actions are enabled for all GitHub repositories, so all you have to do is create a workflow configuration. Let’s create .github/workflows/deploy.yml and start filling it in.

First, we specify what triggers this workflow. We want it to run on a schedule:

on:
  schedule:
    - cron: "0 5,17 * * *"

This example makes it trigger at 5:00 and 17:00 (UTC) every day. For more options, refer to documentation.

Next, we define what steps need to be executed and on what platform:

jobs:
  run:
    runs-on: ubuntu-latest
    name: run
    steps:
      - uses: actions/checkout@v2

Remember how you had to install the ChromeDriver before? This needs to be done on the CI runner as well. Thankfully, there’s the setup-chromedriver action on the Marketplace for that:

 - uses: nanasess/setup-chromedriver@master

We need to install Python dependencies:

 - name: Requirements
        run: pip3 install selenium ics

And run the Python scraper we wrote earlier:

 - name: Run
        run: python3 outages.py
        env:
          PROVIDER_USERNAME: $
          PROVIDER_PASSWORD: $

Presumably, you will need to save those credentials we used to log into the website we were scraping. Never store credentials in plaintext in your repository! You can store them as encrypted secrets instead. Just go to ‘Settings’ in your repository, then the ‘Secrets’ tab, and create two secrets called PROVIDER_USERNAME and PROVIDER_PASSWORD, referenced above.

Finally, we need a way to upload the resulting calendar file to GitHub Gist. We could do it manually via the API, but we’re all busy people, so there’s another Marketplace action called deploy-to-gist to do it for us:

- name: Deploy
  uses: exuanbo/actions-deploy-gist@v1
  with:
    token: $
    gist_id: YOUR_GIST_ID
    gist_file_name: outages.ics
    file_path: ./dist/outages.ics

The field gist_file_name controls the name of the file in the target gist and file_path is where to find the file to upload after executing the Python script. Two more configuration steps before we’re done though:

Go to https://gist.github.com/ and create an empty gist — the gist_id will be in the URL, i.e. https://gist.github.com/[USERNAME]/**[GIST_ID]**
Create a personal access token with scope “Create gists” and save it as a repository secret GIST_TOKEN (same as the website credentials before) — this gives the workflow permission to edit your gist

Done! If you followed the instructions correctly, your workflow will now run on schedule and upload your calendar file to GitHub Gist. To synchronise your calendar app or service with your generated calendar, point it to this URL (with your values) https://gist.githubusercontent.com/**[USERNAME]**/**[GIST_ID]**/raw — this exposes your gist as a raw file that can be downloaded.

Optimisations

If you looked at the GitHub repository I linked to in the beginning, you would have noticed a few additional things not discussed in this article that you may want to consider doing in your own repository, especially if you are new to Python:

Create a build script, such as a Makefile and/or setup.py, and call its targets from the CI workflow, instead of explicit commands
Extract your dependencies into requirements.txt and install them all at once with pip install -r requirements.txt
Separate your Python code into multiple files as needed and put them all in a directory called outages (with an __init__.py file), so that you can call it as a module with python -m outages (more on that here)
Write tests! It may be a small throwaway project, but it’s still a good idea to write at least some basic unit and integration tests

Having said that, I purposely only focused on things relevant to scraping websites and generating the calendar file. How to structure Python projects is out of scope for what I wanted to address and your implementation will be different, depending on what website you are scraping and how you want to approach it, anyway.

Closing Thoughts

Hopefully, this was helpful for somebody facing a similar task. Web scraping is a massive topic that’s impossible to cover in one article and that was not my intention. This was more of a walk-through of a weekend project where you approach a real-world problem with a can-do attitude and some basic programming skills.

As I mentioned in the research section, GitHub Actions was picked purely because it’s simple and free, but it’s by no means the only, or even the nicest, option. Hosting the Python script on a cloud platform, such as AWS, Azure or Google Cloud, is left as an exercise for the reader. Feel free to contribute your setup in the comments and I will happily include it in the footnotes.

Thank you for reading!

ADS-B Feeders on Raspberry Pi

2020-10-26T00:00:00+00:00

I happen to be an aviation nerd and flight tracking services enable aviation nerds to learn a lot about any aircraft flying over. Where it’s going, where it’s coming from, why it’s flying over my house at 6AM after I went to bed only 4 hours ago. Important stuff.

In this article, I explain how you can become a part of this process by using a Raspberry Pi to capture and feed ADS-B data to the top four most popular tracking services.

Introduction

You’ve likely used or heard of Flightradar24, FlightAware, RadarBox, and Plane Finder before. Most of their coverage comes from regular people capturing live ADS-B data, broadcast by nearby aircraft on 1090 MHz and 978 MHz (US-only) frequencies, with some form of software-defined radio (SDR), and feeding it to them with a computer or a mobile device.

Why? Beyond the cool factor, most of them offer a free premium account for your trouble, which is handy if you track planes frequently and want some advanced features. Next question is “how” then.

If you live in a remote area with poor existing coverage, you can apply for a free professional receiver at Flightradar24, FlightAware, RadarBox, or Plane Finder. The only caveat is that you have to install it on your roof or a tall mast with unobstructed 360-degree view of the sky and a 24/7 internet connection. These receivers are also black boxes with an Ethernet port, so you won’t be able to install any third-party software on them, in case you also want to tinker with ADS-B data yourself.

For everyone else who lives in a large city with plenty of coverage or has no ability to install suspicious-looking equipment on the roof of their apartment building, your best bet is building a DIY receiver of your own.

Hardware

You can install ADS-B feeder software on almost any computer. But remember that you’ll be running it 24/7, so it’s probably unwise to install it on your main laptop or desktop. Unless you already have a rack of servers, the cheapest and most energy-efficient option is a Raspberry Pi.

Exact model makes no difference, I’ve installed feeders on first-generation Model B and Zero W without issues. So long as it has one free USB port and an internet connection (i.e. you have a wired router nearby or your Pi has Wi-Fi), it’s good enough.

You will also need an SDR adapter with an external antenna. This can be anything from a cheap DVB-T USB dongle with a small magnetic antenna for $10 off eBay, all the way up to a purpose-built ADS-B receiver with an outdoor fibreglass antenna for $100+ (most tracking services can sell you one on their website). Either option will work, but cheaper equipment and lazier antenna positioning will get you significantly shorter range (5–10 NM) compared to a more serious setup (100–200 NM). It all depends on your budget and level of enthusiasm.

Software

You have some options for the software part as well. If your Raspberry Pi will only be used for feeding data to FlightAware or Flightradar24, the easiest option is installing their pre-made OS images, PiAware or Pi24, respectively.

But if you want to feed data to multiple services or you want to simultaneously use your Raspberry Pi for other tasks, keep reading.

0. Raspberry Pi OS

Let’s start by downloading Raspberry Pi OS (formerly, Raspbian), the official Debian Linux-based operating system made for all Raspberry Pi models. If you intend to run it headless (no monitor, no keyboard), I would recommend Raspberry Pi OS Lite, but full desktop version would work too. Follow the instructions on the website to write it to your SD card.

Now insert the SD card into your Pi and turn the power supply on. Unless you have a console cable, you will have to connect a monitor and keyboard initially, to enable SSH and/or VNC to control it remotely.

Once the system boots, log in with the default username “pi” and password “raspberry” and run sudo raspi-config to do some basic configuration:

Change the default password!
Under locale settings, set time zone to UTC (required by some feeders)
Enable SSH and/or VNC, depending on how you plan to control your Pi in the future; optionally, change the hostname to something unique, like radarpi, that way you can address it as radarpi.local on your network, instead of configuring a static IP

Before we go any further, it’s not a bad idea to install any available OS updates by running sudo apt-get update and then sudo apt-get upgrade.

1. FlightAware

Always install FlightAware first, because its PiAware packages include a working version of dump1090 and dump978 that other feeders rely on. This avoids having to install them separately, which sometimes causes issues.

Sign up for a free FlightAware account before going any further. Follow their guide, it boils down to this (replacing with latest version):

wget https://flightaware.com/adsb/piaware/files/packages/pool/piaware/p/piaware-support/piaware-repository_4.0_all.deb
sudo dpkg -i piaware-repository_4.0_all.deb
sudo apt-get update
sudo apt-get install piaware
sudo piaware-config allow-auto-updates yes
sudo piaware-config allow-manual-updates yes
sudo apt-get install dump1090-fa dump978-fa

All done! Now claim your feeder and it should appear in your stats. Here you can edit the location and elevation of your antenna, your closest airport, etc.

You can also now navigate to port 8080 of your Pi in your browser, e.g. http://radarpi.local:8080/, for a dashboard that shows you which flights the dump1090 daemon can see.

2. Flightradar24

Flightradar24 provides an easy installation script, but for me it always fails with a GPG error. So I pieced together a way to install everything manually from various forum posts.

Sign up for a free Flightradar24 account before going any further.

Go to fr24feed_versions.json to check the latest version available for “linux_arm_deb” platform. At the time of writing, it was 1.0.26–9, so update it to the latest one and execute the following:

wget http://repo.feed.flightradar24.com/rpi_binaries/fr24feed_1.0.26-9_armhf.deb
sudo dpkg --install fr24feed_1.0.26-9_armhf.deb
sudo fr24feed --signup

You will be asked to type in the email address associated with your account and other details, such as the sharing key (which you can leave blank if this is your first feeder) and your antenna location.

Finally, just run this to restart the client and start feeding data:

sudo systemctl restart fr24feed

Your feeder should now appear under data sharing in your account.

3. AirNav RadarBox

RadarBox also provides an installation script and this one worked for me.

Sign up for a free RadarBox account before going any further.

Here’s what you need to execute:

sudo bash -c "$(wget -O - [http://apt.rb24.com/inst_rbfeeder.sh](http://apt.rb24.com/inst_rbfeeder.sh))"
sudo apt-get install mlat-client
sudo systemctl restart rbfeeder
sudo rbfeeder --showkey --no-start

Once it shows you the key, all you have to do is claim it.

You can check your sharing status, including flights your feeder can see and your coverage, by going to Account > Stations and selecting the name of your station.

4. Plane Finder

Last one is Plane Finder. This feeder is just a single Debian package with instructions in a PDF file.

Sign up for a free Plane Finder account before going any further.

Instructions boil down to this (replacing with latest version):

wget [http://client.planefinder.net/pfclient_4.1.1_armhf.deb](http://client.planefinder.net/pfclient_4.1.1_armhf.deb)
sudo dpkg -i pfclient_4.1.1_armhf.deb

Once installed, navigate to port 30053 of your Pi in your browser, e.g. http://radarpi.local:30053/, fill out your account email address, and the latitude and longitude of your antenna.

Then select “Beast” and type in “127.0.0.1” for the IP address and “30005” for the port (output for dump1090’s Mode-S Beast binary format) and take note of the generated share code. Plane Finder will also automatically email it to you, in case you lose it.

Finally, navigate to receivers in your account, enter the share code and click “Add receiver”. Eventually, “inactive” should switch to “active” — this took around 15–20 minutes for me (much longer than with other services).

Conclusion

Congratulations, you’re now feeding to four flight tracking services simultaneously. After a while (within an hour), all your accounts should switch to “premium”, “business” or “enterprise”, depending on what each service is offering in exchange for sharing data.

Keep an eye out on sharing stats to see how many flights and positions your feeder captures, and what your range is. If you’re unhappy with these results, try moving your antenna closer to the window and raising it higher. Alternatively, you can go out and purchase a better USB dongle and/or a bigger antenna. Either way, your existing Raspberry Pi setup won’t change.

My last point is a “life hack” unrelated to aviation. Flightradar24 has a useful setting to notify you when your feeder is offline and display a graph of your uptime. This is surprisingly handy for checking your internet connection, in case you need to yell at your ISP for an all-night outage or decide whether you should stay at work a little longer because your home internet is dead. Enjoy!

Data Team as an Optimisation Problem

2020-07-09T00:00:00+00:00

Many (if not most) companies reach a point when data becomes a priority. This implies building out an internal practice to integrate into existing systems and processes to deliver the sought insights. In a field so wide, relatively recent and infamous for its buzzwords-per-second count, formalising problems and making explainable decisions is the only route that won’t see you run out of resources and people’s patience.

This talk explores how we approached this challenge at mx51 armed with lessons from engineering and statistics. We start by defining the optimisation problem formally(-ish) and then applying it to actual decisions faced along the way, including technology selection, warehousing and data lake (both Snowflake), ETL, visualisation and ML-driven insights.

Exporting dbt Schema to Metabase

2020-03-12T00:00:00+00:00

Metabase, a brilliant open-source BI tool, allows you to define your data model on top of the database schema, which includes descriptions, special types and table relationships. However, updating everything manually is not repeatable and prone to errors, so I will demonstrate how to export your existing dbt schema into Metabase automatically with a tool I created.

Return from the Dark Side

2019-05-26T00:00:00+00:00

There comes a time in a software engineer’s career when they start asking the dreaded question. Do I continue writing code as an ‘individual contributor’ or should I start the gradual descent into management?

Here is my tale of exploring the ‘dark side’ and eventually coming back. Take it for what it is, pseudo-philosophical ramblings on an unoriginal topic, as compiled from several anonymised places of employment.

Motivations

What motivates these forays into management? There are good reasons, such as realising that you’re great at mentoring people and enabling them to do their best work. This article won’t cover those as plenty of LinkedIn listicles already do. Here are some bad reasons.

Some people in the industry seem to think that reaching a certain age implies that it’s time to close your IDE, open your email client and become a manager. The exact age varies but it’s a popular enough theory to warrant a mention.

Money is another one. In larger companies your pay is determined by which ‘band’ you belong to and these bands usually separate the managers from the non-managers. As a result, you get people who are in management only because their annual pay reviews hit the wall.

Then there’s the old ‘nobody else for the role’ scenario. Your manager got promoted up a level and somebody needs to fill his/her shoes. You’re the tech lead, you practically already manage the team, right? So who better for the job than you?

That last one is what happened to me. At the time I considered it a specific situation that only happened in our team, but after telling others about it, I quickly realised that it’s hardly rare. Regardless, that’s how I embarked on my journey from ‘tech lead’ to ‘team lead’ (if these terms sound the same to you, please look them up before reading any further).

Learnings for Make Benefit Glorious Reader

I have no idea why I called it that. I don’t even like that movie. The point is, experiencing the ‘greenness’ of the grass on the management side leaves you with some lessons that you can carry further into whatever you end up doing next. Even a failed experiment yields data.

Meetings are Mostly Pointless

Communication is important, nobody is disputing that. If everyone is just writing code 100% of the time without aligning on what is being built and why, you will be extremely productive at achieving absolutely nothing. But there’s useful communication and there are meetings for the sole purpose of having a meeting and appearing more busy than you really are.

Unfortunately, managers are constantly subjected to meetings that produce less results than going out for coffee. ‘Power play’ meetings to show dominance, ‘get everybody in the room’ meetings to dilute blame if anything goes wrong, ‘lip service’ meetings to assure somebody important that a project is being worked on even though no resources are allocated to it, and many other types of meetings specific to each company.

How do you navigate these situations? No idea. Some people might get away with declining meetings that they don’t have to be in, others just use the time to get some work done on their laptops. The argument that sitting in meetings is not your job does not fare so well anymore now that writing code is no longer as high up on your job description.

If only companies with prevalent ‘meeting cultures’ installed conference room counters to measure the monetary cost of each meeting. Surely the dollar figures would help curb some of that behaviour.

Photo by Drew Beamer on Unsplash

Exposure

What you often don’t realise as a software engineer is how much you’re being shielded from. You may have some idea, perhaps you even think of your manager as a great ‘sh*t umbrella’, but you never truly appreciate what falls on the fibres of that umbrella until you are it.

It’s not just pointless meetings but any number of things: pressures from senior management to deliver faster, inter-departmental politics, attempts from people with no expertise in what your team does to dictate how they should be doing their job, take a pick. All of this is now your business. These things might annoy you, even piss you off, but when you return to the area of the office that you and your team inhabit, you have to be cool and collected as if nothing happened. Anything you do relay to them has to be filtered and carefully phrased, otherwise you might start a chain reaction of rumours, overreactions and panic that won’t end well.

This balance between being completely transparent and as secretive as a press secretary for a totalitarian regime is incredibly hard to master. There’s no one stable solution for it. It completely depends on the culture within your team and how impressionable specific individuals in it are.

If you filter too much, you risk being perceived as a Frank Underwood-esque ‘puppet master’ who uses inside information to their own advantage. Conversely, if you’re completely transparent and relay everything that’s happened in this morning’s planning meeting to everyone, your most junior team member will start losing sleep fearing that everyone will be fired over an impulsive outburst from your CEO that engineers aren’t delivering quickly enough. Go figure.

As much of an inexcusable cliché it is, with great power, and in this case knowledge, comes great responsibility. No matter how minor the increase in that knowledge, you can either help or hurt your team with it.

Satisfaction Not Guaranteed

A software engineer’s job is engineering software. That’s what your performance is measured by and where your job satisfaction comes from at the end of a hard day.

A manager’s job is managing these people by removing any barriers standing in their way, guiding them through an oft-complex maze of company politics, and taking on unimportant (often unpleasant) tasks that would distract them from achieving their goals. It requires you to be a selfless jack of all trades who only succeeds when his/her whole team succeeds and who takes all the blame if everything goes south.

That’s easy to say, harder to do and harder still to get satisfaction from, when only a short while ago your main objective for the day was making all of your integration tests pass. Suddenly, it’s five o’clock and all you’ve done is attended some meetings, wrote some emails and maybe reviewed some code, if you’re lucky. What a frustratingly unproductive day, says your unadjusted brain. It’s draining!

Like with the last two points, I don’t have a universal solution. What helped me was making mental notes of things that you do throughout the day that provide value to somebody: you answered a question that unblocked somebody from working on their task, you found a potential bug while reviewing somebody’s code, you contributed to some design discussion, anything like that. The gratification is not as instant as running code and seeing it work, but it’s something.

No Life Sentence

Regardless of whether you’re enjoying management or not, it should not be a life sentence. There’s absolutely nothing wrong with your LinkedIn status changing from ‘Manager’ or ‘Team Lead’ to ‘Senior Software Engineer’, unless you’re the sort of person who’s a little too invested in job titles.

While I can’t speak for every hiring manager out there, I personally see alternating between ‘doing’ and ‘managing’ in people’s resumes as a positive. It means they’re never too far away from management to understand their perspective and they still know what they’re talking about from the technical standpoint. There’s nothing worse than a person in power who makes technical decisions having been off the tools for decades.

As long as there’s opportunity, focus on doing what you’re good at and what you enjoy doing, not what you think your career path has cornered you into doing. It’s not a sign of defeat or a downgrade, it just means that you missed doing things that you were helping your team do.

That reminded me of a quote from Kris Howard’s article where she announced her departure from organising technical conferences:

After being around such inspirational folks tackling big problems… I suddenly realised that I missed being those people.

That’s precisely what I did, I went back to being one of those people. It’s not that I’m completely ruling out the possibility of going into leadership again, but I’m enjoying the return to being an individual contributor for now.

Your Mileage May Vary

Here comes the disclaimer. Everything in this article is based solely on my own experiences in specific companies. While I’m sure many readers will identify with the problems that I encountered, that doesn’t mean that they are unavoidable.

Many software companies, as opposed to regular companies with software departments, encourage managers to remain hands-on. I’ve personally met a CEO of a software company with around 1000 employees who still regularly writes code.

Similarly, the ‘meeting culture’ isn’t as bad everywhere. Startups and smaller companies are often good places to escape it. The reason being that smaller businesses rely on everyone pulling their hardest to survive, unlike, say, a major bank that has enough redundancy to afford such inefficiencies.

So don’t be discouraged. Just ensure that your move into management is for all the right reasons and do your research before committing to it.

A piece of trivia for the afterword is that I started writing this article shortly after I resigned from my management job and only finished it over a year later. While it admittedly smoothed out and added perspective to some of the harsher points, none of the overall opinions have changed. Hence why I decided to publish it after all this time and not relegate it to the eternal drafts folder.

Data Science and Other Buzzwords

2018-08-12T00:00:00+00:00

Kotlin continues to conquer new areas of software development, but some are still firmly held by one or two languages. Data science, statistical analysis and machine learning are largely Python domains, but with many performance-focused implementations based on the JVM, Kotlin has a good chance to break into this scene too. The intention of this talk is to give a shallow overview of what you can do today.

KotlinConf 2017 Recap

2017-11-29T00:00:00+00:00

Summary of the better talks, the more interesting themes, some conversations with other Kotliners and overall impressions about the first ever Kotlin conference in San Francisco. Think of it as your guide to what recorded talks to watch first.

You Can, but Should You?

2017-11-02T00:00:00+00:00

Kotlin provides a mountain of features that Java developers previously never had access to. This creates endless opportunities. This also creates confusion akin to that of a kid in a candy store, which is exacerbated by the transition from simple beginner demos to production code, expected to be readable, performant and maintainable. Which to choose? Should I be doing this?