Niklas' Blog

Some words about things and the like might appear here semi-regularly. Follow @niklas@blog.niklas-meinzer.de to subscribe

For work I sometimes need to test or debug deployments on virtual machines. To do this I like to spin up a local vm in Virtualbox. Since I need to be able to ssh into it, it requires a host-only network adapter.

Whenever I provision a fresh machine of this kind, I have forgotten how to get the host-only adapter to work and it's hard to find the right instructions online since the process has changed so much during the various Ubuntu version.

So here it goes: How to set up a host-only adapter with Ubuntu 22.04 running in a Virtualbox:

  • Make sure the host-only adapter is connected to the machine. This is done in the settings of the VM in Virtualbox. Adapters can only be added or removed when the machine is down.

  • Logged into the VM run ip link show to list available network interfaces. Depending on what you configured there should be a number of devices listed. Most likely the loopback device, the NAT device and the host-only device, which should show as down. Now note the name of the host-only device

  • Open the netplan configuration at /etc/netplan/00-installer-config.yaml and add the host-only device like in the following example. In this case I'm assigning a static IP, but I think dhcp works too.

network:
  ethernets:
    enp0s3:
      dhcp4: true
    enp0s8: # This is the network device I identified earlier
      dhcp4: no
      addresses:
        - 192.168.56.104/24
  version: 2

  • Finally run netplan apply to apply the changes

  • Now you should see the interface as up when running ip a

Ok, I've been sitting on my old blog at www.niklas-meinzer.de for a long time now. I made it with Hugo, but the theme I used has since evolved so far, that I can no longer update the blog without migrating it through thousands of versions. So I decided that it's not worth the time and I could give a Fediverse blog a go.

As far as I know there are three alternatives for that:

  • Wordpress with ActivityPub Plugin, which... you know...
  • Plume which right on the landing page tells you it's semi-abandoned
  • Writefreely which is what you're looking at now.

It's a super minimalist setup with almost no options to customize the appearance. It's all about writing. I'm not much of a writer, but I'm all about not having many options in software, so I'll give it a go!

(This post was migrated from my old blog, I have since talked about this on an episode of the podcast Test and Code )

At work we have a large code base for one of our main products. This code base has been growing for about 5 years now and there is no end in sight. Naturally, the test suite required to test all that code has been growing with it. It started out using the unittest module from Python's standard library and was migrated to pytest as the testing framework and test runner a couple of years ago. Pytest is the de-facto standard for automated testing in Python today. Not only is it very powerful and extendible, but it's also a great example for a thriving open source project, adding new features and spitting out releases faster than many commercial products.

The test suite now contains about 1500 tests, many of which go beyond the traditional concept of a unit test. We use the web application library Werkzeug for our project, which provides a nice test client to simulate HTTP-requests during tests. Such tests are probably better described as integration tests, though the lines between the different categories of tests are rather blurry.

Naturally, with the increasing number of tests the execution time also went up. In this article I want to discuss if this is even a problem, what we can do about it.

The scenario

As mentioned above the test suite in question is a bit of a mixed bag of “true” unit tests and integration test which simulate a lot of the real application. This includes http requests and database access. We use a sqlite in-memory database for the tests, while the production db is usually managed by postgreSQL. Using different SQL dialects in tests and production is arguably problematic, but that's a topic for another time.

We make heavy use of pytest's fixtures to provide tests with streamlined access to the application under test. For example the client fixture creates a db engine, initializes a session and creates all tables, before it is connected to a Werkzeug test client.

When I started diving into this topic the test suite had a runtime of about 5 minutes on a reasonably powerful PC without the use of pytest-xdist (i.e. one one machine and core only)

Why even bother?

First of all, I'd like to discuss the motivation behind this project. Why even care if the test suite is slow, just let your continuous integration server handle it? Can a runtime of 5 minutes even be considered slow? When discussing these questions with coworkers and on twitter, it became clear that it's very much a matter of perspective. For some people it's fine if the test suite takes long others are annoyed by it. Some say 5 minutes is a great runtime others find it way too slow. So here are my reasons for trying to improve the test suites runtime:

1) A slow test suite slows me down during development and bugfixing on my local machine. While it is true that I rarely run the whole test suite while actively working on a piece of code, I do like to practise test-driven-development and therefore run the test very frequently. Therefore even small improvements in the performance can be beneficial.

2) Waiting for CI sucks. When everything goes smoothly, it's no problem waiting for your continuous integration server to complete the tests and then see the green ticks appear, but when you're out hunting bugs or trying to work out why-oh-why that one test sometimes fails on CI but never on your machine, it can be so annoying to wait every time.

3) A slow test suite could be a sign that the application itself is slow.

Problem analysis

Working on any sort of performance issue, like I was here, is a very specific kind of problem which can become frustrating very quickly, especially if you spring into action to quickly. There are two very important rules:

1) Measure before you do anything. Be absolutely sure that the part of the code you'll be working on is the one that's slowing things down.

2) Optimization anywhere else but at the bottleneck(s) is irrelevant. It may be tempting to shave another couple of milliseconds off that sorting algorithm, but it's not gonna win you anything if that's not the main problem of your application.

If we're transfering this to the slow test suite, we must first find out why the tests are slow and what we can do about it. Of course any improvement will help the overall runtime, but it would surely be most beneficial to find issues affection all or at least many tests.

So here are some things I found helpful when analyzing the problem:

  • Pytest comes with the pretty handy option --durations to list the n slowest tests after a run. This is a pretty low barrier step to get a first idea of which group of tests are causing poblems.

  • The pytest-profiling plugin can be used to run the test suite with a profiler. This can give you very good insights but reading profile data can be a bit tricky and may need some getting used to.

  • To find out how long the collection phase is, use --collect-only and time that.

Fixing stuff

After profiling and analyzing the problem a bit, I found out, that I had three core problems: The collection phase takes very long. Tests which use a database are somewhat slow (that's about 80 % of all tests) and some tests unneccessarily produce a PDF on the side by making an external call to pdflatex. So let's see how I went about fixing these issues.

Collection time

The collection phase of the test suite took about 25 seconds. Now that's not too long for a complete run, but keep in mind that pytests needs to collect all tests even if you only want to run a small subset, so this was very annoying during development.

To do anything about that it's important to understand how pytest discovers test: For each test run you give pytest at least one file or directory to find tests in. This can be done via the command line or a pytest.ini config file. Pytest will then import all Python modules from these targets and look for functions and classes defining tests (usually functions with a name starting with test_ but this behavior can be customized). In our case, although we have all tests placed in a tests directory, we had pytest pointed at the root directory of our code base. That way it did discover and run a couple of legacy doctests we had lying around from before the pytest days. But it means that before each run, pytest had to import (and possibly compile) the entire code base.

Pointing pytest directly at the tests directory greatly improved the collection time, but of course the doctests were no longer found. Left with the decision of either rewriting the doc tests as proper tests in the tests directory or giving the 4 or 5 files containing the doctests as additional test files to pytest, we opted for the latter one and decided to not add any more doctests in the future.

Avoiding external calls

Finding out that a lot of PDFs were generated on the side for each test run was... interesting. I wanted to fix this globally without having to go into each single test and mock it out.


@pytest.fixture(scope="function", autouse=True)
def pdftools_popen_mock(request):
    """
        Mocks away the Popen calls in tools.pdf_exports. There are two external processes which
        are called from that module:

        * pandoc: The mock just returns an empty string
        * pdflatex: A copy of the testpage pdf is placed in the requested output dir, so a valid
                    Pdf is available.

        The mock of pdflatex can be disabled by marking the test with
        @pytest.mark.dont_mock_pdflatex
    """

    pdflatex = mock.Mock(returncode=0)
    pandoc = mock.Mock(communicate=mock.Mock(return_value=[b""]))

    def filtered_Popen(*args, **kwargs):
        if args[0][0] == "pdflatex" and "dont_mock_pdflatex" not in request.keywords:
            output_dir = args[0][2]
            shutil.copy(TESTPAGE_PATH, os.path.join(output_dir, "export.pdf"))
            return pdflatex
        elif args[0][0] == "pandoc":
            return pandoc
        return Popen(*args, **kwargs)

    with mock.patch("chemocompile.tools.pdf_export.Popen", side_effect=filtered_Popen) as m:
        yield m

(This post was migrated from my old blog)

While open source software in general can be considered a huge success – most of the internet runs on it – games in particular were never really its strong suit. Sure, most classic card or board games like Solitaire, Chess, Go or Mahjongg have been implemented as open source apps and can for example be found in many linux distributions.

But what about modern, innovative games? Well, there's Super Tux Kart a 3D racing game, Secret Maryo Chronicles – similarities with a certain italian plumber are completely coincidental – and the real time strategy game Warzone 2100. I'll let you decide if you consider them on-par with contemporary commercial games.

But there's one game that, at least in my opinion, can take on the for-profit competition and that is Battle for Wesnoth.

This turn-based strategy game takes you to the magical world of Wesnoth, which is inhabited by all the factions we know from other fantasy universes: Humans, Orcs, Dwarfes, Elves, Dragons, Undead and many more.

The basic principle is simple: You recruit units in your keep and move them turn for turn across the map to fight one or more opponents. Recruiting units costs money and so does their upkeep. The more units you have the more money you need each turn. To increase your income you need to control as many of the villages scattered throughout the map as possible. Units gain experience and can “evolve” into stronger units as the game progresses. In the campaigns you can carry your experienced warriors over to the next scenario. It is this RPG aspect that makes the game highly addictive.

The game comes with a number of single player campaigns and once you're done with them you can look through the never ending catalogue of fan created stories to play. Or you can take on your friends or other players in multiplayer.

So what is it that this game has, that other open source games lack? Art.

Instead of being just a project run by a bunch of programmers, the Wesnoth community also consist of a large number of musicians, writers and sound and graphic designers. The 2D graphics are a joy to look at, most of the stories are well written and exciting. And just listen to the main theme!

There's a whole soundtrack that will accompany you throughout the game which could just as well be part of a commercial product.

So where can I play it?

Battle for Wesnoth is now available for Linux, Mac, Windows, iOS and Android. You can find instructions on how to get it on the website.

But the really exciting news is: Wesnoth will be coming to Steam. The team have successfully completed their campaign on (the now discontinued) Steam Greenlight. After a long process of almost two years they have now come out and said that the game will become available on Steam on April 13th 2018. (Ok, they say it's a “tentative” release date.) I'm excited to see how this game will perform when measured with the heavyweights of the gaming industry.

Now what are you waiting for? Go play!