January 27, 2021

Call for Project ideas for Google Summer of Code 2021

It is that time of the year again when we start gathering ideas for Google Summer Code.

This time around we will be posting and discussing proposals in GNOME’s GitLab instance. Therefore, if you have a project idea that fits Google Summer of Code, please file an issue at https://gitlab.gnome.org/Teams/Engagement/gsoc-2021/-/issues/new using the “Proposal” template.

Everybody is welcome to add ideas, but it would be nice to verify whether the ideas are realistic and mentorship for it will be available. We encourage you to discuss your ideas with designers in #gnome-design to get their input and plan collaboration, especially if your ideas are related to one of the core GNOME modules.

Keep in mind that there are a few changes in GSoC this year:

  1. Smaller project size all students participating in the 2021 program will be working on a 175 hour project (instead of a 350 hr project). This change will also result in a few other changes including the student stipend being cut in half.
  2. Shortened coding period – the coding period will be 10 weeks with a lot more flexibility for the mentor and student to decide together how they want to spread the work out over the summer. Some folks may choose to stick to a 17-18 hour a week schedule with their students, others may factor in a couple of breaks during the program (for student and mentor) and some may have students focus 30 hours a week on their project so they wrap up in 6 weeks. This also makes it a lot easier for students with finals or other commitments (weddings, etc.) to adjust their schedules.
  3. 2 evaluations (instead of 3) – There will be an evaluation after 5 weeks and the final evaluation will take place after the 10th week. We are also no longer requiring students complete their first evaluation (though we encourage them to do so), so if a student doesn’t complete the first evaluation they will not automatically be removed from the program. They are still required to complete the final evaluation.
  4. Eligibility requirements – In 2020 there are many ways students are learning and we want to acknowledge that so we will be allowing students who are 18 years old AND currently enrolled (or accepted into) a post-secondary academic program as of May 17, 2021 or have graduated from a post-secondary academic program between December 1, 2020 and May 17, 2021 to apply to the GSoC program.

If you have any doubts, please don’t hesitate to contact the GNOME GSoC Admins on Discourse or https://chat.gnome.org/channel/outreach

** This is a repost from https://discourse.gnome.org/t/call-for-project-ideas-for-google-summer-of-code-2021/5454 to reach a broader audience. Please share! **

January 26, 2021

Hitting a milestone – Beat Saber!

I hit an important OpenHMD milestone tonight – I completed a Beat Saber level using my Oculus Rift CV1!

I’ve been continuing to work on integrating Kalman filtering into OpenHMD, and on improving the computer vision that matches and tracks device LEDs. While I suspect noone will be completing Expert levels just yet, it’s working well enough that I was able to play through a complete level of Beat Saber. For a long time this has been my mental benchmark for tracking performance, and I’m really happy 🙂

Check it out:

I should admit at this point that completing this level took me multiple attempts. The tracking still has quite a tendency to lose track of controllers, or to get them confused and swap hands suddenly.

I have a list of more things to work on. See you at the next update!

January 24, 2021

Outreachy Progress Report

I’m halfway gone into my Outreachy internship at the GNOME Foundation. Time flies so fast right? I’m a little emotional cuz I don’t want this fun adventure to end soo soon. Just roughly five weeks to go!!
Oh well, let’s find out what I’ve been able to achieve over the past eight weeks and what my next steps are…

My internship project is to complete the integration between the GNOME Translation Editor (previously known as Gtranslator) and Damned Lies(DL). This integration involves enabling users to reserve a file for translation directly from the Translation Editor and permitting them to upload po files to DL.

Incase you don’t understand any terminology or not you haven’t heard about these projects before, kindly read this blog post and you’ll clear all your doubts.

Let’s move forward!

So far, here are the things I’ve been able to accomplish:

  1. Setup a button and required functions for the “reserve for translation” feature.
  2. Setup a dialog and associated functions for the “upload file” feature
  3. I added the module sate to the user interface which comes in very handy to enable users know what state a po file is in as on the vertimus workflow. This will permit them know what actions can be performed.
  4. In addition, users can now see custom DL headers on po files downloaded from DL or opened locally and even edit the headers as required from the edit header dialog.
  5. Added some endpoints to the DL REST API to authenticate (Token Authentication) a user who wants to perform the “reserve for translation” or “upload file” operation.

What’s left?

  1. The endpoints I added to the DL REST API need to be approved and merged, so that I can write real queries and complete the functions I’ve created for the above mentioned two features. I got some feedback on the merge request I did, which I’m currently working on.
  2. Ensure that memory has been managed properly and handle security vulnerabilities.
  3. Extensive testing and documentation.
  4. Final evaluation and wrapping up.

The big question is I’m I on track?? Oh well let’s find out

I’ve completed 8 out of 13 weeks of my internship. according to my timeline, I’m supposed to have a version one of both features working properly. I’m about 90% close to achieving this goal.

Since my project is dependent on another project, it’s a little challenging to go as fast as possible. This is because communication needs to be done with the community members managing the DL project and getting feedback or convincing them on why my changes are necessary takes quite some time.

Next Steps:

  • I need to push hard for the DL team to merge my changes in time.
  • Stay focused and complete the tasks I have left.

I’m glad all is well and with some determination and push, I should deliver my project in time

2021-01-24 Saturday

  • Up lateish, back to slides, worked through the day. Put family to bed, catch up with Ash, reviewed & merged some patches. Listened to LCA talks, interesting.

January 23, 2021

2021-01-23 Friday

  • Admin, sync with Eloy, and Kendy, mail catch-up. Poked at some calc profiles & fixed a couple of rather silly drop-offs. Plugged away at the next invalidation multiplexing issue. Worked late on slides for LCA.

January 22, 2021

Launching Endless OS Foundation

Passion Led Us Here

How our for-profit company became a nonprofit, to better tackle the digital divide.

Originally posted on the Endless OS Foundation blog.

An 8-year journey to a nonprofit

On the 1st of April 2020, our for-profit Endless Mobile officially became a nonprofit as the Endless OS Foundation. Our launch as a nonprofit just as the global pandemic took hold was, predictably, hardly noticed, but for us the timing was incredible: as the world collectively asked “What can we do to help others in need?”, we framed our mission statement and launched our .org with the same very important question in mind. Endless always had a social impact mission at its heart, and the challenges related to students, families, and communities falling further into the digital divide during COVID-19 brought new urgency and purpose to our team’s decision to officially step in the social welfare space.

On April 1st 2020, our for-profit Endless Mobile officially became a nonprofit as the Endless OS Foundation, focused on the #DigitalDivide.

Our updated status was a long time coming: we began our transformation to a nonprofit organization in late 2019 with the realization that the true charter and passions of our team would be greatly accelerated without the constraints of for-profit goals, investors and sales strategies standing in the way of our mission of digital access and equity for all. 

But for 8 years we made a go of it commercially, headquartered in Silicon Valley and framing ourselves as a tech startup with access to the venture capital and partnerships on our doorstep. We believed that a successful commercial channel would be the most efficient way to scale the impact of bringing computer devices and access to communities in need. We still believe this – we’ve just learned through our experience that we don’t have the funding to enter the computer and OS marketplace head-on. With the social impact goal first, and the hope of any revenue a secondary goal, we have had many successes in those 8 years bridging the digital divide throughout the world, from Brazil, to Kenya, and the USA. We’ve learned a huge amount which will go on to inform our strategy as a nonprofit.

Endless always had a social impact mission at its heart. COVID-19 brought new urgency and purpose to our team’s decision to officially step in the social welfare space.

Our unique perspective

One thing we learned as a for-profit is that the OS and technology we’ve built has some unique properties which are hugely impactful as a working solution to digital equity barriers. And our experience deploying in the field around the world for 8 years has left us uniquely informed via many iterations and incremental improvements.

Endless OS designer in discussion with prospective user

With this knowledge in-hand, we’ve been refining our strategy throughout 2020 and now starting to focus on what it really means to become an effective nonprofit and make that impact. In many ways it is liberating to abandon the goals and constraints of being a for-profit entity, and in other ways it’s been a challenging journey for me and the team to adjust our way of thinking and let these for-profit notions and models go. Previously we exclusively built and sold a product that defined our success; and any impact we achieved was a secondary consequence of that success and seen through that lens. Now our success is defined purely in terms of social impact, and through our actions, those positive impacts can be made with or without our “product”. That means that we may develop and introduce technology to solve a problem, but it is equally as valid to find another organization’s existing offering and design a way to increase that positive impact and scale.

We develop technology to solve access equity issues, but it’s equally as valid to find another organization’s offering and partner in a way that increases their positive impact.

The analogy to Free and Open Source Software is very strong – while Endless has always used and contributed to a wide variety of FOSS projects, we’ve also had a tension where we’ve been trying to hold some pieces back and capture value – such as our own application or content ecosystem, our own hardware platform – necessarily making us competitors to other organisations even though they were hoping to achieve the same things as us. As a nonprofit we can let these ideas go and just pick the best partners and technologies to help the people we’re trying to reach.

School kids writing on paper

Digital equity … 4 barriers we need to overcome

In future, our decisions around which projects to build or engage with will revolve around 4 barriers to digital equity, and how our Endless OS, Endless projects, or our partners’ offerings can help to solve them. We define these 4 equity barriers as: barriers to devices, barriers to connectivity, barriers to literacy in terms of your ability to use the technology, and barriers to engagement in terms of whether using the system is rewarding and worthwhile.

We define the 4 digital equity barriers we exist to impact as:
1. barriers to devices
2. barriers to connectivity
3. barriers to literacy
4. barriers to engagement

It doesn’t matter who makes the solutions that break these barriers; what matters is how we assist in enabling people to use technology to gain access to the education and opportunities these barriers block. Our goal therefore is to simply ensure that solutions exist – building them ourselves and with partners such as the FOSS community and other nonprofits – proving them with real-world deployments, and sharing our results as widely as possible to allow for better adoption globally.

If we define our goal purely in terms of whether people are using Endless OS, we are effectively restricting the reach and scale of our solutions to the audience we can reach directly with Endless OS downloads, installs and propagation. Conversely, partnerships that scale impact are a win-win-win for us, our partners, and the communities we all serve. 

Engineering impact

Our Endless engineering roots and capabilities feed our unique ability to build and deploy all of our solutions, and the practical experience of deploying them gives us evidence and credibility as we advocate for their use. Either activity would be weaker without the other.

Our engineering roots and capabilities feed our unique ability to build and deploy digital divide solutions.

Our partners in various engineering communities will have already seen our change in approach. Particularly, with GNOME we are working hard to invest in upstream and reconcile the long-standing differences between our experience and GNOME. If successful, many more people can benefit from our work than just users of Endless OS. We’re working with Learning Equality on Kolibri to build a better app experience for Linux desktop users and bring content publishers into our ecosystem for the first time, and we’ve also taken our very own Hack, the immersive and fun destination for kids learning to code, released it for non-Endless systems on Flathub, and made it fully open-source.

Planning tasks with sticky notes on a whiteboard

What’s next for our OS?

What then is in store for the future of Endless OS, the place where we have invested so much time and planning through years of iterations? For the immediate future, we need the capacity to deploy everything we’ve built – all at once, to our partners. We built an OS that we feel is very unique and valuable, containing a number of world-firsts: first production OS shipped with OSTree, first Flatpak-only desktop, built-in support for updating OS and apps from USBs, while still providing a great deal of reliability and convenience for deployments in offline and educational-safe environments with great apps and content loaded on every system.

However, we need to find a way to deliver this Linux-based experience in a more efficient way, and we’d love to talk if you have ideas about how we can do this, perhaps as partners. Can the idea of “Endless OS” evolve to become a spec that is provided by different platforms in the future, maybe remixes of Debian, Fedora, openSUSE or Ubuntu? 

Build, Validate, Advocate

Beyond the OS, the Endless OS Foundation has identified multiple programs to help underserved communities, and in each case we are adopting our “build, validate, advocate” strategy. This approach underpins all of our projects: can we build the technology (or assist in the making), will a community in-need validate it by adoption, and can we inspire others by telling the story and advocating for its wider use?

We are adopting a “build, validate, advocate” strategy.
1. build the technology (or assist in the making)
2. validate by community adoption
3. advocate for its wider use

As examples, we have just launched the Endless Key (link) as an offline solution for students during the COVID-19 at-home distance learning challenges. This project is also establishing a first-ever partnership of well-known online educational brands to reach an underserved offline audience with valuable learning resources. We are developing a pay-as-you-go platform and new partnerships that will allow families to own laptops via micro-payments that are built directly into the operating system, even if they cannot qualify for standard retail financing. And during the pandemic, we’ve partnered with Teach For America to focus on very practical digital equity needs in the USA’s urban and rural communities.

One part of the world-wide digital divide solution

We are one solution provider for the complex matrix of issues known collectively as the #DigitalDivide, and these issues will not disappear after the pandemic. Digital equity was an issue long before COVID-19, and we are not so naive to think it can be solved by any single institution, or by the time the pandemic recedes. It will take time and a coalition of partnerships to win. We are in for the long-haul and we are always looking for partners, especially now as we are finding our feet in the nonprofit world. We’d love to hear from you, so please feel free to reach out to me – I’m ramcq on IRC, RocketChat, Twitter, LinkedIn or rob@endlessos.org.

Auto-updating XKB for new kernel keycodes

Your XKB keymap contains two important parts. One is the mapping from the hardware scancode to some internal representation, for example:

  <AB10> = 61;  

Which basically means Alphanumeric key in row B (from bottom), 10th key from the left. In other words: the /? key on a US keyboard.

The second part is mapping that internal representation to a keysym, for example:

  key <AB10> {        [     slash,    question        ]       }; 

This is the actual layout mapping - once in place this key really produces a slash or question mark (on level2, i.e. when Shift is down).

This two-part approach exists so either part can be swapped without affecting the other. Swap the second part to an exclamation mark and paragraph symbol and you have the French version of this key, swap it to dash/underscore and you have the German version of the key - all without having to change the keycode.

Back in the golden days of everyone-does-what-they-feel-like, keyboard manufacturers (presumably happily so) changed the key codes and we needed model-specific keycodes in XKB. The XkbModel configuration is a leftover from these trying times.

The Linux kernel's evdev API has largely done away with this. It provides a standardised set of keycodes, defined in linux/input-event-codes.h, and ensures, with the help of udev [0], that all keyboards actually conform to that. An evdev XKB keycode is a simple "kernel keycode + 8" [1] and that applies to all keyboards. On top of that, the kernel uses semantic definitions for the keys as they'd be in the US layout. KEY_Q is the key that would, behold!, produce a Q. Or an A in the French layout because they just have to be different, don't they? Either way, with evdev the Xkb Model configuration largely points to nothing and only wastes a few cycles with string parsing.

The second part, the keysym mapping, uses two approaches. One is to use a named #define like the "slash", "question" outlined above (see X11/keysymdef.h for the defines). The other is to use unicode directly like this example from  the Devangari layout:

  key <AB10> { [ U092f, U095f, slash, question ] };

As you can see, mix and match is available too. Using Unicode code points of course makes the layouts less immediately readable but on the other hand we don't need to #define the whole of Unicode. So from a maintenance perspective it's a win.

However, there's a third type of key that we care about: functional keys. Those are the multimedia (historically: "internet") keys that most devices have these days. Volume up, touchpad on/off, cycle display connectors, etc. Those keys are special in that they don't have a Unicode representation and they are always mapped to the same fixed functionality. Even Dvorak users want their volume keys to do what it says on the key.

Because they have no Unicode code points, those keys are defined, historically, in XF86keysyms.h:

  #define XF86XK_MonBrightnessUp    0x1008FF02  /* Monitor/panel brightness */

And mapping a key like this looks like this [2]:

  key <I21>   {       [ XF86Calculator        ] };

The only drawback: every key needs to be added manually. This has been done for some, but not for others. And some keys were added with different names than what the kernel uses [3].

So we're in this weird situation where we have a flexible keymap system  but the kernel already tells us what a key does anyway and we don't want to change that. Virtually all keys added in the last decade or so falls into that group of keys, but to actually make use of them requires a #define in xorgproto and an update to the keycodes and symbols in xkeyboard-config. That again introduces discrepancies and we end up in the situation where we're at right now: some keys don't work until someone files a bug, and then the users still need to wait for several components to be released and those releases trickle into the distributions.

10 years ago would've been a good time to make this more efficient. The situation wasn't that urgent then, most of the kernel keycodes added are >255 which means they cannot be used in X anyway. [4] The second best time to do it is now. What we need is basically a pass-through from kernel code to symbol and that's currently sitting in various MRs:

- xkeyboard-config can generate the keycodes/evdev file based on the list of kernel keycodes, so all kernel keycodes are mapped to internal representations by default

- xorgproto has reserved a range within the XF86 keysym reserved range for pass-through mappings, i.e. any KEY_FOO define from the kernel is mapped to XF86XK_Foo with a specific value [5]. The #define format is fixed so it can be parsed.

- xkeyboard-config parses theses XF86 keysyms and sets up a keysym mapping in the default keymap.

This is semi-automatic, i.e. there are helper scripts that detect changes and notify us, hooked into the CI, but the actual work must be done manually. These keysyms immediately become set-in-stone API so we don't want some unsupervised script to go wild on them.

There's a huge backlog of keys to be added (dating to kernels pre-v3.18) and I'll go through them one-by-one over the next weeks to make sure they're correct. But eventually they'll be done and we have a full keymap for all kernel keys to be immediately available in the XKB layout.

The last part of all of this is a calendar reminder for me to do this after every new kernel release. Let's hope this crucial part isn't the first to fail.

[0] 60-keyboard.hwdb has a mere ~1800 lines!
[1] Historical reasons, you don't want to know. *jedi wave*
[2] the XK_ part of the key name is dropped, implementation detail.
[3] This can also happen when a kernel define is renamed/aliased but we cannot easily do so for this header.
[4] X has an 8 bit keycode limit and that won't change until someone develops XKB2 with support for 32-bit keycodes, i.e. never.

[5] The actual value is an implementation detail and no client must care

January 21, 2021

Threaded input adventures

Come around and gather, in this article we will talk about how Mutter got an input thread in the native backend.

Weaver, public domain (Source)

A trip down memory lane

Mutter wasn’t always a self-contained compositor toolkit, in the past it used to rely on Clutter and Cogl libraries for all the benefits usually brought by toolkits: Being able to draw things on screen, and being able to receive input.

In the rise of Wayland, that reliance on an external toolkit drove many of the design decisions around input management, usually involving adding support in the toolkit, and the necessary hooks so Mutter could use or modify the behavior. It was unavoidable that both sides were involved.

Later on, Mutter merged its own copies of Clutter and Cogl, but the API barrier stayed essentially the same at first. Slowly over time, and still ongoing, we’ve been refactoring Mutter so all the code that talks to the underlying layers of your OS lives together in src/backends, taking this code away from Clutter and Cogl.

A quick jump to the near past

However, in terms of input, the Clutter API barrier did still exist for the most part, it was still heavily influenced by X11 design, and was pretty much used as it was initially designed. Some examples, no special notoriety or order:

  • We still forwarded input axes in a compact manner, that requires querying the input device to decode event->motion.axes[3] positions into CLUTTER_INPUT_AXIS_PRESSURE. This space-saving peculiarity comes straight from XIEvent and XIQueryDevice.
  • Pointer constraints were done by hooking a function that could impose the final pointer position.
  • Emission of wl_touch.cancel had strange hooks into libinput event handling, as the semantics of CLUTTER_TOUCH_CANCEL varied slightly.

Polishing these interactions so the backend code stays more self-contained has been a very significant part of the work involved.

Enter the input thread

The main thread context is already a busy place, in the worst case (and grossly simplified) we:

  • Dispatch several libinput events, convert them to ClutterEvents
  • Process several ClutterEvents across the stage actors, let them queue state changes
  • Process the frame clock
    • Relayout
    • Repaint
  • Push the Framebuffer changes

All in the course of a frame. The input thread takes the first step out of that process. For this to work seamlessly the thread needs a certain degree of independence, it needs to produce ClutterEvents and know where will the pointer end up without any external agents. For example:

  • Device configuration
  • Pointer barriers
  • Pointer locks/constraints

The input thread takes over all these. There is of course some involvement from the main thread (e.g. specifying what barriers or constraints are in effect, or virtual input), but these synchronization points are either scarce, or implicitly async already.

The main goal of the input thread is to provide the main thread with ClutterEvents, with the fact that they are produced in a distinct thread being irrelevant. In order to do so, all the information derived from them must be independent of the input thread state. ClutterInputDevice and ClutterInputDeviceTool (representing input devices and drawing tablet tools) are consequently morphing into immutable objects, all changes underneath (e.g. configuration) are handled internally in the input thread, and abstracted away in the emitted events.

The Dark Side of the Loom“The Dark Side of the Loom” by aldoaldoz is licensed under CC BY-NC-SA 2.0

What it brings today

Having a thread always ready to dispatch libinput may sound like a small part in the complexity involved to give you a new frame, but it does already bring some benefits:

  • Libinput events are always dispatched ASAP, so this will mean less “client bug: event processing lagging behind by XXms” messages in the journal.
  • Input handling not being possibly stalled by the rest of the operations in the main thread means fewer awkward situations where we don’t process events in time (e.g. a key release stopping key repeat, at least in the compositor side).
  • With the cursor logical position being figured alone by the input thread, updating the cursor plane position to reflect the most up-to-date position for the next frame does simply require asking the input thread for it.
  • Generally, a tidier organization of input code where fewer details leak outside the backend domain.

What it does not bring (yet)

Code is always halfways to a better place, the merged work does not achieve yet everything that could be achieved. Here’s some things you shouldn’t expect to see fixed yet:

  • The main thread is still in charge of KMS, and updating the cursor plane buffer and position. This means the pointer cursor will still freeze if the main thread stalled, despite the input thread handling events underneath. In the future, There would be another separate thread handling atomic KMS operations, so it’d be possible for the input and KMS threads to talk between them and bypassing any main thread stalls.
  • The main thread still has some involvement in handling of Ctrl+Alt+Fn, should you need to switch to another TTY while hard-locked. Making it fully handled in the input thread would be a small nicety for developers, perhaps a future piece of work.
  • Having an input handling that is unblocked by almost anything else is a prerequisite for handling 1000Hz mice and other high-frequency input devices. But the throttling behavior towards those is unchanged, a better behavior should be expected in the short term.


We’ve so far been working really hard in making Mutter as fast and lock free as possible. This is the first step towards a next level in design that is internally protective against stall situations.

January 20, 2021

Flexbox Cats (a.k.a fixing images in flexbox)

In my previous post I discussed my most recent contributions to flexbox code in WebKit mainly targeted at reducing the number of interoperability issues among the most popular browsers. The ultimate goal was of course to make the life of web developers easier. It got quite some attention (I loved Alan Stearns’ description of the post) so I decided to write another one, this time focused in the changes I recently landed in WebKit (Safari’s engine) to improve the handling of elements with aspect ratio inside flexbox, a.k.a make images work inside flexbox. Some of them have been already released in the Safari 118 Tech Preview so it’s now possible to help test them and provide early feedback.

(BTW if you wonder about the blog post title I couldn’t resist the temptation of writing “Flexbox Cats” which sounded really great after the previous “Flexbox Gaps”. After all, image support was added to the Web just to post pictures of 🐱, wasn’t it?)

Same as I did before, I think it’d be useful to review some of the more relevant changes with examples so you could have any of those so inspiring a-ha moments when you realize that the issue you just couldn’t figure out was actually a problem in the implementation.

What was done

Images as flex items in column flows

Web engines are in charge of taking an element tree, and accompanying CSS and creating a box tree from this. All of this relies on Formatting Contexts. Each formatting context has specific ideas about how layout behaves. Both flex and grid, for example, created new, interesting formatting contexts which allow them to size their children by shrinking and or stretching them. But how all this works can vary. While there is “general” box code that is consulted by each formatting text, there are also special cases which require specialized overrides. Replaced elements (images, for example), should work a little differently in flex and grid containers. Consider this:
.flexbox {
    display: flex;
    flex-direction: column;
    height: 500px;
    justify-content: flex-start;
    align-items: flex-start;

.flexbox > * {
    flex: 1;
    min-width: 0;
    min-height: 0;

<div class="flexbox">
      <img src="cat1.jpg>

Ideally, the aspect ratio of the replaced element (the image, in the example) would be preserved as the flex context calculated its size in the relevant direction (column is the block direction/vertical in western writing modes, for example)…. But in WebKit, they weren’t. They are now.

Black and white cat by pixabay

Images as flex items in row flows

This second issue is kind of the specular twin of the previous one. The same issue that existed for block sizes was also there for inline sizes. Overriding inline sizes were not used to compute block sizes of items with aspect ratio (again the intrinsic inline size was used) and thus the aspect ratio of the image (replaced elements in general) was not preserved at all. Some examples of this issue:
.flexbox {
  display: flex;
  flex-direction: row;
  width: 500px;
  justify-content: flex-start;
  align-items: flex-start;
.flexbox > * {
  flex: 1;
  min-width: 0;
  min-height: 0;

<div class="flexbox">
    <img src="cat2.jpg">

Gray Cat by Gabriel Criçan

Images as flex items in auto-height flex containers

The two fixes above allowed us to “easily” fix this one because we can now rely on the computations done by the replaced elements code to compute sizes for items with aspect ratio even if they’re inside special formatting contexts as grid or flex. This fix was precisely about delegating that computation to the replaced elements code instead of duplicating all the aspect-ratio machinery in the flexbox code. This fix has apparently the potential to be a game changer:
This is a key bug to fix so that Authors can use Flexbox as intended. At the moment, no one can use Flexbox in a DOM structure where images are flex children.

Jen Simmons in bug 209983
Also don’t miss the opportunity to check this visually appealing demo by Jen which should work as expected now. For those of you not having a WebKit based browser I’ve recorded a screencast for you to compare (all circles should be round).
Left: old WebKit. Right: new WebKit (tested using WebKitGtk)
Apart from the screen cast, I’m also showcasing the issue with some actual code.
.flexbox {
    width: 500px;
    display: flex;
.flexbox > * {
    min-width: 0;

<div class="flexbox">  
  <img style="flex: auto;" src="cat3.jpg">

Tabby Cat by Bekka Mongeau

Flexbox additional cases for definite sizes

This was likely the trickiest one. I remember having nightmares with all the definite/indefinite stuff back then when I was implementing grid layout with other Igalia colleages. The whole thing about definite/indefinite sizes although sensible and relatively easy to understand is actually a huge challenge for web engines which were not really designed with them in mind. Laying out web content traditionally means taking a width as input to produce a height as output. However formatting contexts like grid or flex make the whole picture much more complicated.
This particular issue was not a malfunction but something that was not implemented. Essentially the flex specs define some cases where indefinite sizes should be considered as definite although the general rule considers them indefinite. For example, if a single-line flex container has a definite cross size we could assume that flex items have a definite size in the cross axis which is indeed equal to the flex container inner cross size.
In the following example the flex item, the image, has height:auto (by default) which is an indefinite size. However the flex container has a definite height (a fixed 300px). This means that when laying out the image, we could assume that its height is definite and equal to the height of the container. Having a definite height then allows you to properly compute the width using an aspect ratio.
.flexbox {
    display: flex;
    width: 0;
    height: 300px;

<div class="flexbox">
  <img src="cat4.png">

White and Black Cat With Blue Eyes by Thomas Svensson

Aspect ratio computations and box-sizing

Very common overlook in layout code. When dealing with layout bugs we (browser engineers) usually forget about box-sizing because the standard box model is the truth and the whole truth and the sole truth in our minds. Jokes aside, in this case the aspect ratio was applied to the border box (content + border + padding) instead of to the content box as it should. The result were distorted images because border and padding where altering the aspect ratio computations.
.flexbox {
  display: flex;
.flexbox > * {
  border-top: 150px solid blue;
  border-left: 30px solid orange;
  height: 300px;
  box-sizing: border-box;

<div class=flexbox>
  <img src="cat5.png"/>

Grayscale Photo of Long Fur Cat by Skyler Ewin


I mentioned this in the previous post but I’ll do it again here, having the web platform test suite has been an an absolute game changer for web browser engineers. They have helped us in many ways, from easily allowing us to verify our implementations to acting as a safety net against potential regressions we might add while fixing issues in the engines. We no longer have to manually test stuff in different browsers to check how other developers have interpreted the specs. We now have the test, period.
In this case, I’ve been using them in a different way. They have served me both as a guide, directing my efforts to reduce the flexbox interoperability issues and also as a nice metric to measure the progress of the task. Talking about metrics, this work made WebKit based browsers pass an additional 64 test cases from the WPT test suite, a very nice step forward for interoperability.
I’m attaching a screenshot with the current status of images as flex items from the WPT point of view. Each html file on the left column is a test, and each test performs multiple checks. For example the image-as-flexitem-* ones run 19 different checks (use cases) each. Each column show how many tests each browser successfully run. A quarter ago Safari’s (WebKit’s) figures for most of them were 11/19, 13/19 but now the last Tech Preview it’s passing all of them. Not bad huh?
image-as-flexitem-* flexbox tests in WPT as of 2021/01/20


Again many thanks to the different awesome folks at Apple, Google and my beloved Igalia that helped me with very insightful reviews and strong support at all levels.
Also I am thankful to all the photographers from whom I borrowed their nice cat pictures (including the Brown and Black Cat on top by pixabay).

January 18, 2021

My Journey to GJS’ Backtrace “full” Option

My outreachy internship has definitely taught me a lot of things including writing blog posts, reporting tasks, expressing myself and of course improving as a developer. When we developed a project timeline before submitting the final application weeks back, my mentor and I underestimated some of the issues because there were some hidden difficulties we only found out later.

Initially, my timeline was set to using the first week to understand the inner workings of the debugger, using week 2-4 on the backtrace full command, using week 5-7 to display the current line of the source code when displaying the current frame in the debugger and the task for week 8-13 were still to be decided upon by my mentor and I within the course of the internship.

After completing the 7th week of my internship, here are the things I have been able to accomplish

  1. The first week of the internship was spent acquainting myself with how debugger tests are written using merge request 539(visit link for more details https://gitlab.gnome.org/GNOME/gjs/-/merge_requests/539) which was aimed at adding ‘$$’ as a shorthand to refer to the most recently printed value.I did a code review on this merge request in order to understand how it works. Also, I wrote a blog post introducing myself as an Outreachy intern at GNOME. Setting up my word press blog took me some time too since it was new to me. My mentor and I also drew up a plan on when we were to be having our weekly check-ins.
  2. Week two started with me going through issue 208 (https://gitlab.gnome.org/GNOME/gjs/-/issues/208) which is adding the “full” option to the backtrace command. This week was spent studying the debugger API documentation (https://firefox-source-docs.mozilla.org/js/Debugger/) as well as asking questions where necessary.
  3. The third week was used to start writing code to implement this functionality and writing my second blog post on ‘Everybody struggles’.
  4. By the end of week 4, I had completed the implementation of the issue. My mentor reviewed the code and asked me to go on with writing tests.
  5. After week one, I thought writing debugger tests were understood but week 5 proved me wrong. I had to ask my mentor to help me out by explaining everything again which he did. The holidays also had an impact on this week because some of my time was spent celebrating with family and friends.Also this week, I had to think about my audience and write a blog post to help new comers understand my project and community.
  6. Week 6 I can say was my victory week because it is when I finally completed debugger tests for the backtrace “full” command. My mentor reviewed the code and we had a little challenge with version control. This helped me familiarize myself with some git commands (git checkout –file for discarding changes which you no longer want)and also see in practice how important it is to keep track of your project’s version.
  7. Week 7 started with me asking my mentor advice on the next step which could be taken after this internship and also with us discussing issue 207 (https://gitlab.gnome.org/GNOME/gjs/-/issues/207) which is displaying the current line of source code in the debugger. By the end of this week I had a good understanding of the issue and subsequent weeks will be spent on completing this issue and also other issues that my mentor and I are still to decide on.

All of my activities during the past weeks have led me to accomplish one major thing which is adding the “full” option to the debugger’s backtrace command. It is my wish to use this as an opportunity to create awareness about how it works. I will be answering the questions what and why.

What is the “full” option of the backtrace command?

First of all, the backtrace command gives you a summary of how your program got to where it is. It shows one line per frame, for many frames, starting with the currently executing frame(frame 0), followed by it’s caller(frame 1), and up the stack. Now, the full option prints out the values of all local variables for each stack frame hence adding this option improves on the user experience of the debugger.After running the debugger on a file, you can see the full option in action by entering “backtrace full” or “bt full”.

Why the full option?

This is actually the first question I asked my mentor when this issue was assigned to me. To answer this question, he wrote a program which occasionally crashes when run on the debugger. The backtrace full is particularly good for getting stack traces from the user. If you are the developer, it’s not so hard to print the variables you need, because you will often already know which ones you need but if a user has the crash, then maybe you can get them to run the program in the debugger, but it’s very time consuming to go back and forth telling them “ok, now type print x”, “now type print y” when you can just get all the information at once by telling them to type “bt full”.

After 8 weeks of this internship, it is clear that we are about 2 weeks behind the initial time line and some reasons are because I underestimated the strength of the issues that had to be worked, did not take into account the few days to be used up for the Christmas holidays,the time spent on writing blog post and also the time spent on meetings with my mentor. The new timeline my mentor and I agreed on was using week 7, 8 and 9 to work on issue 207 and using week 10,11,12 and 13 on some issues we will discuss during our next meeting.

Offline Toast notification in Nuxt/Vue app

We have often seen apps telling us that “You are offline. Check your network status.”. It is not only convenient to do so but adds to a great UX. In this blog, we will look at how can we display a toast notification in a Nuxt/Vue app whenever the user goes offline or online. This will also help us to understand how to use computed and watch properties together.



Before getting started, we need to make sure that we have correctly setup Nuxt and BootstrapVue.

1. Using $nuxt helper

Nuxt provides a great way to access its helper class, $nuxt. In order to get the current network connection status, we can do two things:


export default {
  created() {

Yes, it is as simple as that.

Now in BootstrapVue, we ca create toasts on-demand using this.$bvToast.toast(). So we can implement the notification behaviour using computed and watch properties provided by Vue.

2. Writing Code

The best place to add the following piece of code is in our layouts/default.vue. Doing so can help us to implement a universal kind of notification behaviour.

  <Nuxt />

export default {
  computed: {
    connectionStatus() {
      return this.$nuxt.isOffline
  watch: {
    connectionStatus(offline) {
      if (offline) {

        // hide the online toast if it exists

        // create a new toast for offline notification
        // that doesn't hide on its own
        this.$bvToast.toast('You are now offline', {
          id: 'offline',
          toaster: 'b-toaster-bottom-right',
          noCloseButton: true,
          solid: true,
          noAutoHide: true,
          variant: 'danger',
      } else {

        // hide the offline toast if it exists

        // create a new toast for online notification
        // that auto hides after a given time
        this.$bvToast.toast('You are now online', {
          id: 'online',
          toaster: 'b-toaster-bottom-right',
          noCloseButton: true,
          solid: true,
          autoHideDelay: 5000,
          variant: 'success',

Let us go through the above code. First of all, we create a computed property, connectionStatus. In connectionStatus, we return the value of this.$nuxt.isOffline. Now in Vue, whenever a property, a computed is dependent upon changes, the computed property also changes. So whenever this.$nuxt.isOffline changes, connectionStatus gets a new value.

We can watch the value of connectionStatus and do things based on its new value. In our case, we check whether the changed value of connectionStatus is true(offline). Depending upon this we display our toast notification using BootstrapVue.


Let us go back to our browser and check whether the above code works or not. In the Network tab in Developer Tools, let us toggle the network connection status.

Tutorial to display notification when user is offline in Nuxt/Vue

Hurray! Our toast notifications are working perfectly fine. So using the combined magic of computed and watch properties, we can create outstanding workflows and take our Nuxt/Vue app to next level. If you any doubts or appreciation for our team, let us know in the comments below. We would be happy to assist you.

January 17, 2021

Chafa 1.6.0: Wider

Here’s another one from the terminal graphics extravaganza dept: Chafa 1.6.0 brings fullwidth character support, so in addition to the usual block elements and ASCII art, you now get some mean CJK art too. Or grab as many fonts as you can and combine all of the Unicode into one big glorious mess. Chafa can efficiently distinguish between thousands of symbols, so it also runs fast enough for animations — up to a point.

Since some users want this in environments where it’s not practical to build from source or even to have nice things like GLib, I’ve started adding statically linked builds. These are pretty bare-bones (fewer image loaders, no man page), so look to your steadfast distribution first.

Speaking of distributions, a big thank you to the packagers. Special thanks go to Florian Viehweger for getting in touch re. adding it to OpenBSD ports, and Mo Zhou (Debian), Michael Vetter (openSUSE), Herby Gillot (MacPorts), @chenrui and Carlo Cabrera (Homebrew) for getting 1.6 out there before I could even finish this post.

So what’s it look like?

Obviously if you just want as faithful a reproduction as possible, stick with the default block elements or sixels. That said, fullwidth characters open up some new artistic possibilities.

Chafa rendering of Dog's Head

Above, a rendering of Dog’s Head (1920) by Julie de Graag, digitally enhanced by Rawpixel. It was generated with the following command line:

chafa --glyph-file /usr/share/fonts/truetype/SourceHanSansCN-Normal.otf \
  --glyph-file /usr/share/fonts/truetype/SourceHanSansJP-Normal.otf \
  --glyph-file /usr/share/fonts/truetype/DroidSansThai.ttf \ 
  --glyph-file /usr/share/fonts/truetype/SourceCodePro-Regular.ttf \
  --symbols 0..fffff-block-border-stipple-dot-geometric \
  -c none -w 9 dog.png

Although I’d like to include a moderately large built-in selection of fullwidth symbols in a future release, for now you must load fonts with --glyph-file in order to achieve this effect. You also need to enable the Unicode ranges you want and curtail the use of block and border elements with --symbols. The latter is necessary because block elements produce more accurate results and will otherwise pretty much always come out on top during error minimization.

Chafa rendering of Shinjuku Skyscrapers

This is a rendering of Shinjuku Skyscrapers, CC-BY-SA Wilhelm Joys Andersen. I used the same set of options to produce it, but left out -c none, resulting in 24-bit color — the default under VTE.

A side effect of allowing lots of color variation is fewer wide characters. This makes sense considering that they force a pair of cells to have the same color, which is often less accurate than two narrow characters with different colors.

彡 (._.) ( l: ) (.-.) ( :l )

Like many subjects that look simple at first, terminal graphics makes for a surprisingly deep rabbit hole to be tumbling into. Chafa now spans the gamut from the most basic monochrome ASCII art to fullwidth Unicode, 24-bit color and sixels, and there’s still a lot that can be done to improve it. I will be doing so… slowly.

If you want to help, feel free to send pull requests or file any issues you find. I think it’s also at the point where you can achieve various surprising effects, so if you manage to get something particularly cool/sick/downright disgusting out of it, just lob it in my general direction and maybe I’ll include it in a future gallery.

The origins of the Flow Game 🎥

Let’s kickstart the new year with a short & simple blog post, as a way to get me back on the blogging treadmill, and as a way to ensure my blog still works fine (I have just finished a very heavy-handed migration and database encoding surgery for my blog, which took months to solve… that’ll be a story for another blog post, if anyone is interested? 🤔 and yes, I’m totally using emojis and exotic languages in this post just to see if it still breaks Planet GNOME. わたしは にほんごがすこししかはなせません!)…

Sometime during the pre-collapse year MMXX, my friend Hélène asked me a favor: to edit and publish some interview footage she had previously recorded—and that had been sitting on her hard drive since. The goal was to salvage this material and do a simple edit—not aim for perfection—and publish it so that the public can benefit from the shared knowledge and life experiences.

And so I did. The resulting video (17 mins) is here for your enjoyment:

What is this about?

The video is an informal speech given by Monica Nissén and Toke Paludan Møller, where they share their experience as business founders in the nineties, and what led them to create the Flow Game, a serious game that serves as a tool for creating “an interactive reflection, dialogue and action space for groups, teams and individuals.”

Indeed, while they were working on their entrepreneurial project (with at least one other collaborator, I believe), they felt the need to create a system to facilitate their brainstorming sessions and guide the evolution of their projects.

“What a strange edit! Were you drunk?”

A friend of mine who worked in Hollywood keeps telling me, “Editing is the pits!”, probably because you have to work with what you’re given, and have to cut and rearrange thousands of little pieces to build a coherent story.

In this case, the video was recorded with only one camera angle, so I did not have a lot of material to work with. I added some footage from other sessions to hide various handheld camera angle relocation moves; this is why you sometimes see visuals of other people (such as Toke) “silently talking” while Monica’s voice narration continues. There are scenes in Toke’s speech where I had only audio (it seems the video recording started late) so I had no choice but to fill the gaps with footage of other people talking (in that case I did find it amusing to have synched lips on some occasions), and I tried to not reuse the same footage too much to provide some variety.

Note: this video is a pro-bono production for educational and historical purposes, and is not a business endorsement of the Flow Game by idéemarque/atypica. I’m not selling anything, there is no Danish Conspiracy (unlike the GNOME Swedish Conspiracy), etc. I just hope someone finds this content insightful. Feel free to share it around (or retweet it) or to leave a comment if you liked it.

January 14, 2021

Toolbox — After a gap of 15 months


We just released version 0.0.99, and I realized that it’s been a while since I blogged about Toolbox. So it’s time to address that.

Rewritten in Go

About a year ago, Ondřej Míchal single-handedly rewrote Toolbox in Go, making it massively easier to work on the code compared to the previous POSIX shell implementation. Go comes with much nicer facilities for command line parsing, error handling, logging, parsing JSON, and in general is a lot more pleasant to program in. Plus all the container tools in the OCI ecosystem are written in Go anyway, so it was a natural fit.

Other than the obvious benefits of Go, the rewrite immediately fixed a few bugs that were inherently very cumbersome to fix in the POSIX shell implementation. Something as simple as offering a –version option, or avoiding duplicate entries when listing containers or images was surprisingly difficult to achieve in the past.

What’s more, we managed to pull this off by retaining full compatibility with the previous code. So users and distributors should have no hesitation to update.

Towards version 0.1.0

We have been very conservative about our versioning scheme so far due to the inherently prototype nature of Toolbox. All our release numbers have followed the 0.0.x format. We thought that the move to Go deserves at least a minor version bump, but we also wanted to give it some time to shake out any bugs that might have crept in; and implement the features and fix the bugs that have been on our short-term wish list before putting a 0.1.0 stamp on it.

Therefore, we started a series of 0.0.9x releases to work our way towards version 0.1.0. The first one was 0.0.90 which shipped the Go code in March 2020, and we are currently at 0.0.99. Suffice to say that we are very close to the objective.

Rootful Toolboxes

Sometimes a rootless OCI container just isn’t enough because it can’t do things that require privilege escalation beyond the user’s current user ID on the host. This means that various debugging tools, such as Nmap, don’t work.

Therefore, we added support for running toolbox as root in version This should hopefully unlock various new use-cases that were so far not possible when running rootless.

When running as root, Toolbox cannot rely on things like the user’s session D-Bus instance or the XDG_RUNTIME_DIR environment variable, because sudo doesn’t create a full-fledged user session that offers them. This means that graphical applications can only work by connecting to a X11 server, but then again running graphical applications as root is never a good idea to begin with.

Red Hat Universal Base Image (or UBI)

We recently took the first step towards supporting operating system distributions other than Fedora as first class citizens. From version 0.0.99 onwards, Toolbox supports Red Hat Enterprise Linux hosts where it will create containers based on the Red Hat Universal Base Image by default.

On hosts that aren’t running RHEL, one can still create UBI containers as:
$ toolbox create --distro rhel --release 8.3

Read more

Those were some of the big things that have happened in Toolbox land since my last update. If you are interested in more details, then you can read Ondřej’s posts where he writes at length about the port to Go and the changes in each of the releases since then.

January 13, 2021

Add extended information to GErrors in GLib 2.67.2

Thanks to Krzesimir Nowak, a 17-year-old feature request in GLib has been implemented: it’s now possible to define GError domains which have extended information attached to their GErrors.

You could now, for example, define a GError domain for text parser errors which includes context information about a parsing failure, such as the current line and character position. Or attach the filename of a file which was being read, to the GError informing of a read failure. Define an extended error domain using G_DEFINE_EXTENDED_ERROR(). The extended information is stored in a ‘private’ struct provided by you, similarly to how it’s implemented for GObjects with G_DEFINE_TYPE_WITH_PRIVATE().

There are code examples on how to use the new APIs in the GLib documentation, so I won’t reproduce them here.

An important limitation to note is that existing GError domains which have ever been part of a stable public API cannot be extended retroactively unless you are breaking ABI. That’s because extending a GError domain increases the size of the allocated GError instances for that domain, and it’s possible that users of your API will have stack-allocated GErrors in the past.

Please don’t stack-allocate GErrors, as it makes future extensions of the API impossible, and doesn’t buy you notable extra performance, as GErrors should not be used on fast paths. By their very nature, they’re for failure reporting.

The new APIs are currently unstable, so please try them out and provide feedback now. They will be frozen with the release of GLib 2.67.3, scheduled for 11th February 2021.

Parsing HID Unit Items

This post explains how to parse the HID Unit Global Item as explained by the HID Specification, page 37. The table there is quite confusing and it took me a while to fully understand it (Benjamin Tissoires was really the one who cracked it). I couldn't find any better explanation online which means either I'm incredibly dense and everyone's figured it out or no-one has posted a better explanation. On the off-chance it's the latter [1], here are the instructions on how to parse this item.

We know a HID Report Descriptor consists of a number of items that describe the content of each HID Report (read: an event from a device). These Items include things like Logical Minimum/Maximum for axis ranges, etc. A HID Unit item specifies the physical unit to apply. For example, a Report Descriptor may specify that X and Y axes are in mm which can be quite useful for all the obvious reasons.

Like most HID items, a HID Unit Item consists of a one-byte item tag and 1, 2 or 4 byte payload. The Unit item in the Report Descriptor itself has the binary value 0110 01nn where the nn is either 1, 2, or 3 indicating 1, 2 or 4 bytes of payload, respectively. That's standard HID.

The payload is divided into nibbles (4-bit units) and goes from LSB to MSB. The lowest-order 4 bits (first byte & 0xf) define the unit System to apply: one of SI Linear, SI Rotation, English Linear or English Rotation (well, or None/Reserved). The rest of the nibbles are in this order: "length", "mass", "time", "temperature", "current", "luminous intensity". In something resembling code this means:

system = value & 0xf
length_exponent = (value & 0xf0) >> 4
mass_exponent = (value & 0xf00) >> 8
time_exponent = (value & 0xf000) >> 12
The System defines which unit is used for length (e.g. SILinear means length is in cm). The actual value of each nibble is the exponent for the unit in use [2]. In something resembling code:

switch (system)
case SILinear:
print("length is in cm^{length_exponent}");
case SIRotation:
print("length is in rad^{length_exponent}");
case EnglishLinear:
print("length is in in^{length_exponent}");
case EnglishRotation:
print("length is in deg^{length_exponent}");
case None:
case Reserved"

For example, the value 0x321 means "SI Linear" (0x1) so the remaining nibbles represent, in ascending nibble order: Centimeters, Grams, Seconds, Kelvin, Ampere, Candela. The length nibble has a value of 0x2 so it's square cm, the mass nibble has a value of 0x3 so it is cubic grams (well, it's just an example, so...). This means that any report containing this item comes in cm²g³. As a more realistic example: 0xF011 would be cm/s.

If we changed the lowest nibble to English Rotation (0x4), i.e. our value is now 0x324, the units represent: Degrees, Slug, Seconds, F, Ampere, Candela [3]. The length nibble 0x2 means square degrees, the mass nibble is cubic slugs. As a more realistic example, 0xF014 would be degrees/s.

Any nibble with value 0 means the unit isn't in use, so the example from the spec with value 0x00F0D121 is SI linear, units cm² g s⁻³ A⁻¹, which is... Voltage! Of course you knew that and totally didn't have to double-check with wikipedia.

Because bits are expensive and the base units are of course either too big or too small or otherwise not quite right, HID also provides a Unit Exponent item. The Unit Exponent item (a separate item to Unit in the Report Descriptor) then describes the exponent to be applied to the actual value in the report. For example, a Unit Eponent of -3 means 10⁻³ to be applied to the value. If the report descriptor specifies an item of Unit 0x00F0D121 (i.e. V) and Unit Exponent -3, the value of this item is mV (milliVolt), Unit Exponent of 3 would be kV (kiloVolt).

Now, in hindsight all this is pretty obvious and maybe even sensible. It'd have been nice if the spec would've explained it a bit clearer but then I would have nothing to write about, so I guess overall I call it a draw.

[1] This whole adventure was started because there's a touchpad out there that measures touch pressure in radians, so at least one other person out there struggled with the docs...
[2] The nibble value is twos complement (i.e. it's a signed 4-bit integer). Values 0x1-0x7 are exponents 1 to 7, values 0x8-0xf are exponents -8 to -1.
[3] English Linear should've trolled everyone and use Centimetres instead of Centimeters in SI Linear.

January 12, 2021

Unlocking LUKS2 volumes with TPM2, FIDO2, PKCS#11 Security Hardware on systemd 248

TL;DR: It's now easy to unlock your LUKS2 volume with a FIDO2 security token (e.g. YubiKey or Nitrokey FIDO2). And TPM2 unlocking is easy now too.

Blogging is a lot of work, and a lot less fun than hacking. I mostly focus on the latter because of that, but from time to time I guess stuff is just too interesting to not be blogged about. Hence here, finally, another blog story about exciting new features in systemd.

With the upcoming systemd v248 the systemd-cryptsetup component of systemd (which is responsible for assembling encrypted volumes during boot) gained direct support for unlocking encrypted storage with three types of security hardware:

  1. Unlocking with FIDO2 security tokens (well, at least with those which implement the hmac-secret extension, most do). i.e. your YubiKeys (series 5 and above), or Nitrokey FIDO2 and such.

  2. Unlocking with TPM2 security chips (pretty ubiquitous on non-budget PCs/laptops/…)

  3. Unlocking with PKCS#11 security tokens, i.e. your smartcards and older YubiKeys (the ones that implement PIV). (Strictly speaking this was supported on older systemd already, but was a lot more "manual".)

For completeness' sake, let's keep in mind that the component also allows unlocking with these more traditional mechanisms:

  1. Unlocking interactively with a user-entered passphrase (i.e. the way most people probably already deploy it, supported since about forever)

  2. Unlocking via key file on disk (optionally on removable media plugged in at boot), supported since forever.

  3. Unlocking via a key acquired through trivial AF_UNIX/SOCK_STREAM socket IPC. (Also new in v248)

  4. Unlocking via recovery keys. These are pretty much the same thing as a regular passphrase (and in fact can be entered wherever a passphrase is requested) — the main difference being that they are always generated by the computer, and thus have guaranteed high entropy, typically higher than user-chosen passphrases. They are generated in a way they are easy to type, in many cases even if the local key map is misconfigured. (Also new in v248)

In this blog story, let's focus on the first three items, i.e. those that talk to specific types of hardware for implementing unlocking.

To make working with security tokens and TPM2 easy, a new, small tool was added to the systemd tool set: systemd-cryptenroll. It's only purpose is to make it easy to enroll your security token/chip of choice into an encrypted volume. It works with any LUKS2 volume, and embeds a tiny bit of meta-information into the LUKS2 header with parameters necessary for the unlock operation.

Unlocking with FIDO2

So, let's see how this fits together in the FIDO2 case. Most likely this is what you want to use if you have one of these fancy FIDO2 tokens (which need to implement the hmac-secret extension, as mentioned). Let's say you already have your LUKS2 volume set up, and previously unlocked it with a simple passphrase. Plug in your token, and run:

# systemd-cryptenroll --fido2-device=auto /dev/sda5

(Replace /dev/sda5 with the underlying block device of your volume).

This will enroll the key as an additional way to unlock the volume, and embeds all necessary information for it in the LUKS2 volume header. Before we can unlock the volume with this at boot, we need to allow FIDO2 unlocking via /etc/crypttab. For that, find the right entry for your volume in that file, and edit it like so:

myvolume /dev/sda5 - fido2-device=auto

Replace myvolume and /dev/sda5 with the right volume name, and underlying device of course. Key here is the fido2-device=auto option you need to add to the fourth column in the file. It tells systemd-cryptsetup to use the FIDO2 metadata now embedded in the LUKS2 header, wait for the FIDO2 token to be plugged in at boot (utilizing systemd-udevd, …) and unlock the volume with it.

And that's it already. Easy-peasy, no?

Note that all of this doesn't modify the FIDO2 token itself in any way. Moreover you can enroll the same token in as many volumes as you like. Since all enrollment information is stored in the LUKS2 header (and not on the token) there are no bounds on any of this. (OK, well, admittedly, there's a cap on LUKS2 key slots per volume, i.e. you can't enroll more than a bunch of keys per volume.)

Unlocking with PKCS#11

Let's now have a closer look how the same works with a PKCS#11 compatible security token or smartcard. For this to work, you need a device that can store an RSA key pair. I figure most security tokens/smartcards that implement PIV qualify. How you actually get the keys onto the device might differ though. Here's how you do this for any YubiKey that implements the PIV feature:

# ykman piv reset
# ykman piv generate-key -a RSA2048 9d pubkey.pem
# ykman piv generate-certificate --subject "Knobelei" 9d pubkey.pem
# rm pubkey.pem

(This chain of commands erases what was stored in PIV feature of your token before, be careful!)

For tokens/smartcards from other vendors a different series of commands might work. Once you have a key pair on it, you can enroll it with a LUKS2 volume like so:

# systemd-cryptenroll --pkcs11-token-uri=auto /dev/sda5

Just like the same command's invocation in the FIDO2 case this enrolls the security token as an additional way to unlock the volume, any passphrases you already have enrolled remain enrolled.

For the PKCS#11 case you need to edit your /etc/crypttab entry like this:

myvolume /dev/sda5 - pkcs11-uri=auto

If you have a security token that implements both PKCS#11 PIV and FIDO2 I'd probably enroll it as FIDO2 device, given it's the more contemporary, future-proof standard. Moreover, it requires no special preparation in order to get an RSA key onto the device: FIDO2 keys typically just work.

Unlocking with TPM2

Most modern (non-budget) PC hardware (and other kind of hardware too) nowadays comes with a TPM2 security chip. In many ways a TPM2 chip is a smartcard that is soldered onto the mainboard of your system. Unlike your usual USB-connected security tokens you thus cannot remove them from your PC, which means they address quite a different security scenario: they aren't immediately comparable to a physical key you can take with you that unlocks some door, but they are a key you leave at the door, but that refuses to be turned by anyone but you.

Even though this sounds a lot weaker than the FIDO2/PKCS#11 model TPM2 still bring benefits for securing your systems: because the cryptographic key material stored in TPM2 devices cannot be extracted (at least that's the theory), if you bind your hard disk encryption to it, it means attackers cannot just copy your disk and analyze it offline — they always need access to the TPM2 chip too to have a chance to acquire the necessary cryptographic keys. Thus, they can still steal your whole PC and analyze it, but they cannot just copy the disk without you noticing and analyze the copy.

Moreover, you can bind the ability to unlock the harddisk to specific software versions: for example you could say that only your trusted Fedora Linux can unlock the device, but not any arbitrary OS some hacker might boot from a USB stick they plugged in. Thus, if you trust your OS vendor, you can entrust storage unlocking to the vendor's OS together with your TPM2 device, and thus can be reasonably sure intruders cannot decrypt your data unless they both hack your OS vendor and steal/break your TPM2 chip.

Here's how you enroll your LUKS2 volume with your TPM2 chip:

# systemd-cryptenroll --tpm2-device=auto --tpm2-pcrs=7 /dev/sda5

This looks almost as straightforward as the two earlier sytemd-cryptenroll command lines — if it wasn't for the --tpm2-pcrs= part. With that option you can specify to which TPM2 PCRs you want to bind the enrollment. TPM2 PCRs are a set of (typically 24) hash values that every TPM2 equipped system at boot calculates from all the software that is invoked during the boot sequence, in a secure, unfakable way (this is called "measurement"). If you bind unlocking to a specific value of a specific PCR you thus require the system has to follow the same sequence of software at boot to re-acquire the disk encryption key. Sounds complex? Well, that's because it is.

For now, let's see how we have to modify your /etc/crypttab to unlock via TPM2:

myvolume /dev/sda5 - tpm2-device=auto

This part is easy again: the tpm2-device= option is what tells systemd-cryptsetup to use the TPM2 metadata from the LUKS2 header and to wait for the TPM2 device to show up.

Bonus: Recovery Key Enrollment

FIDO2, PKCS#11 and TPM2 security tokens and chips pair well with recovery keys: since you don't need to type in your password everyday anymore it makes sense to get rid of it, and instead enroll a high-entropy recovery key you then print out or scan off screen and store a safe, physical location. i.e. forget about good ol' passphrase-based unlocking, go for FIDO2 plus recovery key instead! Here's how you do it:

# systemd-cryptenroll --recovery-key /dev/sda5

This will generate a key, enroll it in the LUKS2 volume, show it to you on screen and generate a QR code you may scan off screen if you like. The key has highest entropy, and can be entered wherever you can enter a passphrase. Because of that you don't have to modify /etc/crypttab to make the recovery key work.


There's still plenty room for further improvement in all of this. In particular for the TPM2 case: what the text above doesn't really mention is that binding your encrypted volume unlocking to specific software versions (i.e. kernel + initrd + OS versions) actually sucks hard: if you naively update your system to newer versions you might lose access to your TPM2 enrolled keys (which isn't terrible, after all you did enroll a recovery key — right? — which you then can use to regain access). To solve this some more integration with distributions would be necessary: whenever they upgrade the system they'd have to make sure to enroll the TPM2 again — with the PCR hashes matching the new version. And whenever they remove an old version of the system they need to remove the old TPM2 enrollment. Alternatively TPM2 also knows a concept of signed PCR hash values. In this mode the distro could just ship a set of PCR signatures which would unlock the TPM2 keys. (But quite frankly I don't really see the point: whether you drop in a signature file on each system update, or enroll a new set of PCR hashes in the LUKS2 header doesn't make much of a difference). Either way, to make TPM2 enrollment smooth some more integration work with your distribution's system update mechanisms need to happen. And yes, because of this OS updating complexity the example above — where I referenced your trusty Fedora Linux — doesn't actually work IRL (yet? hopefully…). Nothing updates the enrollment automatically after you initially enrolled it, hence after the first kernel/initrd update you have to manually re-enroll things again, and again, and again … after every update.

The TPM2 could also be used for other kinds of key policies, we might look into adding later too. For example, Windows uses TPM2 stuff to allow short (4 digits or so) "PINs" for unlocking the harddisk, i.e. kind of a low-entropy password you type in. The reason this is reasonably safe is that in this case the PIN is passed to the TPM2 which enforces that not more than some limited amount of unlock attempts may be made within some time frame, and that after too many attempts the PIN is invalidated altogether. Thus making dictionary attacks harder (which would normally be easier given the short length of the PINs).


(BTW: Yubico sent me two YubiKeys for testing and Nitrokey a Nitrokey FIDO2, thank you! — That's why you see all those references to YubiKey/Nitrokey devices in the text above: it's the hardware I had to test this with. That said, I also tested the FIDO2 stuff with a SoloKey I bought, where it also worked fine. And yes, you!, other vendors!, who might be reading this, please send me your security tokens for free, too, and I might test things with them as well. No promises though. And I am not going to give them back, if you do, sorry. ;-))

GTK 4.0.1

We all took a bit of a break after 4.0 and did some other things, but now it is time for GTK 4.0.1.

This is the first release after 4.0, and it naturally contains a lot of small bug fixes,  theme and documentation improvements, and the like. But there are a few highlights that are worth pointing out.

Better media support

Among the bigger advances in this release: we managed to make the gstreamer media backend use GL textures, which avoids bouncing frame data between gpu and cpu when using hardware acceleration for decoding, such as vaapi. This requires careful orchestration to bridge the differences in how gstreamer and GTK treat GL, but we managed to make it work in many cases.

Does this mean GtkVideo is now ready to support fully-featured media player applications? Far from it. It still just lets you play media from a file or url, and does not support multi-channel audio, video overlays, device selection, input, and other things that you probably want in a media player.

It would be really nice if somebody took the code in the GTK media backend and turned it inside out to make a GStreamer plugin with a sink that exposes its video frames as GdkPaintable. That would let you use gstreamer API to get all of the aforementioned features, while still integrating smoothly in GTK.

Better CI

In order to keep our new MacOS backend in working shape, we’ve started to set up CI builds for this platform, both for GTK itself, and for its dependencies (pango, gdk-pixbuf).

Cleaning Up Unused Flatpak Runtimes

Despite having been a contributor to the GNOME project for almost 5 years now (first at Red Hat and now at Endless), I’ve never found the time to blog about my work. Fortunately in many cases collaborators have made posts or the work was otherwise announced. Now that Endless is a non-profit foundation and we are working hard at advocating for our solutions to technology access barriers in upstream projects, I think it’s an especially good time to make my first blog post announcing a recent feature in Flatpak, which I worked on with a lot of help from Alex Larsson.

On many low-end computers, persistent storage space is quite limited. Some Endless hardware for example has only 32 GB. And we want to fill much of it with useful content in the form of Flatpak apps so that the computers are useful even offline. So often in the past we have shipped computers that are already quite full before the user stores any files. Ideally we want that limited space to be used as efficiently as possible, and Flatpak and OSTree already have some neat mechanisms to that end, such as de-duplicating any identical files across all apps and their runtimes (and, in the case of Endless OS, including the OS files as well).

(For the uninitiated a runtime is basically a set of libraries that can be shared between Flatpak apps, and which the apps use at run-time.)

However, there’s room for improvement. In Flatpak versions prior to 1.9.1 (1.9.x is currently the unstable series), runtimes are, broadly speaking, not uninstalled when the last app using them is uninstalled or updated to use a newer runtime. In some special cases such as locale extensions runtimes are uninstalled, but the main runtimes such as the GNOME or KDE ones that take up the most space are left behind unless manually uninstalled. And those runtimes can take up a significant amount of disk space:

$ du -sh ~/.local/share/flatpak/runtime/org.gnome.Platform/x86_64/3.38
890M /home/mwleeds/.local/share/flatpak/runtime/org.gnome.Platform/x86_64/3.38

$ du -sh ~/.local/share/flatpak/runtime/org.kde.Platform/x86_64/5.14
969M /home/mwleeds/.local/share/flatpak/runtime/org.kde.Platform/x86_64/5.14

This does have a significant advantage: in case the runtime is needed again in the future it will not have to be re-downloaded. But ultimately it is not a good situation to have the user’s disk space increasingly taken up by unneeded Flatpak runtimes as their apps migrate to newer runtimes, with no way for non-technical users to remedy the situation.

For a while now Flatpak has had the ability to remove unused runtimes with the command flatpak uninstall –unused. But users should never need to use the command line to keep their computer running well. And users who choose to use the command line already run flatpak update regularly, so in the new implementation removing unused runtimes is integrated into the update command (in addition to happening behind-the-scenes in GNOME Software for GUI-only users).

A compromise was chosen between removing all unused runtimes and always leaving them installed, which is to remove unused runtimes which have been marked End Of Life on the server side, on the basis that such runtimes are unlikely to be needed again in the future. Of course for this to work properly, runtime publishers must properly set the EOL metadata when appropriate, as was recently fixed on Flathub. So please do so if you maintain any runtimes!

I’ve glossed over it so far but actually defining when a runtime is unused is not trivial: a runtime in the system installation may be used by an app in the current user’s per-user installation (which Flatpak can detect), a runtime in the system installation may be used by an app in another user’s per-user installation (which Flatpak cannot detect), and a runtime may be used for development purposes. For this latter case the current implementation offers two solutions: one can prevent a runtime from being automatically uninstalled by pinning it with the flatpak pin command. Additionally, runtimes that are manually installed (as opposed to being pulled in to satisfy a dependency requirement) are automatically pinned.

You can check if you have any pinned runtime patterns (the command accepts globs in addition to precise runtimes) by just executing flatpak pin without any arguments.

Long story short, with the upcoming releases of Flatpak 1.10 and GNOME Software 40, both will remove unused EOL runtimes during update operations and uninstall operations, freeing up disk space for users. If you maintain a software manager that supports Flatpak, you may consider using the new API to ensure unused runtimes are regularly cleaned up.

There is one improvement I’d like to make for this feature: we could take filesystem access time information into account when determining if a runtime is unused (perhaps removing a runtime that hasn’t been executed in a year?). But that is for another day…

January 11, 2021

Files 40.alpha: Creation timestamp & Wallpaper portal

Hi there, GNOME Planet.

In my last post I’ve promised that the next one would have screenshots of new developments in the Files app, and it’s finally here!

It took me longer than I expected back then. After the 3.38 release, I had to focus my time elsewhere: assisting and training local primary health care teams in managing and following up of the raising number of COVID-19 cases assigned to them. With this mission accomplished, in December I’ve picked up again on my GNOME contributions and have something to show you now.

Files 40.alpha

Last week we have reached the alpha milestone for the upcoming version 40 of GNOME Files. The highlights of this pre-release milestone are a long requested feature to show files creation timestamps and an enhancement to the Set as Wallpaper action.

Creation date

Finally the screenshots!

List of files sorted by creation date
“Created” column can be added by right clicking on the list headers.
List of files sorted by creation date
The full date and time is shown in the file Properties

This was made possible thanks to Thunar developer Andre Miranda’s laudable initiative to implement the low-level glue for all GIO-based apps to benefit from. It was then easy for me to add the column to list view, and for Apoorv Sachan to add it to the Properties dialog (a nice follow-up to his GSOC project cleaning up the Properties code and UI).

This is a new feature, so it would be great to have people testing it before the final release. It’s easy to test, see instructions at the end of this post.

There as some open questions:

  • What to do for files and folders in file systems for which we don’t have access to the creation date (e.g. FAT, NTFS)?
  • Should we do something in case the Modified date is older than Created date, which is counter-intuitive even if technically correct?

Wallpaper Portal

There is a “Set as Wallpaper” action in the context menu for image files, which had a few odd behaviors which were not in sync with the user experience provided by the Settings app.

Thanks to Felipe Borges, not only have these problems been fixed, but the feature has been enhanced! Now you get a preview of the wallpaper, so you can confirm this was the correct picture and whether it’s going to look good, before confirming the desktop wallpaper change.

This is provided by the wallpaper portal created for sandboxed apps, but it works even outside Flatpak.

More coming soon

There are some more enhancements which didn’t make it into this milestone, but which I hope to be able to deliver before the beta milestone. I’ll talk about them in a future post.

I’m also very happy to see many new contributors fixing both major and minor bugs and implementing exciting features in the Files app. Now, back to reviewing the MRs, so that I can highlight their contributions in a future post!


For testing the latest developments in GNOME Files, without modifying the Files app in your system, there is a Nightly flatpak. To install it, copy and run the following command in a Terminal:

flatpak install --from https://nightly.gnome.org/repo/appstream/org.gnome.NautilusDevel.flatpakref

The Nightly can now be launched from Activities, or with this command:

flatpak run org.gnome.NautilusDevel

(If your operating system doesn’t support flatpak out of the box, see the Quick Setup guide.)

fwupd 1.5.5

I’ve just released fwupd 1.5.5 with the following new features:

  • Add a plugin to update PixArt RF devices; the hardware this enables we’ll announce in a few weeks hopefully
  • Add new hardware to use the elantp (for TouchPads) and rts54hid (for USB Hubs) plugins
  • Allow specifying more than one VendorID for a device, which allows ATA devices to use the OUI-assigned vendor if set
  • Detect the AMD TSME encryption state for HSI-4 — use fwupdmgr security --force to help test
  • Detect the AMI PK test key is not installed for HSI-1 — a failure here is very serious
  • As usual, this release fixes quite a few bugs too:

  • Fix flashing a fingerprint reader that is in use; in theory the window to hit this is vanishingly small, but on some hardware we ask the user to authorise the request using the very device that we’re trying to update…
  • Fix several critical warnings when parsing invalid firmware, found using hongfuzz, warming my office on these cold winter days
  • Fix updating DFU devices that use DNLOAD_BUSY which fixes fwupd on some other future hardware support
  • Ignore the legacy UEFI OVMF dummy GUID so that we can test the dbx updates using qemu on older releases like RHEL
  • Make libfwupd more thread safe to fix a crash in gnome-software — many thanks to Philip Withnall for explaining a lot of the GMainContext threading complexities to me
  • We now never show unprintable chars from invalid firmware in the logs — as a result of fuzzing insane things the logs would often be full of gobbledygook, but no longer
  • I’m now building 1.5.5 into Fedora 33 and Fedora 32, packages should appear soon.

    January 10, 2021

    My Learning Curve at the GNOME Foundation

    It’s been six weeks since I started this exciting journey as an Outreachy intern at the GNOME Foundation. Every week, I have a set of tasks to work on and a project review session every start of the week with my mentor.
    During these sessions I present the work I’ve done, challenges I faced and then get feedback. I’ve had to learn most things on the go and every task comes with it’s own unique flavour of difficulty and discovery. Let’s take a quick look at the project I’m working on…

    My project is based on completing the integration between Gtranslator and Damned Lies(DL), so as to permit translators to upload files and reserve for translation directly from Gtranslator.(This is already possible from the DL website). I also need to extend the API(Application Programming Interface) endpoints DL provides so as to suite my use case.


    Gtranslator is an enhanced gettext po file editor for the GNOME desktop environment. It handles all forms of gettext po files and includes very useful features like find/replace, translation memory, different translator profiles, messages table, easy navigation and editing of translation messages and comments of the translation where accurate.

    A .po file is simply a list of strings from the original program. They contain the actual translations. Each language has its own .po file; for example, for French there would be a fr.po file, for German there would be a de.po, for American English there might be en-US.po.

    Gtranslator is now officially called the GNOME Translation Editor src

    Damned Lies

    Damned Lies is a web application that was built using Django to manage GNOME’s translation work-flow and produce statistics to monitor the translation progress. You might be wondering why the name, right? so here you go - “Lies, damned lies, and statistics” is a phrase describing the persuasive power of numbers, particularly the use of statistics to bolster weak arguments.

    Let’s now find out how I have progressed in my learning so far…

    Difficulty Levels

    First off keep an eye on this “letter map”, I’ll use the letters defined below to depict the difficulty level of a particular week.

    • E: Easy
    • N: Normal
    • H: Hard
    • V: Very Hard

    Week One

    It was an E!

    • I setup my blog (lkmandy.github.io), added it to gnome planet and wrote a blogpost on “Tips on Getting Selected for the Outreachy Program” https://lkmandy.github.io/lkmandy.github.io/blog/2020/tips-on-getting-selected-for-the-outreachy-program/
    • Setup the Damned Lies project
    • Covered some concepts in Django, REST APIs and the HTTP verbs

    Week Two

    I’ll give it an N!

    • Did some intensive research on Django API authentication(auth) methods and eventually settled on the Django REST API framework. Django has soo many auth packages to choose from, so this was quite a task.
    • Added some endpoints for user authentication and yes this means I built a REST API!! Yay!! My very first!!

    Week Three

    Returned to an E!

    • Submitted a Work In Progress Merge Request for the REST API I built
    • Did some research on the JSON-Glib and Libsoup libraries.
    • Added a “Reserve for Translation button” on the Gtranslator user interface (just in a bid to setup for the main work)
    • Wrote a blogpost on “How Translation works in GNOME” https://lkmandy.github.io/lkmandy.github.io/blog/2020/how-translation-works-in-gnome/

    Week Four

    Somewhat H!
    Mainly because I had to research and troubleshoot a lot before seeing the green light

    • Added the module state to the Load from DL interface and submitted a Merge Request
    • Enabled the “Reserve for translation button” to work for the appropriate states as on the vertimus workflow (a diagram that describes the various states and actions available during the translation process of a module) and I equally submitted a Merge Request.
    • Added an icon for the upload to DL feature (just did this locally on a separate branch)

    Week Five

    And I’ll give it an E!

    • Designed two mockups for the “upload file dialog”
    • Coded one part of the upload file dialog(the choose file dialog)
    • Tried adding the token auth information to Gtranslator, but didn’t succeed.

    Week Six

    I can’t lie oh, it’s a V
    I’m currently on this week and it finishes in one days as of the time I am writing this blogpost

    • Stored the DL information on every translation file downloaded from DL
    • Added custom headers to a file with the DL information if it’s downloaded from DL
    • Create the “upload file” dialog (WIP)

    A few things I’ve helped me scale through, let’s check them out…


    • After my project review session, I go through the discussion my mentor and I (and maybe some other community members) just had, pick out all the key point mentioned and summarize them in my evernote sheet. My mentor usually does a recap of the goals I need to achieve for the week, which is also very very helpful
    • I read through my summary several times to make sure i don’t miss out on any detail
    • When a task gets hard, I never give up. I just take breaks and push hard until I succeed.
    • I do extensive research and go through documentation before implementing stuff
    • I submit Merge Requests as early as possible so I can get quick feedback (kudos to my mentor for this)

    It’s been an interesting journey and I’m very happy with all the new things I get to learn everyday.
    The end!!
    See you on the next one!! Thanks.

    January 09, 2021

    Dynamic Home Route in a Flutter App

    In any production app, the user is directed to a route based on some authentication logic whenever the app is opened. In our Flutter App, we have at least two routes, Login and Dashboard. The problem is how can we decide which route should a user be redirected to?

    In this app, we will check the value of a locally stored boolean variable to dynamically decide the home route. We can use any method for writing our authentication logic, like checking the validity of the API token, but for the sake of simplicity, we will explore a simple logic.

    Flutter Dynamic Home Route
    Flutter Dynamic Home Route Flowchart


    1. Installing Dependencies

    In our pubspec.yaml, let us add the following dependencies that we will be using in our Flutter application:

      shared_preferences: ^0.5.12+4
      async: ^2.4.2

    Make sure to install the latest version of the dependencies.

    Shared Preferences is a simple Flutter plugin for reading and writing simple key-value pairs to the local storage. Async contains the utility functions and classes related to the dart:async library.

    After adding these dependencies, it is now time to install them. In the terminal, let us execute the following command:

    flutter pub get 

    2. Writing Code

    In our main.dart, let us add the following code:

    import 'package:flutter/material.dart';
    import 'package:shared_preferences/shared_preferences.dart';
    void main() async {
      // handle exceptions caused by making main async
      // init a shared preferences variable
      SharedPreferences prefs = await SharedPreferences.getInstance();
      // get the locally stored boolean variable
      bool isLoggedIn = prefs.getBoolean('is_logged_in');
      // define the initial route based on whether the user is logged in or not
      String initialRoute = isLoggedIn ? '/' : 'login';
      // create a flutter material app as usual
      Widget app = MaterialApp(
        initialRoute: initialRoute,
      // mount and run the flutter app

    The code is pretty self-explanatory. All we are doing is getting the value of is_logged_in boolean variable, and then decide the value of the initialRoute in our Flutter Material App.

    One important thing in the above code is the use of the async-await pattern. We can also use then but it makes the code a little messy and that’s what we are trying to avoid here. Making our main() function asynchronous can cause some exceptions, so to solve this, we need to add WidgetsFlutterBinding.ensureInitialized().


    That’s it. We have successfully written a code that allows us to redirect the user to the Dashboard page if they are logged in, otherwise to the Login page. If you any doubts or appreciation for our team, let us know in the comments below.

    January 07, 2021

    Rift CV1 – Adventures in Kalman filtering Part 2

    In the last post I had started implementing an Unscented Kalman Filter for position and orientation tracking in OpenHMD. Over the Christmas break, I continued that work.

    A Quick Recap

    When reading below, keep in mind that the goal of the filtering code I’m writing is to combine 2 sources of information for tracking the headset and controllers.

    The first piece of information is acceleration and rotation data from the IMU on each device, and the second is observations of the device position and orientation from 1 or more camera sensors.

    The IMU motion data drifts quickly (at least for position tracking) and can’t tell which way the device is facing (yaw, but can detect gravity and get pitch/roll).

    The camera observations can tell exactly where each device is, but arrive at a much lower rate (52Hz vs 500/1000Hz) and can take a long time to process (hundreds of milliseconds) to analyse to acquire or re-acquire a lock on the tracked device(s).

    The goal is to acquire tracking lock, then use the motion data to predict the motion closely enough that we always hit the ‘fast path’ of vision analysis. The key here is closely enough – the more closely the filter can track and predict the motion of devices between camera frames, the better.

    Integration in OpenHMD

    When I wrote the last post, I had the filter running as a standalone application, processing motion trace data collected by instrumenting a running OpenHMD app and moving my headset and controllers around. That’s a really good way to work, because it lets me run modifications on the same data set and see what changed.

    However, the motion traces were captured using the current fusion/prediction code, which frequently loses tracking lock when the devices move – leading to big gaps in the camera observations and more interpolation for the filter.

    By integrating the Kalman filter into OpenHMD, the predictions are improved leading to generally much better results. Here’s one trace of me moving the headset around reasonably vigourously with no tracking loss at all.

    Headset motion capture trace

    If it worked this well all the time, I’d be ecstatic! The predicted position matched the observed position closely enough for every frame for the computer vision to match poses and track perfectly. Unfortunately, this doesn’t happen every time yet, and definitely not with the controllers – although I think the latter largely comes down to the current computer vision having more troubler matching controller poses. They have fewer LEDs to match against compared to the headset, and the LEDs are generally more side-on to a front-facing camera.

    Taking a closer look at a portion of that trace, the drift between camera frames when the position is interpolated using the IMU readings is clear.

    Headset motion capture – zoomed in view

    This is really good. Most of the time, the drift between frames is within 1-2mm. The computer vision can only match the pose of the devices to within a pixel or two – so the observed jitter can also come from the pose extraction, not the filtering.

    The worst tracking is again on the Z axis – distance from the camera in this case. Again, that makes sense – with a single camera matching LED blobs, distance is the most uncertain part of the extracted pose.

    Losing Track

    The trace above is good – the computer vision spots the headset and then the filtering + computer vision track it at all times. That isn’t always the case – the prediction goes wrong, or the computer vision fails to match (it’s definitely still far from perfect). When that happens, it needs to do a full pose search to reacquire the device, and there’s a big gap until the next pose report is available.

    That looks more like this

    Headset motion capture trace with tracking errors

    This trace has 2 kinds of errors – gaps in the observed position timeline during full pose searches and erroneous position reports where the computer vision matched things incorrectly.

    Fixing the errors in position reports will require improving the computer vision algorithm and would fix most of the plot above. Outlier rejection is one approach to investigate on that front.

    Latency Compensation

    There is inherent delay involved in processing of the camera observations. Every 19.2ms, the headset emits a radio signal that triggers each camera to capture a frame. At the same time, the headset and controller IR LEDS light up brightly to create the light constellation being tracked. After the frame is captured, it is delivered over USB over the next 18ms or so and then submitted for vision analysis. In the fast case where we’re already tracking the device the computer vision is complete in a millisecond or so. In the slow case, it’s much longer.

    Overall, that means that there’s at least a 20ms offset between when the devices are observed and when the position information is available for use. In the plot above, this delay is ignored and position reports are fed into the filter when they are available. In the worst case, that means the filter is being told where the headset was hundreds of milliseconds earlier.

    To compensate for that delay, I implemented a mechanism in the filter where it keeps extra position and orientation entries in the state that can be used to retroactively apply the position observations.

    The way that works is to make a prediction of the position and orientation of the device at the moment the camera frame is captured and copy that prediction into the extra state variable. After that, it continues integrating IMU data as it becomes available while keeping the auxilliary state constant.

    When a the camera frame analysis is complete, that delayed measurement is matched against the stored position and orientation prediction in the state and the error used to correct the overall filter. The cool thing is that in the intervening time, the filter covariance matrix has been building up the right correction terms to adjust the current position and orientation.

    Here’s a good example of the difference:

    Before: Position filtering with no latency compensation
    After: Latency-compensated position reports

    Notice how most of the disconnected segments have now slotted back into position in the timeline. The ones that haven’t can either be attributed to incorrect pose extraction in the compute vision, or to not having enough auxilliary state slots for all the concurrent frames.

    At any given moment, there can be a camera frame being analysed, one arriving over USB, and one awaiting “long term” analysis. The filter needs to track an auxilliary state variable for each frame that we expect to get pose information from later, so I implemented a slot allocation system and multiple slots.

    The downside is that each slot adds 6 variables (3 position and 3 orientation) to the covariance matrix on top of the 18 base variables. Because the covariance matrix is square, the size grows quadratically with new variables. 5 new slots means 30 new variables – leading to a 48 x 48 covariance matrix instead of 18 x 18. That is a 7-fold increase in the size of the matrix (48 x 48 = 2304 vs 18 x 18 = 324) and unfortunately about a 10x slow-down in the filter run-time.

    At that point, even after some optimisation and vectorisation on the matrix operations, the filter can only run about 3x real-time, which is too slow. Using fewer slots is quicker, but allows for fewer outstanding frames. With 3 slots, the slow-down is only about 2x.

    There are some other possible approaches to this problem:

    • Running the filtering delayed, only integrating IMU reports once the camera report is available. This has the disadvantage of not reporting the most up-to-date estimate of the user pose, which isn’t great for an interactive VR system.
    • Keeping around IMU reports and rewinding / replaying the filter for late camera observations. This limits the overall increase in filter CPU usage to double (since we at most replay every observation twice), but potentially with large bursts when hundreds of IMU readings need replaying.
    • It might be possible to only keep 2 “full” delayed measurement slots with both position and orientation, and to keep some position-only slots for others. The orientation of the headset tends to drift much more slowly than position does, so when there’s a big gap in the tracking it would be more important to be able to correct the position estimate. Orientation is likely to still be close to correct.
    • Further optimisation in the filter implementation. I was hoping to keep everything dependency-free, so the filter implementation uses my own naive 2D matrix code, which only implements the features needed for the filter. A more sophisticated matrix library might perform better – but it’s hard to say without doing some testing on that front.


    So far in this post, I’ve only talked about the headset tracking and not mentioned controllers. The controllers are considerably harder to track right now, but most of the blame for that is in the computer vision part. Each controller has fewer LEDs than the headset, fewer are visible at any given moment, and they often aren’t pointing at the camera front-on.

    Oculus Camera view of headset and left controller.

    This screenshot is a prime example. The controller is the cluster of lights at the top of the image, and the headset is lower left. The computer vision has gotten confused and thinks the controller is the ring of random blue crosses near the headset. It corrected itself a moment later, but those false readings make life very hard for the filtering.

    Position tracking of left controller with lots of tracking loss.

    Here’s a typical example of the controller tracking right now. There are some very promising portions of good tracking, but they are interspersed with bursts of tracking losses, and wild drifting from the computer vision giving wrong poses – leading to the filter predicting incorrect acceleration and hence cascaded tracking losses. Particularly (again) on the Z axis.

    Timing Improvements

    One of the problems I was looking at in my last post is variability in the arrival timing of the various USB streams (Headset reports, Controller reports, camera frames). I improved things in OpenHMD on that front, to use timestamps from the devices everywhere (removing USB timing jitter from the inter-sample time).

    There are still potential problems in when IMU reports from controllers get updated in the filters vs the camera frames. That can be on the order of 2-4ms jitter. Time will tell how big a problem that will be – after the other bigger tracking problems are resolved.


    All the work that I’m doing implementing this positional tracking is a combination of my free time, hours contributed by my employer Centricular and contributions from people via Github Sponsorships. If you’d like to help me spend more hours on this and fewer on other paying work, I appreciate any contributions immensely!

    Next Steps

    The next things on my todo list are:

    • Integrate the delayed-observation processing into OpenHMD (at the moment it is only in my standalone simulator).
    • Improve the filter code structure – this is my first kalman filter and there are some implementation decisions I’d like to revisit.
    • Publish the UKF branch for other people to try.
    • Circle back to the computer vision and look at ways to improve the pose extraction and better reject outlying / erroneous poses, especially for the controllers.
    • Think more about how to best handle / schedule analysis of frames from multiple cameras. At the moment each camera operates as a separate entity, capturing frames and analysing them in threads without considering what is happening in other cameras. That means any camera that can’t see a particular device starts doing full pose searches – which might be unnecessary if another camera still has a good view of the device. Coordinating those analyses across cameras could yield better CPU consumption, and let the filter retain fewer delayed observation slots.

    A shell UX update

    Last month I shared an updated activities overview design, which is planned for the next GNOME release, version 40.

    The new design has prompted a lot of interest and comment, which we’re all really thrilled about. In this post I wanted to provide an update of where the initiative is at. I also want to take the opportunity to answer some of the common questions that have come up.

    Where we’re at

    Development work has moved rapidly since I blogged last, thanks mostly to a big effort by Georges. As a result, a lot of the basic elements of the design are now in place in a development branch. The following is a short screencast of the development branch (running in a VM), to give an idea of where the development effort has got to:

    There are still work items remaining and the branch has noticeable polish issues. Anyone testing it should bear this in mind – as it stands, it isn’t a complete reflection of the actual design.

    On the design side, we’ve been reviewing the feedback that has been provided on the design so far, and are tracking the main points as they’ve emerged. This is all really valuable, but we’d also suggest that people wait to try the new design before jumping to conclusions about it. We plan on making it easier to test the development version, and will provide details about how to do so in the near future.

    The roadmap from here is to develop the branch with the new design, open it up to testing, and have an intensive period of bug fixing and evaluation prior to the UI freeze in about a month’s time. As we progress it will become easier for people to get involved both in terms of design and development.

    What the design means for users

    In the rest of this post, I’m going to address some of the common questions and concerns that we’ve heard from people about the new design. My main goal here is to clear up any confusion or uncertainty that people might have.

    Argh, change!

    A good portion of the comments that we’ve had about the design reflect various concerns about existing workflows being disrupted by the design changes. We understand these concerns and an effort has been made to limit the scale and disruptiveness of the updated design. As a result, the changes that are being introduced are actually quite limited.

    Everything about the shell remains the same except for the overview, and even that is structurally the same as the previous version. The overview contains the same key elements – windows overview, search, the dash, the app grid – which are accessed in the same sequence as before. The old features that are tied to muscle memory will work just as before: the super key will open the overview, search will function as before, and the existing shortcuts for workspaces will continue to be supported.

    One piece of feedback that we got from initial testing is that testers often didn’t notice a massive difference with the new design. If you’re concerned about potential disruption, we’d encourage you to wait to try the design, and see how it behaves in practice. You might be surprised at how seamless the transition is.

    Advantages of the new design

    A few users have asked me: “so how is the new design better for me?” Which is a fair question! I’ll run through what I see as the main advantages here. Users should bear in mind that some of the improvements are particularly relevant to new rather than existing users – there are some positive impacts which you might not personally benefit from.

    Boot experience

    The boot experience is something that we’ve struggled with throughout GNOME 3, and with the new design we think we’ve cracked it. Instead of being greeted by a blank desktop (and then, a blank overview), when you boot into the new design, you’ll be presented with the overview and your favourite apps that you can launch. Overall, it’s a more welcoming experience, and is less work to use.

    I have been asked why this change isn’t possible with the existing shell UI. Couldn’t we just show the overview on boot, without making these other changes? Theoretically we could, but the new overview design is much better suited to being shown after boot: the layout provides a focus for action and places app launching more centrally. In contrast, the old shell design places launching on the periphery and does not guide the user into their session as effectively.

    Touchpad gestures

    Effective touchpad gestures can be incredibly effective for navigation, yet our gestures for navigating the shell have historically been difficult to use and lacking a clear schema. The new design changes that, by providing a simple, easy and coherent set of touchpad gestures for moving around the system. Up and down moves in and out of the overview and app grid. Left and right moves between workspaces. If you’re primarily using the touchpad, this is going to be a huge win and it’s a very easy way to move around.

    Easy workspaces

    In our user testing, the new workspace design demonstrated itself to be more engaging and easier to get to grips with than the old one. New users could easily understand workspaces as “screens” and found it easier to get started with them, compared to the current design which wasn’t as accessible.

    Feel and organisation

    Designers often talk about mental and spatial models, and the new design is stronger in both regards. What does this translate to for users? Mostly, that the new design will generall feel better. Everything fits together better, and is more coherent. Moving around the system should be more natural and intuitive.

    Other advantages

    Other than those other main advantages, there are other more minor plus points to the new design:

    • Personalised app grid – you can now fully rearrange the app grid to your liking, using drag and drop. This is something that we’ve been working on independently to the other changes, but has continued to evolve and improve this cycle, and it fits very nicely with the other overview changes.
    • App icons in the window overview – the window overview now shows the app icon for each window, to help with identification.
    • Improved app titles – we have a new behaviour for GNOME 40, which shows the full title of the application when hovering its launcher.

    Q & A

    The following are some of the other questions that have come up in comments about the designs. Many of these have been answered in place, and it seemed worthwhile to share the answers more widely.

    How will window drag and drop between workspaces work?

    The current design works by zooming out the view to show all workspaces when a window is dragged:

    Will I be able to search from the overview after pressing super?

    Yes, that won’t change.

    Will there be an option to restore the old design?

    We don’t plan on supporting this option, largely because of the work involved. However, there could of course be community extensions which restore some aspects of the old design (say, having a vertical dash along the side). We’re happy to work with extension developers to help this to happen.

    Please keep the hot corner!

    OK. 🙂 (We weren’t planning on removing it.)

    How will the new design affect multi-display setups?

    It should have very little impact on multi-monitor. The same behaviour with regards to workspaces will be supported that we currently have: by default, only the primary display has workspaces and secondary displays are standalone. We don’t anticipate any major regressions and have some ideas for how to improve multi-monitor support, but that might need to wait until a future release.

    Will the new design work OK with vertical displays?

    Yes, it will work just fine.


    That’s it for now. With this initiative proceeding quickly, we hope to have more updates soon. We also aim to provide another post with details on our user research in the not too distant future.

    January 06, 2021

    Think About your audience

    The new year has got me thinking of how much I have learnt so far about open source and GJS. Usually, contributing to an open source project for the first time is like stepping into the unknown- not knowing how the community will welcome you, how helpful the community members will be or if the skills you have are good enough for a start. In this blog post I will be talking about how my journey has been which might be useful to you thinking of contributing to the GJS debugger.

    Some months ago, I submitted an initial application for the May 2020 round of the outreachy internships not knowing exactly if I was ready for the journey ahead. Unfortunately I didn’t get through to the contributions phase but the little experience I had from going through the lists of organisations that participated in previous rounds and checking out some of their repositories helped me become more familiar with open source contribution.

    When the contributions phase for the December 2020 – March 2021 round started, so much confusion set in. Which Organisation should I choose, Why should I choose it and what strategy am I to use to get selected were the questions I kept asking myself. Sooner than later, I finally decided to choose something that in my opinion was not only challenging but will make me feel like part of something great. Due to the fact that I developed a special relationship with JavaScript mainly because it is the first language that helped me understand what programming meant deeply, I chose the GJS project and another project which required JavaScript. I finally put all of my eggs in the GJS basket when I realized that it was a developer tool and because of how quickly my mentor and other community members helped me out when I got stuck. This experience alone made me understand that the community is open to everyone. You just need to be willing to put in the time and be open enough to ask questions.

    The GJS community is part of the GNOME community so it is more appropriate to mention GJS with GNOME. GJS is GNOME’s very own JavaScript binding built on the SpiderMonkey JavaScript engine (visit this link to know more https://mozilla-spidermonkey.github.io/) and my project involves working on the debugger (A debugger is a computer program that allows you to uncover and diagnose problems in other computer programs.) to improve it’s debugging experience. For clarity’s sake, GNOME is an easy to use graphical user interface and a set of computer desktop applications for Unix-based operating systems which include Gedit (text editor), builder (IDE), polari (chat application), just to mention a few. If you install a Linux distribution like Ubuntu or Fedora, then what you see on your desktop is GNOME (see https://www.gnome.org/gnome-3/). For more information about GNOME visit it’s wikipedia page at https://en.wikipedia.org/wiki/GNOME.

    During the last couple of weeks, I have learnt a lot from better coding practices to new terms such as stack frame, backtrace and bindings. When I started contributing to this project, drawing a line between a developer using the tool and a developer developing it was confusing. This is partly because I only started using it when I started contributing to it. Now I clearly understand why user acceptance testing is very important in developing any application. There are so many things a user can see that the developer of an application will not see. To anyone you who plans on contributing to GJS or any other project, I strongly suggest you test it and try to understand it as a normal user would so you can clearly see some modifications that need to be made before the lines of code steal some of the gifts an end user has from you.

    From all that has been said, here are key points to note as a new contributor to the GJS project

    • The project does seem more challenging than it really is in the beginning. It is generally a good practice to give yourself some room to try and fail so that you can then be able to ask questions from the errors that you get.
    • Again, it is good to test the project as a normal user so it helps give you ideas on possible modifications that can be made without letting the fear of how challenging it might be to achieve them limit you. Users are free minded and are often only concerned with something doing what they want it to do and not how it was made to do that.

    January 05, 2021

    Quick review of Lenovo Yoga 9i laptop

    Some time ago I pondered on getting a new laptop. Eventually I bought a Lenovo Yoga 9i, which ticked pretty much all the boxes. I also considered a Dell 9310 but chose against it due to two reasons. Firstly, several reviews say that the keyboard feels bad with too shallow a movement. The second bit being that Dell's web site for Finland does not actually sell computers to individuals, only corporations, and their retailers did not have any of the new models available.

    The hardware

    It's really nice. Almost everything you need is there, such as USB A and C, touch screen, pen, 16GB of ram, Tiger Lake CPU, Xe graphics and so on. The only real missing things are a microsd card slot and a HDMI port. The trackpad is nice, with multitouch working flawlessly in e.g. Firefox. You can only do right click by clicking on the right edge rather than clicking with two fingers, but that's probably a software limitation (of Windows?). The all glass trackpad surface is a bit of a fingerprint magnet, though.

    There are two choices for the screen, either FullHD or 4k. I took the latter because once you have experienced retina, you'll never go back. This reduces battery life, but even the 4k version gets 4-8 hours of battery life, which is more than I need. The screen itself is really, really nice apart from the fact that it is extremely glossy, almost like a mirror. Colors are very vibrant (to the point of being almost too saturated in some videos) and bright. Merely looking at the OS desktop background and app icons feels nice because the image is so sharp and calm. As a negative point just looking at Youtube videos makes the fan spin up. 

    The touchscreen and pen work as expected, though pen input is broken in Windows Krita by default. You need to change the input protocol from the default to the other option (whose actual name I don't remember).

    When it comes to laptop keyboards, I'm very picky. I really like the 2015-era MBPro and Thinkpad keyboards. This keyboard is not either of those two but it is very good. The key travel is slightly shallower and the resistance is crisper. It feels pleasant to type on.

    Linux support

    This is ... not good. Fedora live USBs do not even boot, and a Ubuntu 20/10 live USB has a lot of broken stuff, but surprisingly wifi works nicely. Things that are broken include:
    • Touchscreen
    • 3D acceleration (it uses LLVM softpipe instead)
    • Trackpad
    • Pen
    The trackpad bug is strange. Clicking works, but motion does not unless you push it at a very, very, very specific amount pressure that is incredibly close to the strength needed to activate the click. Once click activates, motion breaks again. In practice it is unusable.

    All of these are probably due to the bleeding-edgeness of the hardware and will probably be fixed in the future. For the time being, though, it is not really usable as a Linux laptop.

    In conclusion

    This is the best laptop I have ever owned. It may even be the best one I have ever used.

    Devhelp on Fedora Silverblue

    I have recently switched to Fedora Silverblue. The recommended way for development is to use Fedora Toolbox containers, so I have started using it and installed the various development packages there. I like the Devhelp application for browsing the API documentation. So I installed that application over GNOME Software. But the problem is that the Devhelp application started from GNOME Shell doesn’t see the documentation files which are located under the Toolbox container. This is probably expected, but it is annoying. Starting the Devhelp application from the terminal over toolbox run flatpak run org.gnome.Devhelp is pretty cumbersome.

    To solve this issue, I simply copied the org.gnome.Devhelp.desktop file from /var/lib/flatpak/app/org.gnome.Devhelp/current/active/files/share/applications to ~/.local/share/applications and modified the Exec and DBusActivatable lines the following way:

    Exec=toolbox run flatpak run org.gnome.Devhelp

    Now I can easily start the Devhelp application from GNOME Shell to see all the documentation files from the Toolbox container. I hope that this short post helps other people. Let me know in the comments if there is a better way to achieve this…

    Welcome to 2021!

    Since mid-November, we’ve been running a fundraiser that ends today, January 5. We’re thankful for everyone who donated to support our work.

    Welcome to 2021! A new year feels like a time for new beginnings, even though the challenges from 2020 still hang over us. But in the midst of all this, we continue to build free software and a welcoming, supportive community. We do this because we know that even in a world with issues that are so immediately pressing, we must also ensure that the foundations of technology are things that empower people, that people can trust, and that we can continue to use for the hard, amazing, inspiring work still needed.

    GNOME helps users. We believe strongly that in order to create good technology, it must be trustworthy. We do this through the creation of world class technology that meets the needs of users — GNOME works for everyday people. This also means that people know a technology is working in their best interests. With rigorous scientific methods and passionate end user advocacy, GNOME is designed for users, by users.

    We dedicated 2020 to making sure that GNOME software works for everyone through a focus on accessibility. This work is certainly not finished, but we’re proud of how far we’ve come. With the newest release of GTK4, we’ve completely revamped our accessibility toolkit. The updated layout implementation creates new possibilities for designing interfaces for a variety of user needs and preferences. We know that GNOME must be usable by everyone, whether that is due to disability or simply geography. There are more than 140 translations of GNOME in progress, which includes the billions of people who do not speak English.

    GNOME helps people making technology. GTK, a GNOME Foundation project, is a complete set of UI elements implemented to make all sorts of software usable. Since everything in GNOME is free software, not only is it available to people working on software, but the parts, the code, and the designs are available as well. Anyone can look at how any part of GNOME is constructed and reuse that work. We’re excited to hear more about the ways you use GNOME tools to build a better world.

    GNOME helps its contributors. We cannot stress enough the impact GNOME has on the lives on the individual community members. With both mentorship and internship, GNOME helps people break into tech and move to the next stage of their careers. Whether it calls for skills that are technical, social, public speaking, project management, writing, or everything else that is required to make a project as large and complete as GNOME succeed. Working on GNOME builds confidence for contributors. People learn to trust their skills and intuitions. They learn that what they do is valuable to the world at large.

    GNOME is not just an end, but a means to give people the tools, skills, and resources they need to accomplish what they need to create a brighter future. We except 2021 will be a challenging year, but one we have high hopes for. We’re going to continue to build amazing things thanks to the support of our donors, contributors, and supporters.

    A new release of nsnstrace

    During the holidays I managed to get some time to hack on nsntrace. I have mentioned nsntrace on this blog in the past but you might need a refresher:

    The nsntrace application uses Linux network namespaces and iptables magic to perform network traces of a single application. The traces are saved as pcap files and can be analyzed by for instance wireshark or tshark

    In May of 2020 nsntrace moved to a collaborative maintenance model and to its own GitHub organisation. This helped a lot with my guilt for neglecting its maintenance. We even managed to get some pull requests merged and some issues solved.

    For the new release we landed some notable pull requests.

    Handle loopback nameservers

    On many systems the nameserver functionality is handled by an application such as systemd-resolved or dnsmasq and the nameserver address in resolv.conf is a loopback address (like where that application listens for incoming DNS queries.

    This would not work for us since the network namespace we create has its own separate loopback device and cannot reach the one where systemd-resolved or dnsmasq listens. If somebody reading this has a solution to this problem, please let us know in the comments or via GitHub!

    What we ended up doing were two things. First we added a warning output for when we detect only loopback nameservers. And second we offer a new --use-public-dns command line option to override the system resolv.conf in our namespace.

    $ sudo ./src/nsntrace ping lwn.net
    Warning: only loopback (127.x.y.z) nameservers found.
    This means we will probably not be able to resolve hostnames.
    Consider passing --use-public-dns to use public nameservers.

    Starting network trace of 'ping' on interface wlp58s0.
    Your IP address in this trace is
    Use ctrl-c to end at any time.

    In order to have our own nameservers for our network namespace we will bind mount our own resolv.conf file, with public nameservers from Cloudflare, OpenDNS and Quad9, over the one in /etc/resolv.conf.

    But we do not want to affect other applications outside of our namespaced application, so we enter a mount namespace (CLONE_NEWNS) before we do the bind mount.

    This, however, is not enough. Before the bind mount we need to remount the root partition with special mount flags to make sure our changes to the subtree are not propagated to the outside mount namespace.

    This is basically how the ip-netns tool allows for namespace specific nameservers as well.

    Treat lone dash ("-") output filename as stdout

    This started as an issue from a user that reported:

    I am using nsntrace to dump and analyze a processes network traffic in real time. In order to do that, I need to get the network dump piped to another process, not written to a file.

    The user requested that nsntrace followed the convention of treating a lone dash (‘-’) as filename to mean output to stdout. As an aside, does anyone know where that convention originated?

    We got this implemented and we made sure to write all non-pcap output to stderr in the case of a lone dash filename being detected. We also made sure that the application under trace behaves by replacing its stdout file descriptor with stderr, using the dup2 syscall.
    This change enables us to do live packet capture using tshark.

    $ sudo nsntrace -o - wget lwn.net | tshark -r - -Y 'http.response or http.request'
    13 0.467670 → SSDP 214 M-SEARCH * HTTP/1.1
    16 0.541178 → HTTP 200 GET / HTTP/1.1
    20 0.747923 → HTTP 448 HTTP/1.1 302 Moved Temporarily (text/plain)
    41 1.468987 → SSDP 214 M-SEARCH * HTTP/1.1
    71 2.472515 → SSDP 214 M-SEARCH * HTTP/1.1

    Give the network namespace a name

    To give a network namespace a name in the eyes of tools like ip-netns we need to bind mount /proc/<pid>/net/ns/ to /run/netns/NAME. It is described in the man page of ip-netns:

    By convention a named network namespace is an object at /run/netns/NAME that can be opened. The file descriptor resulting from opening /run/netns/NAME refers to the specified network namespace. Holding that file descriptor open keeps the network namespace alive. The file descriptor can be used with the setns(2) system call to change the network namespace associated with a task.

    By implementing this, the nsntrace network namespace will show up when we use ip-netns.

    $ ip netns list

    And we might even use
    ip-netns to enter the same network namespace as a running nsntrace process.

    By adding a snapcraft.yaml file we enable building a snap package of nsntrace and adding it to the snapcraft app store. Which might mean more people can find nsntrace, and that the maintenance team get a bit more control of distribution of nsntrace.

    Thanks for reading! And check out the new release!

    January 01, 2021

    That year we’ll all remember

    So we’ve made it through 2020, a year where everyone’s “wrap-up writings” will likely be more similar than ever.

    The Virus

    Let’s first address the microscopic elephant in the unventilated rooms. This section requires no introduction, though.
    Looking back, going to FOSDEM in the first weekend of Fedruary now seems completely crazy, especially knowing now that the virus was already in Europe then. I wonder how many of us got it with mild symptoms back then, and assumed we were having the infamous “FOSDEM flu”.

    I am lucky that the confinement didn’t apparently affect me too much. Prior to Kinvolk, I had been working remotely for several years, so I was already used to the loneliness of this way of working. Besides, in Berlin we were living in a house with a small backyard where the kids could play, so we were lucky in that regard as well.
    Of course working with the kids at home is never the same as working alone, and it was not great for the kids to be for such a long time away from their friends. Like everyone, I do have many stories related to the confinement rules, but I will refrain from writing those in this post.

    Two Fladenbrot (a Turkish flat brad). They're a bit more flat as expected than normal bread, and have dark and light sesame seeds on top.
    I became a world renown baker, as everybody else, during the pandemic. Behold my delicious Fladenbrot.

    The return

    After our son was born (almost 4 years ago), we started entertaining the idea of moving back to Portugal. There were several reasons for this: our daughter was starting school (which means moving later would be more complicated for her); we grew up with our grandparents around and would like our children to experience the same; a somewhat frustration with Berlin sometimes, and the different look we take at our own country after more than 10 years living abroad.
    Of course, Helena’s getting sick last year was also made harder by being away from family, and put things into perspective.
    So this summer we actually moved back!

    As with all the moves we’ve done (we have lived in 4 european countries), the most difficult thing is leaving our friends. Berlin has been the place where we stayed the longest (after University), and despite any of the love/hate feelings towards Berlin, it will always be a special place for us, and the birth place of our son.

    The move was stressful as any international move, with a special extra concern of crossing 3 EU inner borders that had been closed a not long before our departure date.
    Leaving an apartment in Berlin is a whole ordeal (of rules, repairs, and sometimes pettiness), and like many other people will tell you about their experience, we did have some problems with the renting company. It all got solved thanks to the tireless help of our great neighbors, so I must give a heartwarm shout out to Ilka, Martin, Fernando, and Stefan/Susie, who are simply the best! I hope 2021 will allow us to travel back there at some point (without it being the unrecommended quest it is at the moment).


    A good friend of mine once told me this: all places are the same.
    I realize now that it means that when you move to a different place, there are always better things and worse things in comparison, but there is some kind of balance after one adapts (and thus it means all places can feel the same in the end).

    Besides the whole country, language, and culture, we’ve also changed to a much smaller city (Lagos) where we have my wife’s family around, and that has many advantages for us as parents. But I can leave more details about this for some dedicated post later in the year.


    I continue to proudly work on Kinvolk’s great products, and indeed, I am thankful that Kinvolk is a remote first company.
    Where I initially had some concerns regarding working with friends and moving into such different projects from what I had been doing in recent years, those feelings are gone and I just honestly feel very lucky, excited, and proud to be contributing to an amazing company with nice people.

    Like most companies this year, Kinvolk also had to adapt its plans, but finished the year very positively. Some highlights from Kinvolk are the new company website, the consolidation of Flatcar as the continuation of CoreOS’s original vision, Headlamp (a new Kubernetes UI project), and the great Volks who joined the company in 2020.


    As I wrote last year, I didn’t expect to have any time to devote to tech stuff outside of work and that was certainly true. I even let the GNOME Foundation membership expire during the preparations for the move… But let’s see how the year develops.

    Wrap up

    I will be cautious with the traditional great expectations for the new year this time, so see you later (in some Zoom call I guess)!

    A couple seating on a bench and hugging each other in front of a landscape of great mountains resembling the Alps a bit during summer.
    Here’s a picture of an old couple in 2020 enjoying the beautiful views in Hornberg, Germany. Taken during our trip to Portugal.

    December 30, 2020

    Search Joplin notes from GNOME Shell

    One of my favourite discoveries of 2020 is Joplin, an open, comprehensive notebook app. I’m slowly consolidating various developer journals, Zettelkasten inspired notes, blog drafts, Pinboard bookmarks and abstract doodles into Joplin notebooks.

    Now it’s there I want to search it from the GNOME Shell overview, and that’s pretty fun to implement.

    It’s available from here and needs to be installed manually with Meson. Perhaps one day this can ship with Joplin itself, but there are few issues to overcome first:

    • It’s not yet possible to open a specific note in Joplin. I suggested adding a commandline option and discovered they plan to add x-callback-url support, which will be great once there’s a design for how it should work.
    • The search provider also appears as an application in the Shell. I think a change in GNOME Shell is needed before we can hide this.

    Here’s to the end of 2020. If you’re bored, here’s a compilation of unusual TV news events from the year, including (my favourite) #9, a guy playing piano to monkeys.

    December 28, 2020

    phosh overview

    phosh is graphical shell for mobile, touch based devices like smart phones. It's the default graphical shell on Purism's Librem 5 (and that's where it came to life) but projects like postmarketOS, Mobian and Debian have picked it up putting it into use on other devices as well and contributing patches.

    This post is meant as a short overview how things are tied together so further posts can provide more details.

    A PHone SHell

    As mobile shell phosh provides the interface components commonly found on mobile devices to

    phosh's overview 2 phosh's lockscreen phosh's overview 1

    • launch applications
    • switch between running applications and close them
    • lock and unlock the screen
    • display status information (e.g. network connectivity, battery level)
    • provide quick access to things like torch or Bluetooth
    • show notifications

    It uses GObject object system and GTK to build up the user interface components. Mobile specific patterns are brought in via libhandy.

    Since phosh is meant to blend into GNOME as seamlessly as possible it uses the common interfaces present there via D-Bus like org.gnome.Screensaver or org.gnome.keyring.SystemPrompter and retrieves user configuration like keybindings via GSettings from preexisting schema.

    The components of a running graphical session roughly look like this:

    phosh session

    The blue boxes are the very same found on GNOME desktop sessions while the white ones are currently only found on phones.

    feedbackd is explained quickly: It's used for providing haptic or visual user feedback and makes your phone rumble and blink when applications (or the shell) want to notify the user about certain events like incoming phone calls or new messages. What about phoc and squeekboard?

    phoc and squeekboard

    Although some stacks combine the graphical shell with the display server (the component responsible for drawing applications and handling user input) this isn't the case for phosh. phosh relies on a Wayland compositor to be present for that. Keeping shell and compositor apart has some advantages like being able to restart the shell without affecting other applications but also adds the need for some additional communication between compositor and shell. This additional communication is implemented via Wayland protocols. The Wayland compositor used with phosh is called phoc for PHone Compositor.

    One of these additional protocols is wlr-layer-shell. It allows the shell to reserve space on the screen that is not used by other applications and allows it to draw things like the top and bottom bar or lock screen. Other protocols used by phosh (and hence implemented by phoc) are wlr-output-management to get information on and control properties of monitors or wlr-foreign-toplevel-management to get information about other windows on the display. The later is used to allow to switch between running applications.

    However these (and other) Wayland protocols are not implemented in phoc from scratch. phoc leverages the wlroots library for that. The library also handles many other compositor parts like interacting with the video and input hardware.

    The details on how phoc actually puts things up on the screen deserves a separate post. For the moment it's sufficient to note that phosh requires a Wayland compositor like phoc.

    We've not talked about entering text without a physical keyboard yet - phosh itself does not handle that either. squeekboard is the on screen keyboard for text (and emoji) input. It again uses Wayland protocols to talk to the Wayland compositor and it's (like phosh) a component that wants exclusive access to some areas of the screen (where the keyboard is drawn) and hence leverages the layer-shell protocol. Very roughly speaking it turns touch input in that area into text and sends that back to the compositor that then passes it back to the application that currently gets the text input. squeekboard's main author dcz has some more details here.

    The session

    So how does the graphical session in the picture above come into existence? As this is meant to be close to a regular GNOME session it's done via gnome-session that is invoked somewhat like:

    phoc -E 'gnome-session --session=phosh'

    So the compositor phoc is started up, launches gnome-session which then looks at phosh.session for the session's components. These are phosh, squeekboard and gnome-settings-daemon. These then either connect to already running services via D-Bus (e.g. NetworkManager, ModemManager, ...) or spawn them via D-Bus activation when required (e.g. feedbackd).

    Calling conventions

    So when talking about phosh it's good to keep several things apart:

    • phosh - the graphical shell
    • phoc - the compositor
    • squeekboard - the on screen keyboard
    • phosh.session: The session that ties these and GNOME together

    On top of that people sometimes refer to 'Phosh' as the software collection consisting of the above plus more components from GNOME (Settings, Contacs, Clocks, Weather, Evince, ...) and components that currently aren't part of GNOME but adapt to small screen sizes, use the same technologies and are needed to make a phone fun to use e.g. Geary for email, Calls for making phone calls and Chats for SMS handling.

    Since just overloading the term Phosh is confusing GNOME/Phosh Mobile Environment or Phosh Mobile Environment have been used to describe the above collection of software and I've contacted GNOME on how to name this properly, to not infringe on the GNOME trademark but also give proper credit and hopefully being able to move things upstream that can live upstream.

    That's it for a start. phosh's development documentation can be browsed here but is also available in the source code.

    Besides the projects mentioned above credits go to Purism for allowing me and others to work on the above and other parts related to moving Free Software on mobile Linux forward.

    Advent of Rust 25: Baby Steps

    It’s the final post in the series chronicling my attempt to teach myself the Rust programming language by solving programming puzzles from Advent of Code 2020.

    Day 25, Part 1

    Today’s puzzle is about cracking an encryption key, in order to get at a piece of secret information (called loop size in the puzzle) by taking a piece of known public information (public key) and reversing the algorithm used to generate it. Of course, the algorithm (called transform subject number) is not easy to reverse, and that’s what the puzzle is about.

    The puzzle description suggests guessing the loop size by trial and error which I am skeptical about, but this is the code that would do that by brute force:

    fn transform_subject_number(subject_number: u64, loop_size: usize) -> u64 {
        let mut value = 1;
        for _ in 0..loop_size {
            value *= subject_number;
            value %= 20201227;
    fn guess_loop_size(public_key: u64) -> usize {
        for loop_size in 1.. {
            if transform_subject_number(7, loop_size) == public_key {
                return loop_size;
        panic!("Not reachable");
    struct Party {
        loop_size: usize,
    impl Party {
        fn public_key(&self) -> u64 {
            transform_subject_number(7, self.loop_size)
        fn encryption_key(&self, other_public_key: u64) -> u64 {
            transform_subject_number(other_public_key, self.loop_size)
    fn main() {
        let card_public_key = 2084668;
        let door_public_key = 3704642;
        let card = Party {
            loop_size: guess_loop_size(card_public_key),
        let door = Party {
            loop_size: guess_loop_size(door_public_key),
        println!("{}", card.encryption_key(door.public_key()));

    This is taking a long time. I’m guessing that this is not the way to do it.

    I notice that, were it not for integer overflow, we’d be able to write the transform subject number result as SL (mod 20201227) (where S is the subject number and L is the loop size.) So, the total of what we know is this:

    • Pc ≡ 7Lc (mod 20201227)
    • Pd ≡ 7Ld (mod 20201227)
    • PcLdPdLc (mod 20201227)

    where P is the public key, and the subscript c or d indicates card or door. The symbol “≡” means “congruent with” although I had to look it up on Wikipedia. Since I’m not even using all of this information in the trial and error implementation, I’m not surprised that it isn’t working.

    I’m sure this is a solved problem, just like the hexagons yesterday, so I start searching (although I’m not sure what to search for, I try things like “modulo inverse”) and eventually land on the Wikipedia article for Modular exponentiation, which explains “the task of finding the exponent […] when given [the base, modular exponentiation, and modulus] is believed to be difficult.” Thanks, I guess…

    These pages contain a little too much math all at once for my holiday-addled brain, especially because they confusingly use the notation a−1 to mean… something that’s not 1/a… so I decide to read a bit more on the topic hoping that something will sink in.

    Eventually I read the “Modular arithmetic” section on the Wikipedia article for Discrete logarithm and a light begins to dawn. They give an example of how to calculate the possible solutions using Fermat’s Little Theorem.1 However, this approach turns out not to be useful for me because it already requires knowing one possible solution, and that’s exactly what I don’t have.

    I do some more searching and find the Baby-step giant-step algorithm. I would probably have skipped over this if I had just stumbled upon the Wikipedia article without any context (because I don’t know what a finite abelian group is) but I reached it via another site with a bit more explanation of the missing link connecting the problem at hand to the Wikipedia article.2

    The problem is of the form akb (mod n), where we have to find k. For the example in the puzzle description, we can fill in: a = 7, b = 17807724, n = 20201227.

    The first thing I do is replace transform_subject_number() with a more general pow_m() function using the algorithm that I already have, called the “memory-efficient algorithm” in the Wikipedia article, and check that the tests still pass:

    fn pow_m(base: u64, exponent: usize, modulus: u64) -> u64 {
        if modulus == 1 {
            return 0;
        let mut value = 1;
        for _ in 0..exponent {
            value *= base;
            value %= modulus;
    fn transform_subject_number(subject_number: u64, loop_size: usize) -> u64 {
        pow_m(subject_number, loop_size, 20201227)

    Then I rewrite pow_m() to use the faster “right-to-left binary” algorithm from the Wikipedia article, and again check that the tests still pass:

    fn pow_m(base: u64, exponent: usize, modulus: u64) -> u64 {
        if modulus == 1 {
            return 0;
        let mut value = 1;
        let mut mod_base = base % modulus;
        let mut mod_exponent = exponent;
        while mod_exponent > 0 {
            if mod_exponent % 2 == 1 {
                value *= mod_base;
                value %= modulus;
            mod_exponent >>= 1;
            mod_base *= mod_base;
            mod_base %= modulus;

    Next I rewrite guess_loop_size() to use the baby-steps giant-steps algorithm, as described by Wikipedia:

    fn bsgs(base: u64, modulus: u64, result: u64) -> Option<usize> {
        let m = (modulus as f64).sqrt().ceil() as u64;
        let mut table = HashMap::new();
        let mut e = 1;
        for j in 0..m {
            table.insert(e, j);
            e *= base;
            e %= modulus;
        let factor = pow_m(base, (modulus - m - 1) as usize, modulus);
        let mut gamma = result;
        for i in 0..m {
            if let Some(j) = table.get(&gamma) {
                return Some((i * m + j) as usize);
            gamma *= factor;
            gamma %= modulus;
    fn guess_loop_size(public_key: u64) -> usize {
        bsgs(7, 20201227, public_key).unwrap()

    The tests still pass, so that means it correctly handles the example code. I run this, it finishes almost immediately, and I get the right answer.

    Day 25, Part 2

    I’m not going to spoil the very last puzzle! I solve it without writing any code though.


    I found this puzzle one of the most difficult ones, as I’m not really into cryptography, and it required quickly getting acquainted with an area of mathematics that I had never encountered before. I don’t even think it would have been possible for me to do it, if I hadn’t had enough experience in other areas of mathematics that I at least knew some search terms that brought me in the right direction.

    It was difficult in a different way than the image-assembly puzzle on Day 20, though; that one was more like, you could easily see what needed to be done, but it was so tedious to get right. The hard part today was to find out what needed to be done, and once I had landed on the right Wikipedia page with an explanation of the algorithm, it was simple enough to implement. In a way it was similar to yesterday’s hexagons, but the topic of modular discrete logarithms was so much more difficult to absorb quickly than the topic of hexagons was.

    Reflections on the Whole Thing

    How do I feel about doing 50 puzzles in the past 28 days?
    First of all, if I do this again, I’m not going to keep up this pace and I’m not going to blog about it. It was fun while it lasted, but it’s really quite time-consuming, and it’s not even over yet — Rust is open source, so I have practically obligated myself in January to do the right thing and follow through with submitting all the suggestions for improvement of the Rust tools that I can glean from the earlier posts in the series.

    I enjoyed the VM-like puzzles the most, after that the parser ones,3 and after that the ones that could be solved with interesting uses of iterators and Itertools. The cryptography puzzles like today’s, I didn’t enjoy that much. I appreciated that there were so many Game of Life variations as an homage to Conway who passed away this year, but as puzzles they were not so interesting; after I wrote the code to solve one, I basically just copied it for the subsequent puzzles. Conway’s Game of Life is so mesmerizing when you can watch it evolving, but since it wasn’t necessary for solving the puzzle I didn’t really feel like spending time building visualizations.

    I didn’t find the Reddit board very helpful unless I was actively looking for a hint from the hint threads, or looking for other people’s visualizations of the Game of Life puzzles. Reading all the posts from people who knew exactly the right algorithm to use, or solved the puzzle in mere minutes, or made fantastic visualizations, made me feel inadequate. Even though there were a few of these puzzles that I thought I did very well at, there were always people on the Reddit board who did better.4

    Can I program in Rust now?
    I’m not sure. Probably not. I know just enough that I should now go and read the Rust book, and I will get much more out of it than I would have if I had started out by reading the book.

    Would I recommend learning Rust?
    Certainly I found it very rewarding. It’s refreshing to see how they designed it to avoid or mitigate common classes of mistakes. Its tools and its standard library were a pleasure to use. That said, I would mainly recommend learning it if you are already experienced in another programming language. Put another way, I found learning Rust while knowing C++ as a reference point to be comparable to (what I imagine is the experience of) learning C++ while knowing JavaScript as a reference point. The things that you already know from the other language do serve you well, and give you a better footing in the new language, but there are also many new concepts that have no equivalent in the other language and you just don’t think about them explicitly. For C++, an example would be pointers, and for Rust an example would be borrows. Speaking for myself at least, I wouldn’t want to be learning borrowing at the same time as I was learning for-loops. Not to mention I’ve finished this whole series while still only having a vague idea of what a lifetime is!

    Is the hype justified?
    Although based on limited experience, at this point I believe the advice of using Rust for new projects where you would otherwise be using C or C++ is sound! I’m excited to use Rust for a project in the future. On the other hand, I don’t think Rust quite lives up to the hype of being absolutely safe because I was able to easily write code in almost all of these puzzles that would abort when given any sort of unexpected input, but it is at least true that those cases might instead be buffer overflows in C.

    Would I recommend learning programming by doing these puzzles?
    I’ve seen people recommend this while reading about Advent of Code this month, but honestly? I wouldn’t. If you are learning programming, start with something that’s straightforward and not tricky at all. Programming is tricky enough already by itself.

    To conclude this series, I will leave you with a fast-loading version of the Rust error types diagram that Federico sent me after reading my early posts, rendered from DOT and accessible without Google Drive. This diagram was really helpful for me along the way, big thanks to whoever first made it, so I’m glad I can re-share it in a hopefully improved format.

    Happy New Year, let’s hope for some better times in 2021!

    [1] Yes, that really is the name; is there also Fermat’s Big Theorem? ↩

    [2] I hesitate to link there because they have big banners saying “Get FAANG Ready With [our site]” which promotes a software engineering industry culture that I don’t support ↩

    [3] Hopefully that means I’m in the right job ↩

    [4] Or claimed to, in order to make others feel inadequate5 ↩

    [5] No, I don’t have a very high opinion of the quality of discourse on Reddit, why do you ask? ↩

    Some things a potential Git replacement probably needs to provide

    Recently there has been renewed interest in revision control systems. This is great as improvements to tools are always welcome. Git is, sadly, extremely entrenched and trying to replace will be an uphill battle. This is not due to technical but social issues. What this means is that approaches like "basically Git, but with a mathematically proven model for X" are not going to fly. While having this extra feature is great in theory, in practice is it not sufficient. The sheer amount of work needed to switch a revision control system and the ongoing burden of using a niche, nonstandard system is just too much. People will keep using their existing system.

    What would it take, then, to create a system that is compelling enough to make the change? In cases like these you typically need a "big design thing" that makes the new system 10× better in some way and which the old system can not do. Alternatively the new system needs to have many small things that are better but then the total improvement needs to be something like 20× because the human brain perceives things nonlinearly. I have no idea what this "major feature" would be, but below is a list of random things that a potential replacement system should probably handle.

    Better server integration

    One of Git's design principles was that everyone should have all the history all the time so that every checkout is fully independent. This is a good feature to have and one that should be supported by any replacement system. However it is not revision control systems are commonly used. 99% of the time developers are working on some sort of a centralised server, be it Gitlab, Github or the a corporation's internal revision control server. The user interface should be designed so that this common case is as smooth as possible.

    As an example let's look at keeping a feature branch up to date. In Git you have to rebase your branch and then force push it. If your branch had any changes you don't have in your current checkout (because they were done on a different OS, for example), they are now gone. In practice you can't have more than one person working on a feature branch because of this (unless you use merges, which you should not do). This should be more reliable. The system should store, somehow, that a rebase has happened and offer to fix out-of-date checkouts automatically. Once the feature branch gets to trunk, it is ok to throw this information away. But not before that.

    Another thing one could do is that repository maintainers could mandate things like "pull requests must not contain merges from trunk to the feature branch" and the system would then automatically prohibit these. Telling people to remove merges from their pull requests and to use rebase instead is something I have to do over and over again. It would be nice to be able to prohibit the creation of said merges rather than manually detecting and fixing things afterwards.

    Keep rebasing as a first class feature

    One of the reasons Git won was that it embraced rebasing. Competing systems like Bzr and Mercurial did not and advocated merges instead. It turns out that people really want their linear history and that rebasing is a great way to achieve that. It also helps code review as fixes can be done in the original commits rather than new commits afterwards. The counterargument to this is that rebasing loses history. This is true, but on the other hand is also means that your commit history gets littered with messages like "Some more typo fixes #3, lol." In practice people seem to strongly prefer the former to the latter.

    Make it scalable

    Git does not scale. The fact that Git-LFS exists is proof enough. Git only scales in the original, narrow design spec of "must be scalable for a process that only deals in plain text source files where the main collaboration method is sending patches over email" and even then it does not do it particularly well. If you try to do anything else, Git just falls over. This is one of the main reasons why game developers and the like use other revision control systems. The final art assets for a single level in a modern game can be many, many times bigger than the entire development history of the Linux kernel.

    A replacement system should handle huge repos like these effortlessly. By default a checkout should only download those files that are needed, not the entire development history. If you need to do something like bisection, then files missing from your local cache (and only those) should be downloaded transparently during checkout operations. There should be a command to download the entire history, of course, but it should not be done by default.

    Further, it should be possible to do only partial checkouts. People working on low level code should be able to get just their bits and not have to download hundreds of gigs of textures and videos they don't need to do their work.

    Support file locking

    This is the one feature all coders hate: the ability to lock a file in trunk so that no-one else can edit it. It is disruptive, annoying and just plain wrong. It is also necessary. Practice has shown that artists at large either can not or will not use revision control systems. There are many studios where the revision control system for artists is a shared network drive, with file names like character_model_v3_final_realfinal_approved.mdl. It "works for them" and trying to mandate a more process heavy revision control system can easily lead to an open revolt.

    Converting these people means providing them with a better work flow. Something like this:
    1. They open their proprietary tool, be it Photoshop, Final Cut Pro or whatever.
    2. Click on GUI item to open a new resource.
    3. A window pops up where they can browse the files directly from the server as if they were local.
    4. They open a file.
    5. They edit it.
    6. They save it. Changes go directly in trunk.
    7. They close the file.
    There might be a review step as well, but it should be automatic. Merge requests should be filed and kept up to date without the need to create a branch or to even know that such a thing exists. Anything else will not work. Specifically doing any sort of conflict resolution does not work, even if it were the "right" thing to do. The only way around this (that we know of) is to provide file locking. Obviously this should only be limitable to binary files.

    Provide all functionality via a C API

    The above means that you need to be able to deeply integrate the revision control system with existing artist tools. This means plugins written in native code using a stable plain C API. The system can still be implemented in whatever SuperDuperLanguage you want, but its one true entry point must be a C API. It should be full-featured enough that the official command line client should be implementable using only functions in the public C API.

    Provide transparent Git support

    Even if a project would want to move to something else, the sad truth is that for the time being the majority of contributors only know Git. They don't want to learn a whole new tool just to contribute to the project. Thus the server should serve its data in two different formats: once in its native format and once as a regular Git endpoint. Anyone with a Git client should be able to check out the code and not even know that the actual backend is not Git. They should be able to even submit merge requests, though they might need to jump through some minor hoops for that. This allows you to do incremental upgrades, which is the only feasible way to get changes like these done.

    December 27, 2020

    Advent of Rust 24: A Hexagonal Tribute to Conway

    Today in the penultimate post from the chronicle of teaching myself the Rust programming language by doing programming puzzles from Advent of Code 2020: a hexagonal grid, and another homage to Conway, this time unexpected.

    But before that, I will finally solve that sea monster puzzle from Day 20.

    Day 20, Part 2 (Yet Again)

    First, as promised in the previous post, I will show the refactored code that assembles the full image.

    struct Solver {
        tiles: Vec<Tile>,
        connections: MultiMap<u64, u64>,
        corners: [u64; 4],
        used_tile_ids: HashSet<u64>,
    impl Solver {
        fn new(tiles: &[Tile]) -> Self {
            let mut connections = MultiMap::new();
            for (tile1, tile2) in tiles.iter().tuple_combinations() {
                if tile1.connection_side(tile2).is_some() {
                    connections.insert(tile1.id, tile2.id);
                    connections.insert(tile2.id, tile1.id);
            let corners: Vec<_> = tiles
                .map(|tile| tile.id)
                .filter(|id| match connections.get_vec(id).unwrap().len() {
                    2 => true,
                    3 | 4 => false,
                    _ => panic!("Impossible"),
            Self {
                tiles: tiles.to_vec(),
                corners: corners.try_into().unwrap(),
                used_tile_ids: HashSet::new(),
        fn find_and_orient_tile(&mut self, tile: &Tile, direction: Direction) -> Option<Tile> {
            let tile_connections = self.connections.get_vec(&tile.id).unwrap();
            let maybe_next_tile = self
                .filter(|t| tile_connections.contains(&t.id) && !self.used_tile_ids.contains(&t.id))
                .find_map(|candidate| tile.match_other(candidate, direction));
            if let Some(t) = &maybe_next_tile {
        fn arrange(&mut self) -> Array2<u8> {
            // Find top left corner - pick an arbitrary corner tile and rotate it until
            // it has connections on the right and bottom
            let mut tl_corner = self
                .find(|tile| self.corners.contains(&tile.id))
            let mut tl_corner_connections = vec![];
            for possible_edge in &self.tiles {
                match tl_corner.connection_side(&possible_edge) {
                    None => continue,
                    Some(dir) => tl_corner_connections.push(dir),
            tl_corner = tl_corner.rot90(match (tl_corner_connections[0], tl_corner_connections[1]) {
                (Direction::RIGHT, Direction::BOTTOM) | (Direction::BOTTOM, Direction::RIGHT) => 0,
                (Direction::LEFT, Direction::BOTTOM) | (Direction::BOTTOM, Direction::LEFT) => 1,
                (Direction::LEFT, Direction::TOP) | (Direction::TOP, Direction::LEFT) => 2,
                (Direction::RIGHT, Direction::TOP) | (Direction::TOP, Direction::RIGHT) => 3,
                _ => panic!("Impossible"),
            // Build the top edge
            let mut t_row = vec![tl_corner];
            loop {
                match self.find_and_orient_tile(&&t_row[t_row.len() - 1], Direction::RIGHT) {
                    None => break,
                    Some(tile) => {
            let ncols = t_row.len();
            let nrows = self.tiles.len() / ncols;
            println!("whole image is {}×{}", ncols, nrows);
            // For each subsequent row...
            let mut rows = vec![t_row];
            for row in 1..nrows {
                // Arrange the tiles that connect to the ones in the row above
                        .map(|col| {
                            self.find_and_orient_tile(&rows[row - 1][col], Direction::BOTTOM)
            // Concatenate all the image data together
            let all_rows: Vec<_> = rows
                .map(|row| {
                    let row_images: Vec<_> = row.iter().map(|t| t.image.view()).collect();
                    concatenate(Axis(1), &row_images).unwrap()
                &all_rows.iter().map(|row| row.view()).collect::<Vec<_>>(),

    There are two main things that I changed here. First is that I noticed I was passing a lot of the same arguments (corners, edges, connections) to the methods that I had, so that was a code smell that indicated that these should be gathered together into a class, which I’ve called Solver.

    The second insight was that I don’t actually need to keep track of which tiles are corners, edges, and middles; each tile can only connect in one way, so I only need to find the corners (both for the answer of Part 1, and to pick the top left corner to start assembling the image from.)

    Now that I have the full image, I have to actually solve the Part 2 puzzle: cross-correlate the image with the sea monster matrix.

    Unlike NumPy, Rust’s ndarray does not have any built-in facilities for cross-correlation, nor is it provided by any packages that I can find. So I will have to write code to do this, but because what I need is actually a very simple form of cross-correlation, I don’t think it will be so hard.

    What I need to do is take a slice of the image at every position, of the size of the sea monster, except where the sea monster would extend outside the boundaries of the image. Then I multiply the slice by the sea monster, and sum all the elements in it, and if that sum is equal to the sum of the elements in the sea monster, then there is a sea monster located there.

    I will need to do this operation on each of the eight orientations of the image (rotated four ways, and flipped) until I find one where sea monsters are present. Then to get the answer to the puzzle, I’ll have to subtract the number of sea monsters times the number of pixels in a sea monster, from the sum of the pixels in the image.

    I write this code:

    fn all_orientations(image: &Array2<u8>) -> [ArrayView2<u8>; 8] {
            image.slice(s![.., ..;-1]),
            image.slice(s![.., ..;-1]).reversed_axes(),
            image.slice(s![..;-1, ..]),
            image.slice(s![..;-1, ..]).reversed_axes(),
            image.slice(s![..;-1, ..;-1]),
            image.slice(s![..;-1, ..;-1]).reversed_axes(),
    static SEA_MONSTER: [&str; 3] = [
        "                  # ",
        "#    ##    ##    ###",
        " #  #  #  #  #  #   ",
    fn count_sea_monsters(image: &ArrayView2<u8>) -> (usize, usize) {
        let mon_rows = SEA_MONSTER.len();
        let mon_cols = SEA_MONSTER[0].len();
        let mut sea_monster = Array2::zeros((mon_rows, mon_cols));
        for (y, line) in SEA_MONSTER.iter().enumerate() {
            for (x, cell) in line.bytes().enumerate() {
                sea_monster[[y, x]] = (cell != b' ') as u8;
        let mon_pixels: u8 = sea_monster.iter().sum();
        let mut monsters = 0;
        let rows = image.nrows();
        let cols = image.ncols();
        for y in 0..(rows - mon_rows) {
            for x in 0..(cols - mon_cols) {
                let slice = image.slice(s![y..(y + mon_rows), x..(x + mon_cols)]);
                let correlation = &slice * &sea_monster.view();
                if correlation.iter().sum::<u8>() == mon_pixels {
                    monsters += 1;
        (monsters, monsters * mon_pixels as usize)

    First I make sure it produces the right answer for the example data, then I add this to main():

    let full_image = solver.arrange();
    let (_, pixels) = all_orientations(&full_image)
        .find_map(|image| {
            let (count, pixels) = count_sea_monsters(image);
            if count != 0 { Some((count, pixels)) } else { None }
    println!("{}", full_image.iter().filter(|&&c| c > 0).count() - pixels);

    Sadly, it doesn’t work. When trying to connect up the top left corner I get a panic because it is possible to connect it on three sides, not two! This is obviously a bug in my program (the tile wouldn’t have been in the list of corners if it had been able to connect on three sides!) I should investigate and fix this bug, but I am so done with this puzzle. In one last burst, I decide to paper over the bug by only trying the tiles that I already know should connect, replacing the tl_corner_connections code with the following:

    let tl_corner_connections: Vec<_> = self
        .filter(|t| {
        .map(|candidate| tl_corner.connection_side(&candidate))

    Finally, finally, this gives me the correct answer, and I see the sea monster light up on the Advent of Code map. There is still a bug, but let us close this book and never speak of this code again.

    Day 24, Part 1

    Without that sea monster puzzle weighing on me, I’m happy to start the second-to-last puzzle. It involves a floor tiled with hexagonal tiles. The tiles have a white side and a black side, and can flip from one to the other. The puzzle involves starting from a center tile, and following directions (east, northeast, west, etc.) to get to another tile, which must be flipped. The answer to the puzzle is how many tiles are flipped after following all the directions.

    So! I’ve never had to work with a hexagonal grid before, but so many games have one, it must be a solved problem. I google “represent hex grid in array” and land on a Stack Overflow question, which leads me to this brilliant page, “Hexagonal Grids” by Amit Patel. This is nothing short of a miracle, telling me everything I need to know in order to do things with hexagonal grids.

    After reading that page and thinking about it for a while, I decide that I will use cube coordinates (x, y, z) and that I don’t even need to store a grid. I just need to store the destination coordinates that each instruction from the input takes me to. A tile is white at the end, if its coordinates are reached an even number of times (including 0), and a tile is black if its coordinates are reached an odd number of times.

    I could store the destination coordinates in a hashmap from coordinate to count, but I wonder if there is a multiset similar to the multimap I used a few days ago. There is. With that, I can write the code for Part 1:

    use multiset::HashMultiSet;
    type Hex = (i32, i32, i32);
    #[derive(Debug, PartialEq)]
    enum Direction {
    impl Direction {
        fn move_rel(&self, (x, y, z): Hex) -> Hex {
            use Direction::*;
            match self {
                EAST => (x + 1, y - 1, z),
                SOUTHEAST => (x, y - 1, z + 1),
                SOUTHWEST => (x - 1, y, z + 1),
                WEST => (x - 1, y + 1, z),
                NORTHWEST => (x, y + 1, z - 1),
                NORTHEAST => (x + 1, y, z - 1),
    fn parse_line(text: &str) -> Vec<Direction> {
        use Direction::*;
        let mut iter = text.bytes();
        let mut retval = Vec::with_capacity(text.len() / 2);
        while let Some(b) = iter.next() {
            retval.push(match b {
                b'e' => EAST,
                b's' => match iter.next() {
                    Some(b2) if b2 == b'e' => SOUTHEAST,
                    Some(b2) if b2 == b'w' => SOUTHWEST,
                    Some(b2) => panic!("bad direction s{}", b2),
                    None => panic!("bad direction s"),
                b'w' => WEST,
                b'n' => match iter.next() {
                    Some(b2) if b2 == b'w' => NORTHWEST,
                    Some(b2) if b2 == b'e' => NORTHEAST,
                    Some(b2) => panic!("bad direction n{}", b2),
                    None => panic!("bad direction n"),
                _ => panic!("bad direction {}", b),
    fn main() {
        let input = include_str!("input");
        let destination_counts: HashMultiSet<_> = input
            .map(|line| {
                    .fold((0, 0, 0), |hex, dir| dir.move_rel(hex))
        let count = destination_counts
            .filter(|destination| destination_counts.count_of(destination) % 2 == 1)
        println!("{}", count);

    Day 24, Part 2

    In a surprise twist, the second part of the puzzle is yet another Conway’s Game of Life, this time on the hex grid! So no more storing the coordinates of the flipped tiles in a multiset. I will need to have some sort of array to store the grid, and calculate the number of occupied neighbour cells, as we have done on several previous puzzles.

    The Hexagonal Grids page comes through once again. Isn’t this great, that I knew nothing about hexagonal grids before encountering this puzzle, and there’s just a page on the internet that explains all of it well enough that I can use it to solve this puzzle! I will use axial coordinates (meaning, just discard one of the cube coordinates) and store the grid in a rectangular ndarray. The only question is how big the array has to be.

    I decide, as in Day 17, that a good upper bound is probably the size of the starting pattern, plus the number of iterations of the game, extended in each direction. So, for the example pattern in the puzzle description, the highest number in any of the three coordinates is 3 (and −3), and the number of iterations is 100, so we’d want a grid of 103×103.

    The hex page recommends encapsulating access to the hex grid in a class, so that’s exactly what I do:

    struct Map {
        map: Array2<i8>,
        ref_q: i32,
        ref_r: i32,
    impl Map {
        fn from_counts(counts: &HashMultiSet<Hex>) -> Self {
            let initial_extent = counts.distinct_elements().fold(0, |acc, (x, y, z)| {
            let extent = initial_extent + 100; // n_iterations = 100
            let size = extent as usize;
            let map = Array2::zeros((2 * size + 1, 2 * size + 1));
            let mut this = Self {
                ref_q: extent,
                ref_r: extent,
            for &(x, y, _) in counts
                .filter(|dest| counts.count_of(dest) % 2 == 1)
                this.set(x, y);
        fn set(&mut self, x: i32, y: i32) {
            let q = (x + self.ref_q) as usize;
            let r = (y + self.ref_r) as usize;
            self.map[[q, r]] = 1;
        fn calc_neighbours(map: &Array2<i8>) -> Array2<i8> {
            let shape = map.shape();
            let width = shape[0] as isize;
            let height = shape[1] as isize;
            let mut neighbours = Array2::zeros(map.raw_dim());
            // Add slices of the occupied cells shifted one space in each hex
            // direction
            for &(xstart, ystart) in &[(1, 0), (0, 1), (-1, 1), (-1, 0), (0, -1), (1, -1)] {
                let xdest = xstart.max(0)..(width + xstart).min(width);
                let ydest = ystart.max(0)..(height + ystart).min(height);
                let xsource = (-xstart).max(0)..(width - xstart).min(width);
                let ysource = (-ystart).max(0)..(height - ystart).min(height);
                let mut slice = neighbours.slice_mut(s![xdest, ydest]);
                slice += &map.slice(s![xsource, ysource]);
        fn iterate(&mut self) {
            let neighbours = Map::calc_neighbours(&self.map);
            let removals = &neighbours.mapv(|count| (count == 0 || count > 2) as i8) * &self.map;
            let additions =
                &neighbours.mapv(|count| (count == 2) as i8) * &self.map.mapv(|cell| (cell == 0) as i8);
            self.map = &self.map + &additions - &removals;
        fn count(&self) -> usize {
                .fold(0, |acc, &cell| if cell > 0 { acc + 1 } else { acc })

    I store the map as a 2-dimensional ndarray, with coordinates (q, r) equal to (x, y) in the cube coordinate scheme (I just drop the z coordinate.) I store the offset of the center tile in (qref, rref).

    I make a constructor that takes the multiset from Part 1 as input, and an iterate() method that calculates one iteration of the map and updates the class. The calc_neighbours() and iterate() code is practically copied from Day 11 except that we only shift the map in six directions instead of eight. (Which six directions those are, I get from the hex grids page.)

    I’m not a big fan of acc.max(x.abs()).max(y.abs()).max(z.abs()) and wish I knew of a better way to do that.

    I can then write the following code in main() which gives me the answer:

    let mut map = Map::from_counts(&destination_counts);
    for _ in 0..100 {
    println!("{}", map.count());


    The Day 20 puzzle was a bit disappointing, since I spent so much more time on it than any of the other puzzles, and the solution wasn’t even particularly good. I’m not sure what made it so much more difficult, but I suspect that I just didn’t find the right data structure to read the data into.

    Day 24, on the other hand, I completed easily with the help of a very informative web site. I suppose this puzzle was a bit of a gimmick, though; if you know how to deal with hexagonal grids then it’s very quick, and if you don’t, then it’s much more difficult. In my case, doing a search for the right thing made all the difference. If I had started out writing code without the knowledge that I had learned, I probably would have used offset coordinates, and, as you can read on the Hexagonal Grids page, that would mean that the directions are different depending on whether you’re in an odd or even-numbered column. The code would have been much more complicated!

    This series will return for one final installment within the next few days.

    December 23, 2020

    Custom docs checkers in yelp-check

    I have a personal goal of getting top-notch docs CI in place for GNOME 40, and part of that is having checks for merge requests. We have a number of great checks already in yelp-check, like checking for broken links and media references, but for large documents sets, you want to be able to enforce your own rules for style and consistency. The most common tool for this is Schematron, and in fact we do have a Schematron file in gnome-help. But it has a lot of boilerplate, most people don’t know how to write it, and the command to run it is unpleasant.

    I’ve been rewriting yelp-check in Python. This has made it easy to implement things that were just not fun with /bin/sh. (It’s also way faster, which is great.) Today I added the ability to define custom checkers in a config file. These checkers use the same idea as Schematron: assert something with XPath and bomb if it’s false. But they’re way easier to write. From my commit message, here’s one of the checks from the gnome-help Schematron file, reworked as a yelp-check checker:

    mal = http://projectmallard.org/1.0/
    select = /mal:page/mal:info
    assert = normalize-space(mal:desc) != ''
    message = Must have non-empty desc

    With the new yelp-check (as yet not released), you’ll be able to just run this command:

    yelp-check gnome-desc

    All of the custom checkers show up the the yelp-check usage blurb, alongside the builtin checks, so you can easily see what’s available.

    The next steps are to let you define sets of commands in the config file, so you can define what’s important for you on merge requests, and then to provide some boilerplate GitLab CI YAML so all docs contributions get checked automatically.

    December 22, 2020

    Christmas Maps

     So, it's that time of year again. And even if this year is a lot different than usual in many ways I thought we should still follow the tradition of summing up some of the last updates to GNOME Maps in 2020 (and before the first beta of what will be part of GNOME 40, in the new versioning scheme).

    The biggest change that was landed since the release of 3.38 has been the redesigned ”place bubbles” by James Westman. James has already written an excellent blog post highlighting this. But I still want point this out here as well. The bubbles now feature larger thumbnails with images from Wikipedia when places are tagged with Wikipedia articles in OpenStreetMap (and the article features a title image), utilizing the MediaWiki API. This feature has been present for some time, but with the redesign the thumbnail are larger and has a more balanced and prominent place. Furthermore a short summary of the Wikipedia article is also shown (in the language preferred by the user's locale settings, if the article is translated to that language in Wikipedia). The details are also shown in a nicer list view-style with icons to give visual cues.

    And this is not all there is to it either. James has been starting work on making these adaptive so they can adjust to smaller (narrower) screen sizes, such as on phones:

    This is something we hope to finish up in time for GNOME 40.

    James also implemented better handling for cases where website URLs are malformed in OSM (such as one missing the proptocol, e.g. https:// or http://), skipping showing the link instead of presenting a link which will not work. Along with this the dialog for editing places in OSM will highlight a malformed website field (and the code has provisioning for adding other validators further on, e.g. for phone numbers).

    Another thing that had bothered me a bit lately is how we render opening hours. Before we displayed them in a single line:

    As can be seen here the line wrapping happens in a very awkward place right between a day and it's time interval.

    So I took the opportunity now that we have the list with separators to do some restructuring rendering it a grid to get a cleaner look:

    I also re-implemented the rendering of times using the localization API from EcmaScript (JS) that we get via GJS. For example this is how it would look in Persian:

    Also since the last time we've had some new contributors. Ravi Shankar improved the detection of invalid URLs and Anubhav Tyagi has improved loading of shape layer files (GeoJSON, GPX, and KML) by replacing synchronous file I/O with asynchronous while loading and also an update to show a dialog asking the user for confirmation when loading files larger than 20 MB since it can takes some time to load (and parse).

    And last, but not least, Maps old-timer Jonas Danielsson contributed a fix to normalize phone numbers in the links shown for phone number when an app is installed that can handle tel: URIs. This allows the Calls app on e.g. the PinePhone to use these links directly from Maps.

    I think that was pretty much it for now!

    And until next time, happy holidays and merry Christmas!

    December 21, 2020

    Portfolio: manage files in your phone

    On mobile phones with GNOME

    Ever since I met @Kekun in Barcelona, during LAS 2019, I got intrigued by this wave of “running GNOME on phones”. It took several months until I could get my teeth into it though. Between my Sugar applications project, Flatseal, a new job and, mostly, due to how hard it is to get a proper Linux-capable phone in Paraguay, I had no time or choice really.

    My first mobile-related project started in August, after many failed attempts to buy a proper Linux-capable phone, I decided that my only way forward was to get a refurbished Moto G4 Play, which has acceptable support thanks to PostmarketOS. The project goal was to provide more clarity on how far we are from a Flathub-powered GNOME community-driven OS for phones. The results were better than I expected, but still far from a daily driver. If you’re curious about this research you can find it here.

    To my surprise, one of the biggest missing pieces was the file manager. I tried all the options that fell into my selection criteria, but none provided a good experience for me. The major issue I found is that the available options seem to land on a “designed for the desktop, but will fit in a small screen with a few tweaks to improve the UI” category.

    Since then, I started to think about how would a simple file manager for phones would look like. By simple I mean two things. First, that it provides that ninety percent of things that people need to manage their files and, second, a UI/UX that is specifically crafted for phones and small touch screens.

    A couple of weeks ago I finally got the weekend slots I needed to hack on these ideas.

    Introducing Portfolio

    A minimalist file manager for those who want to use Linux mobile devices.

    That was the best description I could come up with, hah, it’s funny because it’s true. Portfolio is my first application that is one hundred percent designed for mobile devices. Supports the most simple, yet most common, tasks like browsing, opening, moving, copying, deleting and renaming files.

    A whole weekend went into just getting the interaction model right, but I believe it paid off. The UI is clean. The relevant actions are always visible and just one tap away. The application assumes it’s running on a resource limited device and, therefore, sacrifices some speed for improved responsiveness. As seen above, it even provides an About “dialog” fit for small screens.

    Of course, this application is far from perfect and, actually, it’s just a few days old. It urgently needs lazy loading, an explicit way to get out of selection mode, tests, tests, tests, among other things but, I believe getting the interaction model was the right priority for these limited weekend slots.

    Portfolio is now available on Flathub. If you have a Linux-capable phone, please give it a try!

    Last but not least, thanks to @eddsalkield and @bertob for their amazing early feedback.

    Creating tutorial videos (the hard way)

    I recently created tutorial videos for Wikimedia Phabricator, the task tracking system primarily used in Wikimedia. These five tutorials cover the basics of creating tasks, working with projects and workboards, searching and listing tasks, and improving personal productivity. The videos are linked from the the central Wikimedia Phabricator help page and licensed under CC0.

    Five videos on Wikimedia Commons

    There are different types of technical documentation: Overviews for understanding, how-tos for problem solving, tutorials for learning, API references for information.
    And there are different personal preferences how to learn (oral, verbal, physical, visual, etc).

    While I’m content with Wikimedia’s written task-oriented documentation (“As a user, I want to know how to…”), it was missing an overview (“What is this? How is it supposed to be used?”) in a format easier to consume.

    So I started to plan tutorial videos.

    Which preparation and decisions does that require? I’m not going to list everything but here are some pointers:

    • Understand what you are getting into: Watch So you want to make videos? by Sarah Ley-Hamilton for how to approach. Or mistakes to avoid.
    • Install graphical video editing software; learn and understand the basics.
      • I initially played with Pitivi as a graphical video editing tool. While it is quite okay for my needs (and I’ve started to sometimes use it for video editing), at that time I wasn’t fully convinced due to occasional glitches and user interface issues hard to reproduce.
        I decided to give (non-graphical) ffmpeg a try. Which will only work out if you have raw video material which will not require to make exact sequence cuts on a specific millisecond. So you may want to use graphical video editing software instead. (Still, it was fun this way.)
    • Set up your system: Increase mouse pointer size, check how to visualize the pointer location (e.g. for mouse clicks)
    • Play with creating screencasts of browser content: Fullscreen vs window (the latter requires manually calculating the window’s width∶height to end up in 16∶9 after cropping, however while recording it allows you to see other open tabs to switch to), browser content zoom level, etc.
    • Sort out which information should be a screencast versus showing a static screenshot (intro and end slides, static webpages)
    • Plan what to cover. Write your script in a way that it can also be used as subtitles. Gather feedback (what’s missing or unclear?) and proofreading.
    • How much bling to have: Do I want to visually highlight areas, zoom in on videos, set up fade overlays between sequences?
    • Where and when to record audio without much background noise; is the microphone good enough?

    The complete list of steps (setting up Phabricator locally, making customizations, creating test data, preparing the system, recording audio and video, cropping, merging, concatenating, exporting the final videos, creating subtitles, publishing everything) is publicly documented (html source).