April 08, 2020

multi-value webassembly in firefox: a binary interface

Hey hey hey! Hope everyone is staying safe at home in these weird times. Today I have a final dispatch on the implementation of the multi-value feature for WebAssembly in Firefox. Last week I wrote about multi-value in blocks; this week I cover function calls.

on the boundaries between things

In my article on Firefox's baseline compiler, I mentioned that all WebAssembly engines in web browsers treat the function as the unit of compilation. This facilitates streaming, parallel compilation of WebAssembly modules, by farming out compilation of individual functions to worker threads. It also allows for easy tier-up from quick-and-dirty code generated by the low-latency baseline compiler to the faster code produced by the optimizing compiler.

There are some interesting Conway's Law implications of this choice. One is that division of compilation tasks becomes an opportunity for division of human labor; there is a whole team working on the experimental Cranelift compiler that could replace the optimizing tier, and in my hackings on Firefox I have had minimal interaction with them. To my detriment, of course; they are fine people doing interesting things. But the code boundary means that we don't need to communicate as we work on different parts of the same system.

Boundaries are where places touch, and sometimes for fluid crossing we have to consider boundaries as places in their own right. Functions compiled with the baseline compiler, with Ion (the production optimizing compiler), and with Cranelift (the experimental optimizing compiler) are all able to call each other because they actively maintain a common boundary, a binary interface (ABI). (Incidentally the A originally stands for "application", essentially reflecting division of labor between groups of people making different components of a software system; Conway's Law again.) Let's look closer at this boundary-place, with an eye to how it changes with multi-value.

what's in an ABI?

Among other things, an ABI specifies a calling convention: which arguments go in registers, which on the stack, how the stack values are represented, how results are returned to the callers, which registers are preserved over calls, and so on. Intra-WebAssembly calls are a closed world, so we can design a custom ABI if we like; that's what V8 does. Sometimes WebAssembly may call functions from the run-time, though, and so it may be useful to be closer to the C++ ABI on that platform (the "native" ABI); that's what Firefox does. (Incidentally here I think Firefox is probably leaving a bit of performance on the table on Windows by using the inefficient native ABI that only allows four register parameters. I haven't measured though so perhaps it doesn't matter.) Using something closer to the native ABI makes debugging easier as well, as native debugger tools can apply more easily.

One thing that most native ABIs have in common is that they are really only optimized for a single result. This reflects their heritage as artifacts from a world built with C and C++ compilers, where there isn't a concept of a function with more than one result. If multiple results are required, they are represented instead as arguments, typically as pointers to memory somewhere. Consider the AMD64 SysV ABI, used on Unix-derived systems, which carefully specifies how to pass arbitrary numbers of arbitrary-sized data structures to a function (§3.2.3), while only specifying what to do for a single return value. If the return value is too big for registers, the ABI specifies that a pointer to result memory be passed as an argument instead.

So in a multi-result WebAssembly world, what are we to do? How should a function return multiple results to its caller? Let's assume that there are some finite number of general-purpose and floating-point registers devoted to return values, and that if the return values will fit into those registers, then that's where they go. The problem is then to determine which results will go there, and if there are remaining results that don't fit, then we have to put them in memory. The ABI should indicate how to address that memory.

When looking into a design, I considered three possibilities.

first thought: stack results precede stack arguments

When a function needs some of its arguments passed on the stack, it doesn't receive a pointer to those arguments; rather, the arguments are placed at a well-known offset to the stack pointer.

We could do the same thing with stack results, either reserving space deeper on the stack than stack arguments, or closer to the stack pointer. With the advent of tail calls, it would make more sense to place them deeper on the stack. Like this:

The diagram above shows the ordering of stack arguments as implemented by Firefox's WebAssembly compilers: later arguments are deeper (farther from the stack pointer). It's an arbitrary choice that happens to match up with what the native ABIs do, as it was easier to re-use bits of the already-existing optimizing compiler that way. (Native ABIs use this stack argument ordering because of sloppiness in a version of C from before I was born. If you were starting over from scratch, probably you wouldn't do things this way.)

Stack result order does matter to the baseline compiler, though. It's easier if the stack results are placed in the same order in which they would be pushed on the virtual stack, so that when the function completes, the results can just be memmove'd down into place (if needed). The same concern dictates another aspect of our ABI: unlike calls, registers are allocated to the last results rather than the first results. This is to make it easy to preserve stack invariant (1) from the previous article.

At first I thought this was the obvious option, but I ran into problems. It turns out that stack arguments are fundamentally unlike stack results in some important ways.

While a stack argument is logically consumed by a call, a stack result starts life with a call. As such, if you reserve space for stack results just by decrementing the stack pointer before a call, probably you will need to load the results eagerly into registers thereafter or shuffle them into other positions to be able to free the allocated stack space.

Eager shuffling is busy-work that should be avoided if possible. It's hard to avoid in the baseline compiler. For example, a call to a function with 10 arguments will consume 10 values from the temporary stack; any results will be pushed on after removing argument values from the stack. If there any stack results, it's almost impossible to avoid a post-call memmove, to move stack results to where they should be before the 10 argument values were pushed on (and probably spilled). So the baseline compiler case is not optimal.

However, things get gnarlier with the Ion optimizing compiler. Like many other optimizing compilers, Ion is designed to compute the necessary stack frame size ahead of time, and to never move the stack pointer during an activation. The only exception is for pushing on any needed stack arguments for nested calls (which are popped directly after the nested call). So in that case, assuming there are a number of multi-value calls in a stack frame, we'll be shuffling in the optimizing compiler as well. Not great.

Besides the need to shuffle, stack arguments and stack results differ as regards ownership and garbage collection. A callee "owns" the memory for its stack arguments; it is responsible for them. The caller can't assume anything about the contents of that memory after a call, especially if the WebAssembly implementation supports tail calls (a whole 'nother blog post, that). If the values being passed are just bits, that's one thing, but with the reference types proposal, some result values may be managed by the garbage collector. The callee is responsible for making stack arguments visible to the garbage collector; the caller is responsible for the results. The caller will need to emit metadata to allow the garbage collector to see stack result references. For this reason, a stack result actually starts life just before a call, because it can become initialized at any point and thus needs to be traced during the entire callee activation. Not all callers can easily add garbage collection roots for writable stack slots, so the need to place stack results in a fixed position complicates calling multi-value WebAssembly functions in some cases (e.g. from C++).

second thought: pointers to individual stack results

Surely there are more well-trodden solutions to the multiple-result problem. If we encoded a multi-value return in C, how would we do it? Consider a function in C that has three 64-bit integer results. The idiomatic way to encode it would be to have one of the results be the return value of the function, and the two others to be passed "by reference":

int64_t foo(int64_t* a, int64_t* b) {
  *a = 1;
  *b = 2;
  return 3;
}
void call_foo(void) {
  int64 a, b, c;
  c = foo(&a, &b);
}

This program shows us a possibility for encoding WebAssembly's multiple return values: pass an additional argument for each stack result, pointing to the location to which to write the stack result. Like this:

The result pointers are normal arguments, subject to normal argument allocation. In the above example, given that there are already stack arguments, they will probably be passed on the stack, but in many cases the stack result pointers may be passed in registers.

The result locations themselves don't even need to be on the stack, though they certainly will be in intra-WebAssembly calls. However the ability to write to any memory is a useful form of flexibility when e.g. calling into WebAssembly from C++.

The advantage of this approach is that we eliminate post-call shuffles, at least in optimizing compilers. But, having to make an argument for each stack result, each of which might itself become a stack argument, seems a bit offensive. I thought we might be able to do a little better.

third thought: stack result area, passed as pointer

Given that stack results are going to be written to memory, it doesn't really matter where they will be written, from the perspective of the optimizing compiler at least. What if we allocated them all in a block and just passed one pointer to the block? Like this:

Here there's just one additional argument, no matter how many stack results. While we're at it, we can specify that the layout of the stack arguments should be the same as how they would be written to the baseline stack, to make the baseline compiler's job easier.

As I started implementation with the baseline compiler, I chose this third approach, essentially because I was already allocating space for the results in a block in this way by bumping the stack pointer.

When I got to the optimizing compiler, however, it was quite difficult to convince Ion to allocate an area on the stack of the right shape.

Looking back on it now, I am not sure that I made the right choice. The thing is, the IonMonkey compiler started life as an optimizing compiler for JavaScript. It can represent unboxed values, which is how it came to be used as a compiler for asm.js and later WebAssembly, and it does a good job on them. However it has never had to represent aggregate data structures like a C++ class, so it didn't have support for spilling arbitrary-sized data to the stack. It took a while staring at the register allocator to convince it to allocate arbitrary-sized stack regions, and then to allocate component scalar values out of those regions. If I had just asked the register allocator to give me one appropriate-sized stack slot for each scalar, and hacked out the ability to pass separate pointers to the stack slots to WebAssembly calls with stack results, then I would have had an easier time of it, and perhaps stack slot allocation could be more dense because multiple results wouldn't need to be allocated contiguously.

As it is, I did manage to hack it in, and I think in a way that doesn't regress. I added a layer over an argument type vector that adds a synthetic stack results pointer argument, if the function returns stack results; iterating over this type with ABIArgIter will allocate a stack result area pointer, either as a register argument or a stack argument. In the optimizing compiler, I added add a kind of value allocation coresponding to a variable-sized stack area, (using pointer tagging again!), and extended the register allocator to allocate LStackArea, and the component stack results. Interestingly, I had to add a kind of definition that starts life on the stack; previously all Ion results started life in registers and were only spilled if needed.

In the end, a function will capture the incoming stack result area argument, either as a normal SSA value (for Ion) or stored to a stack slot (baseline), and when returning will write stack results to that pointer as appropriate. Passing in a pointer as an argument did make it relatively easy to implement calls from WebAssembly to and from C++, getting the variable-shape result area to be known to the garbage collector for C++-to-WebAssembly calls was simple in the end but took me a while to figure out.

Finally I was a bit exhausted from multi-value work and ready to walk away from the "JS API", the bit that allows multi-value WebAssembly functions to be called from JavaScript (they return an array) or for a JavaScript function to return multiple values to WebAssembly (via an iterable) -- but then when I got to thinking about this blog post I preferred to implement the feature rather than document its lack. Avoidance-of-document-driven development: it's a thing!

towards deployment

As I said in the last article, the multi-value feature is about improved code generation and also making a more capable base for expressing further developments in the WebAssembly language.

As far as code generation goes, things are progressing but it is still early days. Thomas Lively has implemented support in LLVM for emitting return of C++ aggregates via multiple results, which is enabled via the -experimental-multivalue-abi cc1 flag. Thomas has also been implementing multi-value support in the binaryen WebAssembly toolchain component, used by the emscripten C++-to-WebAssembly toolchain. I think it will be a few months though before everything lands in a way that end users can take advantage of.

On the specification side, the multi-value feature is now at phase 4 since January, which basically means things are all done there.

Implementation-wise, V8 has had experimental support since 2017 or so, and the feature was staged last fall, although V8 doesn't yet support multi-value in their baseline compiler. WebKit also landed support last fall.

Unlike V8 and SpiderMonkey, JavaScriptCore (the JS and wasm engine in WebKit) actually implements a WebAssembly interpreter as their solution to the one-pass streaming compilation problem. Then on the compiler side, there are two tiers that both operate on basic block graphs (OMG and BBQ; I just puked a little in my mouth typing that). This strategy makes the compiler implementation quite straightforward. It's also an interesting design point because JavaScriptCore's garbage collector scans the stack conservatively; there's no need for the compiler to do bookkeeping on the GC's behalf, which I'm sure was a relief to the hacker. Anyway, multi-value in WebKit is done too.

The new thing of course is that finally, in Firefox, the feature is now fully implemented (woo) and enabled by default on Nightly builds (woo!). I did that! It took me a while! Perhaps too long? Anyway it's done. Thanks again to Bloomberg for supporting this work; large ups to y'all for helping the web move forward.

See you next time with a more general article rounding up compile-time benchmarks on a variety of WebAssembly implementations. Until then, happy hacking!

April 07, 2020

Timelines on Calendar

It’s been a long time since I last wrong a blog post about GNOME Calendar only. That doesn’t mean work has stalled!

Since pretty much its inception, Calendar used copy-pasted code from Evolution to retrieve events from Evolution Data Server (EDS). It was a pair of classes called ECalDataModelSubscriber, and ECalDataModel. The first is an interface that classes implement when they handle adding, updating, and removing events. It was implemented by the week, month, and year views. The second Evolution class, ECalDataModel, is responsible for storing multiple subscribers, the time range of each subscriber, fetching the calendar data from EDS, and keeping subscribers aware of which events they should display.

ECalDataModel is a fairly complicated code, full of threads and locks and synchronization points. It was hard to investigate and fix bugs related to it. In addition to that, Calendar tries to use the GDateTime API everywhere, but ECalDataModel (and most Evolution-related code) uses other time types such as time_t and GTimeVal. Over time, those points were growing the pain of maintaining Calendar.

Even though ECalDataModel and ECalDataModelSubscriber worked mostly well for a long time, I thought it wouldn’t hurt to experiment with a new backend that uses more modern APIs and techniques, threads the heavy stuff away, and is closer to the style and idiosyncrasy of Calendar.

After some testing and validating the core concepts of the new backend, and asking a few community members for targeted testing, I finally landed it. I’ll be fixing a few remaining bugs introduced by it, but so far so good!

Timeline

The core component of this new engine is what I called “timeline”. Conceptually, a timeline is a straightforward concept: it’s a virtual representation of the time.

Events are added to this timeline, and subscribers can subscriber to a well defined time slice. The timeline matches which events are visible by which subscribers, and updates subscribers accordingly.

In the picture above, the timeline object would detect that “Subscriber 1” should only display “Event 1”, whereas “Subscriber 2” would display all the 3 events available.

The subscriber concept is directly borrowed from Evolution’s design. Anything can be a subscriber, as long as it knows what slice of time it wants to display. The week, month, and year views continue to implement this concept, but so does the search – which uses a more sophisticated calculation of time, but is still time-based – and the shell search provider.

GcalTimeline takes over the responsibility of aggregating events and subscribers, and deciding which events each subscriber should display. Gathering the events from EDS is done in a different class called GcalCalendarMonitor. This class is very specific and limited in what it does, and is where most of the complexities of multiple threads is handled. It also tries really hard to be efficient and never do heavy operations in tight loops in the main thread.

Augmented Tree

In order to implement that, the data structure used by Calendar to store and query time ranges was further improved to be able to handle a much larger range of dates and times.

Calendar implements an augmented AVL tree that stores data based on ranges instead of single values. This allows us to have a good compromise of memory efficiency, lookup speeds, and insertion and removal speeds. For practical purposes, on a daily usage, the time ranges this range tree is capable of handling is virtually infinite. People will be able to schedule their appointments from year 1 to 9999 in the Gregorian calendar.

Next Steps

I have a few more improvements in the pipeline, such as the introduction of another data structure to handle ranges and compare them with potentially different comparison strategies, since sometimes we want to compare non-exclusive ranges with exclusive ones, or half-exclusive ranges between each other.

Personally, I’m quite satisfied with this new architecture and how much it is a better fit for Calendar. Slowly but steadily, Calendar is being reworked to be more consistent internally and ultimately that will mean less bugs and, who knows!, more features as well.

April 06, 2020

Art vs Design

Over the weekend I was forced to unload all my photos from my phone due to limited storage space. As I went through a nice capture of Builder nighly caught my attention and I couldn’t help but post it on twitter.

Cyberdyne Builder Cyberdyne Builder

Obviously posting on twitter meant it was misunderstood immediately and quipped with entitled adjectives. And rather than responding on the wrong platform, I finally have an excuse to post on my blog again. So let’s take a look at the horrible situation we ended up with.

Thanks to Flatpak you now have a way to install a stable and development versions of an app, concurrenly. You can easily tell them apart without resorting to Name suffices in the shell, where the actual name gets horribly truncated due to ellipsization, while still clearly being the same app on a first glimpse.

Stable and Nightly Boxes Stable and Nightly Boxes

There’s plenty of apps already making use of this. So how does an app developer get one? We actually have the tooling for that. If you have an app icon, you can easily generate a nightly variant with zero effort in most cases.

So what was the situation twitter was praising? Let’s count on how many GNOME applications shipped a custom nighly icon. Umm, how about zero?

A pretty picture an artist spends hours on, modelling, texturing, lighting, adjusting for low resolution screens is not a visual framework nor a reasonable thing to ask app developers to do.

April 05, 2020

Meson manual sales status and price adjustment

The sales dashboard of the Meson manual currently looks like this.

It splits up quite nicely into three parts. The first one is the regular sales from the beginning of the year, which is on average less than one sale per day.

The second part (marked with a line) indicates when I was a guest on CppCast talking about Meson and the book. As an experiment I created a time limited discount coupon so that all listeners could buy it with €10 off. As you can tell from the graph it did have an immediate response, which again proves that marketing and visibility are the things that actually matter when trying to sell any product.

After that we have the "new normal", which means no sales at all. I don't know if this is caused by the coronavirus isolation or whether this is the natural end of life for the product (hopefully the former but you can never really tell in advance).

Price reduction

Thus, effective immediately, the price of the book has been reduced to €24.95. You can purchase it from the official site.

April 04, 2020

High resolution wheel scrolling in the desktop stack

This is a follow up from the kernel support for high-resolution wheel scrolling which you totally forgot about because it's already more then a year in the past and seriously, who has the attention span these days to remember this. Anyway, I finally found time and motivation to pick this up again and I started lining up the pieces like cans, for it only to be shot down by the commentary of strangers on the internet. The Wayland merge request lists the various pieces (libinput, wayland, weston, mutter, gtk and Xwayland) but for the impatient there's also an Fedora 32 COPR. For all you weirdos inexplicably not running the latest Fedora, well, you'll have to compile this yourself, just like I did.

Let's recap: in v5.0 the kernel added new axes REL_WHEEL_HI_RES and REL_HWHEEL_HI_RES for all devices. On devices that actually support high-resolution wheel scrolling (Logitech and Microsoft mice, primarily) you'll get multiple hires events before the now-legacy REL_WHEEL events. On all other devices those two are in sync.

Integrating this into the userspace stack was a bit of a mess at first, but I think the solution is good enough, even if it has a rather verbose explanation on how to handle it. The actual patches to integrate ended up being relatively simple. So let's see why it's a bit weird:

When Wayland started, back in WhoahReallyThatLongAgo, scrolling was specified as the wl_pointer.axis event with a value in pixels. This works fine for touchpads, not so much for wheels. The early versions of Weston decreed that one wheel click was 10 pixels [1] and, perhaps surprisingly, the world kept on turning. When libinput was forked from Weston an early change was that wheel events would have two values - degrees of movement and click count ("discrete steps"). The wayland protocol was expanded to include the discrete steps as wl_pointer.axis_discrete as well. Then backwards compatibility reared its ugly head and Mutter, Weston, GTK all basically said: one discrete step equals 10 pixels so we multiply the discrete value by 10 and, perhaps surprisingly, the world kept on turning.

This worked out well enough for a few years but with high resolution wheels we ran into a problem. Discrete steps are integers, so we can't send partial values. And the protocol is defined in a way that any tweaking of the behaviour would result in broken clients which, perhaps surprisingly, is a Bad Thing. This lead to the current proposal of separate events. LIBINPUT_EVENT_POINTER_AXIS_WHEEL and for Wayland the wl_pointer.axis_v120 event, linked to above. These events are (like the kernel events) a parallel event stream to the previous events and effectively replace the LIBINPUT_EVENT_POINTER_AXIS and Wayland wl_pointer.axis/axis_discrete pair for wheel events (not so for touchpad or button scrolling though).

The compositor side of things is relatively simple: take the events from libinput and pass the hires ones as v120 events and the lowres ones as v120 events with a value of zero. The client side takes the v120 events and uses them over wl_pointer.axis/axis_discrete unless one is zero in which case you can discard all axis events in that wl_pointer.frame. Since most client implementation already have the support for smooth scrolling (because, well, touchpads do exist) it's relatively simple to integrate - the new events just feed into the smooth scrolling code. And since you already have to do wheel emulation for that (because, well, old clients exist) wheel emulation is handled easily too.

All that to provide buttery smooth [2] wheel scrolling. Or not, if your hardware doesn't support it. In which case, well, live with the warm fuzzy feeling that someone else has a better user experience now. Or soon, anyway.

[1] with, I suspect, the scientific measurement of "yeah, that seems about alright"
[2] like butter out of a fridge, so still chunky but at least less so than before

April 03, 2020

This Month in Mutter & GNOME Shell | March 2020

During March, GNOME Shell and Mutter saw their 3.36.0 and 3.36.1 releases, and the beginning of the 3.38 development cycle. We’ve focused most of the development efforts  on fixing bugs before starting the new development cycle.

From the development perspective, the 3.36.0 release was fantastic, and the number of regressions relative to the massive amount of changes that happened during the last cycle was remarkably small.

GNOME Shell

GNOME Shell saw continued improvements in its new Extensions app. New APIs were added to the Shell, which allows moving the Extensions app away into its own codebase. It also allows Shell to expose fewer interfaces through D-Bus.  The Extensions app is now available on Flathub.

A number of other small bugs and crashes were fixed for 3.36.1. Notably, the blur effect now works properly with fractional scaling.

Initial 3.38 work includes an improved Bluetooth state reporting, and the usage of JavaScript promises to simplify various asynchronous operations.

Mutter

Following the 3.36.0 release, Mutter received various fixes to window streaming support. In contrast to streaming entire monitors, which was working properly, window streaming had a few quirks and misbehaviors. For 3.36.1, we’ve tracked down many issues around it and fixed them. Streaming windows is also done using DMA buffer sharing mechanisms.

On Wayland, sometimes new windows would use the wrong position to animate, leading to the zoom in animation look broken. This issue was fixed as well. Pasting images from Firefox does not freeze apps, specially Xwayland apps, anymore. We also fixed a series of bugs where Xwayland windows would show a black border when resizing.

Mutter now properly handles hardware cursors when hotplugging GPUs, and cursor hotspots now work correctly again on virtual machines. Sometimes cursors would rotate wrongly when on already rotated displays, and this was also fixed.

On the X11 front, Mutter now respects manually configured RandR panning on X11, and a bug preventing the correct monitor scale from being applied on X11 was also fixed.

Mutter now also finally respects the “middle mouse button” emulation setting exposed via GSettings.

multi-value webassembly in firefox: from 1 to n

Greetings, hackers! Today I'd like to write about something I worked on recently: implementation of the multi-value future feature of WebAssembly in Firefox, as sponsored by Bloomberg.

In the "minimum viable product" version of WebAssembly published in 2018, there were a few artificial restrictions placed on the language. Functions could only return a single value; if a function would naturally return two values, it would have to return at least one of them by writing to memory. Loops couldn't take parameters; any loop state variables had to be stored to and loaded from indexed local variables at each iteration. Similarly, any block that would naturally return more than one result would also have to do so via locals.

This restruction is lifted with the multi-value proposal. Function types now map from result type to result type, where a result type is a sequence of value types. That is to say, just as functions can take multiple arguments, they can return multiple results. Similarly, with the multi-value proposal, block types are now the same as function types: loops and blocks can take arguments and return any number of results. This change improves the expressiveness of WebAssembly as a compilation target; a C++ program compiled to multi-value WebAssembly can be encoded in fewer bytes than before. Multi-value also establishes a base for other language extensions. For example, the exception handling proposal builds on multi-value to pass multiple values to catch blocks.

So, that's multi-value. You would think that relaxing a restriction would be easy, but you'd be wrong! This task took me 5 months and had a number of interesting gnarly bits. This article is part one of two about interesting aspects of implementing multi-value in Firefox, specifically focussing on blocks. We'll talk about multi-value function calls next week.

multi-value in blocks

In the last article, I presented the basic structure of Firefox's WebAssembly support: there is a baseline compiler optimized for low latency and an optimizing compiler optimized for throughput. (There is also Cranelift, a new experimental compiler that may replace the current implementation of the optimizing compiler; but that doesn't affect the basic structure.)

The optimizing compiler applies traditional compiler techniques: SSA graph construction, where values flow into and out of graphs using the usual defs-dominate-uses relationship. The only control-flow joins are loop entry and (possibly) block exit, so the addition of loop parameters means in multi-value there are some new phi variables in that case, and the expansion of block result count from [0,1] to [0,n] means that you may have more block exit phi variables. But these compilers are built to handle these situations; you just build the SSA and let the optimizing compiler go to town.

The problem comes in the baseline compiler.

from 1 to n

Recall that the baseline compiler is optimized for compiler speed, not compiled speed. If there are only ever going to be 0 or 1 result from a block, for example, the baseline compiler's internal data structures will use something like a Maybe<ValType> to represent that block result.

If you then need to expand this to hold a vector of values, the naïve approach of using a Vector<ValType> would mean heap allocation and indirection, and thus would regress the baseline compiler.

In this case, and in many other similar cases, the solution is to use value tagging to represent 0 or 1 value type directly in a word, and the general case by linking out to an external vector. As block types are function types, they actually appear as function types in the WebAssembly type section, so they are already parsed; the BlockType in that case can just refer out to already-allocated memory.

In fact this value-tagging pattern applies all over the place. (The jit/ links above are for the optimizing compiler, but they relate to function calls; will write about that next week.) I have a bit of pause about value tagging, in that it's gnarly complexity and I didn't measure the speed of alternative implementations, but it was a useful migration strategy: value tagging minimizes performance risk to existing specialized use cases while adding support for new general cases. Gnarly it is, then.

control-flow joins

I didn't mention it in the last article, but there are two important invariants regarding stack discipline in the baseline compiler. Recall that there's a virtual stack, and that some elements of the virtual stack might be present on the machine stack. There are four kinds of virtual stack entry: register, constant, local, and spilled. Locals indicate local variable reads and are mostly like registers in practice; when registers spill to the stack, locals do too. (Why spill to the temporary stack instead of leaving the value in the local variable slot? Because locals are mutable. A local.get captures a local variable value at its point of execution. If future code changes the local variable value, you wouldn't want the captured value to change.)

Digressing, the stack invariants:

  1. Spilled values precede registers and locals on the virtual stack. If u and v are virtual stack entries and u is older than v, then if u is in a register or is a local, then v is not spilled.

  2. Older values precede newer values on the machine stack. Again for u and v, if they are both spilled, then u will be farther from the stack pointer than v.

There are five fundamental stack operations in the baseline compiler; let's examine them to see how the invariants are guaranteed. Recall that before multi-value, targets of non-local exits (e.g. of the br instruction) could only receive 0 or 1 value; if there is a value, it's passed in a well-known register (e.g. %rax or %xmm0). (On 32-bit machines, 64-bit values use a well-known pair of registers.)

push(v)
Results of WebAssembly operations never push spilled values, neither onto the virtual nor the machine stack. v is either a register, a constant, or a reference to a local. Thus we guarantee both (1) and (2).
pop() -> v
Doesn't affect older stack entries, so (1) is preserved. If the newest stack entry is spilled, you know that it is closest to the stack pointer, so you can pop it by first loading it to a register and then incrementing the stack pointer; this preserves (2). Therefore if it is later pushed on the stack again, it will not be as a spilled value, preserving (1).
spill()
When spilling the virtual stack to the machine stack, you first traverse stack entries from new to old to see how far you need to spill. Once you get to a virtual stack entry that's already on the stack, you know that everything older has already been spilled, because of (1), so you switch to iterating back towards the new end of the stack, pushing registers and locals onto the machine stack and updating their virtual stack entries to be spilled along the way. This iteration order preserves (2). Note that because known constants never need to be on the machine stack, they can be interspersed with any other value on the virtual stack.
return(height, v)
This is the stack operation corresponding to a block exit (local or nonlocal). We drop items from the virtual and machine stack until the stack height is height. In WebAssembly 1.0, if the target continuation takes a value, then the jump passes a value also; in that case, before popping the stack, v is placed in a well-known register appropriate to the value type. Note however that v is not pushed on the virtual stack at the return point. Popping the virtual stack preserves (1), because a stack and its prefix have the same invariants; popping the machine stack also preserves (2).
capture(t)
Whereas return operations happen at block exits, capture operations happen at the target of block exits (the continuation). If no value is passed to the continuation, a capture is a no-op. If a value is passed, it's in a register, so we just push that register onto the virtual stack. Both invariants are obviously preserved.

Note that a value passed to a continuation via return() has a brief instant in which it has no name -- it's not on the virtual stack -- but only a location -- it's in a well-known place. capture() then gives that floating value a name.

Relatedly, there is another invariant, that the allocation of old values on block entry is the same as their allocation on block exit, so that all predecessors of the block exit flow all values via the same places. This is preserved by spilling on block entry. It's a big hammer, but effective.

So, given all this, how do we pass multiple values via return()? We don't have unlimited registers, so the %rax strategy isn't going to work.

The answer for the baseline compiler is informed by our lean into the stack machine principle. Multi-value returns are allocated in such a way that a capture() can push them onto the virtual stack. Because spilled values must precede registers, we therefore allocate older results on the stack, and put the last result in a register (or register pair for i64 on 32-bit platforms). Note that it's possible in theory to allocate multiple results to registers; we'll touch on this next week.

Therefore the implementation of return(height, v1..vn) is straightforward: we first pop register results, then spill the remaining virtual stack items, then shuffle stack results down towards height. This should result in a memmove of contiguous stack results towards the frame pointer. However because const values aren't present on the machine stack, depending on the stack height difference, it may mean a split between moving some values toward the frame pointer and some towards the stack pointer, then filling in by spilling constants. It's gnarly, but it is what it is. Note that the links to the return and capture implementations above are to the post-multi-value world, so you can see all the details there.

that's it!

In summary, the hard part of multi-value blocks was reworking internal compiler data structures to be able to represent multi-value block types, and then figuring out the low-level stack manipulations in the baseline compiler. The optimizing compiler on the other hand was pretty easy.

When it comes to calls though, that's another story. We'll get to that one next week. Thanks again to Bloomberg for supporting this work; I'm really delighted that Igalia and Bloomberg have been working together for a long time (coming on 10 years now!) to push the web platform forward. A special thanks also to Mozilla's Lars Hansen for his patience reviewing these patches. Until next week, then, stay at home & happy hacking!

April 02, 2020

SCaLE 18x

Melissa Wu is organizing the Community Education Challenge. She attended her first conference with the GNOME Foundation at SCaLE.

The 18th annual Southern California Linux Expo (SCaLE) took place on March 5–8, 2020 in Pasadena, CA. As the largest community-run open source and free software conference in North America, it was interesting to see the variety of corporate and non-profit exhibitors all united under their passion for open source.

A photo of the GNOME booth, featuring a blue GNOME table cloth and exciting GNOME swag, including bugs, t-shirts, and a tote bag.

The GNOME presence was felt throughout the conference with a special GNOME Beers and pre-release party on the first day of the conference, Thursday, March 5th. GNOME information flyers were also included inside every attendee bag.

This presence carried on to our booth where we were able to connect with GNOME community members, contributors, and enthusiasts as well as tote our merchandise, including a brand new GNOME t-shirt, and stickers. Thank you to the number of supporters who assisted us at the booth including Foundation staff, Melissa Wu, Caroline Henriksen, Neil McGovern, and Rosanna Yuen, along with Foundation members Matthias Clasen, Sriram Ramkrishna, and Nuritzi Sanchez.

A photo of three people showing off temporary tattoos of the GNOME logo.

A Coloring API for GTK

In GUADEC 2019 we had a vendor themes BoF which got expanded to the application developers’ need to brand their app with color. We agreed on the need of a recoloring API for apps and vendors to take advantage of.

This week we had the Design Tools Hackfest 2020, virtualized because of COVID-19, where we discussed that recoloring API. We came up with something I think is interesting enough to discuss more widely.

The Need

First let’s define the need:

  • vendors want to inject their branding in applications, this is typically done by shipping a different theme which creates a whole new set of problems;
  • application developers want to use their branding or inject colors in parts of the application’s UI;
  • users want to make the system more their own by setting the system’s colors.

Common patterns in vendor themes are to change the accent color from the default blue to something better matching their brand like orange or green, another is to have a light theme but a dark titlebar as do Ubuntu and Pop!_OS.

Files with Ubuntu's Yaru GTK theme

Application developers want to convey information via color, e.g. Web uses a light blue titlebar to convey you are browsing in private mode, and King’s Cross uses a red titlebar to convey you are in super user mode and a purple one to convey you are not on the host’s shell (e.g. in a container or in SSH).

Web in private mode
King's Cross' titlebar in super user mode and remote mode

Application developers want to inject their branding via color, e.g. Ciano has a cyan titlebar and Tootle has a desaturated blue one. I could easily see GNOME Twitch use a purple titlebar too.

The elementary apps Ciano and Tootle have branded titlebars

Application developers color the UI for the material meaning they convey, e.g. Notes gives a color to the notes; when viewing a specific one, almost all the window takes the note’s color, I could imagine the whole window take the note’s color.

Notes could have the whole UI match the colors of the note

Users want to set the system’s accent color, to make the device they use feel more like their own; to do so Windows lets its user set an accent color that will be used in the applications as the titlebar color and accent color.

How to Get There

We need to define a set of color variables and what they should be used for, like selected foreground color or unfocused border color, and these color variables should be made public and defined as the coloring API. Adwaita already has a list of color variables which could serve as a base to forge that API.

To add some variety, each of these color variables would come in a specific variant; during the hackfest we defined three variants so far: regular, titlebar and alternate which would correspond to CSS classes you can use to automatically color elements. regular would be the default color used in the whole UI, titlebar would correspond to the titlebar style class that GTK already gives by default to window titlebars, and the alternate style class would let the application developer apply a third style wherever they want in the UI.

Applications would be able to redefine the colors for all their windows via CSS, like this:

@define-color titlebar_bg_color #f00;

For that to work, themes would only be allowed to use these public color variables in CSS. Given SASS hardcodes its variables’ values, its functions’ results, and the selected conditional branches in the compiled CSS, you can’t affect the style by overriding the public color variables in your application.

In order to change the style by redefining public color variables, you have to make the color computation logic appear directly in the generated CSS, which is fortunately doable using some tricks (see the code example below). Same limitations of not being able to use the full power of SASS can pretty easily be mitigated by offering similar color functions in GTK, but color-dependant conditional branching would have to be abandonned.

// Here is how you can call the alpha() GTK CSS color function from SASS.
@function gtkalpha($color, $alpha) {
  @return unquote("alpha(#{$color}, #{$alpha})");
}
// Because CSS doesn't have conditional branching, this bit from Adwaita
// would have to be abandonned, unless we create a very specific color
// function in GTK for each and every case.
@if lightness($c)>95% { @return white; }
@else if lightness($c)>90% { @return transparentize(white, 0.2); }
@else if lightness($c)>80% { @return transparentize(white, 0.5); }
@else if lightness($c)>50% { @return transparentize(white, 0.8); }
@else if lightness($c)>40% { @return transparentize(white, 0.9); }
@else { @return transparentize(white, 0.98); }

Application developers too shouldn’t deviate from this API as their source of colors; unless of course there is a very valid reason to opt-out.

This color API would be a contract between the GTK developers, theme developers, and application developers that they all should follow for colors to be tweakable.

Implementing Variants in Adwaita

The bulk of Adwaita is implemented in a file named _common.scss, which is then imported by the files implementing its different variants (light and dark). In it, the label color would be set that way:

label { color: $fg_color; }

$fg_color is a color hardcoded in Adwaita and then exported as @theme_fg_color, if we assume it’s pure white, the following CSS would be generated:

label { color: #fff; }

To implement the color API we have two problems: Adwaita should use the overrideable color variabless from the color API in its CSS instead of hardcoded colors, and it should offer the different color variants.

Adwaita implements one of its theme variants that way:

$variant: 'light';

@import 'colors';
@import 'drawing';
@import 'common';
@import 'colors-public';

As you can see, _common.scss in imported via @import 'common';, and variables can be set before importing the file and used in it, like $variant: 'light';. We could import _common.scss for each of the variants of the color API, tweaking a few variables for each import:

$variant: 'light';

@import 'colors';
@import 'drawing';

// The default color variant
$color_selector: '&';
$fg_color: unquote('@theme_fg_color');
@import 'common';

// The titlebar color variant
$color_variant: 'titlebar';
$color_selector: '&.#{$color_variant}, .#{$color_variant} &';
$fg_color: unquote('@titlebar_fg_color');
@import 'common';

// The alternate color variant
$color_variant: 'alternate';
$color_selector: '&.#{$color_variant}, .#{$color_variant} &';
$fg_color: unquote('@alternate_fg_color');
@import 'common';

@import 'colors-public';

Then in _common.scss, every time we use a public color we would do this:

label { #{$api_selector} { color: $fg_color; } }

The following CSS would then be generated:

label { color: @theme_fg_color; }
label.titlebar, .titlebar label { color: @titlebar_fg_color; }
label.alternate, .alternate label { color: @alternate_fg_color; }

By default, Adwaita would use the same color values for all color variants.

Tada~ ��, all labels in a titlebar would have the overrideable titlebar color and all labels marked to use the alternate color would do so.

Settings and Priorities

So far I detailed how such a coloring API could be implemented by themes and taken advantage of by apps, but I didn’t explain how we could support vendor theming or user customization. While I don’t think this should be supported and implemented, and I think this API should only be between the GTK team and application developers, I’ll explain how I think we can extend it to let the vendors and the users set the colors.

GSettings seems like a good candidate to offer that API to vendors and users: GTK could offer to set the color variables from the API via GSetting, that means vendors could override them by a simple GSettings override, and users could override them to set their prefered colors over the default and vendor ones.

GTK allows different sources to provide styling, and each style provider is given a priority: GTK offers the fallback, theme, settings, application, and user priorities, from the lowest to the highest. If we follow these priorities and GTK loads the colors from the settings with the settings priority, it means we offer what I consider the perfect priority order, from the lowest to the highest:

  • theme provided colors (theme priority);
  • vendor provided colors (settings priority);
  • user provided colors (settings priority, overriding the vendor’s default);
  • application provided colors (application priority).

April 01, 2020

Chafa 1.4.0: Now with sixels

April 1st seems like as good a time as any for a new Chafa release though note that Chafa is no joke. At least not anymore, what with the extremely enterprise-ready sixel pipeline and all.

As usual, you can get it from the download page or from Github. There are also release notes. Here are the highlights:

Sixel output

Thanks to this 90s-era technology, you can print excellent-looking graphics directly in the terminal with no need for character cell mosaics or hacky solutions like w3mimagedisplay (from w3m) or Überzug. It works entirely using ANSI escape sequence extensions, so it’s usable over ssh, telnet and that old 2400 baud modem you found in grandma’s shed.

The most complete existing implementation is probably Hayaki Saito’s libsixel, but I chose to write one from scratch for Chafa, since sixel output is remarkably intensive computationally, and I wanted to employ a combination of advanced techniques (parallelism, quantization using a PCA approach, SIMD scaling) and corner-cutting that wouldn’t have been appropriate in that library. This gets me fast animation playback and makes it easier to phase out the ImageMagick dependency in the long term.

There are at least two widely available virtual terminals that support sixels: One is XTerm (when compiled with --enable-sixel), and the other is mlterm. Unfortunately, I don’t think either is widely used compared to distribution defaults like GNOME Terminal and Konsole, so here’s hoping for more mainstream support for this feature.

Glyph import

If sixels aren’t your cup of tea, symbol mode has a new trick for you too. It’s --glyph-file, which allows you to load glyphs from external fonts into Chafa’s symbol map. This can give it a better idea of what your terminal font looks like and allows support for more exotic symbols or custom fonts to suit any respectable retro graphics art project.

Keep in mind that you still need to select the appropriate symbol ranges with --symbols and/or --fill. These options now allow specifying precise Unicode ranges, e.g. --symbols 20,41..5a to emit only ASCII spaces and uppercase letters.

Color extraction

In symbol mode, each cell’s color pair is now based on the median color of the underlying pixels instead of the average. Now this isn’t exactly a huge feature, at least not in terms of effort, but it can make a big difference for certain images, especially line art. You can get the old behavior back with --color-extractor average.

GTK 3.98.2

When we released 3.98.0, we promised more frequent snapshots, as the remaining GTK 4 features are landing. Here we are a few weeks later, and 3.98.1 and 3.98.2 snapshots have quietly made it out.

So, what is new ?

Features

There is still work left to do, but a few more big features have landed.

The first is that we have completed the reimplementation of GtkPopovers as xdg-popup surfaces, and split up the GdkSurface API into separate GdkToplevel and GdkPopup interfaces (there’s a GdkDragSurface interface too), which reflect the different roles of surfaces:

  • Toplevels are sovereign windows that are placed by the user and can be maximized, fullscreened, etc.
  • Popups are positioned relative to a parent surface and often grab input, e.g. when used for menus.

In GTK, popovers have lost their :relative-to property, since they are now part of the regular hierarchy like any other widget, and GtkWindow has lost its :window-type property, since all instances of GTK_WINDOW_POPUP have been converted to popovers, and windows are just used for proper toplevels.

Another major feature is the new infrastructure for keyboard shortcuts. In the past, GTK has had a plethora of APIs to implement key bindings, mnemonics and accelerators. In GTK 4, all of this is handled by event controllers. GtkShortcutController is a bit more complex than typical event controllers, since it handles all the different kinds of shortcuts with a unified API.

Thankfully, most of the complexity is hidden. For widget implementors, the important APIs are the variants of gtk_widget_class_add_shortcut(), which are used to add key bindings. For applications, mnemonics and global accels (with gtk_application_set_accels_for_action()) work the same as before. Additionally, it is possible to create shortcut controllers and shortcuts in ui files.

A set of smaller features has landed in the form of a few GtkTextTag properties that expose new pango features such as overlines, visible rendering of spaces and control over hyphenation. These can now be controlled in a GtkTextView via tags. In entries, they can already be controlled by directly adding pango attributes.

Completions

When I wrote about 3.98, I said that the Drag-and-Drop refactoring was complete. That turned out to be not quite correct, and another round of DND work has landed since. These changes were informed by developer feedback on the Drag-and-Drop API. Yay for user testing!

We introduced separate GtkDropTarget and GtkDropTargetAsync event controllers, with the former being simplified to avoid all async API, which makes it very easy to handle local cases.

We also cleaned up internals of the DND implementation to group DND events into event sequences, handle them in just the same way as normal motion events,  and introduced GtkDropControllerMotion, which is an event controller that is designed to handle things like tab switching during a DND operation.

Finally, we could remove the remnants of X11-style property and selection APIs; GtkSelectionData and GdkAtom are gone.

Cleanups and fixes

As always, there’s a large number of smaller cleanups and fixes that have happened.

The biggest group of cleanups happened in the file chooser, where a number of marginally useful APIs (extra widgets, overwrite confirmation, :local-only, GTK_FILE_CHOOSER_ACTION_CREATE_FOLDER, etc) have been dropped. To make up for it, the portal implementation of the native file chooser supports selecting folders now.

Another big cleanup was that GdkEvent is now an immutable boxed type. This was mainly an internal cleanup; the effect on application-level APIs is small, since event controllers have replaced direct event handling for the most part.

One new such event controller is GdkEventControllerFocus, which was split of from the key event controller to provide just focus handling.

GtkMenuButton lost its ability to have mnemonics when it was turned from a GtkButton subclass into a plain widget. This functionality has been reinstated, with a :use-underline property.

The HighContrast and HighContrastInverse themes that are included in GTK are now derived from Adwaita, for a much reduced maintainance burden and improved quality. Trying these themes out in gtk4-widget-factory is now easier, since we added a style menu.

The new HighContrast theme has also been backported to GTK 3.

Whats ahead

We will continue our snapshots and hope to get more developer feedback on the new APIs and features described above.

Here are things that we still want to integrate before GTK 4:

  • Row-recycling list and grid views
  • Revamped accessibility infrastructure
  • Animation API

If you want to follow the GTK 4 work, go here.

PAM testing using pam_wrapper and dbusmock

On the road to libfprint and fprintd 2.0, we've been fixing some long-standing bugs, including one that required porting our PAM module from dbus-glib to sd-bus, systemd's D-Bus library implementation.

As you can imagine, I have confidence in my ability to write bug-free code at the first attempt, but the foresight to know that this code will be buggy if it's not tested (and to know there's probably a bug in the tests if they run successfully the first time around). So we will have to test that PAM module, thoroughly, before and after the port.

Replacing fprintd

First, to make it easier to run and instrument, we needed to replace fprintd itself. For this, we used dbusmock, which is both a convenience Python library and way to write instrumentable D-Bus services, and wrote a template. There are a number of existing templates for a lot of session and system services, in case you want to test the integration of your code with NetworkManager, low-memory-monitor, or any number of other services.

We then used this to write tests for the command-line utilities, so we can both test our new template and test the command-line utilities themselves.

Replacing gdm

Now that we've got a way to replace fprintd and a physical fingerprint reader, we should write some tests for the (old) PAM module to replace sudo, gdm, or the login authentication services.

Co-workers Andreas Schneier and Jakub Hrozek worked on pam_wrapper, an LD_PRELOAD library to mock the PAM library, and Python helpers to write simple PAM services. This LWN article explains how to test PAM applications, and PAM modules.

After fixing a few bugs in pam_wrapper, and combining with the fprintd dbusmock work above, we could wrap and test the fprintd PAM module like it never was before.

Porting to sd-bus

Finally, porting the PAM module to sd-bus was pretty trivial, a loop of 1) writing tests that work against the old PAM module, 2) porting a section of the code (like the fingerprint reader enumeration, or the timeout support), and 3) testing against the new sd-bus based code. The result was no regressions that we could test for.

Conclusion

Both dbusmock, and pam_wrapper are useful tools in your arsenal to write tests, and given those (fairly) easy to use CIs in GNOME and FreeDesktop.org's GitLabs, it would be a shame not to.

You might also be interested in umockdev, to mock a number of device types, and mocklibc (which combined with dbusmock powers polkit's unattended CI)

March 31, 2020

Sandboxing WebKitGTK Apps

When you connect to a Wi-Fi network, that network might block your access to the wider internet until you’ve signed into the network’s captive portal page. An untrusted network can disrupt your connection at any time by blocking secure requests and replacing the content of insecure requests with its login page. (Of course this can be done on wired networks as well, but in practice it mainly happens on Wi-Fi.) To detect a captive portal, NetworkManager sends a request to a special test address (e.g. http://fedoraproject.org/static/hotspot.txt) and checks to see whether it the content has been replaced. If so, GNOME Shell will open a little WebKitGTK browser window to display http://nmcheck.gnome.org, which, due to the captive portal, will be hijacked by your hotel or airport or whatever to display the portal login page. Rephrased in security lingo: an untrusted network may cause GNOME Shell to load arbitrary web content whenever it wants. If that doesn’t immediately sound dangerous to you, let’s ask me from four years ago why that might be bad:

Web engines are full of security vulnerabilities, like buffer overflows and use-after-frees. The details don’t matter; what’s important is that skilled attackers can turn these vulnerabilities into exploits, using carefully-crafted HTML to gain total control of your user account on your computer (or your phone). They can then install malware, read all the files in your home directory, use your computer in a botnet to attack websites, and do basically whatever they want with it.

If the web engine is sandboxed, then a second type of attack, called a sandbox escape, is needed. This makes it dramatically more difficult to exploit vulnerabilities.

The captive portal helper will pop up and load arbitrary web content without user interaction, so there’s nothing you as a user could possibly do about it. This makes it a tempting target for attackers, so we want to ensure that users are safe in the absence of a sandbox escape. Accordingly, beginning with GNOME 3.36, the captive portal helper is now sandboxed.

How did we do it? With basically one line of code (plus a check to ensure the WebKitGTK version is new enough). To sandbox any WebKitGTK app, just call webkit_web_context_set_sandbox_enabled(). Ta-da, your application is now magically secure!

No, really, that’s all you need to do. So if it’s that simple, why isn’t the sandbox enabled by default? It can break applications that use WebKitWebExtension to run custom code in the sandboxed web process, so you’ll need to test to ensure that your application still works properly after enabling the sandbox. (The WebKitGTK sandbox will become mandatory in the future when porting applications to GTK 4. That’s thinking far ahead, though, because GTK 4 isn’t supported yet at all.) You may need to use webkit_web_context_add_path_to_sandbox() to give your web extension access to directories that would otherwise be blocked by the sandbox.

The sandbox is critically important for web browsers and email clients, which are constantly displaying untrusted web content. But really, every app should enable it. Fix your apps! Then thank Patrick Griffis from Igalia for developing WebKitGTK’s sandbox, and the bubblewrap, Flatpak, and xdg-desktop-portal developers for providing the groundwork that makes it all possible.

API changes in Tracker 3.0

 

This article has been updated to correct a misunderstanding I had about the CONSTRAINT feature. Apps will not need to explicitly add this to their queries, it will be added implicitly by the xdg-tracker-portal process..

Lots has happened in the 2 months since my last post, most notably the global coronavirus pandemic … in Spain we’re in week 3 of quarantine lockdown already and noone knows when it is going to end.

Let’s take our mind off the pandemic and talk about Tracker 3.0. At the start of the year Carlos worked on some key API changes which are now merged. It’s a good opportunity to recap what’s really changing in the new version.

I made the developer documentation for Tracker 3.0 available online. Thanks to GitLab, this can be updated every time we merge a change in Git. The documentation a work in progress and we appreciate if you can help us to improve it.

The documentation contains a migration guide, but let’s have a broader look at some common use cases.

Tracker 3.0 is still in development and things may change! We very much welcome feedback from app developers who are going to use this API.

Browsing and searching

The big news in Tracker 3.0 is decentralization. Each app can now manage its own private database! There’s no single “Tracker store” any longer.

Tracker 3.0 will index content from the filesystem to facilitate searching and browsing, as it does now. The filesystem miner will keep this in its own database, and Flatpak apps will access this database through a portal (currently in development).

Apps access this data using a TrackerSparqlConnection just like now, but when we create the connection we need to specify that we want to connect to the filesystem miner’s database.

Here’s a Python example of listing all the music files in the user’s ~/Music directory:

from gi.repository import Tracker

conn = Tracker.SparqlConnection.bus_new(
    "org.freedesktop.Tracker3.Miner.Files", None, None)
cursor = conn.query(
    'SELECT ?url { ?r a nmm:MusicPiece ; nie:url ?url }')
print("Found music files:\n")
while cursor.next():
    print(cursor.get_string()[0][0])

Running a full text search will be similar. Here’s how you’d look for “bananas” in every file in the users ~/Documents folder:

cursor = conn.query(
    'SELECT ?url fts:snippet(?r) { '
    '    ?r a nfo:Document ; '
    '        nie:url ?url ; '
    '        fts:match "Bananas" '
'}')
print("Found document files:\n")
while cursor.next():
    print("   url: {}".format(cursor.get_string()[0][0]))
    print("   snippet: {}".format(cursor.get_string()[0][0]))

If you are running inside a Flatpak sandbox then there will be a portal between you and the org.freedesktop.Tracker3.Miner.Files database. The read-only /.flatpak-info file inside the sandbox, which is created when building the Flatpak, will declare what graphs your app can access. The xdg-tracker-portal will add that information into the SPARQL query, using a Tracker-specific syntax like this: CONSTRAINT GRAPH , and the database will enforce the constraint ensuring that your app really does only see the graphs that it’s requested access to.

Storing your own data

Tracker can be used as a data store by applications. One principle behind the design of Tracker 1.x was that by using a centralized store and a common vocabulary, different apps could easily share data. For example, when you create an album in GNOME Photos, it’s stored in the Tracker database using the standard nfo:DataContainer class. Any other app, perhaps a file manager, or a photos app from a different platform, can show and edit albums stored in this way without having to know specifics about GNOME Photos. Playlists in GNOME Music and starred files in Nautilus are also stored this way.

This approach had some downsides. Having all data in a single database creates a single point of failure. It’s hard to backup the valuable user data without backing up the search and indexing data too – but since the index can be recreated from the filesystem, it’s a waste of resources to include that in a backup. Apps were also forced to share a single database schema which was maintained in the tracker.git repository.

Tracker 3.0, each app creates a private database for storing its own data. It can use the ontology (database schema) from Tracker, or it can provide its own version. Here’s how a photos app written in Python could store photo albums:

from gi.repository import Gio, GLib, Tracker
import pathlib

def app_database_dir():
    data_dir = pathlib.Path(GLib.get_user_data_dir())
    return data_dir.joinpath('my-photos-app/db')

location = Gio.File.new_for_path(app_database_dir())
conn = Tracker.SparqlConnection.new(
    Tracker.SparqlConnectionFlags.NONE, location, None)

conn.update(
    'INSERT {  a nfo:DataContainer, nie:DataObject ; '
    '           nie:title "My Album" }',
    0, None)

Now let’s insert a photo into this album. Remember that the user’s photos are indexed by the filesystem miner. We can use the SERVICE statement to connect the filesystem miner’s database to our app’s private database, like this:

conn.update(
    'CONSTRAINT GRAPH  '
    'INSERT { '
        '   SELECT ?photo { '
        '       SERVICE  { '
        '           ?photo nie:url  '
        '       } '
        '   }, '
        '   ?photo nie:isPartOf  . ',
    '}',
    0, None)

Now let’s display the contents of the album:

cursor = conn.query(
    'CONSTRAINT GRAPH  '
    'SELECT ?url { '
    '    SELECT ?photo ?url { '
    '        SERVICE  { '
    '            ?photo a nmm:Photo ; nie:url ?url . '
    '        } '
    '    } '
    '    ?photo nie:isPartOf . '
    '}')
while cursor.next():
    print(cursor.get_string(0)[0])

Notice again that the app has to request permission to access the Photos graph. If our example app is running in Flatpak, this will require a special permission.

It’s still possible for one app to share data with another, but it will require coordination at the app level. Using the example of photo albums, GNOME Photos can opt to make its database available to other apps. If a different app wants to see the user’s photo albums, they’ll need to connect to the org.gnome.Photos database over D-Bus. As usual, Flatpak apps would need permission to do this.

Is it a good time to port my app to Tracker 3.0?

It’s a good time to start porting your app. You will definitely be able to help us with testing and stabilising the library and the documentation if you start now.

There are some API changes still unmerged at time of writing, primarily the Flatpak portal and the CONSTRAINT feature, also the details of how you specify which ontology to use.

Some functionality is no longer exposed in C libraries, due to the privitization of libtracker-control and libtracker-miner. As far as we know libtracker-miner is unused outside Tracker, but some apps are currenly using libtracker-control to display status updates for the Tracker daemons and trigger indexing of removable devices. We have an open issue about improving the story for on-demand removable device indexing. For status monitoring you may use the underlying DBus signals, and I’m also hoping to make these more useful.

Ideally I’d like to add a new helper library for Tracker 3.0 which would conveniently wrap the high level features that apps use. My volunteer time is limited though. I can share ideas for this if you are looking for a way to contribute!

What about a hackfest?

At some point we need to finish the Tracker 3.0 work and make sure that apps that use Tracker are all ported and working. The best case is that we do this in time for the upcoming GNOME 3.38 release. We discussed about a hackfest some point between now and GNOME 3.38 to make sure things are settled; it now may be that an in-person hackfest won’t be feasible in light of the Coronavirus pandemic but a series of online meetings would be a good alternative. We can only wait, and see!

March 30, 2020

How to create border maps for your projects

When working with data projects it is usual to use administrative maps. In my experience is not trivial to find the cartographic files as open access or opensource data sources, so after some search I found a method to create an ad-hoc map for any administrative region coded into OpenStreetMap. It’s not a trivial method but it is no as complex as it seems at first sight. I’ll try to introduce the essential concepts to easy understand the recipe. If you know other methods as good or better than this please give me some feedback.

I used this method with geodata for Spain so I guess it works with any other administrative region coded in OSM.

First you need to know an OpenStreetMap concept: the relation. In our case we’ll use multipolygon relations, used to code the borders of areas of our interests. The important thing to remember here is you are going to use an OSM relation.

Second you’ll want to select the region of your interest and you’ll need to figure out how it has been mapped in OSM. So you need to find the related OSM relation. As example I’ll use Alamedilla, my parents’ town in the province of Granada, Spain.

the method

Go to https://www.openstreetmap.org and search for the region of your interest. For example Alamedilla:

example screenshot

Click to the correct place and you’ll see something like this:

example screenshot

Look at the URL box at the browser and you’ll see something like this: https://www.openstreetmap.org/relation/343442. The code number you need for the next steps is that one in the URL after the relation keyword. In this example is 343442.

Then visit the to overpass turbo service, a powerful web-based data query and filtering tool for OpenStreetMap:

example screenshot

The white box at left is where you write the code of your query for Overpass. You have a wizard tool in the menu but it’s not trivial too. Instead you could copy exactly this code:


[out:json][timeout:2500];
(
    relation(343442)({{bbox}});
);
out body;
>;
out skel qt;

example screenshot

In your case you need to change the 349044 number (used for the Alamedilla’s example) with the relation number you got before. If you modify the query keep in mind the default timeout (25) maybe is not enough for your case.

Now, clicking the Run button you’ll execute your query. Keep in mind the resulting data set could be really big, depending how big the area is.

So, here it is:

example screenshot

Zoom the map to have a better view:

example screenshot

Now you’ll find the resulting data set in GeoJSON format ready at the Data tab (right side). If this format is fine for you you are done. But if you need some other you are lucky enough because when clicking into Export button you’ll find some other formats to export: GPX, KML and OSM data.

In this example we’ll use the KML format used by Google Earth, Maps and many others.

example screenshot

importing into Google Earth

Open Google Earth:

example screenshot

and open our kml file: [File][Open]:

example screenshot

and here it is:

example screenshot

Note: I modified the color (at the object properties) to make it more visible in the screenshot.

So, it is done. Now you can use the kml file in your application, import to any GIS software or convert to another format if required.

importing into Google Maps

Go to Google MyMaps and create a new one. Import a new layer an select your kml file:

example screenshot

Here it is:

example screenshot

conclusion

Now you are able to create maps of any region added into OpenStreetMap, export them to any of the said formats and import into your applications. Hope this helps.

If you finally use data from the OSM project remember to add the correct credits:

We require that you use the credit “© OpenStreetMap contributors”.

See credit details at osm.org/copyright.

This is an example of how AWESOME OpenStreetMap is the and extraordinaire work these people does. Big thanks to all of the contributros for these impressive service.

DevConf.CZ 2020

Once again, DevConf.CZ, is our meeting-while-freezing winter conference in Brno. For this year I cooked up two talks:

An hour-long talk about Portals during the first day of the conference. The room was almost full and the questions were very relevant. A few attendees met me after the talk seeking help to make their apps start using Portals and with ideas for new Portals.  You can watch the recordings below:

On the last conference day, I had a quick twenty minutes talk about GNOME Boxes in the virtualization track. The audience wasn’t our known faces from the desktop talks, so I got the chance to show Boxes for the first time for a bunch of people. I did a quick presentation with live demos and Q&A. It was a success IMHO. Check the recordings below:

Besides, I participated in the “Diversity and Inclusion” and “Women in Open source” meetups. It was a good opportunity to see what other teams are doing to be more diverse and also to share my personal experiences with mentoring with Outreachy.

Langdon White had a talk on Fedora Silverblue raising important questions about the development workflow in it. I was glad some of their issues were already addressed and fixed, but I recommend to those who didn’t attend this talk to watch the recordings. It is important feedback.

I felt honored to be mentioned in Rebecca Fernandez’s talk about “Growing your career via open source contributions”, where she had slides showing people’s stories, including mine.

I managed to catch up with the developments of the virgil driver on Windows in order to support Direct3D, and discuss other future developments with folks from the SPICE team.

Other than that, I attended many podman/containers talks to better understand their development workflows and how we could accommodate these workflows in Silverblue. I spoke to Red Hatters from other teams that need CodeReadyContainers to test their applications, and how we could improve their workflow in Fedora Workstation.

Lastly, I had a great time with [delicious] food and drinks at the DevConf Party in Fleda, which is 200 meters away from our flat. :-)

GNOME Infrastructure updates

As you may have noticed from outage and maintenance notes we sent out last week the GNOME Infrastructure has been undergoing a major redesign due to the need of moving to a different datacenter. It’s probably a good time to update the Foundation membership, contributors and generally anyone consuming the multitude of services we maintain of what we’ve been up to during these past months.

New Data Center

One of the core projects for 2020 was moving off services from the previous DC we were in (located in PHX2, Arizona) over to the Red Hat community cage located in RAL3. This specific task was made possible right after we received a new set of machines that allowed us to refresh some of the ancient hardware we had (with the average box dating back to 2013). The new layout is composed of a total of 5 (five) bare metals and 2 (two) core technologies: Openshift (v. 3.11) and Ceph (v. 4).

The major improvements that are worth being mentioned:

  1. VMs can be easily scheduled across the hypervisors stack without having to copy disks over across hypervisors themselves. VM disks and data is now hosted within Ceph.
  2. IPv6 is available (not yet enabled/configured at the OS, Openshift router level)
  3. Overall better external internet uplink bandwidth
  4. Most of the VMs that we had running were turned into pods and are now successfully running from within Openshift

RHEL 8 and Ansible

One of the things we had to take into account was running Ceph on top of RHEL 8 to benefit from its containarized setup. This originally presented itself as a challenge due to the fact RHEL 8 ships with a much newer Puppet release than the one RHEL 7 provides. At the same time we didn’t want to invest much time in upgrading our Puppet code base due to the amount of VMs we were able to migrate to Openshift and to the general willingess of slowly moving to use Ansible (client-side, no more need of maintaining server side pieces). On this specific regard we:

  1. Landed support for RHEL 8 provisioning
  2. Started experimenting with Image Based deployments (much more faster than Cobbler provisioning)
  3. Cooked a set of base Ansible roles to support our RHEL 8 installs including IDM, chrony, Satellite, Dell OMSA , NRPE etc.

Openshift

As originally announced, the migration to the Openshift Container Platform (OSCP) has progressed and we now count a total of 34 tenants (including the entirety of GIMP websites). This allowed us to:

  1. Retire running VMs and prevented the need to upgrade their OS whenever they’re close to EOL. Also, in general, less maintenance burden
  2. Allow the community to easily provision services on top of the platform with total autonomy by choosing from a wide variety of frameworks, programming languages and database types (currently Galera and PSQL, both managed outside of OSCP itself)
  3. Easily scale the platform by adding more nodes/masters/routers whenever that is made necessary by additional load
  4. Data replicated and made redundant across a GlusterFS cluster (next on the list will be introducing Ceph support for pods persistent storage)
  5. Easily set up services such as Rocket.Chat and Discourse without having to mess much around with Node.JS or Ruby dependencies

Special thanks

I’d like to thank BartÅ‚omiej Piotrowski for all the efforts in helping me out with the migration during the past couple of weeks and Milan Zink from the Red Hat Storage Team who helped out reviewing the Ceph infrastructure design and providing useful information about possible provisioning techniques.

March 29, 2020

Maps in GNOME 3.36

There's been quite a while since the last blog post. Since then 3.36.0 was released, and also the first update for the stable 3.36 branch, 3.36.1 has been released.
As I've written about before one of the main features in 3.36 is the support for trip planning for public transit using third party services, as shown here from Paris:


We also support a few other areas for now, such as TriMet in Portland, Sweden (using the Resrobot API), and the Switzerland using the opendata.ch API. A full list is available on a sub page to https://live.gnome.org/Apps/Maps

Another feature implemented by James Westman is that the current location marker should no longer flicker when you have live-updating (e.g. when you have an actual GPS receiver on your device).

James has also implemented the first step towards a responsive UI that can scale to phone and narrow tablet portrait displays, this was finished after the UI change freeze before 3.36, so it will have to wait until 3.38


Oh and another thing, in these times when physical traveling is not an option browsing around in Maps application is another way to explore. And don't forget take advantage of the interlinking with Wikipedia from the OpenStreetMap database (Maps will show a link to a Wikipedia article for a place if available when you press the “Show more“ three-dots icon in a place info bubble). And if it's missing you can always add it yourself.

Be safe everyone!

March 27, 2020

Meet the GNOMEies: Regina Nkemchor Adejo

In addition to recently repping the GNOME project at Open Source Festival Africa, Regina Nkemchor Adejo is organizing the Pan African GNOME Summit in Port Harcourt, Nigeria.

A photo of Regina, who is smartly dressed in black with a green jacket, standing in a conference center. Behind her is a man with a dog.
Photo courtesy of Regina Nkemchor Adejo

Tell us a little bit more about yourself.

Well, My full name is Regina Nkemchor Adejo, I am a Nigerian. I am a technology enthusiast who transitioned into sciences from an arts background. I currently work as a database and application specialist in a tax organization. I am a YouTube content creator, I create technical videos related to database and Linux administration.

Most importantly, I love computers! I spend most of my time on them.

What is your role within the GNOME community?

I am member, currently working as an engagement team volunteer

Why did you get involved in GNOME?

I am a GNOME user, GNOME consistently shines for its open source contribution and friendly members and volunteers.

Why are you still involved with GNOME?

It’s an interesting community so many skills to learn around building strong communities and managing projects.

What are you working on right now?

Pan African GNOME Summit (PAGS)! It is a project I am passionate about, to drive GNOME into the African tech space and ntroduce people on how they can make open source contributions in GNOME. Although the first event is happening in Nigeria I hope to expand this into other African countries as well and hopefully one day we have GUADEC in Africa!

What are you excited about right now – either in GNOME or free and open source software in general?

PAGS, GUADEC, and the Linux App Summit (LAS)!

What is a major challenge you see for the future of GNOME?

I won’t call it a challenge, I will say it is more like a concern about managing more volunteers as GNOME pushes for greater numbers of contributors. There may be a need to have more mentors in the foundation to help guide newcomers.

What do you think GNOME should focus on next?
Africa!

Reducing memory consumption in librsvg, part 4: compact representation for Bézier paths

Let's continue with the enormous SVG from the last time, a map extracted from OpenStreetMap.

According to Massif, peak memory consumption for that file occurs at the following point during the execution of rsvg-convert. I pasted only the part that refers to Bézier paths:

    --------------------------------------------------------------------------------
      n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
    --------------------------------------------------------------------------------
1    33 24,139,598,653    1,416,831,176    1,329,943,212    86,887,964            0
2   ->24.88% (352,523,448B) 0x4A2727E: alloc (alloc.rs:84)
    | ->24.88% (352,523,448B) 0x4A2727E: alloc (alloc.rs:172)
    |   ->24.88% (352,523,448B) 0x4A2727E: allocate_in<rsvg_internals::path_builder::PathCommand,alloc::alloc::Global> (raw_vec.rs:98)
    |     ->24.88% (352,523,448B) 0x4A2727E: with_capacity<rsvg_internals::path_builder::PathCommand> (raw_vec.rs:167)
    |       ->24.88% (352,523,448B) 0x4A2727E: with_capacity<rsvg_internals::path_builder::PathCommand> (vec.rs:358)
    |         ->24.88% (352,523,448B) 0x4A2727E: <alloc::vec::Vec<T> as alloc::vec::SpecExtend<T,I>>::from_iter (vec.rs:1992)
    |           ->24.88% (352,523,448B) 0x49D212C: from_iter<rsvg_internals::path_builder::PathCommand,smallvec::IntoIter<[rsvg_internals::path_builder::PathCommand; 32]>> (vec.rs:1901)
    |             ->24.88% (352,523,448B) 0x49D212C: collect<smallvec::IntoIter<[rsvg_internals::path_builder::PathCommand; 32]>,alloc::vec::Vec<rsvg_internals::path_builder::PathCommand>> (iterator.rs:1493)
    |               ->24.88% (352,523,448B) 0x49D212C: into_vec<[rsvg_internals::path_builder::PathCommand; 32]> (lib.rs:893)
    |                 ->24.88% (352,523,448B) 0x49D212C: smallvec::SmallVec<A>::into_boxed_slice (lib.rs:902)
3   |                   ->24.88% (352,523,016B) 0x4A0394C: into_path (path_builder.rs:320)
    |
4   ->03.60% (50,990,328B) 0x4A242F0: realloc (alloc.rs:128)
    | ->03.60% (50,990,328B) 0x4A242F0: realloc (alloc.rs:187)
    |   ->03.60% (50,990,328B) 0x4A242F0: shrink_to_fit<rsvg_internals::path_builder::PathCommand,alloc::alloc::Global> (raw_vec.rs:633)
    |     ->03.60% (50,990,328B) 0x4A242F0: shrink_to_fit<rsvg_internals::path_builder::PathCommand> (vec.rs:623)
    |       ->03.60% (50,990,328B) 0x4A242F0: alloc::vec::Vec<T>::into_boxed_slice (vec.rs:679)
    |         ->03.60% (50,990,328B) 0x49D2136: smallvec::SmallVec<A>::into_boxed_slice (lib.rs:902)
5   |           ->03.60% (50,990,328B) 0x4A0394C: into_path (path_builder.rs:320)

Line 1 has the totals, and we see that at that point the program uses 1,329,943,212 bytes on the heap.

Lines 3 and 5 give us a hint that into_path is being called; this is the function that converts a temporary/mutable PathBuilder into a permanent/immutable Path.

Lines 2 and 4 indicate that the arrays of PathCommand, which are inside those immutable Paths, use 24.88% + 3.60% = 28.48% of the program's memory; between both they use 352,523,448 + 50,990,328 = 403,513,776 bytes.

That is about 400 MB of PathCommand. Let's see what's going on.

What is in a PathCommand?

A Path is a list of commands similar to PostScript, which get used in SVG to draw Bézier paths. It is a flat array of PathCommand:

pub struct Path {
    path_commands: Box<[PathCommand]>,
}

pub enum PathCommand {
    MoveTo(f64, f64),
    LineTo(f64, f64),
    CurveTo(CubicBezierCurve),
    Arc(EllipticalArc),
    ClosePath,
}

Let's see the variants of PathCommand:

  • MoveTo: 2 double-precision floating-point numbers.
  • LineTo: same.
  • CurveTo: 6 double-precision floating-point numbers.
  • EllipticalArc: 7 double-precision floating-point numbers, plus 2 flags (see below).
  • ClosePath: no extra data.

These variants vary a lot in terms of size, and each element of the Path.path_commands array occupies the maximum of their sizes (i.e. sizeof::<EllipticalArc>).

A more compact representation

Ideally, each command in the array would only occupy as much space as it needs.

We can represent a Path in a different way, as two separate arrays:

  • A very compact array of commands without coordinates.
  • An array with coordinates only.

That is, the following:

pub struct Path {
    commands: Box<[PackedCommand]>,
    coords: Box<[f64]>,
}

The coords array is obvious; it is just a flat array with all the coordinates in the Path in the order in which they appear.

And the commands array?

PackedCommand

We saw above that the biggest variant in PathCommand is Arc(EllipticalArc). Let's look inside it:

pub struct EllipticalArc {
    pub r: (f64, f64),
    pub x_axis_rotation: f64,
    pub large_arc: LargeArc,
    pub sweep: Sweep,
    pub from: (f64, f64),
    pub to: (f64, f64),
}

There are 7 f64 floating-point numbers there. The other two fields, large_arc and sweep, are effectively booleans (they are just enums with two variants, with pretty names instead of just true and false).

Thus, we have 7 doubles and two flags. Between the two flags there are 4 possibilities.

Since no other PathCommand variant has flags, we can have the following enum, which fits in a single byte:

#[repr(u8)]
enum PackedCommand {
    MoveTo,
    LineTo,
    CurveTo,
    ArcSmallNegative,
    ArcSmallPositive,
    ArcLargeNegative,
    ArcLargePositive,
    ClosePath,
}

That is, simple values for MoveTo/etc. and four special values for the different types of Arc.

Packing a PathCommand into a PackedCommand

In order to pack the array of PathCommand, we must first know how many coordinates each of its variants will produce:

impl PathCommand {
    fn num_coordinates(&self) -> usize {
        match *self {
            PathCommand::MoveTo(..) => 2,
            PathCommand::LineTo(..) => 2,
            PathCommand::CurveTo(_) => 6,
            PathCommand::Arc(_) => 7,
            PathCommand::ClosePath => 0,
        }
    }
}

Then, we need to convert each PathCommand into a PackedCommand and write its coordinates into an array:

impl PathCommand {
    fn to_packed(&self, coords: &mut [f64]) -> PackedCommand {
        match *self {
            PathCommand::MoveTo(x, y) => {
                coords[0] = x;
                coords[1] = y;
                PackedCommand::MoveTo
            }

            // etc. for the other simple commands

            PathCommand::Arc(ref a) => a.to_packed_and_coords(coords),
        }
    }
}

Let's look at that to_packed_and_coords more closely:

impl EllipticalArc {
    fn to_packed_and_coords(&self, coords: &mut [f64]) -> PackedCommand {
        coords[0] = self.r.0;
        coords[1] = self.r.1;
        coords[2] = self.x_axis_rotation;
        coords[3] = self.from.0;
        coords[4] = self.from.1;
        coords[5] = self.to.0;
        coords[6] = self.to.1;

        match (self.large_arc, self.sweep) {
            (LargeArc(false), Sweep::Negative) => PackedCommand::ArcSmallNegative,
            (LargeArc(false), Sweep::Positive) => PackedCommand::ArcSmallPositive,
            (LargeArc(true), Sweep::Negative) => PackedCommand::ArcLargeNegative,
            (LargeArc(true), Sweep::Positive) => PackedCommand::ArcLargePositive,
        }
    }
}

Creating the compact Path

Let's look at PathBuilder::into_path line by line:

impl PathBuilder {
    pub fn into_path(self) -> Path {
        let num_commands = self.path_commands.len();
        let num_coords = self
            .path_commands
            .iter()
            .fold(0, |acc, cmd| acc + cmd.num_coordinates());

First we compute the total number of coordinates using fold; we ask each command cmd its num_coordinates() and add it into the acc accumulator.

Now we know how much memory to allocate:

        let mut packed_commands = Vec::with_capacity(num_commands);
        let mut coords = vec![0.0; num_coords];

We use Vec::with_capacity to allocate exactly as much memory as we will need for the packed_commands; adding elements will not need a realloc(), since we already know how many elements we will have.

We use the vec! macro to create an array of 0.0 repeated num_coords times; that macro uses with_capacity internally. That is the array we will use to store the coordinates for all the commands.

        let mut coords_slice = coords.as_mut_slice();

We get a mutable slice out of the whole array of coordinates.

        for c in self.path_commands {
            let n = c.num_coordinates();
            packed_commands.push(c.to_packed(coords_slice.get_mut(0..n).unwrap()));
            coords_slice = &mut coords_slice[n..];
        }

For each command, we see how many coordinates it will generate and we put that number in n. We get a mutable sub-slice from coords_slice with only that number of elements, and pass it to to_packed for each command.

At the end of each iteration we move the mutable slice to where the next command's coordinates will go.

        Path {
            commands: packed_commands.into_boxed_slice(),
            coords: coords.into_boxed_slice(),
        }
    }

At the end, we create the final and immutable Path by converting each array into_boxed_slice like the last time. That way each of the two arrays, the one with PackedCommands and the one with coordinates, occupy the minimum space they need.

An iterator for Path

This is all very well, but we also want it to be easy to iterate on that compact representation; the PathCommand enums from the beginning are very convenient to use and that's what the rest of the code already uses. Let's make an iterator that unpacks what is inside a Path and produces a PathCommand for each element.

pub struct PathIter<'a> {
    commands: slice::Iter<'a, PackedCommand>,
    coords: &'a [f64],
}

We need an iterator over the array of PackedCommand so we can visit each command. However, to get elements of coords, I am going to use a slice of f64 instead of an iterator.

Let's look at the implementation of the iterator:

impl<'a> Iterator for PathIter<'a> {
    type Item = PathCommand;

    fn next(&mut self) -> Option<Self::Item> {
        if let Some(cmd) = self.commands.next() {
            let cmd = PathCommand::from_packed(cmd, self.coords);
            let num_coords = cmd.num_coordinates();
            self.coords = &self.coords[num_coords..];
            Some(cmd)
        } else {
            None
        }
    }
}

Since we want each iteration to produce a PathCommand, we declare it as having the associated type Item =  PathCommand.

If the self.commands iterator has another element, it means there is another PackedCommand available.

We call PathCommand::from_packed with the self.coords slice to unpack a command and its coordinates. We see how many coordinates the command consumed and re-slice self.coords according to the number of commands, so that it now points to the coordinates for the next command.

We return Some(cmd) if there was an element, or None if the iterator is empty.

The implementation of from_packed is obvious and I'll just paste a bit from it:

impl PathCommand {
    fn from_packed(packed: &PackedCommand, coords: &[f64]) -> PathCommand {
        match *packed {
            PackedCommand::MoveTo => {
                let x = coords[0];
                let y = coords[1];
                PathCommand::MoveTo(x, y)
            }

            // etc. for the other variants in PackedCommand

            PackedCommand::ArcSmallNegative => PathCommand::Arc(EllipticalArc::from_coords(
                LargeArc(false),
                Sweep::Negative,
                coords,
            )),

            PackedCommand::ArcSmallPositive => // etc.

            PackedCommand::ArcLargeNegative => // etc.

            PackedCommand::ArcLargePositive => // etc.
        }
    }
}

Results

Before the changes (this is the same Massif heading as above):

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 33 24,139,598,653    1,416,831,176    1,329,943,212    86,887,964            0
                                       ^^^^^^^^^^^^^
                                           boo

After:

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 28 26,611,886,993    1,093,747,888    1,023,147,907    70,599,981            0
                                       ^^^^^^^^^^^^^
                                          oh yeah

We went from using 1,329,943,212 bytes down to 1,023,147,907 bytes, that is, we knocked it down by 300 MB.

However, that is for the whole program. Above we saw that Path data occupies 403,513,776 bytes; how about now?

->07.45% (81,525,328B) 0x4A34C6F: alloc (alloc.rs:84)
| ->07.45% (81,525,328B) 0x4A34C6F: alloc (alloc.rs:172)
|   ->07.45% (81,525,328B) 0x4A34C6F: allocate_in<f64,alloc::alloc::Global> (raw_vec.rs:98)
|     ->07.45% (81,525,328B) 0x4A34C6F: with_capacity<f64> (raw_vec.rs:167)
|       ->07.45% (81,525,328B) 0x4A34C6F: with_capacity<f64> (vec.rs:358)
|         ->07.45% (81,525,328B) 0x4A34C6F: rsvg_internals::path_builder::PathBuilder::into_path (path_builder.rs:486)

Perfect. We went from occupying 403,513,776 bytes to just 81,525,328 bytes. Instead of Path data amounting to 28.48% of the heap, it is just 7.45%.

I think we can stop worrying about Path data for now. I like how this turned out without having to use unsafe.

References

March 26, 2020

It's not what programming languages do, it's what they shepherd you to

How many of you have listened, read or taken part in a discussion about programming languages that goes like the following:

Person A: "Programming language X is bad, code written in it is unreadable and horrible."

Person B: "No it's not. You can write good code in X, you just have to be disciplined."

Person A: "It does not work, if you look at existing code it is all awful."

Person B: "No! Wrong! Those are just people doing it badly. You can write readable code just fine."

After this the discussion repeats from the beginning until either one gets fed up and just leaves.

I'm guessing more than 99% of you readers have seen this, often multiple times. The sad part of this is that even though this thing happens all the time, nobody learns anything and the discussion begins anew all the time. Let's see if we can do something about this. A good way to go about it is to try to come up with a name and a description for the underlying issue.
shepherding An invisible property of a progamming language and its ecosystem that drives people into solving problems in ways that are natural for the programming language itself rather than ways that are considered "better" in some sense. These may include things like long term maintainability, readability and performance.
This is a bit abstract, so let's look at some examples.

Perl shepherds you into using regexps

Perl has several XML parsers available and they are presumably good at their jobs (I have never actually used one so I wouldn't know). Yet, in practice, many Perl scripts do XML (and HTML) manipulation with regexes, which is brittle and "wrong" for lack of a better term. This is a clear case of shepherding. Text manipulation in Perl is easy. Importing, calling and using an XML parser is not. And really all you need to do is to change that one string to a different string. It's tempting. It works. Surely it could not fail. Let's just do it and get on with other stuff. Boom, just like that you have been shepherded.

Note that there is nothing about Perl that forces you to do this. It provides all the tools needed to do the right thing. And yet people don't, because they are being shepherded (unconsciously) into doing the thing that is easy and fast in Perl.

Make shepherds you into embedding shell pipelines in Makefiles

Compiling code with Make is tolerable, but it fails quite badly when you need to generate source code, data files and the like. The sustainable solution would be to write a standalone program in a proper scripting language that has all the code logic needed and call that from Make with your inputs and outputs. This rarely happens. Instead people think "I know, I have an entire Unix userland available [1], I can just string together random text mangling tools in a pipeline, write it here and be done". This is how unmaintainability is born.

Nothing about Make forces people to behave like this. Make shepherds people into doing this. It is the easy, natural outcome when faced with the given problem.

Other examples

  • C shepherds you into manipulating data via pointers rather than value objects.
  • C++ shepherds you into providing dependencies as header-only libraries.
  • Java does not shepherd you into using classes and objects, it pretty much mandates them.
  • Turing complete configuration languages shepherd you into writing complex logic with them, even though they are usually not particularly good programming environments.
[1] Which you don't have on Windows. Not to mention that every Unix has slightly different command line arguments and semantics for basic commands meaning shell pipelines are not actually portable.

March 25, 2020

2020-03-25 Wednesday.

  • Mail chew, S&M call, partner call.
  • Really pleased to see Collabora Office 4.2.1 for Android out - with a lot of rather important fixes, performance improvements, UI pretifications and much more.

There is No “Linux” Platform (Part 2)

This is Part 2 of a series on what’s wrong with the free desktop app ecosystem and how we can fix it, based on the talk Jordan Petridis and I gave at LAS 2019 in Barcelona.

In Part 1 we looked at all the different elements making up a platform, and found that there is only one “complete” platform in the free software desktop world at the moment. This is because desktops control the developer platforms, while packaging and system integration is managed by separate communities, the distributions, for historical reasons. This additional layer of middlemen is a key reason why we don’t have real platforms.

Power to the Makers

The problems outlined in Part 1 are of course not new, and people have been working on solutions to them for a long time. Some of these solutions have really started to come together over the last few years, empowering the people making the software to distribute it directly to the people using it.

Thanks to the work of many amazing people in our community you can now develop an app in GNOME Builder, submit it to Flathub, get it reviewed, and have it available for people to install right away. Once it’s on there you can also update it on a schedule you control. No more waiting 6 months for the next distribution release!

Thanks to GNOME Builder’s Flatpak integration, “works on my machine” is largely a thing of the past now!

But though this is all very awesome, Flatpak is unfortunately not a complete solution to the platform conundrum discussed earlier in this series.

Flatpak is Not Enough

Flatpak does solve a number of the issues around app distribution very elegantly, because app developers do their own packaging, and control their release schedule. It’s also a unified package format that works across different host systems, and the Flatpak runtimes are clearly defined development targets to do QA against.

But that doesn’t magically fix all our problems. The two elephants in the room are

  1. The Host still matters: Flatpak only solves part of the issues with distro packaged apps
  2. Downstream Drama: Flatpak does not address the conflicts between desktops and distributions

1. The Host Still Matters

Even with Flatpak there are still some unpredictable variables on the host system which affect app developers. On the technical side a number of things can go wrong, from an outdated Flatpak version (which can mean some Portals apps rely on may be missing), to missing/incompatible system APIs such as password storage, calendar, or address book.

These things can lead to applications not working properly, or at all. For example, this is why new versions of GNOME Contacts cannot access any contacts on Debian 10, why recent GNOME Calendar cannot access any calendars on Ubuntu 18.04, or why Fractal doesn’t remember your password across restarts on some non-GNOME environments.

There are also user-facing integration points where applications interface with the system. These include things like notifications, the application menu, search providers, the old systray, and the design patterns used in individual apps.

For example, when the system UI or design guidelines change, applications follow the platform and change their UI accordingly. This means if you install newer apps on an older system, there are going to be weird edge cases. For example, if you install new apps on Debian 10 you get a confusing mix of the old and new application menu paradigms because the design guidelines were changed with GNOME 3.32 (early 2019).

Before GNOME 3.32 applications had global menu items in the application menu in the Shell top bar, but now they are in the primary menu, inside the app window.

Flatpak also applies the host GTK stylesheet and icon set to apps. This means that if the host distribution overrides the system stylesheet, Flatpak will happily apply random, never-tested CSS to every app. Obviously this leads to lots of issues, ranging from ugly but relatively harmless glitches to real usability issues, such as illegible text on buttons. For more background on this particular issue, see this blog post.

Some of these issues could be fixed with more standardization, changes to Flatpak, or new portals. However, fundamentally, in order to be a real platform you need a clearly defined environment to develop and test for. Flatpak alone is not enough to achieve that.

Just like “write once, run everywhere” is always an illusion, it’s never going to be possible to completely split apps from the OS. You always need app developers to do some extra work to support different environments, and currently every distribution represents yet another extra environment to support.

2. Downstream Drama

Flatpak does not completely solve the issues app developers face in shipping their software, because these can not be isolated from the ones desktop developers face. In order to fix the app developer story we need real platforms. In order to get those we need to resolve the desktop/distribution dilemma.

The issues here roughly match the ones with traditional distribution packaging mentioned in Part 1, and can be grouped into three broad categories:

  • Structural issues inherent to having distributions and desktops be separate projects.
  • Fragmentation issues because we have multiple of everything so there’s duplication and/or bad abstraction layers.
  • Configuration issues, primarily around settings and other defaults, which have to be set at the distribution level but affect the user experience.

Structural Issues

One of the biggest structural issues is distribution release schedules not being aligned with the upstream one (or between different distributions). GNOME releases every 6 months, but distributions can take anywhere from a few weeks to several years to ship these releases.

This category also includes distributions overriding upstream decisions around system UX, as well as theming/branding issues, due to problematic downstream incentives. This means there is no clear platform visual identity developers can target.

For example, Ubuntu 18.04 (the current LTS) ships with GNOME 3.28 (from March 2018), includes significant changes to system UX and APIs (e.g. Unity-style dock, desktop icons, systray extension), and ships a branded stylesheet that breaks even in core applications.

Ubuntu 18.04 overrides the GTK system stylesheet, which results in the “Create” button on the new folder dialog in Files being invisible (among many many other issues, especially in third party apps).

Fragmentation Issues

Having multiple implementations of everything means we either need do tons of duplicate work, or try to abstract over the different implementations.

On one end of the spectrum there are OS installers: There is no GNOME installer, so every distribution builds their own. Unfortunately, most of these installers are not very good, and don’t integrate well with the rest of the desktop experience (e.g. they use different design patterns than the OS itself). This can be either due to a lack of resources (e.g. not every downstream has their own GNOME designers), or because different distributions have specific downstream goals and motivations (e.g. Fedora and RHEL share an installer, which introduces lots of complexity).

The famously awkward Fedora installer is a good example of why such core parts of the experience should be designed and developed upstream. Unfortunately this isn’t really feasible due to distribution fragmentation.

In other areas we have the opposite problem, because we’re trying to abstract over the fragmentation with a single component. For example, PackageKit is meant to abstract over different package formats, but in practice it only works for a handful of them, and even for those it’s often buggy. The PackageKit maintainers have officially given up on this approach.

Configuration Issues

This includes the default apps, the fonts shipped with the system by default, the terminal shell and prompt, and the UX around things like Plymouth. All of these things are usually configured at the distribution level and are therefore often not great, because these choices need to be made in concert with the rest of the platform UX.

Forging Platforms

Given the constraint of there being multiple different desktops projects and technology stacks (and the host still mattering), we’ll never have a single “Linux” or “FreeDesktop” platform. We could have one platform per desktop though.

From an app developer point of view, testing for GNOME, KDE, and elementary isn’t as nice as testing only for a single platform, but it’s not impossible. However, testing for Debian, Fedora, multiple Ubuntu releases, OpenSUSE, Arch, Endless, and dozens more is not and never will be feasible, even with Flatpak. Multiple different distributions, even ones that ship the same desktop environment, don’t add up to a platform. But exactly that is what we need, one way or another.

The question is, how do we get there?

The Nuclear Option

When we look at it from a Flatpak context, the solution seems obvious. Flatpak is solving the middleman problem for app developers by circumventing the distributions and providing a direct channel between developers and end users. What if we could do the same thing for the OS itself?

Of course the situation isn’t exactly the same, so what would that mean in practice?

With Flatpak runtimes there is no extra “distribution” abstraction layer. There are no Debian or Fedora runtimes, just GNOME and KDE, because those are the technology stacks app developers target.

These runtimes are already more or less full-fledged distributions which are controlled by the desktops, we’re just not using them as such. The Freedesktop SDK (which most runtimes are based on) is not based in any distro, but built directly from upstream sources using Buildstream as the build tool, and it already has most of the things you need to make a basic operating system.

There is an early-stage effort to make bootable nightly GNOME OS images for development/testing, built on top of the Freedesktop SDK. From there it wouldn’t be a huge leap to actually make an independent, consumer-facing platform OS for GNOME (and KDE, and other platforms).

However, though this is likely to become a very attractive solution in the future, there are a number of hurdles to be overcome:

  • An OS needs an installer, OS updates, a Plymouth theme, etc. All of these are being worked on for the nightly GNOME OS images, but are not quite there yet.
  • A “real” OS needs a dedicated group of people doing things like release management, security tracking, and QA. These are being done to some degree for the Flatpak runtimes, but a consumer OS would need more manpower.
  • It’s an OSTree-based immutable system, which means there is no traditional package management. Apps are installed via Flatpak, and server/developer workflows need to happen in containers. Though projects like Silverblue’s toolbox have come a long way over the past few years, there’s still work to be done before immutable OSes can painlessly replace systems with old-school package managers for all use cases.

It takes time to start a new operating system from scratch, especially when it’s using cutting-edge technology. So while things like GNOME OS could be amazing in the longer-term future, it’s likely going to take a few more years before this becomes a viable alternative.

Squaring the Circle

What could we do within the constraints of the technology, ecosystem, and communities we have today, then? If we can’t go around distributions with a platform OS, the only alternative is to meld the distributions into a meta platform OS.

Technically there’s nothing stopping a group of separate distributions from acting more or less like a unified platform OS together. It would require extraordinary discipline and compromise on all sides (admittedly not things our communities are usually known for), but given how important it is that we fix this problem, it’s at least worth thinking about.

To get an idea what this could look like in practice, let’s think through some of the specific issues mentioned earlier:

Release Schedule: This is probably among the thorniest issues since release cycles vary wildly in length and structure, and changing them is very difficult. It’s not unimaginable that at least some progress could be made here though. For example, GNOME could have long term support releases every 2-3 years for “stable” distributions like RHEL and Ubuntu LTS. Distributions could then agree to either be on the regular 6 month schedule, or the 2 year “LTS” schedule. Alternatively, all distributions could find a single compromise schedule that can work for everyone (e.g. maybe one release per year, like mobile operating systems do).

Theming/Branding: Some distributions want ways to customize the OS experience such that their system looks recognizably different from others. This is not necessarily a problem, as long as this is done using APIs that are supported and intended to be used in this way (which unfortunately is currently not happening in many cases).

Creating more branding opportunities which do not break APIs which apps rely upon (especially third party apps shipped via Flatpak), is certainly possible and there have been discussions in this direction (e.g. GTK accent colors). Whether distributions would limit themselves to these APIs once they exist is of course an open question, but at least there is a ongoing dialog about this.

System UX/API Changes: Some distributions make significant changes to the core system, which fragments the visual identity of the platform at best, and severely damages the app ecosystem at worst. This includes things like adding a permanent dock, icons on the desktop, re-enabling the systray, or a “dark mode” setting which just changes the system stylesheet from under apps.

The solution here is simple in theory: If you think a change to the system UX is needed to fix a specific problem, don’t just patch it downstream, but instead help to address the actual underlying issue (We already touched on this in Part 1). For example, if you find that new users are confused by the empty desktop at startup, don’t just ship an extension that completely breaks the structure of the shell. Bring the problem to the upstream designers and developers, figure out a solution together, and help implement it upstream.

In practice it’s not always that easy, but a lot can be done by simply adopting an upstream-first UX mindset. It can take a while to get used to, especially for companies with more, uh, “traditional” internal processes, but it’s definitely possible seeing as it’s working well for Red Hat and Purism, for example.

OS Installer: It may not be doable to have a single code base, but we could definitely share at least the design (and possibly some UI code) for the installers used across distributions. A cross-distribution initiative for nice, native GNOME installers across the major distributions would probably not be easy logistically, but is not unimaginable.

Software Installation & Updates: GNOME Software and PackageKit’s “abstract across distros” strategy has clearly failed, and we need a new approach here. For applications there is a relatively easy solution: Distributions stop packaging apps, and work together on a common repository of developer-submitted Flatpaks (e.g. something like Flathub). We’d need to work out how this common solution can accommodate various distribution policies around e.g. proprietary software, but this seems very doable and most of it already exists in Flathub.

The resources currently going into repackaging every app for every distribution could be pooled to review the apps submitted by developers to the common Flatpak repository.

Seeing as most distributions are not (yet) image-based like e.g. Silverblue or Endless, we would still also need a way to update the packages that make up the core system. For this there’s probably no way around backend duplication.

System Default Configuration: Making progress in this area is likely not too difficult comparatively. The main thing we’d need is better coordination between the various parties needed to synchronize these things better (which is of course easier said than done). Having some kind of common forum where the upstream design and release team, as well as people in charge of major distributions can discuss and standardize defaults across the entire ecosystem might work for that.

The Bottom Line

If we want a future with real platforms we can either go around the distributions or have them all work together (or potentially both), but one way or another we need to vertically integrate.

Neither path is straightforward or easy, and there’s a huge amount of work ahead either way. However, the first and most important step is acknowledging that this problem exists, and that we need to radically change our approach if we’re serious about building attractive app ecosystems.

The good news is that many people across different projects are already working towards enabling this future. We hope that you’ll join us.

Happy hacking :)

2020: the fecal matter is colliding with the rotary oscillator

Many friends of mine, including a significant portion of GNOME contributors, are in the United States, and I’m personally worried they (or those around them) will face particularly deep trouble this year and beyond. It seems nobody dares talk openly about it, so what the heck, I’m sharing my concern here and getting it out of my chest (then, after worrying about death, I can move on to worrying about taxes). Maybe I’ll be able to sleep a bit better.

As you most probably know, Europe is taking a serious beating and is struggling as we speak… but if you thought the US will fare any better, just wait. Shiitake is about to hit the fan, and the case of the United States of America is particularly concerning because of the many reasons I extensively documented here a couple of days ago. Not only is the US’ preparation for this pandemic very much insufficient and it has no true safety net for its citizens, but it also has very unique societal factors that, compared to all the other countries in the world, put it at risk of suffering extremely deep social disruption and pervasive hardship.

I wish you the best of luck in the fight against the SARS-coronavirus-2, just as I am wishing good luck to the rest of the world. I hope I will be incredibly wrong (so far the trends seem to be confirming my predictions, however) and that some unforeseen radical solutions will turn the tide, but I’m not holding my breath here. The US needs more than band-aid quick-fixes.

Let’s hope that this time, the sheer scale of the problem will bring about real positive change in the system. Not just a bigger economic bubble at the expense of the people and planet. It would be about time.

The post 2020: the fecal matter is colliding with the rotary oscillator appeared first on The Open Sourcerer.

24 Mar 2020

Yak Shaving - Swift Edition

At the TensorFlow summit last year, I caught up with Chris Lattner who was at the time working on Swift for TensorFlow - we ended up talking about concurrency and what he had in mind for Swift.

I recognized some of the actor ideas to be similar to those from the Pony language which I had learned about just a year before on a trip to Microsoft Research in the UK. Of course, I pointed out that Pony had some capabilities that languages like C# and Swift lacked and that anyone could just poke at data that did not belong to them without doing too much work and the whole thing would fall apart.

For example, if you build something like this in C#:

class Chart {
  float [] points;
  public float [] Points { get { return points; } }
}

Then anyone with a reference to Chart can go and poke at the internals of the points array that you have surfaced. For example, this simple Plot implementation accidentally modifies the contents:

void Plot (Chart myChart)
{
   // This code accidentally modifies the data in myChart
   var p = myChart.points;
   for (int i = 0; i < p.Length; i++) {
       Plot (0, p [i]++)
   }
}

This sort of problem is avoidable, but comes at a considerable development cost. For instance, in .NET you can find plenty of ad-hoc collections and interfaces whose sole purpose is to prevent data tampering/corruption. If those are consistently and properly used, they can prevent the above scenario from happening.

This is where Chris politely pointed out to me that I had not quite understood Swift - in fact, Swift supports a copy-on-write model for its collections out of the box - meaning that the above problem is just not present in Swift as I had wrongly assumed.

It is interesting that I had read the Swift specification some three or four times, and I was collaborating with Steve on our Swift-to-.NET binding tool and yet, I had completely missed the significance of this design decision in Swift.

This subtle design decision was eye opening.

It was then that I decided to gain some real hands-on experience in Swift. And what better way to learn Swift than to start with a small, fun project for a couple of evenings.

Rather than building a mobile app, which would have been 90% mobile design and user interaction, and little Swift, I decided to port my gui.cs console UI toolkit from C# to Swift and called it TermKit.

Both gui.cs and TermKit borrow extensively from Apple’s UIKit design - it is a design that I have enjoyed. It notably avoids auto layout, and instead uses a simpler layout system that I quite love and had a lot of fun implementing (You can read a description of how to use it in the C# version).

This journey was filled with a number of very pleasant capabilities in Swift that helped me find some long-term bugs in my C# libraries. I remain firmly a fan of compiled languages, and the more checking, the better.

Dear reader, I wish I had kept a log of those but that is now code that I wrote a year ago so I could share all of those with you, but I did not take copious notes. Suffice to say, that I ended up with a warm and cozy feeling - knowing that the compiler was looking out for me.

There is plenty to love about Swift technically, and I will not enumerate all of those features, other people have done that. But I want to point out a few interesting bits that I had missed because I was not a practitioner of the language, and was more of an armchair observer of the language.

The requirement that constructors fully initialize all the fields in a type before calling the base constructor is a requirement that took me a while to digest. My mental model was that calling the superclass to initialize itself should be done before any of my own values are set - this is what C# does. Yet, this prevents a bug where the base constructor can call a virtual method that you override, and might not be ready to handle. So eventually I just learned to embrace and love this capability.

Another thing that I truly enjoyed was the ability of creating a typealias, which once defined is visible as a new type. A capability that I have wanted in C# since 2001 and have yet to get.

I have a love/hate relationship with Swift protocols and extensions. I love them because they are incredibly powerful, and I hate them, because it has been so hard to surface those to .NET, but in practice they are a pleasure to use.

What won my heart is just how simple it is to import C code into Swift

  • to bring the type definitions from a header file, and call into the C code transparently from Swift. This really is a gift of the gods to humankind.

I truly enjoyed having the Character data type in Swift which allowed my console UI toolkit to correctly support Unicode on the console for modern terminals.

Even gui.cs with my port of Go’s Unicode libraries to C# suffers from being limited to Go-style Runes and not having support for emoji (or as the nerd-o-sphere calls it “extended grapheme clusters”).

Beyond the pedestrian controls like buttons, entry lines and checkboxes, there are two useful controls that I wanted to develop. An xterm terminal emulator, and a multi-line text editor.

In the C# version of my console toolkit my multi-line text editor was a quick hack. A List<T> holds all the lines in the buffer, and each line contains the runes to display. Inserting characters is easy, and inserting lines is easy and you can get this done in a couple of hours on the evening (which is the sort of time I can devote to these fun explorations). Of course, the problem is cutting regions of text across lines, and inserting text that spans multiple lines. Because what looked like a brilliant coup of simple design, turns out to be an ugly, repetitive and error-prone code that takes forever to debug - I did not enjoy writing that code in the end.

For my Swift port, I decided that I needed something better. Of course, in the era of web scale, you gotta have a web scale data structure. I was about to implement a Swift version of the Rope data structure, when someone pointed to me a blog post from the Visual Studio Code team titled “Text Buffer Reimplementation”. I read it avidly, founds their arguments convincing, and in the end, if it is good enough for Visual Studio Code, it should be good enough for the gander.

During my vacation last summer, I decided to port the TypeScript implementation of the Text Buffer to Swift, and named it TextBufferKit. Once again, porting this code from TypeScript to Swift turned out to be a great learning experience for me.

By the time I was done with this and was ready to hook it up to TermKit, I got busy, and also started to learn SwiftUI, and started to doubt whether it made sense to continue work on a UIKit-based model, or if I should restart and do a SwiftUI version. So while I pondered this decision, I did what every other respected yak shaver would do, I proceeded to my xterm terminal emulator work.

Since about 2009 or so, I wanted to have a reusable terminal emulator control for .NET. In particular, I wanted one to embed into MonoDevelop, so a year or two ago, I looked for a terminal emulator that I could port to .NET - I needed something that was licensed under the MIT license, so it could be used in a wide range of situations, and was modern enough. After surveying the space, I found “xterm.js” fit the bill, so I ported it to .NET and modified it to suit my requirements. XtermSharp - a terminal emulator engine that can have multiple UIs and hook up multiple backends.

For Swift, I took the XtermSharp code, and ported it over to Swift, and ended up with SwiftTerm. It is now in quite a decent shape, with only a few bugs left.

I have yet to built a TermKit UI for SwiftTerm, but in my quest for the perfect shaved yak, now I need to figure out if I should implement SwiftUI on top of TermKit, or if I should repurpose TermKit completely from the ground up to be SwiftUI driven.

Stay tuned!

March 24, 2020

Reducing memory consumption in librsvg, part 3: slack space in Bézier paths

We got a bug with a gigantic SVG of a map extracted from OpenStreetMap, and it has about 600,000 elements. Most of them are <path>, that is, specifications for Bézier paths.

A <path> can look like this:

<path d="m 2239.05,1890.28 5.3,-1.81"/>

The d attribute contains a list of commands to create a Bézier path, very similar to PostScript's operators. Librsvg has the following to represent those commands:

pub enum PathCommand {
    MoveTo(f64, f64),
    LineTo(f64, f64),
    CurveTo(CubicBezierCurve),
    Arc(EllipticalArc),
    ClosePath,
}

Those commands get stored in an array, a Vec inside a PathBuilder:

pub struct PathBuilder {
    path_commands: Vec<PathCommand>,
}

Librsvg translates each of the commands inside a <path d="..."/> into a PathCommand and pushes it into the Vec in the PathBuilder. When it is done parsing the attribute, the PathBuilder remains as the final version of the path.

To let a Vec grow efficiently as items are pushed into it, Rust makes the Vec grow by powers of 2. When we add an item, if the capacity of the Vec is full, its buffer gets realloc()ed to twice its capacity. That way there are only O(log₂n) calls to realloc(), where n is the total number of items in the array.

However, this means that once we are done adding items to the Vec, there may still be some free space in it: the capacity exceeds the length of the array. The invariant is that vec.capacity() >= vec.len().

First I wanted to shrink the PathBuilders so that they have no extra capacity in the end.

First step: convert to Box<[T]>

A "boxed slice" is a contiguous array in the heap, that cannot grow or shrink. That is, it has no extra capacity, only a length.

Vec has a method into_boxed_slice which does eactly that: it consumes the vector and converts it into a boxed slice without extra capacity. In its innards, it does a realloc() on the Vec's buffer to match its length.

Let's see the numbers that Massif reports:

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 23 22,751,613,855    1,560,916,408    1,493,746,540    67,169,868            0
                                       ^^^^^^^^^^^^^
                                           before

 30 22,796,106,012    1,553,581,072    1,329,943,324   223,637,748            0
                                       ^^^^^^^^^^^^^
                                           after

That is, we went from using 1,493,746,540 bytes on the heap to using 1,329,943,324 bytes. Simply removing extra capacity from the path commands saves about 159 MB for this particular file.

Second step: make the allocator do less work

However, the extra-heap column in that table has a number I don't like: there are 223,637,748 bytes in malloc() metadata and unused space in the heap.

I suppose that so many calls to realloc() make the heap a bit fragmented.

It would be good to be able to read most of the <path d="..."/> to temporary buffers that don't need so many calls to realloc(), and that in the end get copied to exact-sized buffers, without extra capacity.

We can do just that with the smallvec crate. A SmallVec has the same API as Vec, but it can store small arrays directly in the stack, without an extra heap allocation. Once the capacity is full, the stack buffer "spills" into a heap buffer automatically.

Most of the d attributes in the huge file in the bug have fewer than 32 commands. That is, if we use the following:

pub struct PathBuilder {
    path_commands: SmallVec<[PathCommand; 32]>,
}

We are saying that there can be up to 32 items in the SmallVec without causing a heap allocation; once that is exceeded, it will work like a normal Vec.

At the end we still do into_boxed_slice to turn it into an independent heap allocation with an exact size.

This reduces the extra-heap quite a bit:

--------------------------------------------------------------------------------
  n        time(i)         total(B)   useful-heap(B) extra-heap(B)    stacks(B)
--------------------------------------------------------------------------------
 33 24,139,598,653    1,416,831,176    1,329,943,212    86,887,964            0
                                                        ^^^^^^^^^^

Also, the total bytes shrink from 1,553,581,072 to 1,416,831,176 — we have a smaller heap because there is not so much work for the allocator, and there are a lot fewer temporary blocks when parsing the d attributes.

Making the code prettier

I put in the following:

/// This one is mutable
pub struct PathBuilder {
    path_commands: SmallVec<[PathCommand; 32]>,
}

/// This one is immutable
pub struct Path {
    path_commands: Box<[PathCommand]>,
}

impl PathBuilder {
    /// Consumes the PathBuilder and converts it into an immutable Path
    pub fn into_path(self) -> Path {
        Path {
            path_commands: self.path_commands.into_boxed_slice(),
        }
    }
}

With that, PathBuilder is just a temporary struct that turns into an immutable Path once we are done feeding it. Path contains a boxed slice of the exact size, without any extra capacity.

Next steps

All the coordinates in librsvg are stored as f64, double-precision floating point numbers. The SVG/CSS spec says that single-precision floats are enough, and that 64-bit floats should be used only for geometric transformations.

I'm a bit scared to make that change; I'll have to look closely at the results of the test suite to see if rendered files change very much. I suppose even big maps require only as much precision as f32 — after all, that is what OpenStreetMap uses.

References

2020-03-24 Tuesday.

  • Mail, call with Mike, some hacking; tech team call. Dug into unit tests, pwrt. parallelizing them. Movie in the evening.

March 23, 2020

Initial release of Jcat

Today I released the first official tarball of Jcat, version 0.1.0. I’ve started the process to get the package into Fedora as it will almost certainly be a hard requirement in the next major version of fwupd.

Since I announced Jcat a few weeks ago, I’ve had a lot of positive feedback about the general concept and, surprisingly, even one hardware vendors suggested they might start self-signing their firmware before uploading to the LVFS (which is great!). More LVFS announcements coming soon I promise…

The LVFS has been including Jcat files in archives and generating them for metadata for about three weeks now, and we’ve had no issues reported. Once the package is available in Fedora 32 I’ll merge the fwupd pull request to make it a hard dep. All you other distro package maintainers, please go do your packaging thing!

If anyone finds any oddities or weird behavior, please file an issue. I’m not expecting to make API breaks now, but will if we find a design bug. Most of the code is imported from fwupd, and so I’m pretty comfortable with the general design. Comments welcome.

March 20, 2020

GObject Class Private Data

It can be very handy to store things you might do as meta programming in your GObjectClass‘s private data (See G_TYPE_CLASS_GET_PRIVATE()).

Doing so is perfectly fine, but you need to be aware of how GTypeInstance initialization works. Each of your parent classes instance init functions are called before your subclasses instance init (and in order of the type hierarchy). What might seem non-obvious though is that the GTypeInstance.g_class pointer is updated as each successive _init() function is called.

That means if you have my_widget_init() and your parent class is GtkWidget, the gtk_widget_init() does not know it’s instantiating a subclass. Further more, GTK_WIDGET_GET_CLASS() called from gtk_widget_init() will get you the base classes GtkWidgetClass, not the subclasses GtkWidgetClass.

There are ways around this if you don’t use G_DEFINE_TYPE(), but honestly, who wants to do that.

One technique around this, which I used in Bonsai’s DAO, is to use a single-linked list where the head is in each subclass, but the tail exists in each of the parent classes. That way you share all the parent structures, but the subclasses can access all of theirs. You’ll still want to defer most setup work until constructed() though so you can get the full class information of the subclass and hierarchy.

It's templates all the way down

Benjamin Tissoires and I have been busy anthophila and working on the freedesktop CI templates. This post is primarily of interest if you're working on GitLab, specifically if your repo is hosted on gitlab.freedesktop.org. If either of those applies, prepare to be distracted from the current pandemic, otherwise maybe just prepare to be entertained. I'll do my best to be less miserable than the news.

We all know that CI/CD really helps with finding bugs early. If you don't know that yet, insert a jedi handwave before the previous sentence and now you do. GitLab is the git forge now used by freedesktop.org and it comes with a built-in CI system. I'm leaving out the difficult bits such as actually setting the thing up because this is obviously all handled by Heinzelmännchen and just readily available, hooray. I'm also going to assume that you roughly know how to write GitLab CI jobs or, failing that, at least know how to read YAML without screaming. So for this post, we start with the basic problem that your .gitlab-ci.yml is getting unwieldy, repetitive or generally just kinda sucks to maintain. Which is roughly where libinput and libevdev were a while back which caused Benjamin to start the ci-templates.

Now, what do we want? (other than a COVID-19 cure) Reproducible tests, possibly on different distributions, with the same base system across tests. For my repos the goal was basically "test on the common distributions to catch certain bugs early". [1] For Mesa, the requirement is closer to "have a fixed set of images that 'never' change so tests are reproducible". Both goals have much in common.

Your first venture into CI will look like this:


myjob:
image: fedora:31
before_script:
- dnf update -y
- dnf install -y onepackage twopackage threepackage floor
script:
- meson builddir && ninja -C builddir test
So, in short: take a Fedora 31 docker image, update it [2], install the required packages and then run the actual test part - meson and ninja. Easy.

This works fine but it takes approximately forever because dnf update is slow and you're potentially pulling down gigs of packages on every test run. Which is fun, but less so when you have 10 different jobs and they all do that. So let's call this step 1 and pretend we're more advanced than that. Step 2 is where you start building an image you re-use, steps 3 to N are the bits where you learn more than you want to know about docker, podman, skopeo and how many typos you can put into a YAML file. So, ad break, and we jump right to the part where enlightenment is just around the corner or wherever enlightenment lurks these days.

Using the CI Templates

Here's the .gitlab-ci.yml to build a Fedora 31 images with ci-templates and run the test on that image:


include:
- project: 'freedesktop/ci-templates'
ref: 123456deadbeef
file: '/templates/fedora.yml'

variables:
# project name of the upstream repo
FDO_UPSTREAM_REPO: someproject/name

stages:
- prep
- test

myimage:
extends: .fdo.container-build@fedora
stage: prep
variables:
FDO_DISTRIBUTION_VERSION: '31'
FDO_DISTRIBUTION_PACKAGES: 'onepackage twopackage threepackage floor'
FDO_DISTRIBUTION_TAG: '2020-03-20.0'

myjob:
extends: .fdo.distribution-image@fedora
stage: test
script:
- meson builddir && ninja -C builddir test
variables:
FDO_DISTRIBUTION_VERSION: '31'
FDO_DISTRIBUTION_TAG: '2020-03-20.0'
Now, you guessed correctly that the .fdo and FDO_ prefixes are used by the templates. There is a bunch of stuff hidden here. Basically, this will:
  • check if the image exists in your personal project's registry and use that, but if not
  • check if the image exists in the given upstream project's registry and use that, but if not
  • create a Fedora 31 image with the given packages installed and pushes it with the tag to the registry
  • use that image (whether newly created or pre-existing) and run the tests on it
There are a few more details too, but that's roughly the summary of it. For existing tags, the the myimage job effectively becomes a noop and the myjob job will re-use the image. The image will be in your registry so you can podman run it locally to reproduce a bug.

To build a new image, simply change the tag. Either because you want newer packages or you need extra (or less packages). And the nice thing here: you will build a new image as part of your merge request and run the CI against that new image. But upstream and every other MR will keep using the old image - right up until your MR is merged at which point every (future) MR will use that new updated image.

Want to build a Debian Stretch image? Replace Fedora and 31 with debian and stretch. Same for Ubuntu, Centos, Alpine and Arch though for those two you don't need a version number.

Templating the templates

"But, but, Peter, I want to test on eleventy different distribution like you do" I hear you say. Well, fear not, for this is where the ci-fairy comes in. How about we *gasp* generate the .gitlab-ci.yml file from a base configuration? That can't possibly be a bad idea, so let's do that! First, we save our configuration into the .gitlab-ci/config.yml:


distributions:
- name: fedora
tag: 12345
version: 30
- name: ubuntu
tag: abcde
version: '19.10'
# and so on, and so forth

packages:
- curl
- wget
- gcc
There is no specific requirement on the structure of the config file, ci-fairy simply loads it and passes it to Jinja2. Your template could thus look like this .gitlab-ci/ci.template file:

include:
{% for d in distributions %}
- project: 'freedesktop/ci-templates'
ref: 123456deadbeef
file: '/templates/{{d.name}}.yml'
{% endfor %}

stages:
- prep
- test

{% for d in distributions %}

.{{d.name}}.{{d.version}}:
variables:
FDO_DISTRIBUTION_VERSION: '{{d.version}}'
FDO_DISTRIBUTION_TAG: '{{d.tag}}'

myimage.{{d.name}}.{{d.version}}:
extends:
- .fdo.container-build@{{d.name}}
- .{{d.name}}.{{d.version}}
stage: prep
variables:
FDO_DISTRIBUTION_PACKAGES: "{{' '.join(packages)}}"

myjob.{{d.name}}.{{d.version}}:
extends:
- .fdo.distribution-image@{{d.name}}
- .{{d.name}}.{{d.version}}
stage: test
script:
- meson builddir && ninja -C builddir
{% endfor %}
And to locally generate our .gitlab-ci.yml, all we need to do is

$ pip3 install git+http://gitlab.freedesktop.org/freedesktop/ci-templates
$ cd path/to/project
$ ci-fairy generate-template
$ ci-fairy lint # checks the resulting YAML for syntax errors
$ git commit .gitlab-ci.yml
And, for reference, the file we generated here looks like this:

include:
- project: 'freedesktop/ci-templates'
ref: 123456deadbeef
file: '/templates/fedora.yml'
- project: 'freedesktop/ci-templates'
ref: 123456deadbeef
file: '/templates/ubuntu.yml'

stages:
- prep
- test

.fedora.30:
variables:
FDO_DISTRIBUTION_VERSION: '30'
FDO_DISTRIBUTION_TAG: '12345'

myimage.fedora.30:
extends:
- .fdo.container-build@fedora
- .fedora.30
stage: prep
variables:
FDO_DISTRIBUTION_PACKAGES: "curl wget gcc"

myjob.fedora.30:
extends:
- .fdo.distribution-image@fedora
- .fedora.30
stage: test
script:
- meson builddir && ninja -C builddir

.ubuntu.19.10:
variables:
FDO_DISTRIBUTION_VERSION: '19.10'
FDO_DISTRIBUTION_TAG: 'abcde'

myimage.ubuntu.19.10:
extends:
- .fdo.container-build@ubuntu
- .ubuntu.19.10
stage: prep
variables:
FDO_DISTRIBUTION_PACKAGES: "curl wget gcc"

myjob.ubuntu.19.10:
extends:
- .fdo.distribution-image@ubuntu
- .ubuntu.19.10
stage: test
script:
- meson builddir && ninja -C builddir
Aside from the templating a new thing here is the e.g. .fedora.30 template what we extend from. This is an easy way to avoid having to set things like the distribution version and the tag multiple times. And a few things of note: the tag is job-specific (not distribution-specific). So you could have two Fedora 30 images with two different tags. This is also just an example I typed out, a real-world .gitlab-ci.yml will look more complex and different. So only rely on the above to get an idea of what's possible.

A word for non-gitlab.freedesktop.org users: You can use the remote: include directive to use the templates from elsewhere. ci-fairy isn't tied to freedesktop.org either but you'll have to provide more flags to get what you want instead of relying on the default behaviours.

The documentation for CI Templates has more, go and peruse my pretties.

[1] For months the CI was basically just a build test because I couldn't run the test suite in a container
[2] Updating isn't always required but sooner or later you run into a dependency issue if you don't

March 16, 2020

Maintain release info easily in MetaInfo/Appdata files

This article isn’t about anything “new”, like the previous ones on AppStream – it rather exists to shine the spotlight on a feature I feel is underutilized. From conversations it appears that the reason simply is that people don’t know that it exists, and of course that’s a pretty bad reason not to make your life easier 😉

Mini-Disclaimer: I’ll be talking about appstreamcli, part of AppStream, in this blogpost exclusively. The appstream-util tool from the appstream-glib project has a similar functionality – check out its help text and look for appdata-to-news if you are interested in using it instead.

What is this about?

AppStream permits software to add release information to their MetaInfo files to describe current and upcoming releases. This feature has the following advantages:

  • Distribution-agnostic format for release descriptions
  • Provides versioning information for bundling systems (Flatpak, AppImage, …)
  • Release texts are short and end-user-centric, not technical as the ones provided by distributors usually are
  • Release texts are fully translatable using the normal localization workflow for MetaInfo files
  • Releases can link artifacts (built binaries, source code, …) and have additional machine-readable metadata e.g. one can tag a release as a development release

The disadvantage of all this, is that humans have to maintain the release information. Also, people need to write XML for this. Of course, once humans are involved with any technology, things get a lot more complicated. That doesn’t mean we can’t make things easier for people to use though.

Did you know that you don’t actually have to edit the XML in order to update your release information? To make creating and maintaining release information as easy as possible, the appstreamcli utility has a few helpers built in. And the best thing is that appstreamcli, being part of AppStream, is available pretty ubiquitously on Linux distributions.

Update release information from NEWS data

The NEWS file is a not very well defined textfile that lists “user-visible changes worth mentioning” per each version. This maps pretty well to what AppStream release information should contain, so let’s generate that from a NEWS file!

Since the news format is not defined, but we need to parse this somehow, the amount of things appstreamcli can parse is very limited. We support a format in this style:

Version 0.2.0
~~~~~~~~~~~~~~
Released: 2020-03-14

Notes:
 * Important thing 1
 * Important thing 2

Features:
 * New/changed feature 1
 * New/changed feature 2 (Author Name)
 * ...

Bugfixes:
 * Bugfix 1
 * Bugfix 2
 * ...

Version 0.1.0
~~~~~~~~~~~~~~
Released: 2020-01-10

Features:
 * ...

When parsing a file like this, appstreamcli will allow a lot of errors/”imperfections” and account for quite a few style and string variations. You will need to check whether this format works for you. You can see it in use in appstream itself and libxmlb for a slightly different style.

So, how do you convert this? We first create our NEWS file, e.g. with this content:

Version 0.2.0
~~~~~~~~~~~~~~
Released: 2020-03-14

Bugfixes:
 * The CPU no longer overheats when you hold down spacebar

Version 0.1.0
~~~~~~~~~~~~~~
Released: 2020-01-10

Features:
 * Now plays a "zap" sound on every character input

For the MetaInfo file, we of course generate one using the MetaInfo Creator. Then we can run the following command to get a preview of the generated file: appstreamcli news-to-metainfo ./NEWS ./org.example.myapp.metainfo.xml - Note the single dash at the end – this is the explicit way of telling appstreamcli to print something to stdout. This is how the result looks like:

<?xml version="1.0" encoding="utf-8"?>
<component type="desktop-application">
  [...]
  <releases>
    <release type="stable" version="0.2.0" date="2020-03-14T00:00:00Z">
      <description>
        <p>This release fixes the following bug:</p>
        <ul>
          <li>The CPU no longer overheats when you hold down spacebar</li>
        </ul>
      </description>
    </release>
    <release type="stable" version="0.1.0" date="2020-01-10T00:00:00Z">
      <description>
        <p>This release adds the following features:</p>
        <ul>
          <li>Now plays a "zap" sound on every character input</li>
        </ul>
      </description>
    </release>
  </releases>
</component>

Neat! If we want to save this to a file instead, we just exchange the dash with a filename. And maybe we don’t want to add all releases of the past decade to the final XML? No problem too, just pass the --limit flag as well: appstreamcli news-to-metainfo --limit=6 ./NEWS ./org.example.myapp.metainfo.tmpl.xml ./result/org.example.myapp.metainfo.xml

That’s nice on its own, but we really don’t want to do this by hand… The best way to ensure the MetaInfo file is updated, is to simply run this command at build time to generate the final MetaInfo file. For the Meson build system you can achieve this with a code snippet like below (but for CMake this shouldn’t be an issue either – you could even make a nice macro for it there):

ascli_exe = find_program('appstreamcli')
metainfo_with_relinfo = custom_target('gen-metainfo-rel',
    input : ['./NEWS', 'org.example.myapp.metainfo.xml'],
    output : ['org.example.myapp.metainfo.xml'],
    command : [ascli_exe, 'news-to-metainfo', '--limit=6', '@INPUT0@', '@INPUT1@', '@OUTPUT@']
)

In order to also translate releases, you will need to add this to your .pot file generation workflow, so (x)gettext can run on the MetaInfo file with translations merged in.

Release information from YAML files

Since parsing a “no structure, somewhat human-readable file” is hard without baking an AI into appstreamcli, there is also a second option available: Generate the XML from a YAML file. YAML is easy to write for humans, but can also be parsed by machines.The YAML structure used here is specific to AppStream, but somewhat maps to the NEWS file contents as well as MetaInfo file data. That makes it more versatile, but in order to use it, you will need to opt into using YAML for writing news entries. If that’s okay for you to consider, read on!

A YAML release file has this structure:

---
Version: 0.2.0
Date: 2020-03-14
Type: development
Description:
- The CPU no longer overheats when you hold down spacebar
- Fixed bugs ABC and DEF
---
Version: 0.1.0
Date: 2020-01-10
Description: |-
  This is our first release!

  Now plays a "zap" sound on every character input

As you can see, the release date has to be an ISO 8601 string, just like it is assumed for NEWS files. Unlike in NEWS files, releases can be defined as either stable or development depending on whether they are a stable or development release, by specifying a Type field. If no Type field is present, stable is implicitly assumed. Each release has a description, which can either be a free-form multi-paragraph text, or a list of entries.

Converting the YAML example from above is as easy as using the exact same command that was used before for plain NEWS files: appstreamcli news-to-metainfo --limit=6 ./NEWS.yml ./org.example.myapp.metainfo.tmpl.xml ./result/org.example.myapp.metainfo.xml If appstreamcli fails to autodetect the format, you can help it by specifying it explicitly via the --format=yaml flag. This command would produce the following result:

<?xml version="1.0" encoding="utf-8"?>
<component type="console-application">
  [...]
  <releases>
    <release type="development" version="0.2.0" date="2020-03-14T00:00:00Z">
      <description>
        <ul>
          <li>The CPU no longer overheats when you hold down spacebar</li>
          <li>Fixed bugs ABC and DEF</li>
        </ul>
      </description>
    </release>
    <release type="stable" version="0.1.0" date="2020-01-10T00:00:00Z">
      <description>
        <p>This is our first release!</p>
        <p>Now plays a "zap" sound on every character input</p>
      </description>
    </release>
  </releases>
</component>

Note that the 0.2.0 release is now marked as development release, a thing which was not possible in the plain text NEWS file before.

Going the other way

Maybe you like writing XML, or have some other tool that generates the MetaInfo XML, or you have received your release information from some other source and want to convert it into text. AppStream also has a tool for that! Using appstreamcli metainfo-to-news <metainfo-file> <news-file> you can convert a MetaInfo file that has release entries into a text representation. If you don’t want appstreamcli to autodetect the right format, you can specify it via the --format=<text|yaml> switch.

Future considerations

The release handling is still not something I am entirely happy with. For example, the release information has to be written and translated at release time of the application. For some projects, this workflow isn’t practical. That’s why issue #240 exists in AppStream which basically requests an option to have release notes split out to a separate, remote location (and also translations, but that’s unlikely to happen). Having remote release information is something that will highly likely happen in some way, but implementing this will be a quite disruptive, if not breaking change. That is why I am holding this change back for the AppStream 1.0 release.

In the meanwhile, besides improving the XML form of release information, I also hope to support a few more NEWS text styles if they can be autodetected. The format of the systemd project may be a good candidate. The YAML release-notes format variant will also receive a few enhancements, e.g. for specifying a release URL. For all of these things, I very much welcome pull requests or issue reports. I can implement and maintain the things I use myself best, so if I don’t use something or don’t know about a feature many people want I won’t suddenly implement it or start to add features at random because “they may be useful”. That would be a recipe for disaster. This is why for these features in particular contributions from people who are using them in their own projects or want their new usecase represented are very welcome.

March 15, 2020

How to use Sysprof to… Part II

In the previous article of this series we covered Sysprof basics to help you use the tooling. Now I want to take a moment to show you how to use the command line tooling to profile systems like GNOME Shell.

Record an existing session

The easiest way to get started is to record your existing GNOME Shell session. With sysprof-cli, you can use the --gnome-shell option and it will attempt to connect to your active GNOME Shell instance over D-Bus to stream COGL pipeline information over a private file-descriptor.

This information can be combined with callgraphs to see what is happening during the duration of a COGL mark.

The details page can also provide some quick overview information about the marks and their duration. You will find this helpful when comparing patches to see if they really improved things over time.

The details button in the top right will show you information about marks and their min/max/avg duration.

Basic Shell Recording

Running something like a desktop session is complex. You have a D-Bus daemon, a compositor, series of background daemons, settings infrastructure, and programs saving to your home directory. For this reason you cannot really run two of them for the same user at the same time, or even nested.

Because of this, it is handy to log out of your desktop session and switch to a VT to profile GNOME Shell. Sysprof provides a sysprof-cli binary you can use to profile in complicated setups like this.
Start by switching to another VT like Control+Shift+3. I recommend stopping the current display server just so that it doesn’t get in the way of profiling, but usually it’s okay to not. Then we can enter our JHBuild environment with a new D-Bus session before we start Sysprof and GNOME Shell.

Fedora 32 (Workstation Edition)
Kernel 5.6.0-0.rc4.git0.1.fc32.x86_64 (tty3)

startdust login: christian
Password: 
$ sudo service gdm stop
$ dbus-run-session jhbuild shell
$ 

At this point, we can spawn GNOME Shell with Sysprof to start recording.

You can use -- to specify the command you want sysprof-cli to execute while it records. When that application exits, sysprof-cli will extract all the known symbols and finish it’s recording.

I want to mention briefly that the --gnome-shell option only works with an existing GNOME session. I hope to fix that in the near future though.

$ sysprof-cli -- gnome-shell --wayland --display-server

At this point, GNOME Shell will have spawned and you can exercise it to exhibit the behavior you’d like to improve. When done, open a terminal window to kill GNOME shell so that the profiler can clean up.

kill -9 $(pidof gnome-shell) seems to work well for me

Now you’ll have a capture.syscap file in your current directory. Open that up with Sysprof to view the contents of your profiling session. Often I just spawn gnome-shell directly to open the syscap file and explore.

Recording JavaScript stacks

Sometimes you want to profile JavaScript instead of the C code from Shell, Mutter, and friends. To do this, use the --gjs command line option. Currently, this can give mixed results if you also sample callstacks with the Linux perf support, as the timings are not guaranteed to be equivalent. My recommendation is to disable perf when sampling JavaScript using the --no-perf option.

$ sysprof-cli --gjs --no-perf -- gnome-shell --wayland --display-server

Now when you open the callgraph in Sysprof, you’ll see JavaScript samples.

JavaScrpt callgraph example

Recording Energy Consumption

On Linux, we have support for tracking energy usage as “Rolling Average Power Limit” or RAPL for short. Sysprof can include this information for you in your capture if you have the turbostat utility available. It provides power information per “package” such as the GPU and CPU.

Keeping power consumption low is an important part of a modern desktop that aims to be useful on laptops and smaller form factors. It’s useful to check in now and again to ensure that we’re keeping things tip top.

$ sysprof-cli --rapl --no-perf -- gnome-shell --wayland --display-server

You might want to disable sampling while testing power consumption because that could have a larger effect in terms of wattage than the thing you’re profiling.

Don’t forget to check the counter and energy menus for additional graphs.

Reducing Memory Allocations

Plugging memory leaks is a great thing to do. But sometimes it’s better to never allocate things to begin with. The --memprof option can help you find extraneous allocations in your program. For example, I tested the --memprof option on GNOME Shell when writing it and immediately found a way to reduce temporary allocations by hundreds of MiB per minute of use.

$ sysprof-cli --memprof -- gnome-shell --wayland --display-server

Avoiding Main Loop Stalls

This one requires you to build Sysprof until our next release, but you can use the --speedtrack option to find things running on your main loop that may not be a good idea. It will also insert marks for how long the main loop iterations run to find periods of time that you aren’t staying interactive.

$ sysprof-cli --speedtrack -- gnome-shell --wayland --display-server

Anyway, that does it for now! Hope you found this brain dump insightful enough to help us all push forward on the performance curve.

Shortwave – First stable release

Today, after nearly two years of development I’m very proud to say: The first stable version of Shortwave is now available! I have put a lot of time and effort into this project, now it is finally time to make it available for everyone :-).

What is Shortwave?

Shortwave is an internet radio player that provides access to a station database with over 25,000 stations.

Automatic recording of songs

When a station is being played, everything gets automatically recorded in the background. You hear a song you like? No problem, you can save the song afterwards and play it with your favorite music player. Songs are automatically detected based on the stream metadata.

Streaming

It’s possible to stream the audio playback to a network device, which implements the Google Cast protocol (e.g. Chromecast). So you can easily listen to your favorite stations e.g. from a TV.

Adaptive interface

The interface of Shortwave is completely adaptive and adapts to all screen sizes. So you can use it on the desktop, but also on your Linux (not Android!) based smartphone.

Access to a huge database

Shortwave uses the internet service radio-browser.info as station database. It contains more than 25,000 stations. This ensures that you will find every radio station, whether a known or an exotic one.

System Integration

Shortwave integrates into the GNOME Shell, by providing a MPRIS applet and a proper PulseAudio implementation.

Gradio???

… is definitely dead now. But don’t worry, you can migrate the data easily to Shortwave.

Gradio: Application Menu -> “Library” -> “Export” -> “Gradio Database Format”

Shortwave: “Import stations from Gradio”

Download

Shortwave is already available to download from Flathub!

 

 

 

Or install it with:

flatpak install flathub de.haeckerfelix.Shortwave

Have fun with it! And many thanks to all who supported me during the development. Especially the fabulous GNOME Podcasts team :-)

 

March 13, 2020

Building and testing GTK

… or: how GTK developers check their work on the toolkit.

Since GNOME’s collective move to GitLab, GTK has taken advantage of the features provided by that platform—especially when it comes to its continuous integration pipeline.

In days of old, the only way to check that our changes to the toolkit were correct was to wait until the Continuous build bot would notify us of any breakage on the main development branches. While this was better than nothing, it didn’t allow us to prevent breakage during the development phase of anything—from features to bug fixes, from documentation improvements to adding new tests.

These days, the CI pipeline available in GitLab is run on every branch and merge request, long before the changes reach the public development branches used by everybody else.

Topic branches and merge requests

When developing a topic branch against the GTK 4 main development one, we run a CI pipeline that starts with a simple coding style check for the changes applied in the branch. The style check uses clang-format, which is often good enough for the GTK coding style; the coding style has a few “special” caveats, and clang-format can raise false negatives and false positives. For that reason, the style check is allowed to fail, but contributors and reviewers are strongly encouraged to check the logs in case of failure.

Once the style check is passed, we run the build phase, which currently contains three separate jobs:

  • a Linux debug build, using a Fedora container
  • an MSYS2 build on Windows
  • a Linux release build

The Linux debug build is pretty standard fare.

The MSYS2 build catches any issue with a GNU toolchain on Windows.

The release build is necessary to ensure that we don’t rely on side effects of the debugging code we have in place during development.

All of these jobs run the GTK test suite.

We publish the tests reports both as a JUnit file, taking advantage of GitLab’s support; and as an HTML report, stored as a pipeline artifact. This makes it easier for us to check what failed and what succeeded.

Ideally, we want to add more environments:

  • Linux builds based on other mainstream distributions
  • a Windows build using the MSVC toolchain
  • a macOS build, once the GDK backend is fixed

After the build and testing jobs pass, we step into an analysis phase. We run the Clang static analysis tool on GTK’s code base and generate a report. In the near future we could also run sanitizer tools like UBSan and ASan; fuzzying tools for our parsers, like GtkBuilder and CSS; or tools that verify that our UI definitions contain the appropriate accessibility descriptors.

Just like the tests, we publish the analysis reports as GitLab artifacts for review.

Once the analysis phase is passed, we build the API references, and check the result so that newly added symbols are properly documented.

Finally, we have manual CI jobs to build Flatpak bundles for the GTK demo application; the widget factory; and the icon browser. This allows designers to immediately test changes in Adwaita, or newly added widgets, without necessarily building GTK from a scratch on their systems.

Mainline development branches

Once the CI pipeline for a topic branch/merge request passes, we can merge the changes into the main development branch with a certain level of confidence that the code is correct and does what we want.

The main development branch runs the same pipeline as previously described, except that the Flatpak jobs are not manual any more—thus is always possible to test locally the current bleeding edge of GTK. Additionally, the documentation is published online, so it’s always up to date.

The GTK CI pipeline

What about GTK 3?

In the GTK 3 branch we have a simpler pipeline that runs the following jobs:

  • a full Meson debug build on Linux and Windows/MSYS2, for both static and shared libraries artifacts, on the current stable versions of Fedora and Debian
  • a full Meson release build on Linux, which also generates the API reference
  • an Autotools build on Linux and Windows/MSYS2
  • an optional Autotools distcheck build on Linux

The Autotools jobs will be in place for as long as GTK 3 supports Autotools. Ideally, we want to add other jobs for macOS and Windows/MSVC, taking advantage of the Meson build.

The GTK3 CI pipeline

Once the GTK 4 CI pipeline reaches a certain level of features and stability, we’re going to backport it to GTK 3, so we can be even more confident that the current stable branch does not regress.


For more information, you can check the GTK repository:

gedit – 36 things to do, maybe planning a crowdfunding

GNOME 3.36 has been released. And gedit 3.36 too!

In the small corner of the Universe where I live, when we say “36” it actually means “a lot”. When we have 36 things to do today, or when we cannot do 36 things at the same time. In the case of gedit, there are also 36 things to do, as you can imagine.

I now have more time that I can devote to GNOME, especially gedit. But I’m partly living on my savings.

Maybe planning to do a crowdfunding for gedit!

Do you think it would work? Is there still a wide interest for gedit?

gedit is the default text editor of GNOME, that is installed by default with many Linux distributions, so it ought to be a great app. But to be a great app, gedit needs a lot of work in my opinion. There are lots of imperfections and bugs, and the state of the code … could be improved significantly.

To give you an idea of possible things to improve in gedit, here is the roadmap (the items are in no particular order).

Update: to see what’s new in gedit, see the gedit NEWS file and gedit-plugins NEWS file (read also the 3.35 entries).

Note that the fact that I have more free time and the fact that I’m maybe planning a crowdfunding is not related to this corona pandemic thingy (I prefer to precise).

March 12, 2020

End of GNOME Outreachy 2019

The Outreachy Program

The outreachy program ended the past week and we've done great improvements during this four months of work. I'm very happy with the result and with the work of the two interns and also the GNOME co-mentors that make this possible.

If we're lucky the interns will continue contributing in the future and we can see the GNOME comunity growing in developers and diversity 🎉.

GNOME translation editor (Gtranslator)

Priyanka Saggu has been working on the new gtranslator search bar. It's a replacement for the old search dialog with a new and modern search bar, inspired in the gnome-builder search.

This is how it looks in the current gtranslator version and the video is from gtranslator master, that I'll try to release as 3.36.0 this weekend.

I want to thank to the other co-mentor of this project, Daniel Mustieles, who has been testing and reviewing this new functionality.

Fractal

Sonja Heinze has been working on the video player for Fractal, so now we can see videos inside the fractal message history instead of open it with an external video player.

This is now in master and will appear in the next release, that I'll try to publish soon, maybe during this month. I want to fix some performance issues first.

Jordan Petridis (alatiera) has done a great work as co-mentor, guiding the project and helping with the gstreamer.

I've not presented any project proposal for the next outreachy, I want to take a break and rest a bit before the Google Summer of Code, when I'll try to get the multi-account support implemented in Fractal.

This programs give me a bit of work, reviewing and guiding the intern, but it's really great to have paid people working on free software, so I'm very happy to be able to be a mentor in GNOME to help to boost some free software projects using these resources.

March 11, 2020

GNOME 3.36 user documentation updates

Looks like since the release of GNOME 3.34.0 in September 2019 I made exactly 500 commits in GNOME Git. :)

Localized screenshots shipped in GNOME 3.34 versus the same screenshot in 3.36

Localized screenshots shipped in GNOME 3.34 versus the same screenshot in 3.36

  • My main focus was on updating documentation. The user help of cheese, gnome-klotski, gnome-mahjongg, gnome-nibbles, gnome-robots, gnome-terminal, gnome-tetravex, iagno, lightsoff, quadrapassel, rhythmbox, zenity should be up-to-date in 3.36 again.
    (If not, then report issues in GNOME GitLab with the label “8. User Docs” or contribute patches yourself.)

  • This also included updating a majority of outdated screenshots (both English and localized versions when feasible) across projects.

  • I also took the liberty to push quite some trivial markup fixes in some translations when a language was not already reserved for translation on GNOME’s translation platform (as such actions would interfere).

Enjoy 3.36!

Epiphany 3.36 and WebKitGTK 2.28

So, what’s new in Epiphany 3.36?

PDF.js

Once upon a time, beginning with GNOME 3.14, Epiphany had supported displaying PDF documents via the Evince NPAPI browser plugin developed by Carlos Garcia Campos. Unfortunately, because NPAPI plugins have to use X11-specific APIs to draw web content, this didn’t  suffice for very long. When GNOME switched to Wayland by default in GNOME 3.24 (yes, that was three years ago!), this functionality was left behind. Using an NPAPI plugin also meant the code was inherently unsandboxable and tied to a deprecated technology. Epiphany disabled support for NPAPI plugins by default in Epiphany 3.30, hiding the functionality behind a hidden setting, which has now finally been removed for Epiphany 3.36, killing off NPAPI for good.

Jan-Michael Brummer, who comaintains Epiphany with me, tried bringing back PDF support for Epiphany 3.34 using libevince, but eventually we decided to give up on this approach due to difficulty solving some user experience issues. Also, the rendering occurred in the unsandboxed UI process, which was again not good for security.

But PDF support is now back in Epiphany 3.36, and much better than before! Thanks to Jan-Michael, Epiphany now supports displaying PDFs using the amazing PDF.js. We are thankful for Mozilla’s work in developing PDF.js and open sourcing it for us to use. Viewing PDFs in Epiphany using PDF.js is more convenient than downloading them and opening them in Evince, and because the PDF is rendered in the sandboxed web process, using web technologies rather than poppler, it’s also approximately one bazillion times more secure.

Screenshot of Epiphany displaying a PDF document
Look, it’s a PDF!

One limitation of PDF.js is that it does not support forms. If you need to fill out PDF forms, you’ll need to download the PDF and open it in Evince, just as you would if using Firefox.

Dark Mode

Thanks to Carlos Garcia, it should finally be possible to use Epiphany with dark GTK themes. WebKitGTK has historically rendered HTML elements using the GTK theme, which has not been good for users of dark themes, which broke badly on many websites, usually due to dark text being drawn on dark backgrounds or various other problems with unexpected dark widgets. Since WebKitGTK 2.28, WebKit will try to manually change to a light GTK theme when it thinks a dark theme is in use, then use the light theme to render web content. (This work has actually been backported to WebKitGTK 2.26.4, so you don’t need to upgrade to WebKitGTK 2.28 to benefit, but the work landed very recently and we haven’t blogged about it yet.) Thanks to Cassidy James from elementary for providing example pages for testing dark mode behavior.

Screenshot demonstrating broken dark mode support
Broken dark mode support prior to WebKitGTK 2.26.4. Notice that the first two pages use dark color schemes when light color schemes are expected, and the dark blue links are hard to read over the dark gray background. Also notice that the text in the second image is unreadable.
Screenshot demonstrating fixed dark mode support in WebKitGTK 2.26.4
Since WebKitGTK 2.26.4, dark mode works as it does in most other browsers. Websites that don’t support dark mode are light, and websites that do support dark mode are dark. Widgets themed using GTK are always light.

Since Carlos had already added support for the prefers-color-scheme media query last year, this now gets us up to dark mode parity with most browsers, except, notably, Safari. Unlike other browsers, Safari allows websites to opt-in to rendering dark system widgets, like WebKitGTK used to do before these changes. Whether to support this in WebKitGTK remains to-be-determined.

Process Swap on Navigation (PSON)

PSON, which debuted in Safari 13, is a major change in WebKit’s process model. PSON is the first component of site isolation, which Chrome has supported for some time, and which Firefox is currently working towards. If you care about web security, you should care a lot about site isolation, because the web browser community has arrived at a consensus that this is the best way to mitigate speculative execution attacks.

Nowadays, all modern web browsers use separate, sandboxed helper processes to render web content, ensuring that the main user interface process, which is unsandboxed, does not touch untrusted web content. Prior to 3.36, Epiphany already used a separate web process to display each browser tab (except for “related views,” where one tab opens another and gains scripting ability over the opened tab, subject to the Same Origin Policy). But in Epiphany 3.36, we now also have a separate web process per website. Each tab will swap between different web processes when navigating between different websites, to prevent any one web process from loading content from different websites.

To make these process swap navigations fast, a pool of prewarmed processes is used to hide the startup cost of launching a new process by ensuring the new process exists before it’s needed; otherwise, the overhead of launching a new web process to perform the navigation would become noticeable. And suspended processes live on after they’re no longer in use because they may be needed for back/forward navigations, which use WebKit’s page cache when possible. (In the page cache, pages are kept in memory indefinitely, to make back/forward navigations fast.)

Due to internal refactoring, PSON previously necessitated some API breakage in WebKitGTK 2.26 that affected Evolution and Geary: WebKitGTK 2.26 deprecated WebKit’s single web process model and required that all applications use one web process per web view, which Evolution and Geary were not, at the time, prepared to handle. We tried hard to avoid this, because we hate to make behavioral changes that break applications, but in this case we decided it was unavoidable. That was the status quo in 2.26, without PSON, which we disabled just before releasing 2.26 in order to limit application breakage to just Evolution and Geary. Now, in WebKitGTK 2.28, PSON is finally available for applications to use on an opt-in basis. (It will become mandatory in the future, for GTK 4 applications.) Epiphany 3.36 opts in. To make this work, Carlos Garcia designed new WebKitGTK APIs for cross-process communication, and used them to replace the private D-Bus server that Epiphany previously used for this purpose.

WebKit still has a long way to go to fully implement site isolation, but PSON is a major step down that road. Thanks to Brady Eidson and Chris Dumez from Apple for making this work, and to Carlos Garcia for handling most of the breakage (there was a lot). As with any major intrusive change of such magnitude, regressions are inevitable, so don’t hesitate to report issues on WebKit Bugzilla.

highlight.js

Once upon a time, WebKit had its own implementation for viewing page source, but this was removed from WebKit way back in 2014, in WebKitGTK 2.6. Ever since, Epiphany would open your default text editor, usually gedit, to display page source. Suffice to say that this was not a very satisfactory solution.

I finally managed to implement view source mode at the Epiphany level for Epiphany 3.30, but I had trouble making syntax highlighting work. I tried using various open source syntax highlighting libraries, but most are designed to highlight small amounts of code, not large web pages. The libraries I tried were not fast enough, so I gave up on syntax highlighting at the time.

Thanks to Jan-Michael, Epiphany 3.36 supports syntax highlighting using highlight.js, so we finally have view source mode working fully properly once again. It works much better than my failed attempts with different JS libraries. Please thank the highlight.js developers for maintaining this library, and for making it open source.

Screenshot displaying Epiphany's view source mode
Colors!

Service Workers

Service workers are now available in WebKitGTK 2.28. Our friends at Apple had already implemented service worker support a couple years ago for Safari 11, but we were pretty slow in bringing this functionality to Linux. Finally, WebKitGTK should now be up to par with Safari in this regard.

Cookies!

Patrick Griffis has updated libsoup and WebKitGTK to support SameSite cookies. He’s also tightened up our cookie policy by implementing strict secure cookies, which prevents http:// pages from setting secure cookies (as they could overwrite secure cookies set by https:// pages).

Adaptive Design

As usual, there are more adaptive design improvements throughout the browser, to provide a better user experience on the Librem 5. There’s still more work to be done here, but Epiphany continues to provide the best user experience of any Linux browser at small screen sizes. Thanks to Adrien Plazas and Jan-Michael for their continued work on this.

Screenshot showing Epiphany running in mobile mode at small window size.
As before, simply resize your browser window to see Epiphany dynamically transition between desktop mode and mobile mode.

elementary OS

With help from Alexander Mikhaylenko, we’ve also upstreamed many elementary OS design changes, which will be used when running under the Pantheon desktop (and not impact users on other desktops), so that the elementary developers don’t need to maintain their customizations as separate patches anymore. This will eliminate a few elementary-specific bugs, including some keyboard shortcuts that were previously broken only in elementary, and some odd tab bar behavior. Although Epiphany still doesn’t feel quite as native as an app designed just for elementary OS, it’s getting closer.

Epiphany 3.34

I failed to blog about Epiphany 3.34 when I released it last September. Hopefully you have updated to 3.34 already, and are already enjoying the two big features from this release: the new adblocker, and the bubblewrap sandbox.

The new adblocker is based on WebKit Content Blockers, which was developed by Apple several years ago. Adrian Perez developed new WebKitGTK API to expose this functionality, changed Epiphany to use it, and deleted Epiphany’s older resource-hungry adblocker that was originally copied from Midori. Previously, Epiphany kept a large GHashMap of compiled regexes in every web process, consuming a very significant amount of RAM for each process. It also took time to compile these regexes when launching each new web process. Now, the adblock filters are instead compiled into an efficient bytecode format that gets mmapped between all web processes to avoid excessive resource use. The bytecode is interpreted by WebKit itself, rather than by Epiphany’s web process extension (which Epiphany uses to execute custom code in WebKit’s web process), for greatly improved performance.

Lastly, Epiphany 3.34 enabled Patrick’s bubblewrap sandbox, which was added in WebKitGTK 2.26. Bubblewrap is an amazing sandboxing tool, already used effectively by flatpak and rpm-ostree, and I’m very pleased with Patrick’s decision to use it for WebKit as well. Because enabling the sandbox can break applications, it is currently opt-in for GTK 3 apps (but will become mandatory for GTK 4 apps). If your application uses WebKitGTK, you really need to take some time to enable this sandbox using webkit_web_context_set_sandbox_enabled(). The sandbox has introduced a couple regressions that we didn’t notice until too late; notably,  printing no longer works, which, half a year later, we still haven’t managed to fix yet. (I’ll try to get to it soon.)

OK, this concludes your 3.36 and 3.34 updates. Onward to 3.38!

March 10, 2020

Mozilla makes Firefox Beta available on Flathub

I’m glad to see that Mozilla has made a significant process with offering Firefox as a flatpak. Having Firefox as a flatpak was one of our long-term goals.

Three years ago we started a testing flatpak repo with Firefox Developer Edition and soon after that we added Firefox Nightly. For a long time it was the only source of Firefox for Flatpak out there. The user base grew into thousands, a level our hosting could barely deal with. Lately we haven’t had much time for its maintenance and at least the nightly builds were often broken.

That’s why from the very beginning we worked with Mozilla to make official Firefox builds available as flatpaks. The effort was later on picked up by Endless.

Now it brings first fruits, Mozilla is already shipping Firefox Beta in the beta channel on Flathub. You just need to enable it by installing this file: https://flathub.org/beta-repo/appstream/org.mozilla.firefox.flatpakref

I think it may already be useful for Silverblue users who have relied on our testing repo if they didn’t want to use package overlay.

There are still a few things to polish before making the stable Firefox available in the stable channel. One of them is localization files which has always been a difficult thing with Firefox. Mozilla has traditionally provided official localized builds for each language. This is not how localizations are typically handled in Flatpak or even in Linux distro packages.

For Fedora Firefox RPM we had to write a script that on startup automatically loads a particular localization file in the form of an addon. I suppose the official Firefox flatpak will work in a similar way.

Last year we also started providing Firefox flatpak built from Fedora packages. That has been the only stable Firefox for Flatpak around. And even after Mozilla releases their official Firefox for Flatpak, we will stay committed to it because there is demand among Fedora users for Firefox that is verified and built by the Fedora Project and it’s also a requirement for software included in Fedora anyway. So if we want to have Firefox as a default, pre-installed browser e.g. in Silverblue we’ll have to build the flatpak ourselves. It also gives us flexibility to ship crucial fixes and features important to our users faster than upstream (e.g. Firefox in Fedora already runs natively on Wayland by default, not on XWayland like upstream Firefox).

In the future, users will have a choice. They can either stick with the default Firefox provided by us, or switch to the one provided directly by Mozilla in a more convenient and secure way than the current tarballs with binaries are. We will also keep maintaining our testing repo for those who are interested in Nightly and Developer Edition. And we’ll see if there is sufficient interest in it to continue.

 

March 09, 2020

Recapping my journey at Endless

It seemed only yesterday when I joined Endless as a Papercuts Team member after being a GNOME contributor for a while. I was supposed to fix papercuts issues in the OS, package a few electron apps to flatpaks and overall work alongside the desktop team. I am happy to report that the experience almost 2 years now, has been a learning and a positive one that has made me a better developer and human by-and-large.

Endless is a mission. A very tough one, though. Closely working with the team, I learned more about the mission and goals; how our work is going to address the problem-space and K.P.Is to measure the impact made; it was sheer-motivating. In other words:

“Being a part of something bigger, a movement.”

Over the course, I got shifted to the core desktop team, where I started working directly in the desktop team. I joined the desktop team at a time where the OS had a huge delta(GNOME 3.26) with the upstream GNOME and we needed to fast-forward the rebase over three releases(GNOME 3.32). Naturally rebasing such a delta is no cakewalk, given the desktop team was dealing with fallouts from the rebases until the next 3-to-4 rolling releases. Retrospecting this particular time period, I think we learned three major lessons.

  • Rebase, Rebase, Rebase every six months

  • Upstream all the things!

  • Adopt upstream-first strategy where-ever possible

The first one is quite obvious as the rebasing over GNOME 3.32 was a very painful one. We took a note of it in the retrospectives and made sure we do not miss a rebase over an upstream release cycle. The second and third are complementary to each other. Endless has taken great strides in this direction. Projects that were intially started as a downstream project based on the various forms of requests from the field/ground reports, have been upstream-ed across various open source projects, be it Kernel, Flatpak, OSTree and many core GNOME components. Endless also proposes plans that might be useful to the community in general for e.g., I can recommend the talk given by Robert McQueen on Product Metrics & Respecting Privacy for GNOME from GUADEC-2019, where he talks about how GNOME can benefit on metrics front, using some of the pieces that has already been developed and tested by Endless in the field. This is important as the community gets some food for thought, design and refine their goals and probably helps to develop a roadmap for future releases.

Highlights of the work that I have been doing!

Disk space improvements:

Endless suffered from ENOSPACE on low-cost systems where total disk-space is 32GB. This was due to the OS image size containing all the offline knoweldge-library content and having apps such as encyclopedias. A ton of work went into improving disk space across Flatpak, OSTree and GNOME Software.

  • Eliminating the use of double-disk space temporarily by flatpak on system pull - During a ref-pull for the system-repo, flatpak intially pulled into a temporary child repo and then verify/copy each object into the system repo. This caused the problem of transient double-disk-space usage. Hence an encyclopedia app of size 6GB requires 12 GB of free disk-space in order to get installed. To eliminate this problem, a special FUSE system revokefs-fuse was introduced to mimick revoke(path), in addition to a trustable-but-not-root user as the initial owner of the files being written to disk. When all objects were pulled in, the FUSE is unmounted (this makes sure that there are no open FDs to path hence no changes can be done to the underlying pulled-in objects), verification and permissions canoncalization (check for setuid bits) and finally hard-link it to the system’s repo (hardlinks are faster then copying) hence, overall improving system-pulls speeds while reducing ~50% I/O at the same time. A more detailed version of the approach is documented here and you can also checkout the flatpak-source.

  • min-free-space-size=X GBs” config parameter for OSTree repo - A gatekeeper check that makes sure that a pull doesn’t fill up the entire’s user disk-space in the background. Plays a important role when the user has opted-in for auto-updates of apps/runtime or the OS itself.

  • Low-disk space heuristics - Some disk-space heuristics and special behavior in GNOME Software to detect if the user will run out of disk-space if they proceed with an install or update operation.

I covered this in detail in my earlier post.

Password peeking support in GNOME Shell:

Endless ground reports always had a request for password-peeking functionality so that was already one of the top priority at Endless. Adopted upstream-first strategy hence, GNOME 3.36 is shipping password-peeking support. This was developed entirely on Endless’ work-time and was one of the top on list of things to watch out for next release at the last GNOME Advisory Board meeting at FOSDEM (as I have been told).

Collaborating on GNOME new lock screen:

Few pieces around the new user-avatar at login and lock-screen were contributed out of my Endless’ work-time under the guidance of GNOME Shell and Design team. Not only I started to enjoy developing the Shell(as it’s quite a challenge + fun), I started to take more initiative for shell-related work. Also, to get these specific updates, I strongly recommended following the GNOME Shell and Mutter development blog!

Hooking parental-controls across the desktop

Parental controls has been one of the important focus for Endless for a while now and I fondly remember working on it as it was one of the most productive work-period during the journey. The project is authored and being led by Philip Withnall to bring this to the wider community for availability.


Needless to say that all this were unconditionally accompanied by chasing endless bugs/crashes/builds or test failures across various system-components, resolving release blockers, rebases, packaging and dabbling with infrastructure to setup up the pipeline.

Lately I have been doing some basic metrics restructuring work that has been refreshing to work on and also to take a break from doing GNOME-y work.

Developing and enhancing soft skills

To be honest, 2 years back when I started, I took soft-skills for granted and I assumed I pretty-much have it - “How hard it could be, right?” I couldn’t have been more wrong. My myth were busted time and again over the course. It takes real work to communicate efficiently while constantly monitoring/maintaining that I am not directing my agreement/disagreement/comments as a personal attack towards a colleague no matter where in corporate heirarchy or team. I tried to demonstrate a kind and humble attitude towards everyone and the work that they did and I have pretty-much got the same in return. Looking back, I see myself how much growth I have internalized in this particular domain which I think will help me in maintaining relationships with the people I have worked with over time.

On remote work and managing stress

My position as Desktop software developer has been 100% remote. I worked from India (UTC +5:30) most of the times. My experience with remote work has been positive so far and my experience with couple of Google summer of Code rounds had made an eligibility-check of “being able to work remotely” (as Endless is my first-company to work for, after I graduated in 2017) while I was interviewing for the company.

Now to the section of managing stress and burn-out. It’s pretty much established that, at least at some point in work-life, one faces burnout and doesn’t know what to do about it. This was new for me as well. Working, inevitably brings stress especially when you are working against a release deadline and/or being in a deep rabbit-hole or just when things are not going as you would expect. Initially I used to go silent on things like this, fearing that it would affect my performance reviews and being looked down upon as an engineer; but to my suprise opening about it eventually, was one of the best things I have done. Not only was I provided with support but I was made to understand that this is normal and everyone suffers from it at some point in their careers. All we can do is learn from it, listen to warning-signals and take regular time-off to recover from it.

Before closing on this note, I want to just touch upon couple of more points about remote-work that I think are valuable lessons learnt:

  • Be clear in your communications even if it means over-communicating a bit.

  • Being remote is great and all, but also judge if the work isn’t quite async enough beforehand. If the work or iterations requires tight feedback loops to work it through and requires 2 or more persons in opposite timezones, I (personally) have found very difficult to work in that setup. Although this should not be a mandatory rule to follow but I think pro-longed periods of such kind of work might stress you out faster and should be avoided (or be made more async-ed), if possible.

On Endless next steps

As Endless moves ahead to restructure itself; being a full-fledged non-profit foundation, I wish nothing but huge success on their path forwards. Certainly considering working out all these years that have brought learnings and experiences about how the next billion of users will interact with computing; the decisions made in the resturcture makes total sense. I also want to give a huge shout out to the entire team at Endless who has helped my grow in so many aspects. Thank you so much.

Having said that, I will be looking for a job very soon. If you or your org. is looking for a generalist system-developer with track record in GNOME or associated technologies, I am eager to talk to you. Feel free to drop a comment down below or ping me at Twitter, LinkedIn or email at mailumangjain@gmail.com

Markdowm Image

Until the next post, Happy Hacking everyone!