May 14, 2021

Ramblings about GNOME development

Taking a quick look at Rust

I've looked a little at Rust, and the so-called "high-level ergonomics" of the language doesn't entirely suit me. Examples:

  • Type inference (optional) when declaring a variable, used throughout the reference book and thus used in many Rust projects I guess. When I read code, I like to know the types of variables, in order to have a better understanding.
  • Variable shadowing, in the same scope, with different variable types. In short, it breaks the "one variable, one purpose" best-practice, to not have confusing code. When we see the use of a variable somewhere in Rust code, it means that we must not look at "its" declaration to know its type, we must look at its last declaration.

So, even though there is a very good compiler, checking a lot of things for us, at its core, for me what is important when programming is to well understand the code that I write, to convince myself that "this code is correct" before running the compiler and testing the change. And when I re-read myself, having a verbose language syntax like in C, it turns out that it helps me a lot.

Verbosity and redundancy is actually good and is used a lot in natural languages, both verbally and in writing. Redundancy in a programming language must be accompanied by a compiler that checks if the redundant things match, and the text editor can come in handy too to facilitate things (but it must not be too smart, otherwise it gets in the way).

That said, I would like to like Rust better. If you do, please ignore me (the internet is anyway full of contradicting opinions).
Otherwise, adding "syntactic pepper" to Rust would not hurt, to make its syntax a bit more verbose.

Back to C-plus-GLib: semi-OOP style? or creating GObject subclasses?

I still like the "C + GLib + GTK-Doc + Devhelp" combination for software development. But it's maybe because that's what I've practiced the most during the 2010's, and it's hard to change habits.

What I don't really like, though, is creating lots of GObject subclasses, and writing GObject Introspection-friendly APIs (to take care of language bindings). It's a burden that GNOME library developers need to carry.

I said in the previous section that I like a verbose syntax, but here when subclassing a GObject in C, it's a little too verbose (boilerplate code). It needs to be generated with a tool (here is the one that I wrote: gobject-boilerplate scripts). And it's not really malleable code.

In the small glib-gtk-book that I wrote several years ago, I described in a chapter the "semi-OOP" C style used by GLib core (not GIO). So, having a kind of simple Object-Oriented style in C, without using GObject. It doesn't require a lot of code to write your own semi-OOP class in C. But then in later chapters I recommended to create GObject subclasses. Time to revisit my copy :-) ?

The semi-OOP style is also described - and recommended - in the Scalable C unfinished book, coming from the ZeroMQ community (another community that uses the C language and have built libraries on top, with a lot of language bindings). The relevant passage from Scalable C:

"Any language that depends on inheritance leads you to build large monoliths. Worse, there are no reliable internal contracts. Change one piece of code and you can break a hundred.
I'll explain later how we design classes in C, so we get neatly isolated APIs. We don't need inheritance. Each class does some work. We wrap that up, expose it to the world. If we need to share code between classes, we make further APIs. This gives us layers of classes."
The "layers of classes" is basically the same idea as described in A DAG of components - for an internal architecture too (an essay that I wrote last year).

Improving C-plus-GLib?

So I think, when I have the choice, I'll still write code in C with the GLib, using existing GObject classes like the ones provided in GIO, but not creating GObject subclasses. I can still document the semi-OOP-style classes with GTK-Doc and GI annotations. But without caring about the full GObject Introspection system and language bindings.

However, there is room for improvement in the tooling, to make the code more malleable.

Since each public symbol contains the namespace followed by the classname, renaming a class or the namespace is something difficult to do (in an easy way, and robustly).

Solution: libclang to the rescue, to write robust refactoring CLI/scriptable tools (that can also be integrated in various text editors). What I implemented several years ago: gobject-renamer.sh (but it doesn't use libclang), and learning-libclang (hey, it's a start!).

Generic Conclusion

When we know well something, we also know well what are its benefits and drawbacks. We sometimes question ourself: is the grass greener elsewhere? It's nice to explore other worlds, see how things can be done differently. And then coming back to where we were, but with a changed look, new ideas, and, most importantly, a renewed motivation!

Additional notes

You may wonder, what's next? Sometimes I don't even know myself, but it may involve: CMake, TeX and ConTeXt, rewriting the glib-gtk-book(?), libclang, stuff like that. As time permits, if my health condition is good enough, if it rains sufficiently to not blame myself for not going outside enough, if gravitational waves are on my side, and if there is not too much social pressure.

Sometimes I say to myself, computer science is still a young field, relative to other engineering disciplines. Are language features like type inference actually a good thing? Clearly, removing classes of bugs, with a good compiler, or static code analysis, is a good thing. Having self-documented code, is also a good thing. But in a few centuries, developers will laugh at us, for sure :-)

Thoughts about WebAssembly outside-the-browser, inside-the-desktop

Some reflections about WebAssembly, the Bytecode Alliance and desktop application development.

To know more about the Bytecode Alliance (WebAssembly outside-the-browser), you can read this nice article by Mozilla.

Note that I don't plan to work on this, it's just some thoughts about what could the future bring us. If someone is interested, this would be a really, really nice project that would totally change the landscape of native desktop application development. I'm convinced that the solution isn't Rust or C++, or C# or Java, or whichever new language will appear in 20 years that will make Rust obsolete; the solution is some piece of infrastructure such as nanoprocesses.

GNOME, C/GObject and language bindings

Historically, the GNOME project has chosen the C language to develop its libraries, because it's easier to implement language bindings: to be able to use these libraries from other languages.

The GObject library, at the heart of GNOME, has been implemented with language bindings in mind. When writing C/GObject code, it requires extra work to the developer because of language bindings. With the advantage that after that, the library is automatically usable from a wide range of languages thanks to GObject Introspection.

So, that is the strategy used by GNOME and GTK to provide an attractive development platform for application developers: the latters can ideally use their favorite language to develop an application.

An attractive platform for application developers

So I've talked about one advantage of the GNOME development platform for application developers: language bindings.

But it's not enough in order to provide an attractive development platform. To be able to write new applications quickly, conveniently, and with enough flexibility, you need high-level, domain-specific libraries, with still an access to lower-level APIs when the need arises.

And this is where GNOME could be much improved! Providing high-level APIs for various kinds of application domains. You don't like that new UI trend? No problem, you can just fork the application, and assemble the UI with different widgets. You want to create a specialized application (in contrast to the general-purpose one that GNOME provides)? No problem, use this framework and configure it that way.

In other words, with an attractive development platform, it's possible to easily create different, but similar, applications in the same domain. Or, in short, a software product line.

Example in a field that I know well, text editors:

  • First choice to make:
    • Either a general-purpose text editor: can be used for different CS languages.
    • Or a specialized text editor: targets a specific CS language or platform.
  • Second choice to make:
    • Either a traditional UI.
    • Or a smartphone-like UI.

The two choices are orthogonal. With a software product line, it would be possible to have the four combinations. With each application having a small amount of code.

(It would also put an end to the endless discussions and disagreements about which UI is best, just create one application with the desires of the designers, and if a developer disagrees, he or she can create another application. With code written as a library, and a small codebase for the application, this is doable).

But is C/GObject an attractive platform for library developers?

The conclusion of the previous section is that most of the code should be written in a library fashion. Like the LLVM project does. This prevents reinventing the wheel, and enables new low-hanging fruits. Another name for this is "re-targettable code": you can leverage the codebase for new purposes, as the needs arise and as requirements change.

But for GNOME, writing almost all the code in C/GObject is, well, how to say it, a little cumbersome. The C language is not memory-safe, C/GObject is hard to learn and not used by a lot of developers in the world.

Nanoprocesses, a solution

Back to the WebAssembly and the Bytecode Alliance topic.

With the concept of nanoprocesses applied to a desktop application, this would bring the benefit of being able to write libraries in various languages other than C, while still having good performances (in contrast to using a more heavyweight IPC solution like D-Bus, which already permits to develop services in other languages, since the services run in their own OS processes).

Inter-nanoprocess communication isn't much slower than regular function calls, as explained in the Mozilla article linked above. So the primary purpose of C/GObject (language bindings) would no longer be relevant.

On a more personal note, I don't know well recent web technologies, but Mozilla's article about the Bytecode Alliance has convinced me. I'm also convinced that I don't want to develop primarily in C/GObject for at least the next decade. I have heavily developed in C/GObject since 2012 or so, during at least 6 years, and then less heavily with ups and downs. With as focus during these years to make more code re-usable (which means, creating new libraries, with higher-level APIs, and losing a few feathers in passing). And, no, sometimes I don't have the same opinion as most other GNOME developers or designers. But I know that I did good work, I was going in the right direction architecture-wise, I have received many thank-you emails from users over the years.

This article outlines potential solutions, actionable things. To stay relevant compared to web applications, the cloud, etc. Because, I still prefer "normal", desktop applications for which I own my data, is privacy-respecting, without ads, etc etc ;-)

Short tutorial: Digital Television with GStreamer (ATSC setup)

GStreamer support for Digital Television (DTV) is still working. If you follow a few very simple steps you can get DTV signals tuned-to, captured, decoded and displayed with your computer and a few accessories. 

This short article presents basic instructions for a working DTV setup. I use ATSC in the US as an example, but the short process should work equally well for other terrestrial delivery systems and countries.

What hardware do you need?

Beyond a computer, you will need an antenna and a capture device. I recently tested a Hauppauge WinTV-dualHD (ATSC/QAM) USB dongle with a 5-years old Amazon basics flat indoor antenna (that looks pretty much like this one) and it worked quite well at roughly 60km from the repeater station.

Installation

The hardware setup is simple. Install the antenna pointing to your repeater station and as high as you can. Your local telecommunications and/or broadcasting association should have data about repeaters near you. You can get this information by ZIP code from the Federal Communications Commission (FCC) if you are in the US.

Don't stress about the "as high as you can" part of the antenna installation. It's great to have a uninterrupted and direct line of sight to the repeating station's antenna, but this is hard to accomplish in urban areas. My current test system is indoors, on a first floor, and it works fine for a few channels.

Software

You will need GStreamer. I'm building it from the master branch with gst-build but the DTV code hasn't changed much in the last few years so any recently-packaged version should work for you. You will also need dvbv5-scan from DVBv5 tools.

TL;DR

Build a channels.conf configuration file by scanning an initial set of frequencies for your area with dvbv5-scan. You can get initial-frequencies files from the the dtv-scan-tables repository. Here I'm using my Mountain View, California file as an example:

$ dvbv5-scan -o channels.conf ./dtv-scan-tables/atsc/us-CA-Mountain-View

If you grep the resulting file you can see what channels were detected:

$ grep "\\[" channels.conf

The output should look something like this:


[KNTV-HD]
[Hai Le]
[KPIX-TV]
[KQED-HD]
[KTVU-HD]
[ION]
[KFSF-HD]
[KICU-HD]


If you get no channels. And you have reasons to believe your hardware is working correctly, try repositioning your antenna and rescanning till you get some. Small direction and position changes can have big effects on VHF/UHF reception with semi-directional antennas like the flat one I'm using for this example.

To playback a channel with GStreamer you can use the generated channels.conf configuration file and any of the scanned-channel names to let playbin/dvbbasebin figure out all necessary parameters at runtime

$ GST_DVB_CHANNELS_CONF=channels.conf gst-play-1.0 dvb://KNTV-HD

And that's it.

There are lots of details, pitfalls and options I don't write about in this short tutorial but if there's some interest I will try to discuss the subject in more depth in future ones.

I leave you with two screenshot of the HD feed by KNTV in the bay area

 



 

 

 


 






May 13, 2021

cross-module inlining in guile

Greetings, hackers of spaceship Earth! Today's missive is about cross-module inlining in Guile.

a bit of history

Back in the day... what am I saying? I always start these posts with loads of context. Probably you know it all already. 10 years ago, Guile's partial evaluation pass extended the macro-writer's bill of rights to Schemers of the Guile persuasion. This pass makes local function definitions free in many cases: if they should be inlined and constant-folded, you are confident that they will be. peval lets you write clear programs with well-factored code and still have good optimization.

The peval pass did have a limitation, though, which wasn't its fault. In Guile, modules have historically been a first-order concept: modules are a kind of object with a hash table inside, which you build by mutating. I speak crassly but that's how it is. In such a world, it's hard to reason about top-level bindings: what module do they belong to? Could they ever be overridden? When you have a free reference to a, and there's a top-level definition of a in the current compilation unit, is that the a that's being referenced, or could it be something else? Could the binding be mutated in the future?

During the Guile 2.0 and 2.2 cycles, we punted on this question. But for 3.0, we added the notion of declarative modules. For these modules, bindings which are defined once in a module and which are not mutated in the compilation unit are declarative bindings, which can be reasoned about lexically. We actually translate them to a form of letrec*, which then enables inlining via peval, contification, and closure optimization -- in descending order of preference.

The upshot is that with Guile 3.0, top-level bindings are no longer optimization barriers, in the case of declarative modules, which are compatible enough with historic semantics and usage that they are on by default.

However, module boundaries have still been an optimization barrier. Take (srfi srfi-1), a little utility library on lists. One definition in the library is xcons, which is cons with arguments reversed. It's literally (lambda (cdr car) (cons car cdr)). But does the compiler know that? Would it know that (car (xcons x y)) is the same as y? Until now, no, because no part of the optimizer will look into bindings from outside the compilation unit.

mr compiler, tear down this wall

But no longer! Guile can now inline across module boundaries -- in some circumstances. This feature will be part of a future Guile 3.0.8.

There are actually two parts of this. One is the compiler can identify a set of "inlinable" values from a declarative module. An inlinable value is a small copyable expression. A copyable expression has no identity (it isn't a fresh allocation), and doesn't reference any module-private binding. Notably, lambda expressions can be copyable, depending on what they reference. The compiler then extends the module definition that's residualized in the compiled file to include a little procedure that, when passed a name, will return the Tree-IL representation of that binding. The design of that was a little tricky; we want to avoid overhead when using the module outside of the compiler, even relocations. See compute-encoding in that module for details.

With all of that, we can call ((module-inlinable-exports (resolve-interface '(srfi srfi-1))) 'xcons) and get back the Tree-IL equivalent of (lambda (cdr car) (cons car cdr)). Neat!

The other half of the facility is the actual inlining. Here we lean on peval again, causing <module-ref> forms to trigger an attempt to copy the term from the imported module to the residual expression, limited by the same effort counter as the rest of peval.

The end result is that we can be absolutely sure that constants in imported declarative modules will inline into their uses, and fairly sure that "small" procedures will inline too.

caveat: compiled imported modules

There are a few caveats about this facility, and they are sufficiently sharp that I should probably fix them some day. The first one is that for an imported module to expose inlinable definitions, the imported module needs to have been compiled already, not loaded from source. When you load a module from source using the interpreter instead of compiling it first, the pipeline is optimized for minimizing the latency between when you ask for the module and when it is available. There's no time to analyze the module to determine which exports are inlinable and so the module exposes no inlinable exports.

This caveat is mitigated by automatic compilation, enabled by default, which will compile imported modules as needed.

It could also be fixed for modules by topologically ordering the module compilation sequence; this would allow some parallelism in the build but less than before, though for module graphs with cycles (these exist!) you'd still have some weirdness.

caveat: abi fragility

Before Guile supported cross-module inlining, there was only explicit inlining across modules in Guile, facilitated by macros. If you write a module that has a define-inlinable export and you think about its ABI, then you know to consider any definition referenced by the inlinable export, and you know by experience that its code may be copied into other compilation units. Guile doesn't automatically recompile a dependent module when a macro that it uses changes, currently anyway. Admittedly this situation leans more on culture than on tools, which could be improved.

However, with automatically inlinable exports, this changes. Any definition in a module could be inlined into its uses in other modules. This may alter the ABI of a module in unexpected ways: you think that module C depends on module B, but after inlining it may depend on module A as well. Updating module B might not update the inlined copies of values from B into C -- as in the case of define-inlinable, but less lexically apparent.

At higher optimization levels, even private definitions in a module can be inlinable; these may be referenced if an exported macro from the module expands to a term that references a module-private variable, or if an inlinable exported binding references the private binding. But these optimization levels are off by default precisely because I fear the bugs.

Probably what this cries out for is some more sensible dependency tracking in build systems, but that is a topic for another day.

caveat: reproducibility

When you make a fresh checkout of Guile from git and build it, the build proceeds in the following way.

Firstly, we build libguile, the run-time library implemented in C.

Then we compile a "core" subset of Scheme files at optimization level -O1. This subset should include the evaluator, reader, macro expander, basic run-time, and compilers. (There is a bootstrap evaluator, reader, and macro expander in C, to start this process.) Say we have source files S0, S1, S2 and so on; generally speaking, these files correspond to Guile modules M0, M1, M2 etc. This first build produces compiled files C0, C1, C2, and so on. When compiling a file S2 implementing module M2, which happens to import M1 and M0, it may be M1 and M0 are provided by compiled files C1 and C0, or possibly they are loaded from the source files S1 and S0, or C1 and S0, or S1 and C0.

The bootstrap build uses make for parallelism, with each compile process starts afresh, importing all the modules that comprise the compiler and then using them to compile the target file. As the build proceeds, more and more module imports can be "serviced" by compiled files instead of source files, making the build go faster and faster. However this introduces system-specific nondeterminism as to the set of compiled files available when compiling any other file. This strategy works because it doesn't really matter whether module M1 is provided by compiled file C1 or source file S1; the compiler and the interpreter implement the same language.

Once the compiler is compiled at optimization level -O1, Guile then uses that freshly built compiler to build everything at -O2. We do it in this way because building some files at -O1 then all files at -O2 takes less time than going straight to -O2. If this sounds weird, that's because it is.

The resulting build is reproducible... mostly. There is a bug in which some unique identifiers generated as part of the implementation of macros can be non-reproducible in some cases, and that disabling parallel builds seems to solve the problem. The issue being that gensym (or equivalent) might be called a different number of times depending on whether you are loading a compiled module, or whether you need to read and macro-expand it. The resulting compiled files are equivalent under alpha-renaming but not bit-identical. This is a bug to fix.

Anyway, at optimization level -O1, Guile will record inlinable definitions. At -O2, Guile will actually try to do cross-module inlining. We run into two issues when compiling Guile; one is if we are in the -O2 phase, and we compile a module M which uses module N, and N is not in the set of "core" modules. In that case depending on parallelism and compile order, N may be loaded from source, in which case it has no inlinable exports, or from a compiled file, in which case it does. This is not a great situation for the reliability of this optimization. I think probably in Guile we will switch so that all modules are compiled at -O1 before compiling at -O2.

The second issue is more subtle: inlinable bindings are recorded after optimization of the Tree-IL. This is more optimal than recording inlinable bindings before optimization, as a term that is not inlinable due to its size in its initial form may become small enough to inline after optimization. However, at -O2, optimization includes cross-module inlining! A term that is inlinable at -O1 may become not inlinable at -O2 because it gets slightly larger, or vice-versa: terms that are too large at -O1 could shrink at -O2. We don't even have a guarantee that we will reach a fixed point even if we repeatedly recompile all inputs at -O2, because we allow non-shrinking optimizations.

I think this probably calls for a topological ordering of module compilation inside Guile and perhaps in other modules. That would at least give us reproducibility, provided we avoid the feedback loop of keeping around -O2 files compiled from a previous round, even if they are "up to date" (their corresponding source file didn't change).

and for what?

People who have worked on inliners will know what I mean that a good inliner is like a combine harvester: ruthlessly efficient, a qualitative improvement compared to not having one, but there is a pointy end with lots of whirling blades and it's important to stop at the end of the row. You do develop a sense of what will and won't inline, and I think Dybvig's "Macro writer's bill of rights" encompasses this sense. Luckily people don't lose fingers or limbs to inliners, but inliners can maim expectations, and cross-module inlining more so.

Still, what it buys us is the freedom to be abstract. I can define a module like:

(define-module (elf)
  #:export (ET_NONE ET_REL ET_EXEC ET_DYN ET_CORE))

(define ET_NONE		0)		; No file type
(define ET_REL		1)		; Relocatable file
(define ET_EXEC		2)		; Executable file
(define ET_DYN		3)		; Shared object file
(define ET_CORE		4)		; Core file

And if a module uses my (elf) module and references ET_DYN, I know that the module boundary doesn't prevent the value from being inlined as a constant (and possibly unboxed, etc).

I took a look and on our usual microbenchmark suite, cross-module inlining doesn't make a difference. But that's both a historical oddity and a bug: firstly that the benchmark suite comes from an old Scheme world that didn't have modules, and so won't benefit from cross-module inlining. Secondly, Scheme definitions from the "default" environment that aren't explicitly recognized as primitives aren't inlined, as the (guile) module isn't declarative. (Probably we should fix the latter at some point.)

But still, I'm really excited about this change! Guile developers use modules heavily and have been stepping around this optimization boundary for years. I count 100 direct uses of define-inlinable in Guile, a number of them inside macros, and many of these are to explicitly hack around the optimization barrier. I really look forward to seeing if we can remove some of these over time, to go back to plain old define and just trust the compiler to do what's needed.

by the numbers

I ran a quick analysis of the modules include in Guile to see what the impact was. Of the 321 files that define modules, 318 of them are declarative, and 88 contain inlinable exports (27% of total). Of the 6519 total bindings exported by declarative modules, 566 of those are inlinable (8.7%). Of the inlinable exports, 388 (69%) are functions (lambda expressions), 156 (28%) are constants, and 22 (4%) are "primitives" referenced by value and not by name, meaning definitions like (define head car) (instead of re-exporting car as head).

On the use side, 90 declarative modules import inlinable bindings (29%), resulting in about 1178 total attempts to copy inlinable bindings. 902 of those attempts are to copy a lambda expressions in operator position, which means that peval will attempt to inline their code. 46 of these attempts fail, perhaps due to size or effort constraints. 191 other attempts end up inlining constant values. 20 inlining attempts fail, perhaps because a lambda is used for a value. Somehow, 19 copied inlinable values get elided because they are evaluated only for their side effects, probably to clean up let-bound values that become unused due to copy propagation.

All in all, an interesting endeavor, and one to improve on going forward. Thanks for reading, and catch you next time!

.C as a file extension for C++ is not portable

Some projects use .C as a file extension for C++ source code. This is ill-advised, because it is can't really be made to work automatically and reliably. Suppose we have a file source.C with the following contents:

class Foo {
public:
    int x;
};

Let's compile this with the default compiler on Linux:

$ cc -c -o /dev/null source.C

Note that that command is using the C compiler, not the C++ one. Still, the compiler will autodetect the type from the extension and compile it as C++. Now let's do the same thing using Visual Studio:

$ cl /nologo /c source.C
source.C(1): Error C2061 Syntax error: Identifier 'Foo'
<a bunch of other errors>

In this case Visual Studio has chosen to compile it as plain C. The defaults between these two compilers are the opposite and that leads to problems.

How to fix this?

The best solution is to change the file extension to an unambiguous one. The following is a simple ZSH snippet that does the renaming part:

for i in **/*.C; do git mv ${i} ${i:r}.cpp; done

Then you need to do the equivalent in your build files with search-and-replace.

If that is not possible, you need to use the /TP compiler switch with Visual Studio to make it compile the source as C++ rather than C. Note that if you use this argument on a target, then all files are built as C++, even the plain C ones. This is unreliable and can lead to weird bugs. Thus you should rename the files instead.

May 11, 2021

DJI FPV Video Out

DJI FPV Glad I refrained from buying the overpriced DJI Smart controller just to get video out from the fpv goggles. Turns out somebody figured out how it does it.

May 10, 2021

Adventures in graphics APIs

Various people are working on porting desktop virtualization UIs to GTK4. This typically involves virgl, and the GTK3 solution was to use GtkGLArea.

With GTK4, rendering is happening in GL anyway, so it should be enough to just wrap your content in a GdkTexture and hand it to GTK, either by using it as a paintable with GtkPicture, or with a GskTextureNode in your own snapshot() implementation.

dmabuf detour

This is a nice theory, but the practice is more complicated – the content is typically available as a dmabuf object, and with 4k rendering, you really want to avoid extra copies if you can help it. So we had to look at the available solutions for importing dmabufs as textures into GL without copies.

This turned into a quick tour through the maze of graphics APIs: OpenGL, EGL, GL ES, GLX, DRI, … the list goes on. In the end, it turns out that you can use EGL  to wrap a dmabuf into an EGLImage, and use the GL_OES_EGL_image extension to create a GL texture from it.

GLX to EGL

This works fine with our Wayland backend, which uses EGL. Unfortunately, our much older X11 backend has a GL implementation using GLX, and there doesn’t seem to be a way to get a dmabuf imported into a GLX context.

So we had to do a little bit of extra work, and make our X11 backend use EGL as well. Thankfully Emmanuele had an old unfinished branch with  such a conversion from a few years ago, which could be made to work (after some initial head scratching why it would not render anything – as always the case when doing GL work).

The solution

It turns out that importing dmabufs with EGL can be done outside of GTK just fine, so we don’t need to add Linux-specific API for it. To save you the trouble of writing such code yourself, here is what I’ve come up with.

After we had already decided to port the X11 backend to EGL, I learned that another possibility for importing dmabufs might be to use DRI3PixmapFromBuffer to create  an X11 pixmap from a dmabuf, turn it in a GLXPixmap and use glxBindTexImageEXT to make a texture.

Aren’t graphics APIs wonderful! :-)

May 09, 2021

Changing hidden/locked BIOS settings under Linux

This all started with a Mele PCG09 before testing Linux on this I took a quick look under Windows and the device-manager there showed an exclamation mark next to a Realtek 8723BS bluetooth device, so BT did not work. Under Linux I quickly found out why, the device actually uses a Broadcom Wifi/BT chipset attached over SDIO/an UART for the Wifi resp. BT parts. The UART connected BT part was described in the ACPI tables with a HID (Hardware-ID) of "OBDA8723", not good.

Now I could have easily fixed this with an extra initrd with DSDT-overrride but that did not feel right. There was an option in the BIOS which actually controls what HID gets advertised for the Wifi/BT named "WIFI" which was set to "RTL8723" which obviously is wrong, but that option was grayed out. So instead of going for the DSDT-override I really want to be able to change that BIOS option and set it to the right value. Some duckduckgo-ing found this blogpost on changing locked BIOS settings.

The flashrom packaged in Fedora dumped the BIOS in one go and after build UEFITool and ifrextract from source from their git repos I could extract the interface description for the BIOS Setup menus without issues (as described in the blogpost). Here is the interesting part of the IFR for changing the Wifi/BT model:


0xC521 One Of: WIFI, VarStoreInfo (VarOffset/VarName): 0x110, VarStore: 0x1, QuestionId: 0x1AB, Size: 1, Min: 0x0, Max 0x2, Step: 0x0 {05 91 53 03 54 03 AB 01 01 00 10 01 10 10 00 02 00}
0xC532 One Of Option: RTL8723, Value (8 bit): 0x1 (default) {09 07 55 03 10 00 01}
0xC539 One Of Option: AP6330, Value (8 bit): 0x2 {09 07 56 03 00 00 02}
0xC540 One Of Option: Disabled, Value (8 bit): 0x0 {09 07 01 04 00 00 00}
0xC547 End One Of {29 02}



So to fix the broken BT I need to change the byte at offset 0x110 in the "Setup" EFI variable which contains the BIOS settings from 0x01 to 0x02. Easy, one problem though, the "dd on /sys/firmware/efi/efivars/Setup-..." method described in the blogpost does not work on most devices. Most devices protect the BIOS settings from being modified this way by having 2 Setup-${GUID} EFI variables (with different GUIDs), hiding the real one leaving a fake one which is only a couple of bytes large.

But the BIOS Setup-menu itself is just another EFI executable, so how can this access the real Setup variable ? The trick is that the hiding happens when the OS calls exitbootservices to tell EFI it is ready to take over control of the machine. This means that under Linux the real Setup EFI variable has been hidden early on during Linux boot, but when grub is running it is still available! And there is a patch adding a new setup_var command to grub, which allows changing BIOS settings from within grub.

The original setup_var command picks the first Setup EFI variable it finds, but as mentioned already in most cases there are 2, so later an improved setup_var_3 command was added which instead skips Setup EFI variables which are too small (as the fake ones are only a few bytes). After building an EFI version of grub with the setup_var* commands added it is just a matter of booting into a grub commandline and then running "setup_var_3 0x110 2" and from then on the BIOS shows the WIFI type as being AP6330 and the ACPI tables will now report "BCM2E67" as HID for the BT and just like that the bluetooth issue has been fixed.


For your convenience I've uploaded a grubia32.efi and a grubx64.efi with the setup_var patches added here. This is build from this branch at this commit (this was just a random branch which I had checked out while working on this).

The Mele PCG09 use-case for modifying hidden BIOS-settings is a bit of a corner-case. Intel Bay- and Cherry-Trail SoCs come with an embedded OTG XHCI controller to allow them to function as an USB device/gadget rather then only being capable of operating as an USB host. Since most devices ship with Windows and Windows does not really do anything useful with USB-device controllers, this controller is disabled by most BIOS-es and there is no visible option to enable it. The same approach from above can be used to enable the "USB OTG" option in the BIOS so that we can use this under Linux. Lets take the Teclast X89 (Windows version) tablet as example. Extracting the IFR and then looking for the "USB OTG" function results in finding this IFR snippet:


0x9560 One Of: USB OTG Support, VarStoreInfo (VarOffset/VarName): 0xDA, VarStore: 0x1, QuestionId: 0xA5, Size: 1, Min: 0x0, Max 0x1, Step: 0x0 {05 91 DE 02 DF 02 A5 00 01 00 DA 00 10 10 00 01 00}
0x9571 Default: DefaultId: 0x0, Value (8 bit): 0x1 {5B 06 00 00 00 01}
0x9577 One Of Option: PCI mode, Value (8 bit): 0x1 {09 07 E0 02 00 00 01}
0x957E One Of Option: Disabled, Value (8 bit): 0x0 {09 07 3B 03 00 00 00}
0x9585 End One Of {29 02}



And then running "setup_var_3 0xda 1" on the grub commandline results in a new "00:16.0 USB controller: Intel Corporation Atom Processor Z36xxx/Z37xxx Series OTG USB Device" entry showing up in lspci.

Actually using this requires a kernel with UDC (USB Device Controller) support enabled as well as some USB gadget drivers, at least the Fedora kernel does not have these enabled by default. On Bay Trail devices an external device-mode USB-PHY is necessary for device-mode to actually work. On a kernel with UDC enabled you can check if your hardware has such a phy by doing: "cat /sys/bus/ulpi/devices/dwc3.4.auto.ulpi/modalias" if there is a phy this will usually return "ulpi:v0451p1508". If you get "ulpi:v0000p0000" instead then your hardware does not have a device-mode phy and you cannot use gadget mode.

On Cherry Trail devices the device-mode phy is build into the SoC, so on most Cherry Trail devices this just works. There is one caveat though, the x5-z83?0 Cherry Trail SoCs only have one set of USB3 superspeed data lines and this is part of the USB-datalines meant for the OTG port. So if you have a Cherry Trail device with a x5-z83?0 SoC and it has a superspeed (USB3) USB-A port then that is using the OTG superspeed-lines, when the OTG XHCI controller is enabled and the micro-usb gets switched to device-mode (which it also does when charging!) then this will now also switch the superspeed datalines to device-mode, disconnecting any superspeed USB device connected to the USB-A port. So on these devices you need to choose, you can either use the micro-usb in device-mode, or get superspeed on the USB-A port, you cannot use both at the same time.

If you have a kernel build with UDC support a quick test is to run a USB-A to micro-B cable from a desktop or laptop to the tablet and then do "sudo modprobe g_serial" on the tablet, after this you should see a binch of messages in dmesg on the desktop/tablet about an USB device showing up ending with something like "cdc_acm 1-3:2.0: ttyACM0: USB ACM device". If you want you can run a serial-console on the tablet on /dev/ttyGS0 and then connect to it on the desktop/laptop at /dev/ttyACM0.

May 07, 2021

tar::Builder isn’t Send

I recently made a new project in Rust that is generating multiple bootable operating system disk image types from a "pristine" image with the goal of deduplicating storage.

At one point I decided to speed it up using rayon. Each thread here is basically taking a pristine base (read-only), doing some nontrivial computation and writing a new version derived from it. The code is using .par_iter().try_for_each(); here the rayon crate handles spinning up worker threads, etc.

That all worked fine.

Then later, due to some other constraints I realized it was better to support writing to stdout in addition. (This code needs to run in a container, and it’s easier to podman run --rm -i myimage --output stdout > newfile.iso instead of dealing with bind mounts.)

I came up with this:

enum OutputTarget<W: std::io::Write> {
    Stdout(W),
    Tar(tar::Builder<W>),
}

Basically if you’re only asking for one file, we output it directly. If you ask for multiple, we wrap them in a tarball.

But, it didn’t compile – there was an error message about tar::Builder not having the Send trait that pointed at the closure being passed to the rayon try_for_each(). I’ve been using Rust long enough that I understand Send and immediately realized the problem: multiple worker threads trying to concurrently write to the same tar stream just can’t work. (And the same is true for stdout, but the compiler can’t know for sure there’s only one thread in that case.)

But, I still wanted the parallelism from doing the actual file generation. Some refactoring to more cleanly split up "generate files" from "output files" would have been cleanest, and probably not hard.

But this project was still in the fast iteration/prototyping phase so I decided to just wrap the OutputTarget enum to be an Arc<Mutex<OutputTarget>> – and that compiled and worked fine. The worker threads still parallelize generation, then serialize output.

Other languages don’t do this

This project is one of those that honestly could have easily started in bash or Python too. Or Go. But those languages don’t have built-in concurrency protections.

Out of curiosity I just wrote a quick Python program to write to a tarfile from multiple threads. As expected, it silently generated a corrupted tarball with intermixed content. (At this point hopefully everyone knows basically to avoid threads in Python since they’re mostly useless due to the GIL, of course)

And also as expected, a lightly modified example of the code from the Go archive/tar example compiles fine, and generates corrupted output. Now this is a well known problem in Go given its heavy focus on concurrent goroutines, and to be fair go run -race does correctly find errors here. But there’s a bunch of tradeoffs involved there; the race detector is only probabilistic, you have to remember to use it in your CI tests, etc.

I’m really not saying anything here that hasn’t been said before of course. But this was my experience this week. And it’s experiences like this that remind me why I sunk so much time into learning Rust and using it for new projects.

The Internals of Fractal-Next

As you probably know already we are in the process of rewriting Fractal. Since my previous blogpost about Fractal-Next a lot has happened. We are now at the point where the bare minimum is working, including login, the sidebar, and the room history. These things are not totally complete yet, but it’s already possible to run Fractal-Next, and read and send messages.
The previous post was kept for a general audience, but this one goes into more technical details of the foundation we put in place for Fractal-Next. In this post I will try to provide a more detailed look into the machinery, to give newcomers a general idea on how Fractal works. This is just an introduction and we are working on improving the docs for the entire project.

Fractal-Next uses GTK4 and therefore also glib’s GObjects. Every major part of Fractal is implemented as a subclass of a GObject or a GtkWidget. This being said, anything that doesn’t depend on GTK or glib will go, whenever possible, into the matrix-rust-sdk so that other projects can benefit from it. In return this will reduce the code maintained by the Fractal team and it keeps a clear separation between UI code and the backend code.
Fractal’s old code base doesn’t use subclassing nor composite templates, because when Fractal was started the Rust bindings for GTK didn’t have support for it yet. Today we can conveniently create subclassed objects with great support for composite templates to define the UI.

Subclassing and composite templates

We make heavy use of subclassing and composite templates. This allows us to create a GObject that contains the data and lets the object worry about maintaining consistency of the data with the Matrix homeserver. The object itself makes the appropriate calls to the SDK and then it exposes the information as GObject properties. This way, without going into too much detail, any GtkWidget can simply bind to the object’s property and it doesn’t have to worry about updating, requesting, and not even about the actual data it displays, because the binding keeps the widget and the object property in sync automatically.
We create GObject wrappers for many cases but not for everything. Matrix is built around sending events to logical places called Rooms. Generally, Matrix events can’t change their content and therefore we don’t need to make sure they are kept up-to-date, therefore we can use the rust struct directly instead of wrapping each single Matrix event into a GOjbect. It would mean that we need to writing a lot of additional code that for the definition and creation of the object. In short, it would result in a lot of overhead without any real benefit. Therefore we are using the Rust struct returned by the SDK directly to create the event widgets.

Async calls

You may wonder how we handle async functions from the matrix-rust-sdk. It’s simple, we created do_async() (thanks to slomo for the help making the function much nicer). The function takes two async closures, one that is executed on the tokio runtime and a second one that is spawned to the main context used by GTK. Obviously this function can only be called from the thread where the main context is running, but that’s not a problem for us since GTK can only run on a single thread and Fractal is mostly single threaded as well. We only use tokio with multiple threads for the SDK.

For example, the method to get the display name of a room looks like the following code snippet. self.set_display_name() sets the display-name property and emits a notify signal.


fn load_display_name(&self) {
  let priv_ = imp::Room::from_instance(&self);
  let matrix_room = priv_.matrix_room.get().unwrap().clone();
  do_async(
    async move { matrix_room.display_name().await },
    clone!(@weak self as obj => move |display_name| async move {
      match display_name {
        Ok(display_name) => obj.set_display_name(Some(display_name)),
        Err(error) => error!("Couldn't fetch display name: {}", error),
      };
   }),
  );
}

Important objects and widgets

Let’s have a look at some of the important GObjects and GtkWidgets we created. In some cases we create subclassed objects and in some cases we subclass widgets that keep track of the data produced by the SDK. This is because in many cases the data has only one consumer, the widget, so it doesn’t make sense to create a GObject and a GtkWidget that essentially contain the same data. Note that a widget is also a GObject. However, whenever we have information that is consumed by multiple widgets (e.g. a Matrix room) we create an object so that we don’t need to perform the same calls to the SDK multiple times and aggregate it for each single widget.

  • Login is a widget that handles the login flow and produces a Session.
  • The Session is the core of the client. It handles the login to the Matrix server and the sync, obviously via the matrix-rust-sdk. It also creates the UI and every widget needed to display a logged-in matrix account is a child of the Session or a submodule in the rust world.
  • The Room is an object that stores all information pertaining to a Matrix room, e.g. the display name and timeline, and makes sure that the correct values are exposed as GObject properties.
    • The Timeline is a GListModel, the data structur behind GtkListView, containing all events that are shown in the room history.
  • User is an object that tracks a Matrix user and exposes the display name and the avatar conveniently as GObject properties. To work properly it needs to receive room member state events. A User needs to be specific to each room because Matrix users can have different names and avatars for different rooms.
  • Sidebar is the widget that contains the sidebar, build around a GtkListView.
  • Content is the widgets to display a room’s content, including the room history. The room history uses a GtkListView that is connected to the Timeline.

All objects and widgets can be found in the Rust docs. Even though most entries don’t have any description yet, it should help you gain a deeper insight into the structure of the code.

Final words

Up to this point it was pretty much impossible to contribute to Fractal-Next because not much was set in stone and many parts still needed to be figured out. However, it’s now at the point where people can start contributing and help make Fractal-Next as awesome as the old Fractal, and of course way beyond that. We already have a few new contributors. To name two: Veli Tasalı, who worked on improving things around meson and added the generation of the rust docs to the CI, and Kévin Commaille who worked on improving the look and feel of the sidebar among other things.

Fractal-Next wouldn’t be possible without the awesome people working on the GTK Rust bindings, libadwaita, matrix-rust-sdk, and many more. Therefore a big thanks goes out to them. And of course, a huge thank you to NLnet and NEXT GENERATION INTERNET for sponsoring my work.

May 06, 2021

Soft unbricking Bay- and Cherry-Trail tablets with broken BIOS settings

As you may know I've been doing a lot of hw-enablement work on Bay- and Cherry-Trail tablets as a side-project for the last couple of years.

Some of these tablets have one interesting feature intended to "flash" Android on them. When turned on with both the volume-up and the volume-down buttons pressed at the same time they enter something called DNX mode, which it will then also print to the LCD panel, this is really just a variant of the android fastboot protocol built into the BIOS. Quite a few models support this, although on Bay Trail it sometimes seems to be supported (it gets shown on the screen) but it does not work since many models which only shipped with Windows lack the external device/gadget-mode phy which the Bay Trail SoC needs to be able to work in device/gadget mode (on Cherry Trail the gadget phy has been integrated into the SoC).

So on to the topic of this blog-post, I recently used DNX mode to unbrick a tablet which was dead due to the BIOS settings get corrupted in a way where it would not boot and it was also impossible to enter the BIOS setup. After some duckduckgo-ing I found a thread about how in DNX mode you can upload a replacement for the efilinux.efi bootloader normally used for "fastboot boot" and how you can use this to upload a binary to flash the BIOS. I did not have a BIOS image of this tablet, so that approach did not work for me. But it did point me in the direction of a different, safer (no BIOS flashing involved) solution to unbrick the tablet.

If you run the following 2 commands on a PC with a Bay- or Cherry-Trail connected in DNX mode:

fastboot flash osloader some-efi-binary.efi
fastboot boot some-android-boot.img

Then the tablet will execute the some-efi-binary.efi. At first I tried getting an EFI shell this way, but this failed because the EFI binary gets passed some arguments about where in RAM it can find the some-android-boot.img. Then I tried booting a grubx64.efi file and that result in a grub commandline. But I had not way to interact with it and replacing the USB connection to the PC with a OTG / USB-host cable with a keyboard attached to it did not result in working input.

So my next step was to build a new grubx64.efi with "-c grub.cfg" added to the commandline for the final grub2-mkimage step, embedding a grub.cfg with a single line in there: "fwsetup". This will cause the tablet to reboot into its BIOS setup menu. Note on some tablets you still will not have keyboard input if you just let the tablet sit there while it is rebooting. But during the reboot there is enough time to swap the USB cable for an OTG adapter with a keyboard attached before the reboot completes and then you will have working keyboard input. At this point you can select "load setup defaults" and then "save and exit" and voila the tablet works again.

For your convenience I've uploaded a grubia32.efi and a grubx64.efi with the necessary "fwsetup" grub.cfg here. This is build from this branch at this commit (this was just a random branch which I had checked out while working on this).

Note the image used for the "fastboot boot some-android-boot.img" command does not matter much, but it must be a valid android boot.img format file otherwise fastboot will refuse to try to boot it.

More doorbell adventures

Back in my last post on this topic, I'd got shell on my doorbell but hadn't figured out why the HTTP callbacks weren't always firing. I still haven't, but I have learned some more things.

Doorbird sell a chime, a network connected device that is signalled by the doorbell when someone pushes a button. It costs about $150, which seems excessive, but would solve my problem (ie, that if someone pushes the doorbell and I'm not paying attention to my phone, I miss it entirely). But given a shell on the doorbell, how hard could it be to figure out how to mimic the behaviour of one?

Configuration for the doorbell is all stored under /mnt/flash, and there's a bunch of files prefixed 1000eyes that contain config (1000eyes is the German company that seems to be behind Doorbird). One of these was called 1000eyes.peripherals, which seemed like a good starting point. The initial contents were {"Peripherals":[]}, so it seemed likely that it was intended to be JSON. Unfortunately, since I had no access to any of the peripherals, I had no idea what the format was. I threw the main application into Ghidra and found a function that had debug statements referencing "initPeripherals and read a bunch of JSON keys out of the file, so I could simply look at the keys it referenced and write out a file based on that. I did so, and it didn't work - the app stubbornly refused to believe that there were any defined peripherals. The check that was failing was pcVar4 = strstr(local_50[0],PTR_s_"type":"_0007c980);, which made no sense, since I very definitely had a type key in there. And then I read it more closely. strstr() wasn't being asked to look for "type":, it was being asked to look for "type":". I'd left a space between the : and the opening " in the value, which meant it wasn't matching. The rest of the function seems to call an actual JSON parser, so I have no idea why it doesn't just use that for this part as well, but deleting the space and restarting the service meant it now believed I had a peripheral attached.

The mobile app that's used for configuring the doorbell now showed a device in the peripherals tab, but it had a weird corrupted name. Tapping it resulted in an error telling me that the device was unavailable, and on the doorbell itself generated a log message showing it was trying to reach a device with the hostname bha-04f0212c5cca and (unsurprisingly) failing. The hostname was being generated from the MAC address field in the peripherals file and was presumably supposed to be resolved using mDNS, but for now I just threw a static entry in /etc/hosts pointing at my Home Assistant device. That was enough to show that when I opened the app the doorbell was trying to call a CGI script called peripherals.cgi on my fake chime. When that failed, it called out to the cloud API to ask it to ask the chime[1] instead. Since the cloud was completely unaware of my fake device, this didn't work either. I hacked together a simple server using Python's HTTPServer and was able to return data (another block of JSON). This got me to the point where the app would now let me get to the chime config, but would then immediately exit. adb logcat showed a traceback in the app caused by a failed assertion due to a missing key in the JSON, so I ran the app through jadx, found the assertion and from there figured out what keys I needed. Once that was done, the app opened the config page just fine.

Unfortunately, though, I couldn't edit the config. Whenever I hit "save" the app would tell me that the peripheral wasn't responding. This was strange, since the doorbell wasn't even trying to hit my fake chime. It turned out that the app was making a CGI call to the doorbell, and the thread handling that call was segfaulting just after reading the peripheral config file. This suggested that the format of my JSON was probably wrong and that the doorbell was not handling that gracefully, but trying to figure out what the format should actually be didn't seem easy and none of my attempts improved things.

So, new approach. Rather than writing the config myself, why not let the doorbell do it? I should be able to use the genuine pairing process if I could mimic the chime sufficiently well. Hitting the "add" button in the app asked me for the username and password for the chime, so I typed in something random in the expected format (six characters followed by four zeroes) and a sufficiently long password and hit ok. A few seconds later it told me it couldn't find the device, which wasn't unexpected. What was a little more unexpected was that the log on the doorbell was showing it trying to hit another bha-prefixed hostname (and, obviously, failing). The hostname contains the MAC address, but I hadn't told the doorbell the MAC address of the chime, just its username. Some more digging showed that the doorbell was calling out to the cloud API, giving it the 6 character prefix from the username and getting a MAC address back. Doing the same myself revealed that there was a straightforward mapping from the prefix to the mac address - changing the final character from "a" to "b" incremented the MAC by one. It's actually just a base 26 encoding of the MAC, with aaaaaa translating to 00408C000000.

That explained how the hostname was being generated, and in return I was able to work backwards to figure out which username I should use to generate the hostname I was already using. Attempting to add it now resulted in the doorbell making another CGI call to my fake chime in order to query its feature set, and by mocking that up as well I was able to send back a file containing X-Intercom-Type, X-Intercom-TypeId and X-Intercom-Class fields that made the doorbell happy. I now had a valid JSON file, which cleared up a couple of mysteries. The corrupt name was because the name field isn't supposed to be ASCII - it's base64 encoded UTF16-BE. And the reason I hadn't been able to figure out the JSON format correctly was because it looked something like this:

{"Peripherals":[]{"prefix":{"type":"DoorChime","name":"AEQAbwBvAHIAYwBoAGkAbQBlACAAVABlAHMAdA==","mac":"04f0212c5cca","user":"username","password":"password"}}]}


Note that there's a total of one [ in this file, but two ]s? Awesome. Anyway, I could now modify the config in the app and hit save, and the doorbell would then call out to my fake chime to push config to it. Weirdly, the association between the chime and a specific button on the doorbell is only stored on the chime, not on the doorbell. Further, hitting the doorbell didn't result in any more HTTP traffic to my fake chime. However, it did result in some broadcast UDP traffic being generated. Searching for the port number led me to the Doorbird LAN API and a complete description of the format and encryption mechanism in use. Argon2I is used to turn the first five characters of the chime's password (which is also stored on the doorbell itself) into a 256-bit key, and this is used with ChaCha20 to decrypt the payload. The payload then contains a six character field describing the device sending the event, and then another field describing the event itself. Some more scrappy Python and I could pick up these packets and decrypt them, which showed that they were being sent whenever any event occurred on the doorbell. This explained why there was no storage of the button/chime association on the doorbell itself - the doorbell sends packets for all events, and the chime is responsible for deciding whether to act on them or not.

On closer examination, it turns out that these packets aren't just sent if there's a configured chime. One is sent for each configured user, avoiding the need for a cloud round trip if your phone is on the same network as the doorbell at the time. There was literally no need for me to mimic the chime at all, suitable events were already being sent.

Still. There's a fair amount of WTFery here, ranging from the strstr() based JSON parsing, the invalid JSON, the symmetric encryption that uses device passwords as the key (requiring the doorbell to be aware of the chime's password) and the use of only the first five characters of the password as input to the KDF. It doesn't give me a great deal of confidence in the rest of the device's security, so I'm going to keep playing.

[1] This seems to be to handle the case where the chime isn't on the same network as the doorbell

comment count unavailable comments

May 05, 2021

Is the space increase caused by static linking a problem?

Most recent programming languages want to link all of their dependencies statically rather than using shared libraries. This has many implications, but for now we'll only focus on one: executable size. It is generally accepted that executables created in this way are bigger than when static linking. The question is how much and whether it even mattesr. Proponents of static linking say the increase is irrelevant given current computers and gigabit networks. Opponents are of the, well, opposite opinion. Unfortunately there is very little real world measurements around for this.

Instead of arguing about hypotheticals, let's try to find some actual facts. Can we find a case where, within the last year or so, a major proponent of static linking has voluntarily switched to shared linking due to issues such as bandwidth savings. If such a case can be found, then it would indicate that, yes, the binary size increase caused by static linking is a real issue.

Android WebView, Chrome and the Trichrome library

Last year (?)  Android changed the way they provide both the Chrome browser and the System WebView app [1]. Originally both of them were fully isolated, but after the change both of them had a dependency on a new library called Trichrome, which is basically just a single shared library. According to news sites, the reasoning was this:

"Chrome is no longer used as a WebView implementation in Q+. We've moved to a new model for sharing common code between Chrome and WebView (called "Trichrome") which gives the same benefits of reduced download and install size while having fewer weird special cases and bugs."

Google has, for a long time, favored static linking. Yet, in this case, they have chosen to switch from static linking to shared linking on their flagship application on their flagship operating system. More importantly their reasons seem to be purely technical. This would indicate that shared linking does provide real world benefits compared to static linking.

[1] I originally became aware of this issue since this change broke updates on both of these apps and I had to fix it manually with apkmirror.

May 04, 2021

Spring Maps

 Since it was a while since the release of GNOME 40, I thought it might be time again for a post.

Since the 40.0 release there's just been a bug fix release (40.1) where, among other things, a bug where toggling a place as a favorite and then “unfavoring” it again, made it impossible to select that place again until restarting Maps.

And in master, leading towards 41 there's also been some goings-on.

 

A new Bookmarks button icon

First, one issue that was brought to our attention was that since we've used the star icon both for the favorites menu button in the header bar and for the button to mark a place as a favorite in a “place bubble” and this get a bit confusing when the new adaptive mobile mode is active. Since in this case the favorite button moves to the bottom bar and this lines up the favorite button in the place bar shown above that when selecting a place.

So, to mitigate this we decided to adopt the same bookmark icon as used in GNOME Web (Epiphany) for the favorites menu button to keep them apart.

In the desktop layout it looks like this:


And when in the mobile “narrow mode”:


Here, the buttons can be seen horizontally aligned, with the top one being the button to mark the place as a favorite.

An overhaul of the search result icons

For the icons shown in the search results list we always relied on the icons provided by the libgeocode-glib library. This has had some issues, however. For one this icon set is quite limited, so most places has just received a generic pin icon.

So now these have all been replaced with fresh new icons from the GNOME icon catalog:


Jakub Steiner also drew up some brand new icons for, among other things cafes (shown above), pubs, bars, cities, and towns.

These icons are also proper symbolic icons, meaning they adapt to the dark (night mode) theme also:


In contrast to the old icons, which didn't look to great against a dark background:


Furthermore, bus stops, train stations, and similar will now use the same icons as where already used to render journey plans, so this looks more consistent

I also couldn't resist adding a bit a fun as well, so now zoos get this little penguin icon:



Validating Wikipedia refs when editing OSM POIs

I also added validation for Wikipedia references when editing places. So now it will warn you if the reference does not follow the style language-code:Article title (such as “en:Article title“ for example).


The light bubble, as before, will show a hint explaining the format. And still  like before you can also paste in a URL from Wikipedia, and it will automagically re-format it, so for exmple it converts https://en.wikipedia.org/wiki/The_article_title → en:The article title.

Improvements to libshumate

And over in libshumate (the new library we're working on to enable migrating to GTK 4.x and away from using Clutter) Georges Basile Stavracas Neto and James Westman has been working on a number of improvments. So that now smooth scrolling works (meaning that when you let go of the mouse or touch pad gesture while in motion the map view continues moving a bit with inertia) and fractional scaled zooming that will be built upon later on when adding support for smooth animated zoom and pinch gestures for touch screens.


And that's that until next time!

May 03, 2021

Automating Project Maintenance on Github

Contents

Manual Maintenance is tough and boring

Most of the effort in the software business goes into the maintenance of the code that already exists. Once the software is built, many factors affect its performance over time. We need to fix bugs, address security vulnerabilities, make performance improvements, and decrease technical debt.

Managing a single piece of software is easy but as a developer, we often have to deal with more than one. And this is exactly where maintenance gets hard. The best way to handle Maintenance debt is to upgrade the dependencies on which our project depends regularly.

All these problems can be solved by automation. We at RavSam use Github for our code handling and CI/CD purposes. There are tools like Github Apps and Github Actions that allow us to automate our software maintenance.

Renovate - Automated Dependency Updates

Renovate is one of those packages that make our idea of automated maintenance a reality. It is a free, open-source, customizable Github app that helps us to automatically update our dependencies in software projects by receiving pull requests and that too for multiple languages.

Dependency Updates by Renovate
Dependency Updates by Renovate

The best part is that we can write a single config and use it for all of our projects in our Github organization. Here is a config that we use at RavSam:

{
  "extends": ["config:base"],
  "labels": ["dependencies"],
  "major": {
    "enabled": false
  },
  "packageRules": [
    {
      "matchUpdateTypes": ["patch", "pin", "digest"],
      "automerge": true
    }
  ],
  "prCreation": "not-pending",
  "schedule": ["every weekend"],
  "stabilityDays": 3
}

We have configured Renovate to run only on weekends to prevent noise and distractions. We have enabled auto-merge when the update type is one of the following: patch, pin or digest.

Imgbot - Automated Image Optimization

The performance of a Web App is often dependent on the images. Hence it is crucial to optimize images or else lose customers. Another advantage of optimized images is that it reduces the bandwidth costs for us as well as our visitors.

We love Imgbot. It optimizes the images and creates a pull request against our default branch. Imgbot is verified by GitHub which means there is no point worrying about the security.

Images optimized by ImgBot
Images optimized by ImgBot

RavSam Bot - A Github Probot App

We have built our custom serverless Github Probot, RavSam Bot, our employee #1. It helps us automate various tasks like managing issues by raising their priority, assigning confirmed issues to developers, assigning reviewers to the pull requests, auto-merging them once the changes have been approved and many more things.

Approved pull request merged by RavSam Bot
Approved pull request merged by RavSam Bot

Probot apps are easy to write, deploy, and share. We have deployed our app on Netlify Functions and it spends the entire day doing mundane tasks for us tirelessly.

GNOME Internet Radio Locator 5.0.0 with BBC (United Kingdom) on Fedora Core 34

GNOME Internet Radio Locator 5.0.0 with BBC (United Kingdom) features English and Asian language translation, a new, improved map marker palette with 188 other radio stations from around the world and live audio streaming from BBC implemented through GStreamer.

BBC – Radio 1
BBC – Radio 2
BBC – Radio 3
BBC – Radio 4
BBC – Radio 4 LW (UK only)
BBC – Radio 4 LW (non-UK)
BBC – Radio 5 live (UK only)
BBC – Radio 5 live (non-UK)
BBC – Radio 6 Music
BBC – Radio 1Xtra
BBC – Radio 4 Extra
BBC – Radio 5 Live sports extra (UK only)
BBC – Radio Asian Network
BBC – BBC CWR
BBC – BBC Essex
BBC – BBC Hereford Worcester
BBC – Radio Berkshire
BBC – Radio Bristol
BBC – Radio Cambridge
BBC – Radio Cornwall
BBC – Radio Cumbria
BBC – Radio Cymru
BBC – Radio Cymru 2
BBC – Radio Derby
BBC – Radio Devon
BBC – Radio Foyle
BBC – Radio Gloucestershire
BBC – Radio Guernsey
BBC – Radio Humberside
BBC – Radio Jersey
BBC – Radio Kent
BBC – Radio Lancashire
BBC – Radio Leeds
BBC – Radio Leicester
BBC – Radio Lincolnshire
BBC – Radio London
BBC – Radio Manchester
BBC – Radio Merseyside
BBC – Radio nan Gaidheal
BBC – Radio Newcastle
BBC – Radio Norfolk
BBC – Radio Northampton
BBC – Radio Nottingham
BBC – Radio Orkney
BBC – Radio Oxford
BBC – Radio Scotland FM
BBC – Radio Scotland MW
BBC – Radio Sheffield
BBC – Radio Shropshire
BBC – Radio Solent
BBC – Radio Solent West Dorset
BBC – Radio Somerset Sound
BBC – Radio Stoke
BBC – Radio Suffolk
BBC – Radio Surrey
BBC – Radio Sussex
BBC – Radio Tees
BBC – Radio Ulster
BBC – Radio Wales
BBC – Radio Wiltshire
BBC – Radio WM
BBC – Radio York
BBC – Three Counties Radio
BBC – BBC World Service (London, United Kingdom)

The project lives on www.gnomeradio.org and the Fedora 34 RPM packages of  version 5.0.0 of GNOME Internet Radio Locator are now also available for free:

gnome-internet-radio-locator.spec

gnome-internet-radio-locator-5.0.0-1.fc34.src.rpm

gnome-internet-radio-locator-5.0.0-1.fc34.x86_64.rpm

To install GNOME Internet Radio Locator 5.0.0 on Fedora Core 34 in GNOME Terminal, run the following installation command to resolve all dependencies:

sudo dnf install http://www.gnomeradio.org/~ole/fedora/RPMS/x86_64/gnome-internet-radio-locator-5.0.0-1.fc34.x86_64.rpm

To run GNOME Internet Radio Locator from GNOME Terminal, run the command

/usr/bin/gnome-internet-radio-locator

To inspect the source code and build the version 5.0.0 source tree, run

sudo dnf install gnome-common
sudo dnf install intltool libtool gtk-doc geoclue2-devel yelp-tools
sudo dnf install gstreamer1-plugins-bad-free-devel geocode-glib-devel
sudo dnf install libchamplain-devel libchamplain-gtk libchamplain
git clone http://gitlab.gnome.org/GNOME/gnome-internet-radio-locator
cd gnome-internet-radio-locator/
./autogen.sh
make install

Introducing Regento, marketing for FLOSS-centric companies and transitioning industries

In this blog post, I’m taking a quick break from my GTG blogging frenzy to talk about another one of my non-software projects from the past few months years (estimated reading time: 3 ½ minutes).


Some may remember I previously introduced the Atypica collective as a dedicated place where I could showcase some of my video production work (instead of having everything and the kitchen sink into my own website).

Launching Atypica was a long-standing project of mine that had been put on the back-burner because of some of my intensive CMO work in recent years (such as this, for instance). Awakening from the lucid dream allowed me to re-enter a long R&D phase where I could not only shave a ton of “infrastructure and productivity” yaks (that’s a story for another blog post, including the ridiculous tale of the months of work it took to fix my personal blog), but also realign my business objectives and pick up where I had left off “the last time”, hence the delayed announcement of Atypica’s website launch.

But Atypica, besides being the kind of initiative that works much better outside a pandemic, is merely one of my creative interests, and therefore part of my broader business orientations.


Today I’m revealing part two (of potentially many): Regento, the fractional CMO agency regency. You could say we are Riskbreakers, but in the business sense. Because in business, it’s all fun and games until an expensive wyvern loses an eye.

We break down silos and reduce risk. Although we can dream of being digital nomads some day, this is no Vagrant Story.

“Why another brand, don’t you already do idéemarque,” you ask? Because, in addition to various practical and technical considerations:

  • This area of work is less about creative services and more about deeply involved strategic marketing and strategic business consulting, rather than what you’d find in a typical tactical agency service offering.
  • It allows me to articulate my specific “CMO for hire” offering in a simplified way, instead of overloading again the scope of my personal homepage
    • Until now, the feedback I received from some of the people who looked at my personal homepage was, “You do too many things, it’s unbelievable… therefore we don’t believe you, or don’t understand what you do”.
    • The thing is, my personal home page is a reflection of who I am as a human, laid bare for anyone who cares to dig… yet visitors landing on that page may all have different interests from each other (for example, some will only care about my open-source work, some will only care about my curriculum, some only care about my businesses, and some weird people might actually be interested in seeing my old illustration works), so it can’t really niche down that much without watering down some of its purpose and personality. As major Kusanagi once said, overspecialization is slow death.
  • Separating Regento from the rest lets me build different kinds of partnerships, the kind that are otherwise not available in a typical agency setting. Different team, different clientèle. And indeed, I am not alone in this endeavour (more on that below).

Double Regento all the way?
But what does it mean?! 🌈

One of the main reasons behind this name is that “regento” is the esperanto word for regent, which matches how I see my services as an “interim CMO for hire”, if you will pardon the royal metaphor here: I take care of your kingdom (business) by structuring and strengthening it until the heir (my replacement) comes of age and takes over at some point in time. This is, I posit, one of the most interesting ways for me to make an impact on the world of business: by helping multiple businesses flourish instead of through one endless reign in Leá Monde one industry.

When it comes to strategic work, I am very picky about who I surround myself with, so I am super excited to highlight the fact that—as can be seen in the Regento about/team pageI have partnered up with Halle Baksh, a good friend of mine whose business experience and wisdom I have great admiration for; our personalities also have great affinity. She is a wonderful human being (not a rogue A.I. like me) and a kindred spirit, and I look forward to continue working with her on making the world a slightly better place, one client’s business at a time.

Watch that space!

Although the Regento initiative has been going on since June 2019, it is only now that I’m announcing its existence (again: yak shavings). The Regento website is pretty embryonic, you could almost say it’s a placeholder, but it’s better to have something than to wait for perfection. The design is not where I would want it to be, the contents are “bare minimum” (there are only two articles published for now, and no case studies and testimonials have been written yet), but “perfect” is the enemy of “good”, and all will come in due time.


If you are interested in insights about business management & business development best practices, high-tech startups (the good, the bad, the ugly), sustainable businesses, or any other topic you may suggest (send me an email!), you may decide to subscribe to the contents email notification list on Regento’s website. As usual: no spam, no data capitalism, just the convenience of not having to remember to check periodically “manually” for updates. And we always enjoy knowing that whatever we write gets read.

April 29, 2021

How to Symbolic Icon

Icon Tooling

Unlike application icons, the symbolic icons don’t convey application identity, but rely on visual metaphor to describe an action (what a button in the UI does).

GNOME has not used fullcolor icons in toolbars and most of the UI in many years. Instead we use symbols, adjusting legibility and their rendering the same way we do with text (recoloring the foreground and background as you can demo switching the dark theme on this blog post).

image/svg+xml

But how does one create such an icon? Here’s a walkthrough of the process, using our 2021 tooling. While the actual drawing of shapes still happens in Inkscape, the workflow is now heavily supported by a suite of design tools.

Before we dive into creation though, let’s start with a more common case: In many cases developers just want to pick and use an existing icon rather than attempting to create it or commission a designer.

Finding an Icon

Historically, looking up icons has been a matter of familiarizing yourself with the icon naming spec, which was built on the concept of semantic naming. However, it turns out developers really just want that symbol that looks like a door, rather than adhering to the strict semantic constraints. Combined with icon themes and evolving visual trends this semantic dream gradually faded over time.

The very basic set of icons is provided directly by the toolkit. For the most part it still adheres to the semantic names such as edit-copy or menu-open rather than descriptive names like scissors or pencil. The coverage of the set is quite conservative and you’re likely to need something that isn’t provided by GTK itself.

Sounds like a lot of work? Lucky for you we have Icon Library to make this easy!

Icon Library

The app can help you not only to search for an appropriate icon using keywords, it can assist you distributing the assets along with your app (as a gresource). While it’s on you as a developer to maintain the assets you bundle, no longer will your app break when you’ve used the the wrench icon when the designers followed a trend of using a pegged wheel for preferences or decided to drop an old or unused icon.

Icon Library

It’s worth noting the app is also quite useful for designers, since you can also copy & paste icons into mockups easily.

The application is currently being ported to GTK 4 so there’s a lot of moving parts in the air at the moment, but the latest release on Flathub is perfectly usable.

Creation

So let’s assume the icons available don’t really fit your need and you need to create one from scratch. I should boldly proclaim that with no API-like maintenance burden, the design team is more than happy to fulfill app developers’ requests. Especially apps aiming to be listed in GNOME Circle, feel free to request an icon on Gitlab.

The first step is always to figure out the metaphor. Sketching out some ideas on paper is a solid recommendation. Even if you think you’re no good at sketching, try it. In fact, especially when you’re not good at sketching, the process will help you identify very strong metaphors that convey the meaning even using a few squibbly lines. The more definition you provide, the harder it is to tell whether the function isn’t overshadowed by the form.

Sketching

Once you’re somewhat convinced a fitting metaphor can be executed on the 16x16 pixel grid, it’s time to reach for the tool to guide your throughout the process, Symbolic Preview.

A Handful

Let’s assume you have a small app that is very early in development and you decided to make it easy to report issues with it and need a report bug icon for the headerbar.

Pencil

Symbolic Preview either allows you to create a single symbolic or a sheet with many symbolics. In our first case, we’ll go for just one.

Single Symbolic

After providing a name for your icon (prefixing with app name can avoid some theming issues) a template is saved. Currently you have to do some file system surfing to open up the asset in Inkscape.

File Naming

Symbolic preview can be used as its name suggests. Currently it doesn’t auto-reload on changes, but hopefully that will come soon.

There are some basic guidelines on how to produce a GNOME symbolic icons, but to provide a tl;dr

  • Use filled shapes to provide enough contrast
  • Maintain the overall contrast not going edge to edge
  • Main outline should not use a hairline stroke, 2px strokes are advised
  • Convert strokes to outlines. The old toolchain allowed to use strokes as long as no fill was used on the same object, but it has been the source of many misrenderings. Just convert all strokes to outlines with Path>Stroke to Path in Inkscape.

Symbolic Preview

The app provides different color contexts and displays your icon among existing symbolics, not unlike what App Icon Preview does for app icons. Again, this is when you only need to ship a handful of icons with your app.

Your needs may, however, exceed just a few. Maintaining lots of individual icons in separate files becomes daunting. Also, there is no need to see your icons in the context of others when they form a large enough set on their own.

The same app we just used to create and test a single symbolic icon can be used to maintain a large set of symbolics in a single library file. The icon-development-kit that is the source for most of the icons you can search with Icons Library is also maintained this way. Let’s take a look how you can maintain a whole set of symbolics icons.

A Lot

Unlike individual icons, for preview and export to work, this workflow is very Inkscape centered and there are numerous constraints to be mindful of, so let’s dive in.

Export Conventions

There are a few conventions you need to follow in order for the export to work. Icon devel kit describes the conventions in detail, but let’s sum them up here:

  • Each icon is a group (<g>). To force its size to be 16x16px, the icon should include a <rect> with no stroke or fill, that is exactly 16x16 and aligns to the pixel grid. This rectangle is removed during export.
  • The name of the icon is taken from the title property of the group in the object properties in Inkscape (Object>Object Properties). (<title/>).
  • If you wish to omit an icon in the preview and export, don’t provide a name for it or append -old to it. This can be useful for some ‘build assets’.

Metadata

While it’s mainly useful for maintaining a collection like icon-development-kit, you will save yourself time looking up your own assets if you provide some metadata describing your icon.

Sheet Preview

Keywords for an icon are given as space separated words in the label attribute in object properties in Inkscape (inkscape:label attribute).

Acknowledgments

Jokingly we refer to the design tool page as the Bilal page. A huge thank you goes to Bilal Elmoussaoui for writing brilliant apps for GNOME and making the design process a little less punishing.

April 28, 2021

Friends of GNOME Update – April 2021

Welcome to the April 2021 Friends of GNOME Update

GNOME 40

At the end of March we released GNOME 40! Some highlights include:

  • new touchpad gestures
  • core apps
  • better wifi settings

You can try it out on GNOME OS Nightly, Fedora, and openSUSE. Check it out online or watch the release video!

GNOME On the Road

We might not be on the road, but Director of Operations Rosanna Yuen recently curated imakefoss. As part of this, she gave her perspective on things like her FOSS origin story and newcomers to the FOSS community. Check it out!

Events with GNOME and Friends

Linux App Summit is coming up. Join us and KDE from May 13 – 15 to learn and grow the Linux app ecosystem. Keynote speakers include GNOME Foundation member and former executive director Karen Sandler and Kathy Giori, who has built her own Linux powered private smart home. The schedule is online and registration is open.

We’ve opened registration for GUADEC 2021. This year’s conference will take place online, using our BigBlueButton installation. You can read the schedule and then register online to attend! Highlights from the schedule include 24 sessions on all sorts of topics, the GNOME Foundation annual members meeting, and keynotes by Hong Phuc Dang and Shauna Gordon-McKeon.

Challenge Winners Announced!

After a number of exciting months, the Community Engagement Challenge wrapped up on April 7 with a showcase of projects and the announcement of the Challenge winner. Congratulations to our winner Big Open Source Sibling and the runner up Open UK Kids Course and Digital Camp.

Part of running the Challenge was building the infrastructure for it, which we now have set up and is ready to go. If you have ideas for future Challenges that match up well with the mission and work of GNOME, please email us!

GNOME.org updates

We recently updated the GNOME web site with a new WordPress instance! Previously, we used a combination of WordPress pages and static pages, but the new site is all on WordPress. The project was started by Britt Yazel, and happened with the help of Evan Welsh and Claudio Wunder.

Summer Internships

This summer we are participating in Google Summer of Code and Outreachy. Mentors have been working with potential interns on their applications and first contributions to GNOME. Accepted interns will be announced in the upcoming weeks.

Technical Updates

Emmanuele Bassi, Core GTK Developer, works on a tool called gi-docgen, which generates API references from introspection data. He’s made some updates and documented them. Additionally, there have been a lot of updates to the GTK documentation.

Emmanuele is also working on fixing issues in the GTK4 accessibility infrastructure. He is replacing the shared accessibility bus with a peer-to-peer connection between GTK4 applications and assistive technologies.

Thank you!

Thank you for your support! Whether you’re a Friend of GNOME, contributor, users, or casually interested in our work, we appreciate your time and interest in building a great GNOME community!

April 26, 2021

fwupd 1.6.0

I’ve just released the first release of the 1.6.x series, and since 1.5.x some internal plugin API has been changed and removed. Although we’ve tested this release on all the hardware we have regression tests for, bugs may have crept in; please report failures to the issue tracker as required.

There are several new plugins adding support for new hardware and a lot of code has been migrated to the new plugin API. The public libfwupd API also has some trivial additions, although no action is required.

This release adds the following notable features:

  • Add a composite ID that is used to identify dock device components
  • Add an Intel Flash Descriptor parser
  • Add API to allow the device to report its own battery level
  • Add API to refount why the the device is non-updatable
  • Add lspcon-i2c-spi programmer support
  • Add more hardware support to the pixart-rf plugin
  • Add some more new category types for firmware to use
  • Add support for downloading the SPI image from the Intel eSPI device
  • Add support for some Analogix hardware
  • Add support for writing SREC firmware
  • Add the firmware-sign command to fwupdtool to allow resigning archives
  • Split UEFI EFI binary into a subproject
  • Use an OFD or Unix lock to prevent more than one fwupdtool process

This release fixes the following bugs:

  • Actually write the bcm57xx stage1 version into the file
  • Add option to disable the UEFI capsule splash screen generation
  • Avoid use-after-free when specifying the VID/PID in dfu-tool
  • Cancel the GDBusObjectManager operation to fix a potential crash
  • Check PixArt firmware compatibility with hardware before flashing
  • Do not check for native dependencies as target dependencies
  • Do not use help2man to build manual pages
  • Fix a crash when shutting down the daemon
  • Fix build on musl
  • Fix build when using BSD
  • Fix /etc/os-release ID_LIKE field parsing
  • Force the synaptics-rmi hardware into IEP mode as required
  • Never allow D-Bus replacement when a firmware update is in operation
  • Offer the user to refresh the remote after enabling
  • Remove unused, unsafe and deprecated functions from libfwupdplugin
  • Simplify asking the user about reviews
  • Write BMP data directly without using PIL
  • Write synaptics-rmi files with valid checksum data

From a Fedora point of view, I’m waiting for the fwupd-efi new package to be reviewed and approved, and then I’ll upload the new version for Rawhide only. I might put 1.6.x into Fedora 34 after a couple more minor releases, but at the moment I’m keeping it with the 1.5.x stable branch which has all the important fixes backported. There’s a lot of new code in 1.6.x which needs to settle.

April 25, 2021

Documentation transforms in JavaScript

Three weeks ago, I had a project idea while napping. I tried to forget it, but ended up prototyping it. Now I named it Hookbill and put it on GitLab. From the README:

I appear to be the last XSLT programmer in the world, and sometimes people get very cross with me when I say those four letters. So I had an idea. What if we did it in JavaScript using Mustache templates?

Just to give a glimpse, here’s the Hookbill template for paragraphs:

{{~#if (if_test element)~}}
{{~#var 'cls' normalize=true}}
  {{element.localname}}
  {{#if (contains element.attrs.style 'lead')}}lead{{/if}}
  {{if_class element}}
{{/var~}}
<p class="{{vars.cls}}" {{call 'html-lang-attrs'}}>
  {{~call 'html-inline' element.children~}}
</p>
{{~/if~}}

And here’s the equivalent XSLT for Mallard in yelp-xsl:

<xsl:template mode="mal2html.block.mode" match="mal:p">
  <xsl:variable name="if"><xsl:call-template name="mal.if.test"/></xsl:variable><xsl:if test="$if != ''">
  <p>
    <xsl:call-template name="html.class.attr">
      <xsl:with-param name="class">
        <xsl:text>p</xsl:text>
        <xsl:if test="contains(concat(' ', @style, ' '), ' lead ')">
          <xsl:text> lead</xsl:text>
        </xsl:if>
        <xsl:if test="$if != 'true'">
          <xsl:text> if-if </xsl:text>
          <xsl:value-of select="$if"/>
        </xsl:if>
      </xsl:with-param>
    </xsl:call-template>
    <xsl:call-template name="html.lang.attrs"/>
    <xsl:apply-templates mode="mal2html.inline.mode"/>
  </p>
</xsl:if>
</xsl:template>

Is the hookbill template shorter? For sure. Is it better? In this case, yeah. For some other things, maybe not as much. It does force me to put a lot of logic in JavaScript, outside the templates. And maybe that’s a good thing for getting other people playing with the HTML and CSS.

I don’t know if I’ll continue working on this. It would take a fairly significant time investment to reach feature parity with yelp-xsl, and I have too many projects as it is. But it was fun to play with, and I thought I’d share it.

April 23, 2021

An accidental bootsplash

Back in 2005 we had Debconf in Helsinki. Earlier in the year I'd ended up invited to Canonical's Ubuntu Down Under event in Sydney, and one of the things we'd tried to design was a reasonable graphical boot environment that could also display status messages. The design constraints were awkward - we wanted it to be entirely in userland (so we didn't need to carry kernel patches), and we didn't want to rely on vesafb[1] (because at the time we needed to reinitialise graphics hardware from userland on suspend/resume[2], and vesa was not super compatible with that). Nothing currently met our requirements, but by the time we'd got to Helsinki there was a general understanding that Paul Sladen was going to implement this.

The Helsinki Debconf ended being an extremely strange event, involving me having to explain to Mark Shuttleworth what the physics of a bomb exploding on a bus were, many people being traumatised by the whole sauna situation, and the whole unfortunate water balloon incident, but it also involved Sladen spending a bunch of time trying to produce an SVG of a London bus as a D-Bus logo and not really writing our hypothetical userland bootsplash program, so on the last night, fueled by Koff that we'd bought by just collecting all the discarded empty bottles and returning them for the deposits, I started writing one.

I knew that Debian was already using graphics mode for installation despite having a textual installer, because they needed to deal with more complex fonts than VGA could manage. Digging into the code, I found that it used BOGL - a graphics library that made use of the VGA framebuffer to draw things. VGA had a pre-allocated memory range for the framebuffer[3], which meant the firmware probably wouldn't map anything else there any hitting those addresses probably wouldn't break anything. This seemed safe.

A few hours later, I had some code that could use BOGL to print status messages to the screen of a machine booted with vga16fb. I woke up some time later, somehow found myself in an airport, and while sitting at the departure gate[4] I spent a while staring at VGA documentation and worked out which magical calls I needed to make to have it behave roughly like a linear framebuffer. Shortly before I got on my flight back to the UK, I had something that could also draw a graphical picture.

Usplash shipped shortly afterwards. We hit various issues - vga16fb produced a 640x480 mode, and some laptops were not inclined to do that without a BIOS call first. 640x400 worked basically everywhere, but meant we had to redraw the art because circles don't work the same way if you change the resolution. My brief "UBUNTU BETA" artwork that was me literally writing "UBUNTU BETA" on an HP TC1100 shortly after I'd got the Wacom screen working did not go down well, and thankfully we had better artwork before release.

But 16 colours is somewhat limiting. SVGALib offered a way to get more colours and better resolution in userland, retaining our prerequisites. Unfortunately it relied on VM86, which doesn't exist in 64-bit mode on Intel systems. I ended up hacking the X.org x86emu into a thunk library that exposed the same API as LRMI, so we could run it without needing VM86. Shockingly, it worked - we had support for 256 colour bootsplashes in any supported resolution on 64 bit systems as well as 32 bit ones.

But by now it was obvious that the future was having the kernel manage graphics support, both in terms of native programming and in supporting suspend/resume. Plymouth is much more fully featured than Usplash ever was, but relies on functionality that simply didn't exist when we started this adventure. There's certainly an argument that we'd have been better off making reasonable kernel modesetting support happen faster, but at this point I had literally no idea how to write decent kernel code and everyone should be happy I kept this to userland.

Anyway. The moral of all of this is that sometimes history works out such that you write some software that a huge number of people run without any idea of who you are, and also that this can happen without you having any fucking idea what you're doing.

Write code. Do crimes.

[1] vesafb relied on either the bootloader or the early stage kernel performing a VBE call to set a mode, and then just drawing directly into that framebuffer. When we were doing GPU reinitialisation in userland we couldn't guarantee that we'd run before the kernel tried to draw stuff into that framebuffer, and there was a risk that that was mapped to something dangerous if the GPU hadn't been reprogrammed into the same state. It turns out that having GPU modesetting in the kernel is a Good Thing.

[2] ACPI didn't guarantee that the firmware would reinitialise the graphics hardware, and as a result most machines didn't. At this point Linux didn't have native support for initialising most graphics hardware, so we fell back to doing it from userland. VBEtool was a terrible hack I wrote to try to re-execute the system's graphics hardware through a range of mechanisms, and it worked in a surprising number of cases.

[3] As long as you were willing to deal with 640x480 in 16 colours

[4] Helsinki-Vantaan had astonishingly comfortable seating for time

comment count unavailable comments

Getting Fractal up to speed

Fractal is a popular Matrix chat client for the GNOME desktop which has been in development since 2017. Fractal was designed to work well for collaboration in large groups, such as free software projects. However, there are still two main areas where it would benefit from improvements: its performance and maintainability are limited, and it lacks some important features such as end-to-end encryption (E2EE).

Adding E2EE will increase the confidence that users have in the privacy of their conversations, making it nearly impossible for their conversations to be accessed by others. Because E2EE aims to prevent the service provider from being able to decrypt the messages, because the encryption keys are stored only on the end-user’s device. The direct consequence of this is that some work is delegated to the client. Some of this functionality is the same for each and every Matrix client, and includes technical components that could easily be implemented in the wrong way (especially the various encryption and security features). Most security researchers agree that redoing this work is a bad idea as it can lead to vulnerabilities. More generally, reimplementing the same functionality for each client doesn’t make much sense. On the other hand, sharing it with others allows projects that use it to contribute their expertise and polish it together instead of competing on a multitude of implementations. That shared work is called an SDK and could be considered the future “engine” of Fractal.

When Fractal was created, there was no existing code that we could rely on. We had to implement ourselves bits of the Matrix protocol in Fractal, at a low level. In the meantime, the Matrix Foundation has kickstarted matrix-rust-sdk, a library to channel the Matrix community efforts into a common, efficient library. This library, still in development for now, will allow us to drop a lot of our own code.

The current Fractal engine, which handles all the interactions with the servers, is entangled with the parts that handle the user interface. This severely impairs the evolutionary potential of Fractal and has a significant impact on performance. At first we attempted to untangle the user interface and engine. Alejandro worked at the same time on refactoring our messy code and porting it to the SDK, but it was just too much of a hassle and breakage kept happening. We weighed the pros and cons and ended up deciding that a rewrite was the better alternative. We will also use this opportunity to update Fractal to use a newer version of GTK, the toolkit which Fractal uses for its user interface, which brings many performance improvements. The rewrite we decided upon is a perfect opportunity to transition to GTK4 and design for it.

We launched the Fractal-next initiative to carry out all of this work: rebuild Fractal from scratch, with improved foundations. Those foundations are GTK4 for the User Interface code, and matrix-rust-sdk for the engine code. Fractal-next is only just acquiring basic functionality so is not ready for general testing yet.

Sharing the work with others

The ultimate goal for the matrix-rust-sdk is to be feature-complete. It means the SDK aims at providing an easy way for developers to interact with servers using every feature offered by the Matrix protocol.

The SDK is still in its early stages, but is already feature-rich enough to be useful. In particular, it already implements the end-to-end encryption features we need to support in Fractal. We want to contribute to it by implementing the features we want to rely on for Fractal, but which are not exclusive to Fractal.

A major part that the SDK is lacking is a high-level API: a description of what the engine can do and how to use it. This would allow developers without knowledge of the underlying technology to work with Matrix. This API would translate simple calls such as “send this message to this person” into the actual technical action compliant with the Matrix protocol. In the Matrix world, almost everything is an event that is sent to a Room, a conceptual place to send and receive events from. Someone sends a message to a room: that’s an event. Someone changes their display name: that’s an event. Someone is kicked: you got it, that’s an event.

I already started working on designing and improving the API. So far I wrote the code that developers will use to interact with Matrix rooms and their membership. I also have plans to do some thing similar with Matrix events, but the design requires more discussion.

I also added some convenience methods to the API. The SDK can now easily retrieve the avatar for users and for matrix rooms. Future improvements include optimising the process for retrieving the avatar and caching the image so that it doesn’t need to be re-downloaded every time it is needed. This and other optimizations will happen behind the scenes inside the SDK, without requiring the application developer to do anything.

Code is only one aspect of contributions: a lot of the actual work consists of providing feedback and discussions around improving the code. I started the conversation about how the SDK should store Matrix events. So far, the SDK mostly stores and cares about end-to-end encryption events. The final goal will be that the SDK stores most of information locally, this will be especially crucial for the search in encrypted rooms since it’s performed locally because the server doesn’t have access to the content of a conversation.

Progress so far

Rewriting Fractal from scratch is a huge undertaking, but it has to happen and we are on the right path. While the contributions to matrix-rust-sdk are important as it is foundational work for Fractal, I could start working on the application itself too.

Most of the boilerplate is set up. That is quite generic code, which is mostly the same for every GTK app. This includes the the folder and module structure we will use in Fractal’s codebase. Now that the core structure is in place we can incrementally implement each piece of Fractal untill we reach a feature rich Matrix client. If you run Fractal-next right now you will already see most of the UI even though it’s pretty much an empty shell.

I already implemented the login via password in Fractal-next, but there is no single sign on support yet. This was done on purpose because the login flow of Matrix is expected to change in the near future and the current login is enough to start making use of matrix-rust-sdk.

Furthermore, I spent significant time on finding a good approach to connect Fractal and the matrix-rust-sdk. They are both written in Rust but Fractal relies heavily on GTK and GLib . As a result, they use quite different paradigms. The interconnection of Fractal and matrix-rust-sdk needs to be carefully thought through to make use of their full potentials.

After logging in to Fractal-next, you can currently see the list of your rooms. The sidebar is mostly complete, but doesn’t contain room categories such as Favorites yet. I will add the base for this feature to the SDK before I can implement it in Fractal.

I will cover the code of the sidebar and the rest of the architecture of Fractal-next in much more detail in a future blogpost.

Milestones and next steps

So what is next on my TODO list? Well, as I previously mentioned, we have a list of rooms, but no interaction with the items in this list are possible, yet. So the next step will be to add the room history and allow switching between rooms from within the sidebar. Clicking on a room should bring up the history for the selected room. Once this is implemented, we will add the possibility to send messages. And by this point Fractal-next will be huge step closer to a client, but will still miss a lot before it can be considered a feature rich client for instant messaging.

I would like to thank NLnet for funding my work on Fractal and Fractal-next, to build a feature rich and polished client to make the instant messaging experience on Matrix as pleasant as possible. The funding of NlNet allows me to focus also on the SDK, so that other people will be able to build great software on top of the SDK and us the Matrix ecosystem.

Flatseal 1.7.0

A big new release of Flatseal is out! It took a bit longer, due to that little Portfolio detour 😅, but it finally happened. This release introduces some subtle improvements to the UI, along with one major new feature; initial support for Portals.

Starting with the UI improvements, a massive libhandy’fication happened. Main widgets such as the window, groups and permissions rows are now using libhandy widgets. This helped to remove a lot of custom code, but also brought neat little details like rounded corners, the ability to toggle permissions by simply tapping the rows, among other things. Thanks to @BrainBlasted for starting this work.

The applications search entry can now be toggled, and its position is now independent of the scrolled window. Not having to scroll back to the top to change the search terms, makes it a bit more usable.

This release comes with an offline version of Flatseal’s documentation, thanks to a newly-added viewer. For now, the documentation is only available in English. I am hoping to improve the document before any attempt to translate it. Thanks to Apostrophe for such nice exporters 🙂.

Moving on to the new big feature, yes, portals! With this release, it’s now possible to review and manage dynamic permissions. Thank @hferreiro for requesting this feature.

This initial support includes six of the most basic permissions. The plan is to support as many of these, as long as it makes sense of course. Having said that, a few important notes:

  • Not all dynamic permissions rely on stored “decisions” and, therefore, there’s nothing for Flatseal to manage.
  • Not all Flatpak applications use portals currently and, therefore, many of these new options won’t be very useful until portals get more adoption.
  • It is expected to see many of these permissions “grayed out”. Check the tooltips for the specific reason but, it is likely due to the fact that some of these portals have not been used yet and, therefore, have not set up any data for Flatseal to manage.

Well, that’s it for now. Any suggestions, ideas, bugs, I am always available. And if you haven’t tried Flatseal yet, I recommend getting it from Flathub.

Last but never least, thanks to everyone who contributed to this release, @BrainBlasted, @cho2@Vistaus, @ovari, @eson57, @TheEvilSkeleton, @AsciiWolf and @MiloCasagrande.

April 20, 2021

AF_ALG support in GnuTLS

The Linux kernel implements a set of cryptographic algorithms to be used by other parts of the kernel. These algorithms can be accessed through the internal API; notable consumers of this API are encrypted network protocols such as WireGuard, as well as data encryption as in fscrypt. The kernel also provides an interface for user-space programs to access the kernel crypto API.

GnuTLS has recently gained a new crypto backend that uses the kernel interface in addition to the user-space implementation. There are a few benefits of having it. The most obvious one is performance improvement: while the existing user-space assembly implementation has comparable performance to the in-kernel software emulation, the kernel crypto implementation also enables workload offloading to hardware accelerators, such as Intel QAT cards. Secondly, it brings support for a wider variety of CPU architectures: not only IA32 and AArch64, but also PowerPC and s390. The last but not least is that it could be used as a potential safety net for the crypto algorithms implementation: deferring the crypto operations to the kernel means that we could have an option to workaround any bugs or compliance (such as FIPS140) issues in the library.

As for the implementation, the kernel interface is exposed through the AF_ALG socket family along with a Netlink interface to retrieve information about algorithms; although it is not straightforward to directly work with the interface, libkcapi provides a nice abstraction over the underlying system calls, which we use as a basis for the integration with GnuTLS. František Krenželok in our team picked the initial patch set provided by Stephan Mueller and has successfully moved it towards the finish line.

With the upcoming 3.7.2 release, GnuTLS user programs could enjoy a performance boost (under certain circumstances) through this new crypto backend. Next up, we are aiming to integrate KTLS as well. Stay tuned.

Focusrite is hostile to Linux, avoid if possible

Last year, I acquired a Focusrite Scarlett 4i4. The main purpose was to improve the quality of my live coding sessions, and also to allow me experiment with recording my own songs.

It was a pain from the moment I plugged this card into my laptop, until now.

As of today, I’m happy that I’m finally getting rid of it.

Allow me to explain how much of a disaster their approach is. Most USB digital audio interfaces are compatible with industry standards – they’re class compliant. That means they advertise features, inputs, outputs, etc, using a standard USB protocol.

Not Focusrite.

Focusrite decided they didn’t like hardware buttons. So they removed them, and switched to software-controlled features.

For some reason that I’m yet to understand, Focusrite decided they wouldn’t use any standard protocol to advertise these features. So they invented a proprietary protocol only to control these features. This protocol is only usable through their Focusrite Control software – which, as you might have guessed, is proprietary, and only runs on Windows and Mac.

Focusrite decided they didn’t want their hardware to work on Linux, so not even a minimal documentation about routing was published. That makes it even harder for the heroes trying to reverse-engineer their cards.

I’ve contacted them and shared my thoughts about it. Their response, while clear and unambiguous, is still disappointing:

Focusrite clearly states on their website that Linux is not supported. And that’s okay, very few digital audio interface manufacturers claim that. But that wouldn’t be a problem if their hardware were at least class compliant. They are not. With that, there is enough evidence that Focusrite is hostile to Linux users, and we should not support their business practices until they switch this stance.

April 19, 2021

Simple HTTP profiling of applications using sysprof

This is a quick write-up of a feature I added last year to libsoup and sysprof which exposes basic information about HTTP/HTTPS requests to sysprof, so they can be visualised in GNOME Builder.

Prerequisites

  • libsoup compiled with sysprof support (-Dsysprof=enabled), which should be the default if libsysprof-capture is available on your system.
  • Your application (and ideally its dependencies) uses libsoup for its HTTP requests; this won’t work with other network libraries.

Instructions

  • Run your application in Builder under sysprof, or under sysprof on the CLI (sysprof-cli -- your-command) then open the resulting capture file in Builder.
  • Ensure the ‘Timings’ row is visible in the ‘Instruments’ list on the left, and that the libsoup row is enabled beneath that.

Results

You should then be able to see a line in the ‘libsoup’ row for each HTTP/HTTPS request your application made. The line indicates the start time and duration of the request (from when the first byte of the request was sent to when the last byte of the response was received).

The details of the event contain the URI which was requested, whether the transaction was HTTP or HTTPS, the number of bytes sent and received, the Last-Modified header, If-Modified-Since header, If-None-Match header and the ETag header (subject to a pending fix).

What’s that useful for?

  • Seeing what requests your application is making, across all its libraries and dependencies — often it’s more than you think, and some of them can easily be optimised out. A request which is not noticeable to you on a fast internet connection will be noticeable to someone on a slower connection with higher request latencies.
  • Quickly checking that all requests are HTTPS rather than HTTP.
  • Quickly checking that all requests from the application set appropriate caching headers (If-Modified-Since, If-None-Match) and that all responses from the server do too (Last-Modified, ETag) — if a HTTP request can result in a cache hit, that’s potentially a significant bandwidth saving for the client, and an even bigger one for the server (if it’s seeing the same request from multiple clients).
  • Seeing a high-level overview of what bandwidth your application is using, and which HTTP requests are contributing most to that.
  • Seeing how it all ties in with other resource usage in your application, courtesy of sysprof.

Yes that seems useful

There’s plenty of scope for building this out into a more fully-featured way of inspecting HTTP requests using sysprof. By doing it from inside the process, using sysprof – rather than from outside, using Wireshark – this allows for visibility into TLS-encrypted conversations.

Deploy a Serverless Probot/Github App on Netlify Functions

Automation is love. We all love automating repetitive things. There is one such thing called Probot. Probot is one of the most popular frameworks for developing GitHub Apps using Javascript. It is easy to set up as most of the things like setting up authentication, registering webhooks, managing permissions, are all handled by Probot itself. We just need to write our code for sending responses to different events.

In this article, we will learn how we can build a serverless bot and deploy it to Netlify Functions. The advantage of using Netlify Functions is that it is free for up to 125,000 requests per month which is more than enough for a small startup or organization.

Contents

Prerequisites

We can follow this guide on how to set up Probot.

1. Writing Application Logic

Once our Probot is set up, we need to do some changes to our directory structure to deploy it to Netlify Functions.

Let us create an src directory and put our application logic in a new file, app.js:

/**
 * This is the main entrypoint to your Probot app
 * @param {import('probot').Probot} app
 */

module.exports = (app) => {
  app.log.info('App has loaded');

  app.on('issues.opened', async (context) => {
    context.octokit.issues.createComment(
      context.issue({
        body: 'Thanks for opening this issue!',
      })
    );
  });
};

The above code is really simple. Whenever a new issue is opened, it creates an issue comment thanking the issue author.

Netlify Functions are AWS Lambda functions but their deployment to AWS is handled by Netlify. For deploying our Probot on Netlify, we can use AWS Lambda adapter for Probot.

npm install @probot/adapter-aws-lambda-serverless --save

The next thing we need to do is to create a functions directory that will be used by Netlify to deploy our serverless functions. Every JS file in the functions directory is deployed as an individual function which can be accessed via <domain>/.netlify/functions/<function_name>.

In functions directory, let us create a index.js file and add the following code:

const { createLambdaFunction, createProbot } = require('@probot/adapter-aws-lambda-serverless');
const app = require('../src/app');

module.exports.handler = createLambdaFunction(app, {
  probot: createProbot(),
});

Our code is finally done and the next step is to deploy our application to Netlify.

2. Deploying on Netlify Functions

Before proceeding with setup for deployment, we need to address some issues. We need to create a configuration file, netlify.toml at the root of the project and tell some important things for Netlify to consider when deploying our bot.

Let us add the following content in netlify.toml:

[build]
command = "npm install --production"
functions = "./functions"

We are telling Netlify to run npm install before deploying our functions which are present in the functions directory.

To deploy on Netlify, we can use Netlify Dev. For that we need to install netlify-cli by doing:

npm install netlify-cli -g

Let us now login to our Netlify account by doing:

netlify login

Once we are logged in, let us connect our current directory to Netlify Functions. We can either connect to an existing one or create a new one by doing:

netlify init

Once our site is connected, we can build our site locally and deploy it to Netlify by doing:

netlify build
netlify deploy --prod

We can also connect our Github Repository to our Netlify project or use Github Actions to deploy our bot to Netlify. {.alert alert-info}

3. Updating Webhook URL

Once our Probot is deployed, we need to update the Webhook URL to tell Github where to send the event payloads. We can visit [https://github.com/settings/apps/<app-name>](https://github.com/settings/apps/<app-name>) and update the Webhook URL with our Netlify website URL.

Results

Let us test our bot by creating an issue on a repository where we installed our Github app and see whether our bot responds back or not.

Setting up our master branch on Github
Setting up our master branch on Github

Awesome! We can see that our bot welcomed us with a message that we wrote earlier. There are many things to automate on Github like auto assigning users to the issues, auto assigning reviewers to a pull request, auto merging pull requests created by dependabot alerts and much more.

April 14, 2021

2021-04-14 Wednesday

  • Catch up with Tor, sales call with Eloy - prodded at a nasty mis-feature with web view mode in Online and code read through a nasty with Ash. M's ballet re-starting; fun.

April 13, 2021

2021-04-13 Tuesday

  • Mail chew, sync with Kendy; chased customer bugs, and contractuals; catch-up with Pedro.

April 12, 2021

Rift CV1 – Getting close now…

It’s been a while since my last post about tracking support for the Oculus Rift in February. There’s been big improvements since then – working really well a lot of the time. It’s gone from “If I don’t make any sudden moves, I can finish an easy Beat Saber level” to “You can’t hide from me!” quality.

Equally, there are still enough glitches and corner cases that I think I’ll still be at this a while.

Here’s a video from 3 weeks ago of (not me) playing Beat Saber on Expert+ setting showing just how good things can be now:

Beat Saber – Skunkynator playing Expert+, Mar 16 2021

Strap in. Here’s what I’ve worked on in the last 6 weeks:

Pose Matching improvements

Most of the biggest improvements have come from improving the computer vision algorithm that’s matching the observed LEDs (blobs) in the camera frames to the 3D models of the devices.

I split the brute-force search algorithm into 2 phases. It now does a first pass looking for ‘obvious’ matches. In that pass, it does a shallow graph search of blobs and their nearest few neighbours against LEDs and their nearest neighbours, looking for a match using a “Strong” match metric. A match is considered strong if expected LEDs match observed blobs to within 1.5 pixels.

Coupled with checks on the expected orientation (matching the Gravity vector detected by the IMU) and the pose prior (expected position and orientation are within predicted error bounds) this short-circuit on the search is hit a lot of the time, and often completes within 1 frame duration.

In the remaining tricky cases, where a deeper graph search is required in order to recover the pose, the initial search reduces the number of LEDs and blobs under consideration, speeding up the remaining search.

I also added an LED size model to the mix – for a candidate pose, it tries to work out how large (in pixels) each LED should appear, and use that as a bound on matching blobs to LEDs. This helps reduce mismatches as devices move further from the camera.

LED labelling

When a brute-force search for pose recovery completes, the system now knows the identity of various blobs in the camera image. One way it avoids a search next time is to transfer the labels into future camera observations using optical-flow tracking on the visible blobs.

The problem is that even sped-up the search can still take a few frame-durations to complete. Previously LED labels would be transferred from frame to frame as they arrived, but there’s now a unique ID associated with each blob that allows the labels to be transferred even several frames later once their identity is known.

IMU Gyro scale

One of the problems with reverse engineering is the guesswork around exactly what different values mean. I was looking into why the controller movement felt “swimmy” under fast motions, and one thing I found was that the interpretation of the gyroscope readings from the IMU was incorrect.

The touch controllers report IMU angular velocity readings directly as a 16-bit signed integer. Previously the code would take the reading and divide by 1024 and use the value as radians/second.

From teardowns of the controller, I know the IMU is an Invensense MPU-6500. From the datasheet, the reported value is actually in degrees per second and appears to be configured for the +/- 2000 °/s range. That yields a calculation of Gyro-rad/s = Gyro-°/s * (2000 / 32768) * (?/180) – or a divisor of 938.734.

The 1024 divisor was under-estimating rotation speed by about 10% – close enough to work until you start moving quickly.

Limited interpolation

If we don’t find a device in the camera views, the fusion filter predicts motion using the IMU readings – but that quickly becomes inaccurate. In the worst case, the controllers fly off into the distance. To avoid that, I added a limit of 500ms for ‘coasting’. If we haven’t recovered the device pose by then, the position is frozen in place and only rotation is updated until the cameras find it again.

Exponential filtering

I implemented a 1-Euro exponential smoothing filter on the output poses for each device. This is an idea from the Project Esky driver for Project North Star/Deck-X AR headsets, and almost completely eliminates jitter in the headset view and hand controllers shown to the user. The tradeoff is against introducing lag when the user moves quickly – but there are some tunables in the exponential filter to play with for minimising that. For now I’ve picked some values that seem to work reasonably.

Non-blocking radio

Communications with the touch controllers happens through USB radio command packets sent to the headset. The main use of radio commands in OpenHMD is to read the JSON configuration block for each controller that is programmed in at the factory. The configuration block provides the 3D model of LED positions as well as initial IMU bias values.

Unfortunately, reading the configuration block takes a couple of seconds on startup, and blocks everything while it’s happening. Oculus saw that problem and added a checksum in the controller firmware. You can read the checksum first and if it hasn’t changed use a local cache of the configuration block. Eventually, I’ll implement that caching mechanism for OpenHMD but in the meantime it still reads the configuration blocks on each startup.

As an interim improvement I rewrote the radio communication logic to use a state machine that is checked in the update loop – allowing radio communications to be interleaved without blocking the regularly processing of events. It still interferes a bit, but no longer causes a full multi-second stall as each hand controller turns on.

Haptic feedback

The hand controllers have haptic feedback ‘rumble’ motors that really add to the immersiveness of VR by letting you sense collisions with objects. Until now, OpenHMD hasn’t had any support for applications to trigger haptic events. I spent a bit of time looking at USB packet traces with Philipp Zabel and we figured out the radio commands to turn the rumble motors on and off.

In the Rift CV1, the haptic motors have a mode where you schedule feedback events into a ringbuffer – effectively they operate like a low frequency audio device. However, that mode was removed for the Rift S (and presumably in the Quest devices) – and deprecated for the CV1.

With that in mind, I aimed for implementing the unbuffered mode, with explicit ‘motor on + frequency + amplitude’ and ‘motor off’ commands sent as needed. Thanks to already having rewritten the radio communications to use a state machine, adding haptic commands was fairly easy.

The big question mark is around what API OpenHMD should provide for haptic feedback. I’ve implemented something simple for now, to get some discussion going. It works really well and adds hugely to the experience. That code is in the https://github.com/thaytan/OpenHMD/tree/rift-haptics branch, with a SteamVR-OpenHMD branch that uses it in https://github.com/thaytan/SteamVR-OpenHMD/tree/controller-haptics-wip

Problem areas

Unexpected tracking losses

I’d say the biggest problem right now is unexpected tracking loss and incorrect pose extractions when I’m not expecting them. Especially my right controller will suddenly glitch and start jumping around. Looking at a video of the debug feed, it’s not obvious why that’s happening:

To fix cases like those, I plan to add code to log the raw video feed and the IMU information together so that I can replay the video analysis frame-by-frame and investigate glitches systematically. Those recordings will also work as a regression suite to test future changes.

Sensor fusion efficiency

The Kalman filter I have implemented works really nicely – it does the latency compensation, predicts motion and extracts sensor biases all in one place… but it has a big downside of being quite expensive in CPU. The Unscented Kalman Filter CPU cost grows at O(n^3) with the size of the state, and the state in this case is 43 dimensional – 22 base dimensions, and 7 per latency-compensation slot. Running 1000 updates per second for the HMD and 500 for each of the hand controllers adds up quickly.

At some point, I want to find a better / cheaper approach to the problem that still provides low-latency motion predictions for the user while still providing the same benefits around latency compensation and bias extraction.

Lens Distortion

To generate a convincing illusion of objects at a distance in a headset that’s only a few centimetres deep, VR headsets use some interesting optics. The LCD/OLED panels displaying the output get distorted heavily before they hit the users eyes. What the software generates needs to compensate by applying the right inverse distortion to the output video.

Everyone that tests the CV1 notices that the distortion is not quite correct. As you look around, the world warps and shifts annoyingly. Sooner or later that needs fixing. That’s done by taking photos of calibration patterns through the headset lenses and generating a distortion model.

Camera / USB failures

The camera feeds are captured using a custom user-space UVC driver implementation that knows how to set up the special synchronisation settings of the CV1 and DK2 cameras, and then repeatedly schedules isochronous USB packet transfers to receive the video.

Occasionally, some people experience failure to re-schedule those transfers. The kernel rejects them with an out-of-memory error failing to set aside DMA memory (even though it may have been running fine for quite some time). It’s not clear why that happens – but the end result at the moment is that the USB traffic for that camera dies completely and there’ll be no more tracking from that camera until the application is restarted.

Often once it starts happening, it will keep happening until the PC is rebooted and the kernel memory state is reset.

Occluded cases

Tracking generally works well when the cameras get a clear shot of each device, but there are cases like sighting down the barrel of a gun where we expect that the user will line up the controllers in front of one another, and in front of the headset. In that case, even though we probably have a good idea where each device is, it can be hard to figure out which LEDs belong to which device.

If we already have a good tracking lock on the devices, I think it should be possible to keep tracking even down to 1 or 2 LEDs being visible – but the pose assessment code will have to be aware that’s what is happening.

Upstreaming

April 14th marks 2 years since I first branched off OpenHMD master to start working on CV1 tracking. How hard can it be, I thought? I’ll knock this over in a few months.

Since then I’ve accumulated over 300 commits on top of OpenHMD master that eventually all need upstreaming in some way.

One thing people have expressed as a prerequisite for upstreaming is to try and remove the OpenCV dependency. The tracking relies on OpenCV to do camera distortion calculations, and for their PnP implementation. It should be possible to reimplement both of those directly in OpenHMD with a bit of work – possibly using the fast LambdaTwist P3P algorithm that Philipp Zabel wrote, that I’m already using for pose extraction in the brute-force search.

Others

I’ve picked the top issues to highlight here. https://github.com/thaytan/OpenHMD/issues has a list of all the other things that are still on the radar for fixing eventually.

Other Headsets

At some point soon, I plan to put a pin in the CV1 tracking and look at adapting it to more recent inside-out headsets like the Rift S and WMR headsets. I implemented 3DOF support for the Rift S last year, but getting to full positional tracking for that and other inside-out headsets means implementing a SLAM/VIO tracking algorithm to track the headset position.

Once the headset is tracking, the code I’m developing here for CV1 to find and track controllers will hopefully transfer across – the difference with inside-out tracking is that the cameras move around with the headset. Finding the controllers in the actual video feed should work much the same.

Sponsorship

This development happens mostly in my spare time and partly as open source contribution time at work at Centricular. I am accepting funding through Github Sponsorships to help me spend more time on it – I’d really like to keep helping Linux have top-notch support for VR/AR applications. Big thanks to the people that have helped get this far.

April 11, 2021

guile's reader, in guile

Good evening! A brief(ish?) note today about some Guile nargery.

the arc of history

Like many language implementations that started life when you could turn on the radio and expect to hear Def Leppard, Guile has a bottom half and a top half. The bottom half is written in C and exposes a shared library and an executable, and the top half is written in the language itself (Scheme, in the case of Guile) and somehow loaded by the C code when the language implementation starts.

Since 2010 or so we have been working at replacing bits written in C with bits written in Scheme. Last week's missive was about replacing the implementation of dynamic-link from using the libltdl library to using Scheme on top of a low-level dlopen wrapper. I've written about rewriting eval in Scheme, and more recently about how the road to getting the performance of C implementations in Scheme has been sometimes long.

These rewrites have a quixotic aspect to them. I feel something in my gut about rightness and wrongness and I know at a base level that moving from C to Scheme is the right thing. Much of it is completely irrational and can be out of place in a lot of contexts -- like if you have a task to get done for a customer, you need to sit and think about minimal steps from here to the goal and the gut doesn't have much of a role to play in how you get there. But it's nice to have a project where you can do a thing in the way you'd like, and if it takes 10 years, that's fine.

But besides the ineffable motivations, there are concrete advantages to rewriting something in Scheme. I find Scheme code to be more maintainable, yes, and more secure relative to the common pitfalls of C, obviously. It decreases the amount of work I will have when one day I rewrite Guile's garbage collector. But also, Scheme code gets things that C can't have: tail calls, resumable delimited continuations, run-time instrumentation, and so on.

Taking delimited continuations as an example, five years ago or so I wrote a lightweight concurrency facility for Guile, modelled on Parallel Concurrent ML. It lets millions of fibers to exist on a system. When a fiber would need to block on an I/O operation (read or write), instead it suspends its continuation, and arranges to restart it when the operation becomes possible.

A lot had to change in Guile for this to become a reality. Firstly, delimited continuations themselves. Later, a complete rewrite of the top half of the ports facility in Scheme, to allow port operations to suspend and resume. Many of the barriers to resumable fibers were removed, but the Fibers manual still names quite a few.

Scheme read, in Scheme

Which brings us to today's note: I just rewrote Guile's reader in Scheme too! The reader is the bit that takes a stream of characters and parses it into S-expressions. It was in C, and now is in Scheme.

One of the primary motivators for this was to allow read to be suspendable. With this change, read-eval-print loops are now implementable on fibers.

Another motivation was to finally fix a bug in which Guile couldn't record source locations for some kinds of datums. It used to be that Guile would use a weak-key hash table to associate datums returned from read with source locations. But this only works for fresh values, not for immediate values like small integers or characters, nor does it work for globally unique non-immediates like keywords and symbols. So for these, we just wouldn't have any source locations.

A robust solution to that problem is to return annotated objects rather than using a side table. Since Scheme's macro expander is already set to work with annotated objects (syntax objects), a new read-syntax interface would do us a treat.

With read in C, this was hard to do. But with read in Scheme, it was no problem to implement. Adapting the expander to expect source locations inside syntax objects was a bit fiddly, though, and the resulting increase in source location information makes the output files bigger by a few percent -- due somewhat to the increased size of the .debug_lines DWARF data, but also due to serialized source locations for syntax objects in macros.

Speed-wise, switching to read in Scheme is a regression, currently. The old reader could parse around 15 or 16 megabytes per second when recording source locations on this laptop, or around 22 or 23 MB/s with source locations off. The new one parses more like 10.5 MB/s, or 13.5 MB/s with positions off, when in the old mode where it uses a weak-key side table to record source locations. The new read-syntax runs at around 12 MB/s. We'll be noodling at these in the coming months, but unlike when the original reader was written, at least now the reader is mainly used only at compile time. (It still has a role when reading s-expressions as data, so there is still a reason to make it fast.)

As is the case with eval, we still have a C version of the reader available for bootstrapping purposes, before the Scheme version is loaded. Happily, with this rewrite I was able to remove all of the cruft from the C reader related to non-default lexical syntax, which simplifies maintenance going forward.

An interesting aspect of attempting to make a bug-for-bug rewrite is that you find bugs and unexpected behavior. For example, it turns out that since the dawn of time, Guile always read #t and #f without requiring a terminating delimiter, so reading "(#t1)" would result in the list (#t 1). Weird, right? Weirder still, when the #true and #false aliases were added to the language, Guile decided to support them by default, but in an oddly backwards-compatible way... so "(#false1)" reads as (#f 1) but "(#falsa1)" reads as (#f alsa1). Quite a few more things like that.

All in all it would seem to be a successful rewrite, introducing no new behavior, even producing the same errors. However, this is not the case for backtraces, which can expose the guts of read in cases where that previously wouldn't happen because the C stack was opaque to Scheme. Probably we will simply need to add more sensible error handling around callers to read, as a backtrace isn't a good user-facing error anyway.

OK enough rambling for this evening. Happy hacking to all and to all a good night!

April 10, 2021

Calliope, slowly building steam

I wrote in December about Calliope, a small toolkit for building music recommendations. It can also be used for some automation tasks.

I added a bandcamp module which list albums in your Bandcamp collection. I sometimes buy albums and then don’t download them because maybe I forgot or I wasn’t at home when I bought it. So I want to compare my Bandcamp collection against my local music collection and check if something is missing. Here’s how I did it:

# Albums in your online collection that are missing from your local collection.

ONLINE_ALBUMS="cpe bandcamp --user ssssam collection"
LOCAL_ALBUMS="cpe tracker albums"
#LOCAL_ALBUMS="cpe beets albums"

cpe diff --scope=album <($ONLINE_ALBUMS | cpe musicbrainz resolve-ids -) <($LOCAL_ALBUMS) 


Like all things in Calliope this outputs a playlist as a JSON stream, in this case, a list of all the albums I need to download:

{
  "album": "Take Her Up To Monto",
  "bandcamp.album_id": 2723242634,
  "location": "https://roisinmurphy.bandcamp.com/album/take-her-up-to-monto",
  "creator": "Róisín Murphy",
  "bandcamp.artist_id": "423189696",
  "musicbrainz.artist_id": "4c56405d-ba8e-4283-99c3-1dc95bdd50e7",
  "musicbrainz.release_id": "0a79f6ee-1978-4a4e-878b-09dfe6eac3f5",
  "musicbrainz.release_group_id": "d94fb84a-2f38-4fbb-971d-895183744064"
}
{
  "album": "LA OLA INTERIOR Spanish Ambient & Acid Exoticism 1983-1990",
  "bandcamp.album_id": 3275122274,
  "location": "https://lesdisquesbongojoe.bandcamp.com/album/la-ola-interior-spanish-ambient-acid-exoticism-1983-1990",
  "creator": "Various Artists",
  "bandcamp.artist_id": "3856789729",
  "meta.warnings": [
    "musicbrainz: Unable to find release on musicbrainz"
  ]
}

There are some interesting complexities to this, and in 12 hours of hacking I didn’t solve them all. Firstly, Bandcamp artist and album names are not normalized. Some artist names have spurious “The”, some album names have “(EP)” or “(single)” appended, so they don’t match your tags. These details are of interest only to librarians, but how can software tell the difference?

The simplest approach is use Musicbrainz, specifically cpe musicbrainz resolve-ids. By comparing ids where possible we get mostly good results. There are many albums not on Musicbrainz, though, which for now turn up as false positives. Resolving Musicbrainz IDs is a tricky process, too — how do we distinguish Multi-Love (album) from Multi-Love (single) if we only have an album name?

If you want to try it out, great! It’s still aimed at hackers — you’ll have to install from source with Meson and probably fix some bugs along the way. Please share the fixes!

April 09, 2021

New Shortwave release

Ten months later, after 14.330 added and 8.634 deleted lines, Shortwave 2.0 is available! It sports new features, and comes with the well known improvements, and bugfixes as always.

Optimised user interface

The user interface got ported from GTK3 to GTK4. During this process, many elements were improved or recreated from scratch. For example the station detail dialog window got completely overhauled:

New station details dialog

Huge thanks to Maximiliano, who did the initial port to GTK4!

Adaptive interface – taken to the next level

New mini player window mode

Shortwave has always been designed to handle any screen size from the beginning. In version 2.0 we have been able to improve this even further. There is now a compact mini player for desktop screens. This still offers access to the most important functions in a tiny window.

Other noteworthy changes

  • New desktop notifications to notify you of new songs.
  • Improved keyboard navigation in the user interface.
  • Inhibit sleep/hibernate mode during audio playback.

Download

Shortwave is available to download from Flathub:

April 08, 2021

dLeyna updates and general project things

I have too many projects. That’s why I also picked up dLeyna which was laying around looking a bit unloved, smacked fixes, GUPnP 1.2 and meson support on top and made new releases. They are available at

  • https://github.com/phako/dleyna-core/archive/refs/tags/v0.7.0.tar.gz
  • https://github.com/phako/dleyna-connector-dbus/archive/refs/tags/v0.4.0.tar.gz
  • https://github.com/phako/dleyna-server/archive/refs/tags/v0.7.0.tar.gz
  • https://github.com/phako/dleyna-renderer/archive/refs/tags/v0.7.0.tar.gz

Furthermore I have filed an issue on upstream’s dLeyna-core component asking for the project to be transferred to GNOME infrastructure officially (https://github.com/intel/dleyna-core/issues/55).

As for all the other things I do. I was trying to write many versions of this blog post and each one sounded like an apology, which sounded wrong. Ever since I changed jobs in 2018 I’m much more involved in coding during my work-time again and that seems to suck my “mental code reservoir” dry, meaning I have very little motivation to spend much time on designing and adding features, limiting most of the work on Rygel, GUPnP and Shotwell to the bare minimum. And I don’t know how and when this might change.

April 07, 2021

Governing Values-Centered Tech Non-Profits; or, The Route Not Taken by FSF

A few weeks ago, I interviewed my friend Katherine Maher on leading a non-profit under some of the biggest challenges an org can face: accusations of assault by leadership, and a growing gap between mission and reality on the ground.

We did the interview at the Free Software Foundation’s Libre Planet conference. We chose that forum because I was hopeful that the FSF’s staff, board, and membership might want to learn about how other orgs had risen to challenges like those faced by FSF after Richard Stallman’s departure in 2019. I, like many others in this space, have a soft spot for the FSF and want it to succeed. And the fact my talk was accepted gave me further hope.

Unfortunately, the next day it was announced at the same conference that Stallman would rejoin the FSF board. This made clear that the existing board tolerated Stallman’s terrible behavior towards others, and endorsed his failed leadership—a classic case of non-profit founder syndrome.

While the board’s action made the talk less timely, much of the talk is still, hopefully, relevant to any value-centered tech non-profit that is grappling with executive misbehavior and/or simply keeping up with a changing tech world. As a result, I’ve decided to present here some excerpts from our interview. They have been lightly edited, emphasized, and contextualized. The full transcript is here.

Sunlight Foundation: harassment, culture, and leadership

In the first part of our conversation, we spoke about Katherine’s tenure on the board of the Sunlight Foundation. Shortly after she joined, Huffington Post reported on bullying, harassment, and rape accusations against a key member of Sunlight’s leadership team.

[I had] worked for a long time with the Sunlight Foundation and very much valued what they’d given to the transparency and open data open government world. I … ended up on a board that was meant to help the organization reinvent what its future would be.

I think I was on the board for probably no more than three months, when an article landed in the Huffington Post that went back 10 years looking at … a culture of exclusion and harassment, but also … credible [accusations] of sexual assault.

And so as a board … we realized very quickly that there was no possible path forward without really looking at our past, where we had come from, what that had done in terms of the culture of the institution, but also the culture of the broader open government space.

Katherine

Practical impacts of harassment

Sunlight’s board saw immediately that an org cannot effectively grapple with a global, ethical technological future if the org’s leadership cannot grapple with its own culture of harassment. Some of the pragmatic reasons for this included:

The [Huffington Post] article detailed a culture of heavy drinking and harassment, intimidation.

What does that mean for an organization that is attempting to do work in sort of a progressive space of open government and transparency? How do you square those values from an institutional mission standpoint? That’s one [pragmatic] question.

Another question is, as an organization that’s trying to hire, what does this mean for your employer brand? How can you even be an organization that’s competitive [for hiring] if you’ve got this culture out there on the books?

And then the third pragmatic question is … [w]hat does this mean for like our funding, our funders, and the relationships that we have with other partner institutions who may want to use the tools?

Katherine

FSF suffers from similar pragmatic problems—problems that absolutely can’t be separated from the founder’s inability to treat all people as full human beings worthy of his respect. (Both of the tweets below lead to detailed threads from former FSF employees.)

Since the announcement of Stallman’s return, all top leadership of the organization have resigned, and former employees have detailed how the FSF staff has (for over a decade) had to deal with Richard’s unpleasant behavior, leading to morale problems, turnover, and even unionization explicitly to deal with RMS.

And as for funding, compare the 2018 sponsor list with the current, much shorter sponsor list.

So it seems undeniable: building a horrible culture has pragmatic impacts on an org’s ability to espouse its values.

Values and harassment

Of course, a values-centered organization should be willing to anger sponsors if it is important for their values. But at Sunlight, it was also clear that dealing with the culture of harassment was relevant to their values, and the new board had to ask hard questions about that:

The values questions, which … are just as important, were… what does this mean to be an organization that focuses on transparency in an environment in which we’ve not been transparent about our past?

What does it mean to be an institution that [has] progressive values in the sense of inclusion, a recognition that participation is critically important? … Is everyone able to participate? How can we square that with the institution that are meant to be?

And what do we do to think about justice and redress for (primarily the women) who are subjected to this culture[?]

Katherine

Unlike Sunlight, FSF is not about transparency, per se, but RMS at his best has always been very strong about how freedom had to be for everyone. FSF is an inherently political project! One can’t advocate for the rights of everyone if, simultaneously, one treats staff disposably and women as objects to be licked without their consent, and half the population (women) responds by actively avoiding the leadership of the “movement”.

So, in this situation, what is a board to do? In Sunlight’s case:

[Myself and fellow board member Zoe Reiter] decided that this was a no brainer, we had to do an external investigation.

The challenges of doing this… were pretty tough. [W]e reached out to everyone who’d been involved with the organization we also put not just as employees but also trying to find people who’ve been involved in transparency camps and other sorts of initiatives that Sunlight had had run.

We put out calls for participation on our blog; we hired a third party legal firm to do investigation and interviews with people who had been affected.

We were very open in the way that we thought about who should be included in that—not just employees, but anyone who had something that they wanted to raise. That produced a report that we then published to the general public, really trying to account for some of the things that have been found.

Katherine

The report Katherine mentions is available in two parts (results, recommendations) and is quite short (nine pages total).

While most of the report is quite specific to the Sunlight Foundation’s specific situation, the FSF board should particularly have read page 3 of the recommendations: “Instituting Board Governance Best Practices”. Among other recommendations relevant to many tech non-profits (not just FSF!), the report says Sunlight should “institute term limits” and “commit to a concerted effort to recruit new members to grow the Board and its capacity”.

Who can investigate a culture? When?

Katherine noted that self-scrutiny is not just something for large orgs:

[W]hen we published this report, part of what we were hoping for was that … we wanted other organizations to be able to approach this in similar challenges with a little bit of a blueprint for how one might do it. Particularly small orgs.

There were four of us on the board. Sunlight is a small organization—15 people. The idea that an even smaller organizations don’t have the resources to do it was something that we wanted to stand against and say, actually, this is something that every and all organizations should be able to take on regardless of the resources available to them.

Katherine

It’s also important to note that the need for critical self scrutiny is not something that “expires” if not undertaken immediately—communities are larger, and longer-lived, than the relevant staff or boards, so even if the moment seems to be in the relatively distant past, an investigation can still be valuable for rebuilding organizational trust and effectiveness.

[D]espite the fact that this was 10 years ago, and none of us were on the board at this particular time, there is an accounting that we owe to the people who are part of this community, to the people who are our stakeholders in this work, to the people who use our tools, to the people who advocated, who donated, who went on to have careers who were shaped by this experience.

And I don’t just mean, folks who were in the space still—I mean, folks who were driven out of the space because of the experiences they had. There was an accountability that we owed. And I think it is important that we grappled with that, even if it was sort of an imperfect outcome.

Katherine

Winding down Sunlight

As part of the conclusion of the report on culture and harassment, it was recommended that the Sunlight board “chart a new course forward” by developing a “comprehensive strategic plan”. As part of that effort, the board eventually decided to shut the organization down—not because of harassment, but because in many ways the organization had been so successful that it had outlived its purpose.

In Katherine’s words:

[T]he lesson isn’t that we shut down because there was a sexual assault allegation, and we investigated it. Absolutely not!

The lesson is that we shut down because as we went through this process of interrogating where we were, as an organization, and the culture that was part of the organization, there was a question of what would be required for us to shift the organization into a more inclusive space? And the answer is a lot of that work had already been done by the staff that were there…

But the other piece of it was, does it work? Does the world need a Sunlight right now? And the answer, I think, in in large part was not to do the same things that Sunlight had been doing. …

The organization spawned an entire community of practitioners that have gone on to do really great work in other spaces. And we felt as though that sort of national-level governmental transparency through tech wasn’t necessarily needed in the same way as it had been 15 years prior. And that’s okay, that’s a good thing.

Katherine

We were careful to say at Libre Planet that I don’t think FSF needs to shut down because of RMS’s terrible behavior. But the reaction of many, many people to “RMS is back on the FSF board” is “who cares, FSF has been irrelevant for decades”.

That should be of great concern to the board. As I sometimes put it—free licenses have taken over the world, and despite that the overwhelming consensus is that open won and (as RMS himself would say) free lost. This undeniable fact reflects very badly on the organization whose nominal job it is to promote freedom. So it’s absolutely the case that shutting down FSF, and finding homes for its most important projects in organizations that do not suffer from deep governance issues, should be an option the current board and membership consider.

Which brings us to the second, more optimistic topic: how did Wikimedia react to a changing world? It wasn’t by shutting down! Instead, it was by building on what was already successful to make sure they were meeting their values—an option that is also still very much available to FSF.

Wikimedia: rethinking mission in a changing world

Wikimedia’s vision is simple: “A world in which every single human can freely share in the sum of all knowledge.” And yet, in Katherine’s telling, it was obvious that there was still a gap between the vision, the state of the world, and how the movement was executing.

We turned 15 in 2016 … and I was struck by the fact that when I joined the Wikimedia Foundation, in 2014, we had been building from a point of our founding, but we were not building toward something.

So we were building away from a established sort of identity … a free encyclopedia that anyone can edit; a grounding in what it means to be a part of open culture and free and libre software culture; an understanding that … But I didn’t know where we were going.

We had gotten really good at building an encyclopedia—imperfect! there’s much more to do!—but we knew that we were building an encyclopedia, and yet … to what end?

Because “a free world in which every single human being can share in the sum of all knowledge”—there’s a lot more than an encyclopedia there. And there’s all sorts of questions:

About what does “share” mean?

And what does the distribution of knowledge mean?

And what does “all knowledge” mean?

And who are all these people—“every single human being”? Because we’ve got like a billion and a half devices visiting our sites every month. But even if we’re generous, and say, that’s a billion people, that is not the entirety of the world’s population.

Katherine

As we discussed during parts of the talk not excerpted here, usage by a billion people is not failure! And yet, it is not “every single human being”, and so WMF’s leadership decided to think strategically about that gap.

FSF’s leadership could be doing something similar—celebrating that GPL is one of the most widely-used legal documents in human history, while grappling with the reality that the preamble to the GPL is widely unheeded; celebrating that essentially every human with an internet connection interacts with GPL-licensed software (Linux) every day, while wrestling deeply with the fact that they’re not free in the way the organization hopes.

Some of the blame for that does in fact lie with capitalism and particular capitalists, but the leadership of the FSF must also reflect on their role in those failures if the organization is to effectively advance their mission in the 2020s and beyond.

Self-awareness for a successful, but incomplete, movement

With these big questions in mind, WMF embarked on a large project to create a roadmap, called the 2030 Strategy. (We talked extensively about “why 2030”, which I thought was interesting, but won’t quote here.)

WMF could have talked only to existing Wikimedians about this, but instead (consistent with their values) went more broadly, working along four different tracks. Katherine talked about the tracks in this part of our conversation:

We ran one that was a research track that was looking at where babies are born—demographics I mentioned earlier [e.g., expected massive population growth in Africa—omitted from this blog post but talked about in the full transcript.]

[Another] was who are our most experienced contributors, and what did they have to say about our projects? What do they know? What’s the historic understanding of our intention, our values, the core of who we are, what is it that motivates people to join this project, what makes our culture essential and important in the world?

Then, who are the people who are our external stakeholders, who maybe are not contributors in the sense of contributors to the code or contributors to the projects of content, but are the folks in the broader open tech world? Who are folks in the broad open culture world? Who are people who are in the education space? You know, stakeholders like that? “What’s the future of free knowledge” is what we basically asked them.

And then we went to folks that we had never met before. And we said, “Why don’t you use Wikipedia? What do you think of it? Why would it be valuable to you? Oh, you’ve never even heard of it. That’s so interesting. Tell us more about what you think of when you think of knowledge.” And we spent a lot of time thinking about what these… new readers need out of a project like Wikipedia. If you have no sort of structural construct for an encyclopedia, maybe there’s something entirely different that you need out of a project for free knowledge that has nothing to do with a reference—an archaic reference—to bound books on a bookshelf.

Katherine

This approach, which focused not just on the existing community but on data, partners, and non-participants, has been extensively documented at 2030.wikimedia.org, and can serve as a model for any organization seeking to re-orient itself during a period of change—even if you don’t have the same resources as Wikimedia does.

Unfortunately, this is almost exactly the opposite of the approach FSF has taken. FSF has become almost infamously insulated from the broader tech community, in large part because of RMS’s terrible behavior towards others. (The list of conference organizers who regret allowing him to attend their events is very long.) Nevertheless, given its important role in the overall movement’s history, I suspect that good faith efforts to do this sort of multi-faceted outreach and research could work—if done after RMS is genuinely at arms-length.

Updating values, while staying true to the original mission

The Wikimedia strategy process led to a vision that extended and updated, rather than radically changed, Wikimedia’s strategic direction:

By 2030, Wikimedia will become the essential infrastructure of the ecosystem of free knowledge, and anyone who shares our vision will be able to join us.

Wikipedia

In particular, the focus was around two pillars, which were explicitly additive to the traditional “encyclopedic” activities:

Knowledge equity, which is really around thinking about who’s been excluded and how we bring them in, and what are the structural barriers that enable that exclusion or created that exclusion, rather than just saying “we’re open and everyone can join us”. And how do we break down those barriers?

And knowledge as a service, which is without thinking about, yes, the technical components of what a service oriented architecture is, but how do we make knowledge useful beyond just being a website?

Katherine

I specifically asked Katherine about how Wikimedia was adding to the original vision and mission because I think it’s important to understand that a healthy community can build on its past successes without obliterating or ignoring what has come before. Many in the GNU and FSF communities seem to worry that moving past RMS somehow means abandoning software freedom, which should not be the case. If anything, this should be an opportunity to re-commit to software freedom—in a way that is relevant and actionable given the state of the software industry in 2021.

A healthy community should be able to handle that discussion! And if the GNU and FSF communities cannot, it’s important for the FSF board to investigate why that is the case.

Checklists for values-centered tech boards

Finally, at two points in the conversation, we went into what questions an organization might ask itself that I think are deeply pertinent for not just the FSF but virtually any non-profit, tech or otherwise. I loved this part of the discussion because one could almost split it out into a checklist that any board member could use.

The first set of questions came in response to a question I asked about Wikidata, which did not exist 10 years ago but is now central to the strategic vision of knowledge infrastructure. I asked if Wikidata had been almost been “forced on” the movement by changes in the outside world, to which Katherine said:

Wikipedia … is a constant work in progress. And so our mission should be a constant work in progress too.

How do we align against a north star of our values—of what change we’re trying to effect in the world—while adapting our tactics, our structures, our governance, to the changing realities of the world?

And also continuously auditing ourselves to say, when we started, who, you know, was this serving a certain cohort? Does the model of serving that cohort still help us advance our vision today?

Do we need to structurally change ourselves in order to think about what comes next for our future? That’s an incredibly important thing, and also saying, maybe that thing that we started out doing, maybe there’s innovation out there in the world, maybe there are new opportunities that we can embrace, that will enable us to expand the impact that we have on the world, while also being able to stay true to our mission and ourselves.

Katherine

And to close the conversation, I asked how one aligns the pragmatic and organizational values as a non-profit. Katherine responded that governance was central, with again a great set of questions all board members should ask themselves:

[Y]ou have to ask yourself, like, where does power sit on your board? Do you have a regenerative board that turns over so that you don’t have the same people there for decades?

Do you ensure that funders don’t have outsize weight on your board? I really dislike the practice of having funders on the board, I think it can be incredibly harmful, because it tends to perpetuate funder incentives, rather than, you know, mission incentives.

Do you think thoughtfully about the balance of power within those boards? And are there … clear bylaws and practices that enable healthy transitions, both in terms of sustaining institutional knowledge—so you want people who are around for a certain period of time, balanced against fresh perspective.

[W]hat are the structural safeguards you put in place to ensure that your board is both representative of your core community, but also the communities you seek to serve?

And then how do you interrogate on I think, a three year cycle? … So every three years we … are meant to go through a process of saying “what have we done in the past three, does this align?” and then on an annual basis, saying “how did we do against that three year plan?” So if I know in 15 years, we’re meant to be the essential infrastructure free knowledge, well what do we need to clean up in our house today to make sure we can actually get there?

And some of that stuff can be really basic. Like, do you have a functioning HR system? Do you have employee handbooks that protect your people? … Do you have a way of auditing your performance with your core audience or core stakeholders so that you know that the work of your institution is actually serving the mission?

And when you do that on an annual basis, you’re checking in with yourself on a three year basis, you’re saying this is like the next set of priorities. And it’s always in relation to that that higher vision. So I think every nonprofit can do that. Every size. Every scale.

Katherine

The hard path ahead

The values that the FSF espouses are important and world-changing. And with the success of the GPL in the late 1990s, the FSF had a window of opportunity to become an ACLU of the internet, defending human rights in all their forms. Instead, under Stallman’s leadership, the organization has become estranged and isolated from the rest of the (flourishing!) digital liberties movement, and even from the rest of the software movement it was critical in creating.

This is not the way it had to be, nor the way it must be in the future. I hope our talk, and the resources I link to here, can help FSF and other value-centered tech non-profits grow and succeed in a world that badly needs them.

April 06, 2021

GNOME Internet Radio Locator 4.0.1 with KVRX on Fedora Core 33

GNOME Internet Radio Locator 4.0.1 with KVRX (Austin, Texas) features updated language translations, new, improved map marker palette with 125 other radio stations from around the world with live audio streaming implemented through GStreamer.

The project lives on www.gnomeradio.org and Fedora 33 RPM packages for version 4.0.1 of GNOME Internet Radio Locator are now also available:

gnome-internet-radio-locator.spec

gnome-internet-radio-locator-4.0.1-1.fc33.src.rpm

gnome-internet-radio-locator-4.0.1-1.fc33.x86_64.rpm

To install GNOME Internet Radio Locator 4.0.1 on Fedora Core 33 in Terminal:

sudo dnf install http://www.gnome.org/~ole/fedora/RPMS/x86_64/gnome-internet-radio-locator-4.0.1-1.fc33.x86_64.rpm

 

“Getting Things GNOME” 0.5 released!

It is time to welcome a new release of the Rebuild of EvanGTGelion: 0.5, “You Can (Not) Improve Performance”!

This release of GTG has been 9 months in the making after the groundbreaking 0.4 release. While 0.4 was a major “perfect storm” overhaul, 0.5 is also a very technology-intensive release, even though it was done in a relatively short timeframe comparatively.

Getting Things GNOME 0.5 brings a truckload of user experience refinements, bugfixes, a completely revamped file format and task editor, and a couple of notable performance improvements. It doesn’t solve every performance problem yet (some remain), but it certainly improves a bunch of them for workaholics like me. If 0.4 felt a bit like a turtle, 0.5 is a definitely a much faster turtle.

If that was not enough already, it has some killer new features too. It’s the bee’s knees!

To benefit from one performance improvement in particular, it requires the new version of liblarch, 3.1, that we have released this month. GTG with the latest liblarch is available all-in-one in a Flatpak update near you. 📦

This release announcement and all that led up to it was, as you can imagine, planned using GTG:

“Who’s laughing now eh, señor Ruiz?”

As you can see, I came prepared. So continue reading below for the summary of improvements, I guarantee it’ll be worth it.


Some brief statistics

Beyond what is featured in these summarized release notes below, GTG itself has undergone roughly 400 changes affecting over 230 files, and received hundreds of bug fixes, as can be seen here. In fact, through this development cycle, we have crossed the 6,000 commits mark!

If you need another indication of how active the project has been in the last 9 months, I have received over 950 GTG-related emails since the 0.4 release last July.

We are thankful to all of our supporters and contributors, past and present, who made GTG 0.5 possible. Check out the “About” dialog for a list of contributors for this release.


New features

Recurring Tasks

"Repeating task" UI screenshot

Thanks to Mohieddine Drissi and Youssef Toukabri, tasks can now be made recurring (repeating). This allows you to keep reminding yourself that you need to clean your room every Saturday, and then not do it because it’s more fun to spend your whole Saturday fixing GTG bugs!

  • In the task editor, recurrence can be toggled with the dedicated menubutton that shows a nice user interface to manage the recurrence (daily, every other day, weekly, monthly and yearly).
  • Recurring tasks can also be added from the main window, using the “quick add” entry’s syntax.

Emojis as tag emblems 🦄

The old tag icons have been replaced by emojis. I decided it would be best for us to switch to emojis instead of icon-based emblems, because:

  • The availability of emblem icons varies a lot from one system to another, and from one icon theme to another;
  • In recent years, emblem icons on Linux have become quite a mismatch of styles, and their numbers in standard themes have also decreased;
  • Emojis are scalable by default, which makes them better for HiDPI displays (while traditional pixmap icons still are a hit-or-miss);
  • There is a much, much, much wider choice of emojis (for every conceivable situation!) than there are emblem icons on typical Linux systems;
  • It’s simply easier to deal with, in the long run. 😌

Unfortunately, this means that when you upgrade from 0.4 to 0.5, the old emblem icons will disappear and you will need to choose new emblems for your tasks.

Undead Hamster

The GTG plugin to interface with Hamster has been brought back to life by Francisco Lavin (issue #114 and pull request #465). I have not personally tested this. I’m still a bit worried about the ferrets rising from their graves.

Performance improvements
for power users

An elegant search trigger
for a more… civilized age

The filter-as-you-type global live search in GTG now uses a timeout approach (it waits a fraction of a second until after you stopped typing). This changes many things:

  • No more spamming the database and the UI’s treeview
  • The UI stays much more responsive, and you can see your characters being typed smoothly instead of appearing 15 seconds later
  • It won’t hog your CPU nearly as much
  • You get search results much faster: the results now tend to show up within a second after you stop typing, rather than a linear or exponential amount of time. For example, with 1000 tasks, it is now over 5 times faster (ex: bringing down the search time from 17+ seconds to 3 seconds), and the difference would be even more drastic if you have more tasks than that.

Optimizations to avoid processing all views all the time

As a result of the changes proposed in pull request #530 and follow-up PR #547, for issue #402, most operations switching between various representations of your tasks will now be 20% to 200% faster (depending on how your tasks are structured):

  • Faster startup
  • Faster switching between tags (particularly when inside the “Actionable” view)
  • Faster mid-night refreshing
  • It most likely makes the global search feature faster, too

However, changing between global view modes (Open/Actionable/Closed) is now a bit slower the first time around (but is instantaneous afterwards, until a filtering change occurs).

It’s a performance tradeoff, but it seems to be worth it. Especially when you have lots of tasks:

Note that to benefit from these changes, GTG 0.5 depends on the newly released version of liblarch (3.1) we have published this month.

Faster read/write operations on the data file 🚀

The switch to LXML as our parser ensures any operations on the local file format now happen instantly (less than a millisecond for a normal/lightweight task lists, and up to 25 milliseconds for heavy-duty 1000-tasks lists like mine). This was originally such a big deal that we thought we’d nickname this release “¿Que parser, amigo?” … but other major improvements kept happening since then, so this release is not just about the pasa!

Less micro-freezes when focusing task windows

Mr. Bean waiting
Pictured: me, waiting for the task editor windows to un-freeze when I focus them.

Users with a thousand tasks will probably have noticed that GTG 0.4 had a tendency to painfully lock up for a couple of seconds when focusing/defocusing individual task editor windows.

Sure, “It doesn’t catastrophically lock up in your face multiple times per day anymore” doesn’t sound like a feature in theory, but it is a usability feature in practice.

Now, even when typing inside the new Task Editor and making lots of changes, it remains butter-smooth.

Part of it might be because it now uses a timer-based approach to parsing, and part of it is probably due to the code having been refactored heavily and being of higher quality overall. It just tends to be better all around now, and I’ve tried fairly hard to break it during my smoke-testing, especially since a lot of the improvements have also been tied to the major file format redesign that Diego implemented.

All in all, I’m not sure exactly how it accomplishes better reliability, to be honest, so we’ll just assume Diego secretly is a warlock.

50% more awesome task editing

Oh yeah, speaking of the task editor…

Rewriting the Task Editor’s content area and its natural language parser brought many user-visible improvements in addition to the technological improvements:

  • The new timeout-based parser is faster (because we’re not spamming it like barbarians), more reliable, and now gains the ability to recognize subtasks (lines starting with “-“) live, instead of “after pressing Enter.” This also makes copy-pasting from a dashed list easier, as it will be converted into many subtasks in one go
  • Clicking on a tag inside the task editor’s text selects the tag in the main window
  • Tags’ text highlight color now uses the tag’s chosen emblem color
  • Subtasks now have GTK checkboxes!
  • Completed subtasks show up in normal strikethrough text style
  • Opening a subtask takes care of closing the parent task, and the other way around (opening a parent task closes the subtask window), minimizing the amount of utility windows cluttering your view (and reducing the chances of making conflicting changes)
  • Now supports subheadings, in Markdown format (type # title). Now that it supports some basic rich-text formatting features, even Grumpy Sri will be happy with this release:

Here is a short demonstration video that lets you see the improved “parsing as you type” engine in action:

Other user interface improvements

General bug fixes

Technological changes

  • Eliminate the pyxdg dependency in favor of directly using GLib
  • dbus-python dependency dropped in favor of Gio (except in the Hamster plugin)
  • New and improved file format (see issue 279, “Better XML format [RFC]“, and my previous spooky blog post):
    • Automatically converts upon first startup, nothing to do manually
    • More consistent
    • Uses a single file instead of three separate ones
    • Uses valid XML
    • Fully documented (see docs/contributors/file format.md)
    • Adds file format versioning to make it future-proof
    • Uses LXML as the parser, which is reportedly 40 times faster (see issue #285)
    • Fundamentally allows the task editor to be a lot more reliable for parsing, as we ditch the XML-in-XML matryoshka approach
    • Solves a few other bugs related to (de)serializing
  • Rewrote the Task Editor’s “TaskView” (content area) and its natural language parser, paving the way for more improvements and increased reliability:
    • The timeout-based approach mentioned earlier means that parsing happens less often, “when it needs to”, and is also less error-prone as a result.
    • With the implementation of subheadings, the code now has the foundations for more Markdown features in the future.
    • The taskview supports invisible characters that turn visible when the cursor passes over them, which could be useful for more text tags in the future.
    • Easier to maintain improved performance in the future: When running in debug mode (launch.sh -d), the process function now prints the time it took to process the buffer. This will help us stay conscious of the performance.
    • Support for linking between tasks with gtg://[TASK_ID] (no UI for this yet, though)
    • Better code overall, a lot more extensible and easier to work on
  • Rewrote the Quick Add entry’s parsing and added unit tests
  • Command line parsing ported to GLib, so GLib/GTK-specific options such as --display are now available
  • GTG is now dbus-activatable. Although no API that other applications can use has been implemented yet (outside of “standard” GTK ones), this opens up the possibility in the future, such as the GNOME Shell search integration.

Developer Experience improvements

Want to be part of a great team? Join BAHRAM GTG!

Here are some changes to primarily benefit users crazy enough to be called developers.

  • New plugin: “Developer Console”.
    Play with GTG’s internals from the comfort of GTG itself! Remember: when exiting your Orbital Frame, wear… a helmet.
  • New parameter in launch.sh: -w to enable warnings
    (in case you still are not wearing a helmet, or want to catch more bugs)
  • Better help message for launch.sh, documenting the available options
  • Include Git commit version in the about window
  • Use the proper exit code when the application quits cleanly
  • We closed all the tickets on LaunchPad (over 300 of them) so there is no confusion when it comes to reporting bugs and the current state of the app’s quality and priorities, and to ensure that issues are reported in the currently used bug tracker where development actually happens.

Help out Diego!

It is no secret that Diego has been spending a lot of time and effort making this release happen, just like the 0.4 release. The amount and depth of his changes are quite amazing, so it’s no surprise he’s the top contributor here:

If you would like to show your appreciation and support his maintainership work with some direct monthly donations to him, please consider giving a couple of dollars per month on his gumroad page or his liberapay page!

More coming in the next release

Many new features, plugins, and big code changes have been held back from merging so that we could release 0.5 faster, so a bunch of other changes will land in the development version on the way to 0.6.

“This blog post is ridiculously long! And we still have 10 pull requests almost ready to merge!”
— Diego, five minutes before I published this blog post

Spreading this announcement

We have made some social postings on Twitter, on Mastodon and on LinkedIn that you can re-share/retweet/boost. Please feel free to link to this announcement on forums and blogs as well!

Permanent Revolution

10 years ago today was April 6, 2011.

Windows XP was still everywhere. Smartphones were tiny, and not everyone had one yet. New operating systems were coming out left and right. Android phones had physical buttons, and webOS seemed to have a bright future. There was general agreement that the internet would bring about a better world, if only we could give everyone unrestricted access to it.

This was the world into which GNOME 3.0 was released.

I can’t speak to what it was like inside the project back then, this is all way before my time. I was still in high school, and though I wasn’t personally contributing to any free software projects yet, I remember it being a very exciting moment.

Screenshot of the GNOME 3.0 live ISO with Settings, Gedit, Calculator, and Evince Screenshot of the GNOME 3.0 live ISO showing Settings, Gedit, Calculator, and Evince in the overview

3.0 was a turning point. It was a clear sign that we’d not only caught up to, but confidently overtaken the proprietary desktops. It was the promise that everything old and crufty about computing could be overcome and replaced with something better.

As an aspiring designer and free software activist it was incredibly inspiring to me personally, and I know I’m not alone in that. There’s an entire generation of us who are here because of GNOME 3, and are proud to continue working in that tradition.

Here’s to permanent revolution. Here’s to the hundreds who worked on GNOME 3.

March 31, 2021

OBS Studio on Wayland

As of today, I’m happy to announce that all of the pull requests to make OBS Studio able to run as a native Wayland application, and capture monitors and windows on Wayland compositors, landed.

I’ve been blogging sparsely about my quest to make screencasting on Wayland a fluid and seamless experience for about a couple of years now. This required some work throughout the stack: from making Mutter able to hand DMA-BUF buffers to PipeWire; to improving the GTK desktop portal; to creating a plugin for OBS Studio; to fixing bugs in PipeWire; it was a considerable amount of work.

But I think none of it would matter if this feature is not easily accessible to everyone. The built-in screen recorder of GNOME Shell already works, but most importantly, we need to make sure applications are able to capture the screen properly. Sadly our hands are tied when it comes to proprietary apps, there’s just no way to contribute. But free and open source software allows us to do that! Fortunately, not only OBS Studio is distributed under GPL, it also is a pretty popular app with an active community. That’s why, instead of creating a fork or just maintaining a plugin, I decided to go the long hard route of proposing everything to OBS Studio itself.

The Road to Native Wayland

Making OBS Studio work on Wayland was a long road indeed, but fortunately other contributors attempted to do it before I did, and my pull requests were entirely based on their fantastic work. It took some time, but eventually the 3 big pull requests making OBS Studio able to run as a native Wayland application landed.

OBS Studio running as a native Wayland client. An important step, but not very useful without a way to capture monitors or windows.

After that, the next step was teaching OBS Studio how to create textures from DMA-BUF information. I wrote about this in the past, but the tl;dr is that implementing a monitor or window capture using DMA-BUFs means we avoid copying buffers from GPU memory to RAM, which is usually the biggest bottleneck when capturing anything. Exchanging DMA-BUFs is essentially about passing a few ids (integers) around, which is evidently much faster than copying dozens of megabytes of image data per second.

Fortunately for us, this particular feature also landed, introducing a new function gs_texture_create_from_dmabuf() which enables creating a gs_texture_t from DMA-BUF information. Like many other DMA-BUF APIs, it is pretty verbose since it needs to handle multiplanar buffers, but I believe this API is able to handle pretty much anything related to DMA-BUFs. This API is being documented and will freeze and become stable soon, with the release of OBS Studio 27, so make sure to check if and see if there’s anything wrong with it!

This was the last missing piece of the puzzle to implement a Wayland-compatible capture.

PipeWire & Portals to the Rescue!

An interesting byproduct of the development of apps sandboxing mechanisms are portals. Portals are D-Bus interfaces that provide various functionalities at runtime. For example, the Document portal allows applications with limited access to the filesystem to ask the user to select a file; the user is then presented with a file chooser dialog managed by the host system, and after selecting, the application will have access to that specific file only, and nothing else.

The portal that we’re interested in here is the Desktop portal, which provides, among others, a screencasting interface. With this interface, sandboxed applications can ask users to select either a window or a monitor, and a video stream is returned if the user selects something. The insecure nature of X11 allows applications to completely bypass this interface, but naturally it doesn’t work Wayland. At most, Xwayland will give you an incomplete capture of some (or no) applications running through it.

It is important to notice that despite born and hosted under the Flatpak umbrella, portals are mostly independent of Flatpak. It is perfectly possible to use portals outside of a Flatpak sandbox, and even when running it as a Snap or an AppImage. It’s merely a bunch of D-Bus calls after all. Portals are also implemented by important Wayland desktops, such as GNOME, KDE, and wlroots, which should cover the majority of Wayland desktops out there.

Remember that I vaguely mentioned above that the screencast interface returns a video stream? This video stream is actually a PipeWire stream. PipeWire here is responsible for negotiating and exchanging buffers between the video producer (GNOME Shell, Plasma, etc) and the consumer (here, OBS Studio).

These mechanisms (portals, and PipeWire) were the basis of my obs-xdg-portal plugin, which was recently merged into OBS Studio itself as part of the built-in capture plugin! Fortunately, it landed just in time for the release of OBS Studio 27, which means soon everyone will be able to use OBS Studio on Wayland.

And, finally, capturing on Wayland works!

Meanwhile at Flatpakland…

While contributing with these Wayland-related features, I sidetracked a bit and did some digging on a Flatpak manifest for OBS Studio.

Thanks to the fantastic work by Bilal Elmoussaoui, there is a GitHub action that allows creating CI workflows that build Flatpaks using flatpak-builder. This allowed proposing a new workflow to OBS Studio’s CI that generates a Flatpak bundle. It is experimental for now, but as we progress towards feature parity between Flatpak and non-Flatpak, it’ll eventually reach a point where we can propose it to be a regular non-experimental workflow.

In addition to that, Flatpak greatly helps me as a development tool, specially when used with GNOME Builder. The Flatpak manifest of OBS Studio is automatically detected and used to build and run it. Running OBS Studio is literally an one-click action:

Running OBS Studio with Flatpak on GNOME Builder


Next Steps

All these Wayland, PipeWire, and portals pull requests are only the first steps to make screencasting on Wayland better than on X11. There’s still a lot to do and fix, and contributions would be more than welcomed.

For a start, the way the screencast interface currently works doesn’t mix well with OBS Studio’s workflow. Each capture pops up a dialog to select a monitor or a window, and that’s not exactly a fantastic experience. If you have complex scenes with many window or screen captures, a swarm of dialogs will pop up. This is clearly not great UX, and improving this would be a good next step. Fortunately, portals are extensible enough to allow implementing a more suitable workflow that e.g. saves and restores the previous sessions.

Since this was tested in a relatively small number of hardware setups and environments, I’m sure we’ll need a round of bugfixes once people start using this code more heavily.

There’s also plenty of room for improvements on the Flatpak front. My long-term goal is to make OBS Studio’s CI publish stable releases to Flathub directly, and unstable releases to Flathub Beta, all automatically with flat-manager. This will require fixing some problems with the Flatpak package, such as the obs-browser plugin not working inside a sandbox (it uses CEF, Chromium Embedded Framework, which apparently doesn’t enjoy the sandbox PID remapping) nor on Wayland (Chromium barely supports Wayland for now).

Of course, I have no authority over what’s going to be accepted by the OBS Studio community, but these goals seem not to be controversial there, and might be great ways to further improve the state of screencasting on Wayland.

I’d like to thank everyone who has been involved in this effort; from the dozens of contributors that tested the Wayland PRs, to the OBS Studio community that’s been reviewing all this code and often helping me, to the rest of the Flatpak and GNOME communities that built the tooling that I’ve been using to improve OBS Studio.