December 08, 2023

v8's mark-sweep nursery

Today, a followup to yesterday’s note with some more details on V8’s new young-generation implementation, minor mark-sweep or MinorMS.

A caveat again: these observations are just from reading the code; I haven’t run these past the MinorMS authors yet, so any of these details might be misunderstandings.

The MinorMS nursery consists of pages, each of which is 256 kB, unless huge-page mode is on, in which case they are 2 MB. The total default size of the nursery is 72 MB by default, or 144 MB if pointer compression is off.

There can be multiple threads allocating into the nursery, but let’s focus on the main allocator, which is used on the main thread. Nursery allocation is bump-pointer, whether in a MinorMS page or scavenger semi-space. Bump-pointer regions are called linear allocation buffers, and often abbreviated as Lab in the source, though the class is LinearAllocationArea.

If the current bump-pointer region is too small for the current allocation, the nursery implementation finds another one, or triggers a collection. For the MinorMS nursery, each page collects the set of allocatable spans in a free list; if the free-list is non-empty, it pops off one entry as the current and tries again.

Otherwise, MinorMS needs another page, and specifically a swept page: a page which has been visited since the last GC, and whose spans of unused memory have been collected into a free-list. There is a concurrent sweeping task which should usually run ahead of the mutator, but if there is no swept page available, the allocator might need to sweep some. This logic is in MainAllocator::RefillLabMain.

Finally, if all pages are swept and there’s no Lab big enough for the current allocation, we trigger collection from the roots. The initial roots are the remembered set: pointers from old objects to new objects. Most of the trace happens concurrently with the mutator; when the nursery utilisation rises over 90%, V8 will kick off concurrent marking tasks.

Then once the mutator actually runs out of space, it pauses, drains any pending marking work, marks conservative roots, then drains again. I am not sure whether MinorMS with conservative stack scanning visits the whole C/C++ stacks or whether it manages to install some barriers (i.e. “don’t scan deeper than 5 frames because we collected then, and so all older frames are older”); dunno. All of this logic is in MinorMarkSweepCollector::MarkLiveObjects.

Marking traces the object graph, setting object mark bits. It does not trace pages. However, the MinorMS space promotes in units of pages. So how to decide what pages to promote? The answer is that sweeping partitions the MinorMS pages into empty, recycled, aging, and promoted pages.

Empty pages have no surviving objects, and are very useful because they can be given back to the operating system if needed or shuffled around elsewhere in the system. If they are re-used for allocation, they do not need to be swept.

Recycled pages have some survivors, but not many; MinorMS keeps the page around for allocation in the next cycle, because it has enough empty space. By default, a page is recyclable if it has 50% or more free space after a minor collection, or 30% after a major collection. MinorMS also promotes a page eagerly if in the last cycle, we only managed to allocate into 30% or less of its empty space, probably due to fragmentation. These pages need to be swept before re-use.

Finally, MinorMS doesn’t let pages be recycled indefinitely: after 4 minor cycles, a page goes into the aging pool, in which it is kept unavailable for allocation for one cycle, but is not yet promoted. This allows any new allocations on that page in the previous cycle age out and probably die, preventing premature tenuring.

And that’s it. Next time, a note on a way in which generational collectors can run out of memory. Have a nice weekend, hackfolk!

Vivaldi Is Available on Flathub

The Vivaldi browser is now available on Flathub. It’s been available for 2 weeks now actually, but I didn’t find time to blog about it until today.

When Jon. S. von Tetzchner, the CEO of Vivaldi, asked users for feedback on Mastodon, I replied that it’d be great to have Vivaldi available as a flatpak on Flathub. And he replied that he’d ask the engineering team to look into it since they had also seen a strong demand for Flatpak support among their Linux users on their forums.

They did look into it, but it was blocked by the conflict of Chromium and Flatpak sandboxes for a long time. They were not sure if they could rely on Zypak. In the end it was made available on Flathub in a sort of semiofficial way. It’s published by Vivaldi emploee Ruarí Ødegaard with the company’s approval. And if all works out, the plan is to make it official.

As an old Opera browser fan (I used Opera for over a decade and was very active in the community), I have a soft spot for Vivaldi. I still plan to use Firefox primarily because that’s what we ship and what my team is working on and I want to keep dogfooding, but since I can now easily install Vivaldi on Fedora Silverblue, it will be my second browser.

Firefox and Vivaldi complement each other very well. Firefox aims to be just a browser and strives for simplicity. Vivaldi is truly a spiritual successor to the old Opera. It’s packed with features and offers much more than just a browser (email client, calendar, RSS reader…).

Vivaldi Browser

Both companies also share a similar mission to keep the web open and protect users’ privacy. Vivaldi is a small, employee-owned company with no venture capital that would eventually push them to compromise on their mission, no business model based on questionable crypto or injecting affiliate codes into links.

Vivaldi is off to a good start on Flathub with over 12k downloads already. If you’re looking for a powerful browser or Chromium-based alternative, give it a try.

#125 Portalled USB Devices

Update on what happened across the GNOME project in the week from December 01 to December 08.

Sovereign Tech Fund

Tobias Bernard reports

As part of the GNOME STF (Sovereign Tech Fund) project, a number of community members are working on infrastructure related projects.

Areas we’re currently investigating:

  • Exploring the constraints and options for fractional scaling
  • We are looking at the state of speech synthetizers on Linux, particularly in relationship with the screen reader
  • We’re discussing the technical requirements for adaptive dialogs (bottom sheets)

GNOME Core Apps and Libraries

Tracker

A filesystem indexer, metadata storage system and search tool.

Sam Thursfield reports

Some memory usage fixes landed in the Tracker SPARQL database library this week:

  • Fewer memory allocation operations when processing text (!637), thanks to Carlos Garnacho
  • Memory leak in tracker:strip-punctuation() fixed (!639), thanks to Lukáš Tyrychtr

Third Party Projects

Giant Pink Robots! reports

I released the v2023.12.7 version of Varia, a new download manager written with GTK4 and Libadwaita. Originally released a week prior, but now it has more essential features built in.

It’s available on Flathub: https://flathub.org/apps/io.github.giantpinkrobots.varia

Krafting says

I’m pleased to announce the release of Playlifin, a simple tool that aids in importing your YouTube music playlists to your Jellyfin server!

It’s now available on Flathub.

Phosh

A pure wayland shell for mobile devices.

Guido announces

Phosh 0.34.0 is out. This includes an updated Wayland compositor (phoc) that works with the recently released wlroots 0.17.0 adding support for new Wayland protocols like security-context-v1 (to limit protocol access for flatpaks). We also fixed night light support and drag and drop on touch screens.

Check the full details here

Miscellaneous

Felipe Borges reports

We are happy to announce that GNOME is sponsoring two Outreachy internship projects for the December 2023 to March 2024 Outreachy internship round where they will be working on implementing end-to-end tests for GNOME OS using openQA.

Dorothy Kabarozi and Tanjuate Achaleke will be working with mentors Sam Thursfield and Sonny Piers.

Stay tuned to Planet GNOME for future updates on the progress of this project!

That’s all for this week!

See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects!

December 07, 2023

GNOME will have two Outreachy interns working on implementing end-to-end tests for GNOME OS using openQA

We are happy to announce that GNOME is sponsoring two Outreachy internship projects for the December 2023 to March 2024 Outreachy internship round where they will be working on implementing end-to-end tests for GNOME OS using openQA.

Dorothy Kabarozi and Tanjuate Achaleke will be working with mentors Sam Thursfield and Sonny Piers.

Stay tuned to Planet GNOME for future updates on the progress of this project!

This is a repost from https://discourse.gnome.org/t/announcement-gnome-will-have-two-outreachy-interns-working-on-implementing-end-to-end-tests-for-gnome-os-using-openqa

the last 5 years of V8's garbage collector

Captain, status report: I’m down here in a Jeffries tube, poking at V8’s garbage collector. However, despite working on other areas of the project recently, V8 is now so large that it’s necessary to ignore whole subsystems when working on any given task. But now I’m looking at the GC in anger: what is its deal? What does V8’s GC even look like these days?

The last public article on the structure of V8’s garbage collector was in 2019; fine enough, but dated. Now in the evening of 2023 I think it could be useful to revisit it and try to summarize the changes since then. At least, it would have been useful to me had someone else written this article.

To my mind, work on V8’s GC has had three main goals over the last 5 years: improving interactions between the managed heap and C++, improving security, and increasing concurrency. Let’s visit these in turn.

C++ and GC

Building on the 2018 integration of the Oilpan tracing garbage collector into the Blink web engine, there was some refactoring to move the implementation of Oilpan into V8 itself. Oilpan is known internally as cppgc.

I find the cppgc name a bit annoying because I can never remember what it refers to, because of the other thing that has been happpening in C++ integration: a migration away from precise roots and instead towards conservative root-finding.

Some notes here: with conservative stack scanning, we can hope for better mutator throughput and fewer bugs. The throughput comes from not having to put all live pointers in memory; the compiler can keep them in registers, and avoid managing the HandleScope. You may be able to avoid the compile-time and space costs of stack maps (side tables telling the collector where the pointers are). There are also two classes of bug that we can avoid: holding on to a handle past the lifetime of a handlescope, and holding on to a raw pointer (instead of a handle) during a potential GC point.

Somewhat confusingly, it would seem that conservative stack scanning has garnered the acronym “CSS” inside V8. What does CSS have to do with GC?, I ask. I know the answer but my brain keeps asking the question.

In exchange for this goodness, conservative stack scanning means that because you can’t be sure that a word on the stack refers to an object and isn’t just a spicy integer, you can’t move objects that might be the target of a conservative root. And indeed the conservative edge might actually not point to the start of the object; it could be an interior pointer, which places additional constraints on the heap, that it be able to resolve internal pointers.

Security

Which brings us to security and the admirable nihilism of the sandbox effort. The idea is that everything is terrible, so why not just assume that no word is safe and that an attacker can modify any word they can address. The only way to limit the scope of an attacker’s modifications is then to limit the address space. This happens firstly by pointer compression, which happily also has some delightful speed and throughput benefits. Then the pointer cage is placed within a larger cage, and off-heap data such as Wasm memories and array buffers go in that larger cage. Any needed executable code or external object is accessed indirectly, through dedicated tables.

However, this indirection comes with a cost of a proliferation in the number of spaces. In the beginning, there was just an evacuating newspace, a mark-compact oldspace, and a non-moving large object space. Now there are closer to 20 spaces: a separate code space, a space for read-only objects, a space for trusted objects, a space for each kind of indirect descriptor used by the sandbox, in addition to spaces for objects that might be shared between threads, newspaces for many of the preceding kinds, and so on. From what I can see, managing this complexity has taken a significant effort. The result is pretty good to work with, but you pay for what you get. (Do you get security guarantees? I don’t know enough to say. Better pay some more to be sure.)

Finally, the C++ integration has also had an impact on the spaces structure, and with a security implication to boot. The thing is, conservative roots can’t be moved, but the original evacuating newspace required moveability. One can get around this restriction by pretenuring new allocations from C++ into the mark-compact space, but this would be a performance killer. The solution that V8 is going for is to use the block-structured mark-compact space that is already used for the old-space, but for new allocations. If an object is ever traced during a young-generation collection, its page will be promoted to the old generation, without copying. Originally called minor mark-compact or MinorMC in the commit logs, it was renamed to minor mark-sweep or MinorMS to indicate that it doesn’t actually compact. (V8’s mark-compact old-space doesn’t have to compact: V8 usually chooses to just mark in place. But we call it a mark-compact space because it has that capability.)

This last change is a performance hazard: yes, you keep the desirable bump-pointer allocation scheme for new allocations, but you lose on locality in the old generation, and the rate of promoted bytes will be higher than with the semi-space new-space. The only relief is that for a given new-space size, you can allocate twice as many objects, because you don’t need the copy reserve.

Why do I include this discussion in the security section? Well, because most MinorMS commits mention this locked bug. One day we’ll know, but not today. I speculate that evacuating is just too rich a bug farm, especially with concurrency and parallel mutators, and that never-moving collectors will have better security properties. But again, I don’t know for sure, and I prefer to preserve my ability to speculate rather than to ask for too many details.

Concurrency

Speaking of concurrency, ye gods, the last few years have been quite the ride I think. Every phase that can be done in parallel (multiple threads working together to perform GC work) is now fully parallel: semi-space evacuation, mark-space marking and compaction, and sweeping. Every phase that can be done concurrently (where the GC runs threads while the mutator is running) is concurrent: marking and sweeping. A major sweep task can run concurrently with an evacuating minor GC. And, V8 is preparing for multiple mutators running in parallel. It’s all a bit terrifying but again, with engineering investment and a huge farm of fuzzers, it seems to be a doable transition.

Concurrency and threads means that V8 has sprouted new schedulers: should a background task have incremental or concurrent marking? How many sweepers should a given isolate have? How should you pause concurrency when the engine needs to do something gnarly?

The latest in-progress work would appear to be concurrent marking of the new-space. I think we should expect this work to result in a lower overall pause-time, though I am curious also to learn more about the model: how precise is it? Does it allow a lot of slop to get promoted? It seems to have a black allocator, so there will be some slop, but perhaps it can avoid promotion for those pages. I don’t know yet.

Summary

Yeah, GCs, man. I find the move to a non-moving young generation is quite interesting and I wish the team luck as they whittle down the last sharp edges from the conservative-stack-scanning performance profile. The sandbox is pretty fun too. All good stuff and I look forward to spending a bit more time with it; engineering out.

December 06, 2023

Toby is Recovering in ER ICU

Normally I’m posting about code here, but for the past two weeks most of my time has been spent taking care of our 4 year old Australian Shepherd. Toby is very special to me and we even share the same birthday!

Toby recently lost control of his hind legs, which was related to a herniated disk and likely IVDD. For the past two weeks my wife and I have been on full-time care duty. Diapers, sponge baths, the whole gamut.

Previously we had X-Rays done but that type of imaging is not all that conclusive for spinal injuries. So yesterday he had a doggy MRI then rushed into surgery for his L1/L2 discs and a spinal tap. The spinal tap is to make sure the situation wasn’t caused by meningitis. It seems to have gone well but this morning he still hasn’t regained control of his hind legs (and that’s to be expected so soon after surgery). He does have feeling in them though, so that’s a positive sign.

He’s still a very happy boy and we got to spend a half hour with him this morning in the ICU.

Thanks to everyone who has been supporting us through this, especially with watching our 5 month kitten June who can be a bit of a rascal during the day. We wouldn’t be able to get any sleep without y’all.

Toby, a 4 year old Australian Shepherd with a soft plushy toy laying on a microfoam bed to ease the pain on his back.

To keep his mind active, Tenzing started to teach him to sing. He’s already got part of La Bouche’s “Be My Lover” down which is just too adorable not to share with you.

100 Million Firmware Updates Supplied By The LVFS

The LVFS has now supplied over 100 million updates to Linux machines all around the globe. The true number is unknown, as we allow users to re-distribute updates without any kind of tracking, and also allow large companies or agencies to mirror the entire LVFS so the archive can be used offline. The true number of updates deployed will probably be a lot higher. Just 8 years ago Red Hat asked me to “make firmware updates work on Linux” and now we have a thriving set of projects that respect both your freedom and your privacy, and a growing ecosystem of hardware vendors who consider Linux users first class citizens. Every month we have two or three new vendors join; the logistical, security and most importantly commercial implications of not being “on the LVFS” are now too critical for IHVs, ODMs and OEMs to ignore.

Red Hat can certainly take a lot of credit for the undeniable success of LVFS and fwupd, as they have been paying my salary and pushing me forward over the last decade and more. Customer use of fwupd and LVFS is growing and growing – and planning for new fwupd/LVFS device support now happens months in advance to ensure fwupd is ready-to-go in long term support distributions like Red Hat Enterprise Linux. With infrastructure supplied and support paid for by the Linux Foundation, the LVFS really has a stable base that will be used for years to come.

As the number of devices supported by the LVFS goes up and up every week, and I’m glad that the community around fwupd is growing at the same pace as the popularity. Google and Collabora have also been amazing partners in encouraging and helping vendors to ship updates on the LVFS and supporting fwupd in ChromeOS — and their trust and support has been invaluable. I’m also glad the “side-projects” like “GNOME Firmware“, “Host Security ID“, “fwupd friendly firmware” and “uSWID as a SBoM format” also seem to be flourishing into independent projects in their own right.

Everybody is incredibly excited about the long term future of both fwupd and the LVFS and I’m looking forward to the next 100 million updates. A huge thank you to all that helped.

Calliope 10.0: creating music playlists using Tracker Miner FS

I just published version 10.0 of the open source playlist generation toolkit, Calliope. This fixes a couple of long standing issues I wanted to tackle.

SQLite Concurrency

The first of these only manifest itself as intermittent Gitlab CI failures when you submitted pull requests. Calliope uses SQLite to cache data, and a cache may be used by multiple concurrent process. SQLite has a “Write-Ahead Log” journalling mode that should excel at concurrency but somehow I kept seeing “database is locked” errors from a test that verified the cache with multiple writers. Well – make sure to explicitly *close* database connections in your Python threads.

Content resolution with Tracker Miner FS

The second issue was content resolution using Tracker Miner FS, which worked nicely but very slowly. Some background here: “content resolution” involves finding a playable URL for a piece of music, given metadata such as the artist name and track name. Calliope can resolve content against remote services such as Spotify, and can also resolve against a local music collection using the Tracker Miner FS index. The “special mix” example, which generates nightly playlists of excellent music, takes a large list of songs taken from Listenbrainz and tries to resolve each one locally, to check it’s available and get the duration. Until now this took over 30 minutes at 100% CPU.

Why so slow? The short answer is: cpe tracker resolve was not using the Tracker FTS (Full Text Search) engine. Why? Because there are some limitations in Tracker FTS that means we couldn’t use it in all cases.

About Tracker FTS

The full-text search engine in Tracker uses the SQLite FTS5 module. Any resource type marked with tracker:fullTextIndexed can be queried using a special fts:match predicate. This is how Nautilus search and the tracker3 search command work internally. Try running this command to search your music collection locally for the word “Baby”:

tracker3 sparql --dbus-service org.freedesktop.Tracker3.Miner.Files \
-q 'SELECT ?title { ?track a nmm:MusicPiece; nie:title ?title; fts:match "Baby" }'

This feature is great for desktop search, but it’s not quite right for resolving music content based on metadata.

Firstly, it is doing a substring match. So if I search for the artist named “Eagles”, it will also match “Eagles of Death Metal” and any other artist that contains the word “Eagles”.

Secondly, symbol matching is very complicated, and the current Tracker FTS code doesn’t always return the results I want. There are at least two open issues, 400 and 171 about bugs. It is tricky to get this right: is ' (Unicode +0027) the same as ʽ (Unicode +02BD)? What about ՚ (Unicode +055A, the “Armenian Apostrophe”)? This might require some time+money investment in Tracker SPARQL before we get a fully polished implementation.

My solution the meantime is as follows:

  1. Strip all words with symbols from the “artist name” and “track name” fields
  2. If one of the fields is now empty, run the slow lookup_track_by_name query which uses FILTER to do string matching against every track in the database.
  3. Otherwise, run the faster lookup_track_by_name_fts query. This uses both FTS *and* FILTER string matching. If FTS returns extra results, the FILTER query still picks the right one, but we are only doing string matching aginst the FTS results rather than the whole database.

Some unscientific profiling shows the special_mix script took 7 minutes to run last night, down from 35 minutes the night before. Success! And it’d be faster still if people can stop writing songs with punctuation marks in the titles.

Screenshot of a Special Mix playlist
Yesterday’s Special Mix.

Standalone SPARQL queries

You might think Tracker SPARQL and Tracker Miners have stood still since the Tracker 3.0 release in 2020. Not so. Carlos Garnacho has done huge amounts of work all over the two codebases bringing performance improvements, better security and better developer experience. At some point we need to do a review of all this stuff.

Anyway, precompiled queries are one area that improved, and it’s now practical to store all of an apps queries in separate files. Today most Tracker SPARQL users still use string concatenation to build queries, so the query is hidden away in Python or C code in string fragments, and can’t easily be tested or verified independently. That’s not necessary any more. In Tracker Miners we already migrated to using standalone query files (here and here). I took the opportunity to do the same in Calliope.

The advantages are clear:

  • no danger of “SPARQL injection” attacks, nor bugs caused by concatenation mistakes
  • a query is compiled to SQLite bytecode just once per process, instead of happening on every execution
  • you can check and lint the queries at build time (to do: actually write a SPARQL linter)
  • you can run and test the queries independently of the app, using tracker3 sparql --file. (Support for setting query parameters due to land Real Soon).

The only catch is some apps have logic in C or Python that affects the query functionality, which will need to be implemented in SPARQL instead. It’s usually possible but not entirely obvious. I got ChatGPT to generate ideas for how to change the SPARQL. Take the easy option! (But don’t trust anything it generates).


Next steps for Calliope

Version 10.0 is a nice milestone for the project. I have some ideas for more playlist generators but I am not sure when I’ll get more time to experiment. In fact I only got time for the work above because I was stuck on the sofa with a head-cold. In a way this is what success looks like.

December 05, 2023

Why does Gnome fingerprint unlock not unlock the keyring?

There's a decent number of laptops with fingerprint readers that are supported by Linux, and Gnome has some nice integration to make use of that for authentication purposes. But if you log in with a fingerprint, the moment you start any app that wants to access stored passwords you'll get a prompt asking you to type in your password, which feels like it somewhat defeats the point. Mac users don't have this problem - authenticate with TouchID and all your passwords are available after login. Why the difference?

Fingerprint detection can be done in two primary ways. The first is that a fingerprint reader is effectively just a scanner - it passes a graphical representation of the fingerprint back to the OS and the OS decides whether or not it matches an enrolled finger. The second is for the fingerprint reader to make that determination itself, either storing a set of trusted fingerprints in its own storage or supporting being passed a set of encrypted images to compare against. Fprint supports both of these, but note that in both cases all that we get at the end of the day is a statement of "The fingerprint matched" or "The fingerprint didn't match" - we can't associate anything else with that.

Apple's solution involves wiring the fingerprint reader to a secure enclave, an independently running security chip that can store encrypted secrets or keys and only release them under pre-defined circumstances. Rather than the fingerprint reader providing information directly to the OS, it provides it to the secure enclave. If the fingerprint matches, the secure enclave can then provide some otherwise secret material to the OS. Critically, if the fingerprint doesn't match, the enclave will never release this material.

And that's the difference. When you perform TouchID authentication, the secure enclave can decide to release a secret that can be used to decrypt your keyring. We can't easily do this under Linux because we don't have an interface to store those secrets. The secret material can't just be stored on disk - that would allow anyone who had access to the disk to use that material to decrypt the keyring and get access to the passwords, defeating the object. We can't use the TPM because there's no secure communications channel between the fingerprint reader and the TPM, so we can't configure the TPM to release secrets only if an associated fingerprint is provided.

So the simple answer is that fingerprint unlock doesn't unlock the keyring because there's currently no secure way to do that. It's not intransigence on the part of the developers or a conspiracy to make life more annoying. It'd be great to fix it, but I don't see an easy way to do so at the moment.

comment count unavailable comments

December 04, 2023

CapyPDF 0.7.0 released

Version 0.7.0 of CapyPDF is out. There is nothing major as such, just a bunch of minor features. As CapyPDF already supports a fair bit of stuff, the easiest way of finding out what can be done is to look at the unit test file.

Gradients, Gouraud meshes and Coons patches can now be defined in grayscale and CMYK in addition to RGB.

Support has also been added for separations. In PDF lingo this means that you can define new inks in addition to the basic CMYK ones. Typically these include metal inks or foils, Pantone colors and different kinds of glosses and varnishes.

2023-12-04 Monday

  • As part of my 4D chess / cunning plan to convince everyone I'm more foolish even than I seem - I got up super early, and took a train to London/St Pancras to discover I was a day early for my train; bother. Got some quiet code-reading in on the train home at least.
  • Planning call, sync with Naomi & Pedro, finalized annual reviews, and dispatched numbers left & right. Worked late.

December 03, 2023

2023-12-03 Sunday

  • Up early, off to visit Naomi in Loughborough; met her at Holywell Church and enjoyed the service together.
  • Out for a pub lunch together, and back to for a guided tour of engineering things - interesting. Out to visit Ruth & boys very briefly to drop some hardware off.
  • Bid 'bye to N. and drove home; dinner, bed early.

Profiling Rust Applications With Sysprof

Sysprof is an enormously helpful tool that can be used for identifying performance problems in applications. However, there are a few things that need to be considered in order to get meaningful and useful results, especially for Rust applications.

With this blog post I want to provide a short step-by-step guide on what requirements need to be met, so that hopefully more people can make use of Sysprof.

For reference, I am using Fedora Silverblue 39, and have GNOME Builder 45 installed as a Flatpak from Flathub. Sysprof must be installed as a regular (rpm) package on the host system:

On regular Fedora Workstation:

As an example for profiling I will use my application Fragments. The process and the results may differ on other distributions / systems, as this depends heavily on whether frame-pointers are enabled or not, for example. The latter is a hard requirement for achieving useful results with Sysprof.

Debug symbols

I start Fragments with Builder and select the “Run with Profiler” option.

Then I can open and inspect the generated syscap file with Sysprof:


I select fragments from the left sidebar so that I can see all the associated descendants. However, the results are not really helpful yet, as instead of the method names only “In File… org.gnome.Sdk….” is shown.

This is because the required debug symbols are missing and have to be installed:

Make sure that you install the correct .Debug runtime for the corresponding version, which you have also specified in your Flatpak manifest (runtime / runtime-version, in my case master).

Rust and frame-pointers

With the right debug symbols installed, the results look much more readable:

However, it is noticeable that apparently almost exclusively function names of libraries (Gio, Glib, Gtk, …) appear, and nothing of Fragments itself.

This is because Fragments itself must also be compiled with frame-pointers enabled, which we can enforce with the Rust flag force-frame-pointers=yes – so let’s do that.

With objdump we can check if the compiled binary has frame-pointers enabled (and the specific registers aren’t clobbered):

If we have done everything correctly, this command should not return any output. But we do get output, and see some functions from the Rust standard library, even though we have frame-pointers enabled… How come?

The reason for this is that Rust toolchain, and thus also the standard library, have been compiled without frame-pointers enabled (the Rust SDKs from Flathub are repackaging the official Rust provided binaries).

To fix this, we can use -Zbuild-std --target=x86_64-unknown-linux-gnu . This will recompile the standard library during the build process (like a regular crate), while respecting our set compiler options/flags, especially force-frame-pointers=yes.

Since build-std is currently unstable, we have to switch from rust-stable to rust-nightly. Once build-std gets stabilised, or the Rust toolchain is compiled with frame-pointers enabled, we can switch back to rust-stable again.

If we run the same objdump command from above again, we get no output anymore. Which means, hooray, our binary has now been compiled without frame-pointers getting clobbered / optimised away!

Now, for the first time, we are getting results with Sysprof from which we can derive useful information. For example, we see that the Client::torrent_files function takes quite a bit of time, which is because we have to deserialise the JSON we get from transmission-daemon, which can be quite time consuming with long responses (and unfortunately happens synchronously, and not asynchronously).

Bonus: Better function names

We can improve the display of the function names by setting the Rust flag symbol-mangling-version=v0.

This way we use the new Rust v0 mangling scheme instead of the (currently still default) legacy one. This has the advantage for us that the function names can be displayed better/more detailed in Sysprof, since Sysprof has direct support for the Rust v0 scheme and therefore does not have to fall back to the generic c++ demangler.

Before (legacy):

serde_path_to_error::de::deserialize

After (v0):

serde_path_to_error::de::deserialize::<&mut serde_json::de::Deserializer<serde_json::read::StrRead>, transmission_client::rpc::response::RpcResponse<transmission_client::torrent::TorrentFilesList>>

Conclusion

All the changes I have described in this blog post are included in this merge request: Fragments!168

For more detailed information on how to use Sysprof itself, I recommend the Sysprof GNOME Wiki page, which contains some usage examples.

Thanks to Christian for Sysprof, and for the ability to use Sysprof together with Flatpak, which is a great benefit, especially for image-based systems like Silverblue.

December 01, 2023

#124 Fixes and Improvements

Update on what happened across the GNOME project in the week from November 24 to December 01.

Sovereign Tech Fund

Tobias Bernard announces

As part of our infrastructure initiative funded by the Sovereign Tech Fund, a number of community members have been hard at work for the past weeks.

Some highlights of what landed this week:

New projects started this week:

GNOME Core Apps and Libraries

GLib

The low-level core library that forms the basis for projects such as GTK and GNOME.

Philip Withnall says

gtk-doc has been removed from GLib and all the documentation is now generated using gi-docgen. Help is needed to update the syntax in doc comments to re-enable links in the documentation! See https://gitlab.gnome.org/GNOME/glib/-/issues/3037

GNOME Circle Apps and Libraries

Gaphor

A simple UML and SysML modeling tool.

Dan Yeaw says

Gaphor, the simple UML and SysML modeling tool, version 2.22.0 is now out! Some of the major changes include:

  • Add app preferences for overriding dark mode and the language to English. As part of this we helped improve libadwaita 1.4.0 support on macOS and Windows.
  • Proxy port improvements
  • Add allocations toolbox with allocate relationship item
  • Add members in model browser
  • Make line selection easier by increasing tolerance
  • Make model loading more lenient

The new version is available on Flathub.

Third Party Projects

دانیال بهزادی says

Carburetor 4.2.0 released with a new icons and it’s now on Flathub too. Carburetor is built upon Libadwaita to let you easily set up a TOR proxy on your session, without getting your hands dirty with system configs. Initially aimed at simplifying life for GNOME enthusiast on their mobiles, it’s now fully usable with mouse and/or keyboard too.

Pods

Keep track of your podman containers.

Marcus Behrendt reports

I have released Pods in version 2.0.0. The biggest changes are that volumes are now supported and that the basic layout of the interface has been redesigned. Pods now has a completely new look, with a sidebar and new libadwaita 1.4 widgets.

Kooha

Elegantly record your screen.

Dave Patrick says

Aside from a range of critical fixes, Kooha is getting a new UI to address previous design limitations and enhance the overall experience.

Documentation

Emmanuele Bassi announces

A new release of gi-docgen, 2023.2, the documentation generator for C libraries using gobject-introspection. Lots of quality of life improvements, as well as larger changes, like:

  • parse the default value attribute for GObject properties
  • a complete redesign of the search results, with better output on smaller form factors
  • support for admonitions (note, warning, important) inside the documentation blocks
  • display the implemented interfaces in the class pseudocode description
  • add a link in extra content files to their source in the code repository
  • and much more…

You can download gi-docgen 2023.2 from the GNOME file server or from PyPI.

That’s all for this week!

See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects!

November 30, 2023

Reducing Mutter dependencies

Mutter, if you don't know what it is, is a Wayland display server and an X11 window manager and compositor library. It used by GNOME Shell, Gala, GNOME Kiosk and possibly other old forks out there.

The project was born from merging Clutter, Cogl and Metacity so it has/had a lot of cruft and technical debts carried over time.

In this blog post, I will go through the list of things I worked on the last few months to reduce that and my plans for the upcoming year.

Reducing build/runtime dependencies

After going through the list of dependencies used by Mutter, in a Wayland-only use case, I found the following ones that could potentially be completely removed or made optional

zenity

A runtime dependency, it was used to display various "complex" X11-specific dialogues that can't be replicated easily using Clutter.

Those dialogs were no longer useful and the dependency got removed in !2370. Note that Mutter still uses Zenity for x11-specific tests.

libcanberra

Used by Mutter to play sounds in some contexts and also provides a public API to do so, in your custom shell. For embedded use cases, the functionality is not useful, especially since libcanberra ends up pulling gstreamer.

With !2375, the functionality can be disabled at build time with -Dsound_player=false making MetaSoundPlayer not produce any sound.

gnome-desktop

Used internally by Mutter to retrieve the monitor vendor name before falling back to "Undefined".

With !2317, the functionality can be disabled with -Dlibgnome_desktop=false

gnome-settings-daemon

A runtime dependency that provides the org.gnome.settings-daemon.peripherals.touchscreen orientation-lock setting, which is used to disable automatically updating the display orientation.

With !2398, Mutter checks if the setting scheme is available first making the runtime dependency optional.

gtk

Under X11, GTK is used for drawing the window decorations. Some parts of GTK APIs ended up being used elsewhere for simplicity at that time. Ideally, shortly, we would have the possibility to build Mutter without x11 support, keeping the GTK dependency in that case doesn't make much sense.

With !2407, the usage of GTK inside Mutter was reduced to drawing the window clients decorations, which Carlos Garnacho took care of handling in !2175.

json-glib

Clutter, could create ClutterActors from a JSON file, but nothing is using that feature since the merge of Clutter inside Mutter. The library was also used to serialize ClutterPaintNodes for debugging purposes.

With !3354, both use cases were removed and the dependency was dropped.

gdk-pixbuf

The library was used to provide APIs for creating CoglTextures or CoglBitmap from a file and internally with MetaBackgroundImageCachefor caching the background textures.

With !3097, we only have the MetaBackgroundImageCache use case which needs some refactoring.

cairo

When I looked at building Mutter without X11, I noticed that cairo ends up pulling some X11 bits through the cairo-xlib backend and wondered how hard it would be to completely get rid of cairo usage in the Wayland-only code path.

The first step in this adventure was to replace cairo_rectangle_int_t, especially since Mutter had a MetaRectangle type anyways...But that meant replacing cairo_region_t as well, which thankfully is just a ref counted pixman_region32_t wrapper. With those two changes, the cairo usage dropped drastically but was still far from enough.

  • MetaWindow:icon / MetaWindow:mini-icon uses a cairo_surface_t which probably can either be dropped or replaced with something like a CoglTexture
  • ClutterCanvas uses cairo for drawing. It is only used by StDrawingArea, a ClutterActor subclass provided in GNOME Shell. The idea is to merge both in GNOME Shell and drop that API from Mutter
  • pango / pangocairo integration, which opens the question of whether the whole font rendering bits should be moved to GNOME Shell and leave Mutter outside of that business or if it should be made optional
  • Other use cases requiring case-by-case handling

X11

Well, Wayland is no news and Xwayland support allows keeping backward compatibility with X11-only applications. The idea is to allow building Mutter with Wayland only with or without Xwayland.

I have opened #2272 which keeps track of the progress as well as the remaining tasks if someone is interested in helping with that!

Replacing/Removing deprecated APIs

Cogl/Clutter before being merged with Metacity contained various deprecated APIs that were intended to be removed in the next major release of those libraries, which never happened. With the help of Zander Brown and Florian Müllner, we have managed to significantly reduce those.

Cogl also contained a CoglObject base class that is not a GObject for historical reasons. I took it as a personal challenge to get rid of it

Improving documentation of the public API

Mutter and its various libraries already had a build time option to generate docs using gi-docgen but the CI wasn't building and publishing them somewhere on Gitlab Pages. I have also ported portion of the documentation to use gi-docgen annotations instead of the old gtk-doc way.

There is still a lot of work to be done to improve the quality of the documentation but we are getting there.

The docs are nowadays available at https://gitlab.gnome.org/GNOME/mutter#contributing.

Next year?

When I look back at the amount of changes I have managed to contribute to Mutter the last few months, I don't believe it was that much, but at the same time remind myself there is still a lot to be done, as usual.

For next year, I plan to continue pushing forward the effort to build Mutter without x11 / cairo / gdk-pixbuf and various upstream involvement where possible.

Till then, stay safe

November 29, 2023

Fedora Workstation 39 and beyond

I have not been so active for a while with writing these Fedora Workstation updates and part of the reason was that I felt I was beginning to repeat myself a lot, which I partly felt was a side effect of writing them so often, but with some time now since my last update I felt that time was ripe again. So what are some of the things we have been working on and what are our main targets going forward? This is not a exhaustive list, but hopefully items you find interesting. Apologize for weird sentences and potential spelling mistakes, but it ended up a a long post and when you read your own words over for the Nth time you start going blind to issues :)

PipeWire

PipeWire 1.0 is available! PipeWire keeps the Linux Multimedia revolution rolling[/caption]So lets start with one of your favorite topics, PipeWire. As you probably know PipeWire 1.0 is now out and I feel it is a project we definitely succeeded with, so big kudos to Wim Taymans for leading this effort. I think the fact that we got both the creator of JACK, Paul Davis and the creator of PulseAudio Lennart Poettering to endorse it means our goal of unifying the Linux audio landscape is being met. I include their endorsement comments from the PipeWire 1.0 release announcement here :

“PipeWire represents the next evolution of audio handling for Linux, taking
the best of both pro-audio (JACK) and desktop audio servers (PulseAudio) and
linking them into a single, seamless, powerful new system.”
– Paul Davis, JACK and Ardour author

“PipeWire is a worthy successor to PulseAudio, providing a feature set
closer to how modern audio hardware works, and with a security model
with today’s application concepts in mind. Version 1.0 marks a
major milestone in completing the adoption of PipeWire in the standard
set of Linux subsystems. Congratulations to the team!”
– Lennart Poettering, Pulseaudio and systemd author

So for new readers, PipeWire is a audio and video server we created for Fedora Workstation to replace PulseAudio for consumer audio, JACK for pro-audio and add similar functionality for video to your linux operating system. So instead of having to deal with two different sound server architectures users now just have to deal with one and at the same time they get the same advantages for video handling. Since PipeWire implemented both the PulseAudio API and the JACK API it is a drop in replacement for both of them without needing any changes to the audio applications built for those two sound servers. Wim Taymans alongside the amazing community that has grown around the project has been hard at work maturing PipeWire and adding any missing feature they could find that blocked anyone from moving to it from either PulseAudio and JACK. Wims personal focus recently has been on an IRQ based ALSA driver for PipeWire to be able to provide 100% performance parity with the old JACK server. So while a lot of Pro-audio users felt that PipeWire’s latency was already good enough, this work by Wim shaves of the last few milliseconds to reach the same level of latency as JACK itself had.

In parallel with the work on PipeWire the community and especially Collabora has been hard at work on the new 0.5 release of WirePlumber, the session manager which handles all policy issues for PipeWire. I know people often get a little confused about PipeWire vs WirePlumber, but think of it like this: PipeWire provides you the ability to output audio through a connected speaker, through a bluetooth headset, through an HDMI connection and so on, but it doesn’t provide any ‘smarts’ for how that happens. The smarts are instead provided by WirePlumber which then contains policies to decide where to route your audio or video, either based on user choice or through preset policies making the right choices automatically, like if you disconnect your USB speaker it will move the audio to your internal speaker instead. Anyway, WirePlumber 0.5 will be a major step forward for WirePlumber moving from using lua scripts for configuration to instead using JSON for configuration while retaining lua for scripting. This has many advantages, but I point you to this excellent blog post by Collabora’s Ashok Sidipotu for the details. Ashok got further details about WirePlumber 0.5 that you can find here.

With PipeWire 1.0 out the door I feel we are very close to reaching one of our initial goals with PipeWire, to remove the need for custom pro-audio distributions like Fedora JAM or Ubuntu Studio, and instead just let audio folks be able to use the same great Fedora Workstation as the rest of the world. With 1.0 done Wim plans next to look a bit at things like configuration tools and similar used by pro-audio folks and also dive into the Flatpak portal needs of pro-audio applications more, to ensure that Flatpaks + PipeWire is the future of pro-audio.

On the video handling side its been a little slow going since there applications need to be ported from relying directly on v4l. Jan Grulich has been working with our friends at Mozilla and Google to get PipeWire camera handling support into Firefox and Google Chrome. At the moment it looks like the Firefox support will land first, in fact Jan has set up a COPR that lets you try it out here. For tracking the upstream work in WebRTC to add PipeWire support Jan set up this tracker bug. Getting the web browsers to use PipeWire is important both to enable the advanced video routing capabilities of PipeWire, but it will also provide applications the ability to use libcamera which is a needed for new modern MIPI cameras to work properly under Linux.

Another important application to get PipeWire camera support into is OBS Studio and the great thing is that community member Georges Stavracas is working on getting the PipeWire patches merged into OBS Studio, hopefully in time for their planned release early next year. You can track Georges work in this pull request.

For more information about PipeWire 1.0 I recommend our interview with Wim Taymans in Fedora Magazine and also the interview with Wim on Linux Unplugged podcast.

HDR
HDRHDR, or High Dynamic Range, is another major effort for us. HDR is a technology I think many of you have become familiar with due to it becoming quite common in TVs these days. It basically provides for greatly increased color depth and luminescence on your screen. This is a change that entails a lot of changes through the stack, because when you introduce into an existing ecosystem like the Linux desktop you have to figure out how to combine both new HDR capable applications and content and old non-HDR applications and content. Sebastian Wick, Jonas Ådahl, Oliver Fourdan, Michel Daenzer and more on the team has been working with other members of the ecosystem from Intel, AMD, NVIDIA, Collabora and more to pick and define the standards and protocols needed in this space. A lot of design work was done early in the year so we been quite focused on implementation work across the drivers, Wayland, Mesa, GStreamer, Mutter, GTK+ and more. Some of the more basic scenarios, like running a fullscreen HDR application is close to be ready, while we are still working hard on getting all the needed pieces together for the more complex scenarios like running SDR and HDR windows composited together on your desktop. So getting for instance full screen games to run in HDR mode with Steam should happen shortly, but the windowed support will probably land closer to summer next year.

Wayland remoting
One feature we been also spending a lot of time on is enabling remote logins to a Wayland desktop. You have been able to share your screen under Wayland more or less from day one, but it required your desktop session to be already active. But lets say you wanted to access your Wayland desktop running on a headless system you been out of luck so far and had to rely on the old X session instead. So putting in place all the pieces for this has been quite an undertaking with work having been done on PipeWire, on Wayland portals, gnome remote desktop daemon, libei; the new input emulation library, gdm and more. The pieces needed are finally falling into place and we expect to have everything needed landed in time for GNOME 46. This support is currently done using a private GNOME API, but a vendor less API is being worked on to replace it.

As a sidenote here not directly related to desktop remoting, but libei has also enabled us to bring xtest support to XWayland which was important for various applications including Valves gamescope.

NVIDIA drivers
One area we keep investing in is improving the state of NVIDIA support on Linux. This comes both in the form of being the main company backing the continued development of the Nouveau graphics driver. So the challenge with Nouveau is that for the longest while it offered next to no hardware acceleration for 3D graphics. The reason for this was that the firmware that NVIDIA provided for Nouveau to use didn’t expose that functionality and since recent generations of NVIDIA cards only works with firmware signed by NVIDIA this left us stuck. So Nouveau was a good tool for doing an initial install of a system, but if you where doing any kind of serious 3D acceleration, including playing games, then you would need to install the NVIDIA binary driver. So in the last year that landscape around that has changed drastically, with the release of the new out-of-tree open source driver from NVIDIA. Alongside that driver a new firmware has also been made available , one that do provide full support for hardware acceleration.
Let me quickly inject a quick explanation of out-of-tree versus in-tree drivers here. An in-tree driver is basically a kernel driver for a piece of hardware that has been merged into the official Linux kernel from Linus Torvalds and is thus being maintained as part of the official Linux kernel releases. This ensures that the driver integrates well with the rest of the Linux kernel and that it gets updated in sync with the rest of the Linux kernel. So Nouveau is an in-tree kernel driver which also integrates with the rest of the open source graphics stack, like Mesa. The new NVIDIA open source driver is an out-of-tree driver which ships as a separate source code release on its own schedule, but of course NVIDIA works to keeps it working with the upstream kernel releases (which is a lot of work of course and thus considered a major downside to being an out of tree driver).

As of the time of writing this blog post NVIDIAs out-of-tree kernel driver and firmware is still a work in progress for display usercases, but that is changing with NVIDIA exposing more and more display features in the driver (and the firmware) with each new release they do. But if you saw the original announcement of the new open source driver from NVIDIA and have been wondering why no distribution relies on it yet, this is why. So what does this mean for Nouveau? Well our plan is to keep supporting Nouveau for the foreseeable future because it is an in-tree driver, which is a lot easier to ensure keeps working with each new upstream kernel release.

At the same time the new firmware updates allows Nouveau to eventually offer performance levels competitive with the official out-of-tree driver, kind of how the open source AMD driver with MESA offers comparable performance to AMD binary GPU driver userspace. So Nouvea maintainer Ben Skeggs spent the last year working hard on refactoring Nouveau to work with the new firmware and we now have a new release of Nouveau out showing the fruits of that labor, enabling support for NVIDIAs latest chipset. Over time we will have it cover more chipset and expand Vulkan and OpenGL (using Zink) support to be a full fledged accelerated graphics driver.
So some news here, Ben after having worked tirelessly on keeping Nouveau afloat for so many years decided he needed a change of pace and thus decided to leave software development behind for the time being. A big thank you to Ben from all us at Red Hat and Fedora ! The good news is that Danilo Krummrich will take over as the development lead, with Lyude Paul taking on working on the Display side specifically of the driver. We also expect to have other members of the team chipping in too. They will pick up Bens work and continue working with NVIDIA and the community on a bright future for Nouveau.

So as I mentioned though the new open source driver from NVIDIA is still being matured for the display usercase and until it works fully as a display driver neither will Nouveau be able to be a full alternative since they share the same firmware. So people will need to rely on the binary NVIDIA Driver for some time still. One thing we are looking at there and discussing is if there are ways for us to improve the experience of using that binary driver with Secure Boot enabled. Atm that requires quite a bit of manual fiddling with tools like mokutils, but we have some ideas on how to streamline that a bit, but it is a hard nut to solve due to a combination of policy issues, legal issues, security issues and hardware/UEFI bugs so I am making no promises at this point, just a promise that it is something we are looking at.

Accessibility
Accessibility is an important feature for us in Fedora Workstation and thus we hired Lukáš Tyrychtr to focus on the issue. Lukáš has been working through across the stack fixing issues blocking proper accessibility support in Fedora Workstation and also participated in various accessibility related events. There is still a lot to do there so I was very happy to hear recently that the GNOME Foundation got a million Euro sponsorship from the Sovereign Tech Fund to improve various things across the stack, especially improving accessibility. So the combination of Lukáš continued efforts and that new investment should make for a much improved accessibility experience in GNOME and in Fedora Workstation going forward.

GNOME Software
Another area that we keep investing in is improving GNOME Software, with Milan Crha working continuously on bugfixing and performance improvements. GNOME Software is actually a fairly complex piece of software as it has to be able to handle the installation and updating of RPMS, OSTree system images, Flatpaks, fonts and firmware for us in addition to the formats it handles for other distributions. For some time it felt was GNOME Software was struggling with the load of all those different formats and usercases and was becoming both slow and with a lot of error messages. Milan has been spending a lot of time dealing with those issues one by one and also recently landed some major performance improvements making the GNOME Software experience a lot better. One major change that Milan is working on that I think we will be able to land in Fedora Workstation 40/41 is porting GNOME Software to use DNF5. The main improvement end users will probably notice is that it unifies the caches used for GNOME Software and using dnf on the command line, saving you storage space and also ensuring the two are fully in sync on what RPMS is installed/updated at any given time.

Fedora and Flatpaks

Flatpaks is another key element of our strategy for moving the Linux desktop forward and as part of that we have now enabled all of Flathub to be available if you choose to enable 3rd party repositories when you install Fedora Workstation. This means that the huge universe of applications available on Flathub will be easy to install through GNOME Software alongside the content available in Fedora’s own repositories. That said we have also spent time improving the ease of making Fedora Flatpaks. Owen Taylor jumped in and removed the dependency on a technology called ‘modularity‘ which was initially introduced to Fedora to bring new features around having different types of content and ease keeping containers up to date. Unfortunately it did not work out as intended and instead it became something that everyone just felt made things a lot more complicated, including building Flatpaks from Fedora content. With Owens updates building Flatpaks in Fedora has become a lot simpler and should help energize the effort building Flatpaks in Fedora.

Toolbx
As we continue marching towards a vision for Fedora Workstation to be a highly robust operating we keep evolving Toolbx. Our tool for making running your development environment(s) inside a container and thus allows you to both keep your host OS pristine and up to date, while at the same time using specific toolchains and tools inside the development container. This is a hard requirement for immutable operating systems such as Fedora Silverblue or Universal blue, but it is also useful on operating systems like Fedora Workstation as a way to do development for other platforms, like for instance Red Hat Enterprise Linux.

A major focus for Toolbx since the inception is to get it a stage where it is robust and reliable. So for instance while we prototyped it as a shell script, today it is written in Go to be more maintainable and also to confirm with the rest of the container ecosystem. A recent major step forward for getting that stability there is that starting with Fedora 39, the toolbox image is now a release blocking deliverable. This means it is now built as part of the nightly compose and the whole Toolbx stack (ie. the fedora-toolbox image and the toolbox RPM) is part of the release-blocking test criteria. This shows the level of importance we put on Toolbx as the future of Linux software development and its criticality to Fedora Workstation. Earlier, we built the fedora-toobox image as a somewhat separate and standalone thing, and people interested in Toolbx would try to test and keep the whole thing working, as much as possible, on their own. This was becoming unmanageable because Toolbx integrates with many parts of the distribution from Mutter (ie, the Wayland and X sockets) to Kerberos to RPM (ie., %_netsharedpath in /usr/lib/rpm/macros.d/macros.toolbox) to glibc locale definitions and translations. The list of things that could change elsewhere in Fedora, and end up breaking Toolbx, was growing too large for a small group of Toolbx contributors to keep track of.

We the next release we now also have built-in support for Arch Linux and Ubuntu through the –distro flag in toolbox.git main, thanks again to the community contributors who worked with us on this allowing us to widen the amount of distros supported while keeping with our policy of reliability and dependability. And along the same theme of ensuring Toolbx is a tool developers can rely on we have added lots and lots of new tests. We now have more than 280 tests that run on CentOS Stream 9, all supported Fedoras and Rawhide, and Ubuntu 22.04.

Another feature that Toolbx maintainer Debarshi Ray put a lot of effort into is setting up full RHEL containers in Toolbx on top of Fedora. Today, thanks to Debarshi work you do subscription-manager register --username user@domain.name on the Fedora or RHEL host, and the container is automatically entitled to RHEL content. We are still looking at how we can provide a graphical interface for that process or at least how to polish up the CLI for doing subscription-manager register. If you are interested in this feature, Debarshi provides a full breakdown here.

Other nice to haves added is support for enterprise FreeIPA set-ups, where the user logs into their machine through Kerberos and support for automatically generated shell completions for Bash, fish and Z shell.

Flatpak and Foreman & Katello
For those out there using Foreman to manage your fleet of Linux installs we have some good news. We are in the process of implementing support for Flatpaks in these tools so that you can manage and deploy applications in the Flatpak format using them. This is still a work in progress, but relevant Pulp and Katello commits are Pulp commit Support for Flatpak index endpoints and Katello commits Reporting results of docker v2 repo discovery” and Support Link header in docker v2 repo discovery“.

LVFS
Another effort that Fedora Workstation has brought to the world of Linux and that is very popular arethe LVFS and fwdup formware update repository and tools. Thanks to that effort we are soon going to be passing one hundred million firmware updates on Linux devices soon! These firmware updates has helped resolve countless bugs and much improved security for Linux users.

But we are not slowing down. Richard Hughes worked with industry partners this year to define a Bill of Materials defintion to firmware updates allowing usings to be better informed on what is included in their firmware updates.

We now support over 1400 different devices on the LVFS (covering 78 different protocols!), with over 8000 public firmware versions (image below) from over 150 OEMs and ODMs. We’ve now done over 100,000 static analysis tests on over 2,000,000 EFI binaries in the firmware capsules!

Some examples of recently added hardware:
* AMD dGPUs, Navi3x and above, AVer FONE540, Belkin Thunderbolt 4 Core Hub dock, CE-LINK TB4 Docks,CH347 SPI programmer, EPOS ADAPT 1×5, Fibocom FM101, Foxconn T99W373, SDX12, SDX55 and SDX6X devices, Genesys GL32XX SD readers, GL352350, GL3590, GL3525S and GL3525 USB hubs, Goodix Touch controllers, HP Rata/Remi BLE Mice, Intel USB-4 retimers, Jabra Evolve 65e/t and SE, Evolve2, Speak2 and Link devices, Logitech Huddle, Rally System and Tap devices, Luxshare Quad USB4 Dock, MediaTek DP AUX Scalers, Microsoft USB-C Travel Hub, More Logitech Unifying receivers, More PixartRF HPAC devices, More Synaptics Prometheus fingerprint readers, Nordic HID devices, nRF52 Desktop Keyboard, PixArt BLE HPAC OTA, Quectel EM160 and RM520, Some Western Digital eMMC devices, Star Labs StarBook Mk VIr2, Synaptics Triton devices, System76 Launch 3, Launch Heavy 3 and Thelio IO 2, TUXEDO InfinityBook Pro 13 v3, VIA VL122, VL817S, VL822T, VL830 and VL832, Wacom Cintiq Pro 27, DTH134 and DTC121, One 13 and One 12 Tablets

InputLeap on Wayland
One really interesting feature that landed for Fedora Workstation 39 was the support for InputLeap. It’s probably not on most peoples radar, but it’s an important feature for system administrators, developers and generally anyone with more than a single computer on their desk.

Historically, InputLeap is a fork of Barrier which itself was a fork of Synergy, it allows to share the same input devices (mouse, keyboard) across different computers (Linux, Windows, MacOS) and to move the pointer between the screens of these computers seamlessly as if they were one.

InputLeap has a client/server architecture with the server running on the main host (the one with the keyboard and mouse connected) and multiple clients, the other machines sitting next to the server machine. That implies two things, the InputLeap daemon on the server must be able to “capture” all the input events to forward them to the remote clients when the pointer reaches the edge of the screen, and the InputLeap client must be able to “replay” those input events on the client host to make it as if the keyboard and mouse were connected directly to the (other) computer. Historically, that relied on X11 mechanisms and neither InputLeap (nor Barrier or even Synergy as a matter of fact) would work on Wayland.

This is one of the use cases that Peter Hutterer had in mind when he started libEI, a low-level library aimed at providing a separate communication channel for input emulation in Wayland compositors and clients (even though libEI is not strictly tied to Wayland). But libEI alone is far from being sufficient to implement InputLeap features, with Wayland we had the opportunity to make things more secure than X11 and take benefit from the XDG portal mechanisms.

On the client side, for replaying input events, it’s similar to remote desktop but we needed to update the existing RemoteDesktop portal to pass the libEI socket. On the server side, it required a brand new portal for input capture . These also required their counterparts in the GNOME portal, for both RemoteDesktop and InputCapture [8], and of course, all that needs to be supported by the Wayland compositor, in the case of GNOME that’s mutter. That alone was a lot of work.

Yet, even with all that in place, that’s just the basic requirements to support a Synergy/Barrier/InputLeap-like feature, the tools in question need to have support for the portal and libEI implemented to benefit from the mechanisms we’ve put in place and for the all feature to work and be usable. So libportal was also updated to support the new portal features and a new “Wayland” backend alongside the X11, Windows and Mac OS backends was contributed to InputLeap.

The merge request in InputLeap was accepted very early, even before the libEI API was completely stabilized and before the rest of the stack was merged, which I believe was a courageous choice from Povilas (who maintains InputLeap) which helped reduce the time to have the feature actually working, considering the number of components and inter-dependencies involved. Of course, there are still features missing in the Wayland backend, like copy/pasting between hosts, but a clipboard interface was fairly recently added to the remote desktop portal and therefore could be used by InputLeap to implement that feature.

Fun fact, Xwayland also grew support for libEI also using the remote desktop portal and wires that to the XTEST extension on X11 that InputLeap’s X11 backend uses, so it might even be possible to use the X11 backend of InputLeap in the client side through Xwayland, but of course it’s better to use the Wayland backend on both the client and server sides.

InputLeap is a great example of collaboration between multiple parties upstream including key contributions from us at Red Hat to implement and contribute a feature that has been requested for years upstream..

Thank you to Olivier Fourdan, Debarshi Ray, Richard Hughes, Sebastian Wick and Jonas Ådahl for their contributions to this blog post.

November 27, 2023

GSoCxBMU event India

Namaste Everyone!

This is a quick blog to share my recent experience of speaking at BML Munjal University (BMU), India.

I was invited there as a speaker to shed some light on GSoC and Open Source in general for their GSoCxBMU event.

Bml campus

The event was great, where we collectively shared the love of open source, and the students got to learn about how they can participate in it and got to know about GSoC and similar opportunities (Outreachy, LFX Mentorship, MLH Fellowship, GSoD, etc...)

Near the end, we also discussed about the GNOME Foundation and the GNOME Project, where I shared some projects to which they can contribute and details about GNOME Internships.

gnome foundation sslide

In the end, there was a short Q&A session where any queries were resolved, after which the students received some GNOME stickers which the engagement team sent me and we called it a day :D

picture of atendees

The event was co-organised by GDSC BMU and Sata BMU in collaboration with GNOME Users Group, Delhi - NCR.

Organising team picture

Let's do more events and spread the love about FOSS :D

November 23, 2023

Local-First Workshop (feat. p2panda)

This week we had a local-first workshop at offline in Berlin, co-organized with the p2panda project. As I’ve written about before, some of us have been exploring local-first approaches as a way to sync data between devices, while also working great offline.

We had a hackfest on the topic in September, where we mapped out the problem space and discussed different potential architectures and approaches. We also realized that while there are mature solutions for the actual data syncing part with CRDT libraries like Automerge, the network and discovery part is more difficult than we thought.

Network Woes

The issues we need to address at the network level are the classic problems any distributed system has, including:

  • Discovering other peers
  • Connecting to the other peers behind a NAT
  • Encryption and authentication
  • Replication (which clients need what data?)

We had sketched out a theoretical architecture for first experiments at the last hackfest, using WebRTC data channel to send data, and hardcoding a public STUN server for rendezvous.

A few weeks after that I met Andreas from p2panda at an event in Berlin. He mentioned that in p2panda they have robust networking already, including mDNS discovery on the local network, remote peer discovery using rendezvous servers, p2p connections via UDP holepunching or relays, data replication, etc. Since we’re very interested in getting a low-fi prototype working sooner rather than later it seemed like a promising direction to explore.

p2panda

The p2panda project aims to provide a batteries-included SDK for easy local-first app development, including all the hard networking stuff mentioned above. It’s been around since about 2020, and is currently primarily developed by Andreas Dzialocha and Sam Andreae.

The architecture consist of nodes and clients. Nodes include networking, materialization, and an SQL database. Clients sign and create data, and interact with the node using a GraphQL API.

As of the latest release there’s TLS transport encryption between nodes, but end-to-end data-encryption using MLS is still being worked on, as well as a capabilities system and privacy-respecting deletion. Currently there’s a single key/value CRDT being used for all data, with no high-level way for apps to customize this.

The Workshop

The idea for the workshop was to bring together people from the GNOME and local-first communities, discuss the problem space, and do some initial prototyping.

For the latter Andreas prepared a little bookmark manager demo project (git repository) that people can open in Workbench and hack on easily. This demo runs a node in the background and accesses the database via GraphQL from a simple GTK frontend, written in Rust. The demo app automatically finds other peers on the local network and syncs the data between them.

Bookmark syncing demo using p2panda, running in Workbench

We had about 10 workshop participants with diverse backgrounds, including an SSB developer, a Mutter developer, and some people completely new to both local-first and GTK development. We didn’t get a ton of hacking done due to time constraints (we had enough program for an all-day workshop realistcally :D), but a few people did start projects they plan to pursue after the workshop, including C/GObject bindings for p2panda-rs and an app/demo to sync a list of map locations. We also had some really good discussions on local-first architecture, and the GNOME perspective on this.

Thoughts on Local-First Architectures

The way p2panda splits responsibilities between components is optimized for simple client development, and being able to use it in the browser using the GraphQL API. All of the heavy lifting is done in the node, including networking, data storage, and CRDTs. It currently only supports one CRDT, which is optimized for database-style apps with lots of discrete fields.

One of the main takeaways from our previous hackfest was that data storage and CRDTs should ideally be in the client. Different apps need different CRDTs, because these encode the semantics of the data. For example, a text editor would need a custom text CRDT rather than the current p2panda one.

Longer-term we’ll probably want an architecture where clients have more control over their data to allow for more complex apps and diverse use cases. p2panda can provide these building blocks (generic reducer logic, storage providers, networking stack, etc.) but these APIs still need to be exposed for more flexibility. How exactly this could be done and if/how parts of the stack could be shared needs more exploration :)

Theoretical future architectures aside, p2panda is a great option for local-first prototypes that work today. We’re very excited to start playing with some real apps using it, and actually shipping them in a way that people can try.

What’s Next?

There’s a clear path towards first prototype GNOME apps using p2panda for sync. However, there are two constraints to keep in mind when considering ideas for this:

  • Data is not encrypted end-to-end for now (so personal data is tricky)
  • The default p2panda CRDT is optimized for key / value maps (more complex ones would need to be added manually)

This means that unfortunately you can’t just plug this into a GtkSourceView and have a Hedgedoc replacement. That said, there’s still lots of cool stuff you can do within these constraints, especially if you get creative in the client around how you use/access data. If you’re familiar with Rust, the Workbench demo Andreas made is a great starting point for app experiments.

Some examples of use cases that could be well-suited, because the data is structured, but not very sensitive:

  • Expense splitting (e.g. Splittypie)
  • Meeting scheduling (e.g. Doodle)
  • Shopping list
  • Apartment cleaning schedule

Thanks to Andreas Dzialocha for co-organizing the event and providing the venue, Sebastian Wick for co-writing this blog post, Sonny Piers for his help with Workbench, and everyone who joined the event. See you next time!

November 22, 2023

Linux Desktop Migration Tool 1.3

I made another release of Linux Desktop Migration Tool. This release includes migration of various secrets and certificates.

It can now migrate PKI certificates and the shared NSS database. It also exports, copies over, and imports existing GPG keys, ssh certificates and settings, migrates GNOME Online Accounts and the GNOME keyring.

For security reasons libsecret has no API to change a keyring password. So I thought that I would have to instruct the users to do it manually in Seahorse after the migration, but I was happy to learn that after you open the Login keyring (and most users only has this one) with the old password, it automatically changes it to the current user’s password. So after logging in you’re prompted to type the old password and that’s it.

After the migration, I opened Evolution and everything was ready. All messages were there, all GOA accounts set up, it didn’t prompt me for passwords, my electronic signature certificate and GPG keys were also there to sign my messages.

November 20, 2023

Vilanova / Barcelona 2006: Memories of GUADECs past

Light shaft at Gaudi's Casa Batlló.  Windows with skull shapes and wooden framing, and a door with warm lighting from the inside.

Rear courtyard of the Casa Batlló, with capricious ironwork along the balconies.

Spire and parapet at the Administrative Pavilion at Parc Güell.  Both have curved, tactile shapes and are covered in blue and white broken tiles.

Bare Neo-Gothic windows at Sagrada Familia, where the stained glass has not been set yet.

Jim Gettys demonstrating One Laptop Per Child's X-O laptop.  It has funny ear-like antennae.

November 17, 2023

Status update, 17/11/2023

In Santiago we are just coming out of 33 consecutive days of rain. The all-time record here is 41 consecutive days, but in some less rainy parts of Galicia there have been actual records broken this autumn. It feels like I have been unnaturally busy but at least with some interesting outcomes.

I spent a lot of time helping Outreachy applicants get started with openQA testing for GNOME. I have been very impressed with the effort some candidates have put in. We have come a long way since October when some participants had yet to even install Linux. We already have a couple of contributions merged in openqa-tests (well done to Reece and Riya), and some more which will hopefully land soon. The final participants are announced next Monday 20th November.

Helping newcomers took up most of my free time but I did also make some progress on openQA testing for apps, in that, it’s now actually possible 🙂 Let me know if you maintain an app and are interested in becoming an early adopter of openQA!

November 15, 2023

Introducing graphics offload

Some of us in the GTK team have spent the last month or so exploring the world of linux kernel graphics apis, in particular, dmabufs. We are coming back from this adventure with some frustrations and some successes.

What is a dmabuf?

A dmabuf is a memory buffer in kernel space that is identified by a file descriptor. The idea is that you don’t have to copy lots of pixel data around, and instead just pass a file descriptor between kernel subsystems.

Reality is of course more complicated that this rosy picture: the memory may be device memory that is not accessible in the same way as ‘plain’ memory, and there may be more than one buffer (and more than one file descriptor), since graphics data is often split into planes (e.g. RGB and A may be separate, or Y and UV).

Why are dmabufs useful?

I’ve already mentioned that we hope to avoid copying the pixel data and feeding it through the GTK compositing pipeline (and with 4k video, that can be quite a bit of data for each frame).

The use cases where this kind of optimization matters are those where frequently changing content is displayed for a long time, such as

  • Video players
  • Virtual machines
  • Streaming
  • Screencasting
  • Games

In the best case, we may be able to avoid feeding the data through the compositing pipeline of the compositor as well, if the compositor supports direct scanout and the dmabuf is suitable for it.  In particular on mobile systems, this may avoid using the GPU altogether, thereby reducing power consumption.

Details

GTK has already been using dmabufs since 4.0: When composing a frame, GTK translates all the render nodes (typically several for each widget) into GL commands, sends those to the GPU, and mesa then exports the resulting texture as a dmabuf and attaches it to our Wayland surface.

But if the only thing that is changing in your UI is the video content that is already in a dmabuf, it would be nice to avoid the detour through GL and just hand the data directly to the compositor, by giving it the file descriptor for the the dmabuf.

Wayland has the concept of subsurfaces that let applications defer some of their compositing needs to the compositor: The application attaches a buffer to each (sub)surface, and it is the job of the compositor to combine them all together.

With what is now in git main, GTK will create subsurfaces as-needed in order to pass dmabufs directly to the compositor. We can do this in two different ways: If nothing is drawn on top of the dmabuf (no rounded corners, or overlaid controls), then we can stack the subsurface above the main surface without changing any of the visuals.

This is the ideal case, since it enables the compositor to set up direct scanout, which gives us a zero-copy path from the video decoder to the display.

If there is content that gets drawn on top of the video, we may not be able to get that, but we can still get the benefit of letting the compositor do the compositing, by placing the subsurface with the video below the main surface and poking a translucent hole in the main surface to let it peek through.

The round play button is what forces the subsurface to be placed below the main surface here.

GTK picks these modes automatically and transparently for each frame, without the application developer having to do anything. Once that play button appears in a frame, we place the subsurface below, and once the video is clipped by rounded corners, we stop offloading altogether. Of course, the advantages of offloading also disappear.

The graphics offload visualization in the GTK inspector shows these changes as they happen:

Initially, the camera stream is not offloaded because the rounded corners clip it. The magenta outline indicates that the stream is offloaded to a subsurface below the main surface (because the video controls are on top of it). The golden outline indicates that the subsurface is above the main surface.

How do you use this?

GTK 4.14 will introduce a GtkGraphicsOffload widget, whose only job it is to give a hint that GTK should try to offload the content of its child widget by attaching it to a subsurface instead of letting GSK process it like it usually does.

To create suitable content for offloading, the new GdkDmabufTextureBuilder wraps dmabufs in GdkTexture objects. Typical sources for dmabufs are pipewire, video4linux or gstreamer. The dmabuf support in gstreamer will be much more solid in the upcoming 1.24 release.

When testing this code, we used the GtkMediaStream implementation for pipewire by Georges Basile Stavracas Neto that can be found in pipewire-media-stream and libmks by Christian Hergert and Bilal Elmoussaoui.

What are the limitations?

At the moment, graphics offload will only work with Wayland on Linux. There is some hope that we may be able to implement similar things on MacOS, but for now, this is Wayland-only. It also depends on the content being in dmabufs.

Applications that want to take advantage of this need to play along and avoid doing things that interfere with the use of subsurfaces, such as rounding the corners of the video content. The GtkGraphicsOffload docs have more details for developers on constraints and how to debug problems with graphics offload.

Summary

The GTK 4.14 release will have some interesting new capabilities for media playback. You can try it now, with the just-released 4.13.3 snapshot.

Please try it and let us know what does and doesn’t work for you.

Rewriting nouveau’s Website

Introduction

We spent a whole week rewriting nouveau’s website — the drivers for NVIDIA cards. It started as a one-person effort, but it led to a few people helping me out. We addressed several issues in the nouveau website and improved it a lot. The redesign is live on nouveau.freedesktop.org.

In this article, we’ll go over the problems with the old site and the work we’ve done to fix them.

Problems With Old Website

I’m going to use this archive as a reference for the old site.

The biggest problem with the old site was that the HTML and CSS were written 15 years ago and have never been updated since. So in 2023, we were relying on outdated HTML/CSS code. Obviously, this was no fun from a reader’s perspective. With the technical debt and lack of interest, we were suffering from several problems. The only good thing about the old site was that it didn’t use JavaScript, which I wanted to keep for the rewrite.

Fun fact: the template was so old that it could be built for browsers that don’t support HTML5!

Not Responsive

“Responsive design” in web design means making the website accessible on a variety of screen sizes. In practice, a website should adapt to work on mobile devices, tablets, and laptops/computer monitors.

In the case of the nouveau website, it didn’t support mobile screen sizes properly. Buttons were hard to tap and text was small. Here are some screenshots taken in Firefox on my Razer Phone 2:

Small buttons and text in the navigation bar that are difficult to read and tap.

Small text in a table that forces the reader to zoom in.

No Dark Style

Regardless of style preferences, having a dark style/theme can help people who are sensitive to light and battery life on AMOLED displays. Dark styles are useful for those who absolutely need them.

No SEO

Search Engine Optimization (SEO) is the process of making a website more discoverable on search engines like Google. We use various elements such as title, description, icon, etc. to increase the ranking in search engines.

In the case of nouveau, there were no SEO efforts. If we look at the old nouveau homepage’s <head> element, we get the following:

<head>
<meta charset="utf-8">
<title>nouveau</title>
<link rel="stylesheet" href="style.css" type="text/css">
<link rel="stylesheet" href="xorg.css" type="text/css">
<link rel="stylesheet" href="local.css" type="text/css">
<link rel="alternate" type="application/x-wiki" title="Edit this page" href="https://gitlab.freedesktop.org/nouveau/wiki/-/edit/main/sources/index.mdwn">
</head>

The only thing there was a title, which is, obviously, far from desirable. The rest were CSS stylesheets, wiki source link, and character set.

Readability Issues

One of the biggest problems with nouveau’s website (apart from the homepage) is the lack of a maximum width. Large paragraphs stretch across the screen, making it difficult to read.

Process of Rewriting

Before I started the redesign, I talked to Karol Herbst, one of the nouveau maintainers. He had been wanting to redesign the nouveau site for ages, so I asked myself, “How hard can it be?” Well… mistakes were made.

The first step was to look at the repository and learn about the tools freedesktop.org uses for their website. freedesktop.org uses ikiwiki to generate the wiki. Problem is: it’s slow and really annoying to work with. The first thing I did was create a Fedora toolbox container. I installed the ikiwiki package to generate the website locally.

The second step was to rewrite the CSS and HTML template. I took a look at page.tmpl — the boilerplate. While looking at it, I discovered another problem: the template is unreadable. So I worked on that as well.

I ported to modern HTML elements, like <nav> for the navigation bar, <main> for the main content, and <footer> for the footer.

The third step was to rewrite the CSS. In the <head> tag above, we can see that the site pulls CSS from many sources: style.css, xorg.css, and local.css. So what I did was to delete xorg.css and local.css, delete the contents of style.css, and rewrite it from scratch. I copied a few things from libadwaita, namely its buttons and colors.

And behold… merge request !29!

Despite the success of the rewrite, I ran into a few roadblocks. I couldn’t figure out how to make the freedesktop.org logo dark style. Luckily, my friend kramo helped me out by providing an SVG file of the logo that adapts to dark style, based on Wikipedia’s. They also adjusted the style of the website to make it look nicer.

I also couldn’t figure out what to do with the tables because the colors were low contrast. Also, the large table on the Feature Matrix page was limited in maximum width, which would make it uncomfortable on large monitors. Lea from Fyra Labs helped with the tables and fixed the problems. She also adjusted the style.

After that, the rewrite was mostly done. Some reviewers came along and suggested some changes. Karol wanted the rewrite so badly that he opened a poll asking if he should merge it. It was an overwhelming yes, so… it got merged!

Conclusion

As Karol, puts it:

“check out the nouveau repo, then cry, then reconsider your life choices”

In all seriousness, I’ve had a great time working on it. While this is the nouveau site in particular, I plan to eventually rewrite the entire freedesktop.org site. However, I started with nouveau because it was hosted on GitLab. Meanwhile, other sites/pages are hosted on freedesktop.org’s cgit instance, which were largely inaccessible for me to contribute to.

Ideally, we’d like to move from ikiwiki to something more modern, like a framework or a better generator, but we’ll have to see who’s willing to work on it and maintain it.

November 11, 2023

AppStream 1.0 released!

Today, 12 years after the meeting where AppStream was first discussed and 11 years after I released a prototype implementation I am excited to announce AppStream 1.0! 🎉🎉🎊

Check it out on GitHub, or get the release tarball or read the documentation or release notes! 😁

Some nostalgic memories

I was not in the original AppStream meeting, since in 2011 I was extremely busy with finals preparations and ball organization in high school, but I still vividly remember sitting at school in the students’ lounge during a break and trying to catch the really choppy live stream from the meeting on my borrowed laptop (a futile exercise, I watched parts of the blurry recording later).

I was extremely passionate about getting software deployment to work better on Linux and to improve the overall user experience, and spent many hours on the PackageKit IRC channel discussing things with many amazing people like Richard Hughes, Daniel Nicoletti, Sebastian Heinlein and others.

At the time I was writing a software deployment tool called Listaller – this was before Linux containers were a thing, and building it was very tough due to technical and personal limitations (I had just learned C!). Then in university, when I intended to recreate this tool, but for real and better this time as a new project called Limba, I needed a way to provide metadata for it, and AppStream fit right in! Meanwhile, Richard Hughes was tackling the UI side of things while creating GNOME Software and needed a solution as well. So I implemented a prototype and together we pretty much reshaped the early specification from the original meeting into what would become modern AppStream.

Back then I saw AppStream as a necessary side-project for my actual project, and didn’t even consider me as the maintainer of it for quite a while (I hadn’t been at the meeting afterall). All those years ago I had no idea that ultimately I was developing AppStream not for Limba, but for a new thing that would show up later, with an even more modern design called Flatpak. I also had no idea how incredibly complex AppStream would become and how many features it would have and how much more maintenance work it would be – and also not how ubiquitous it would become.

The modern Linux desktop uses AppStream everywhere now, it is supported by all major distributions, used by Flatpak for metadata, used for firmware metadata via Richard’s fwupd/LVFS, runs on every Steam Deck, can be found in cars and possibly many places I do not know yet.

What is new in 1.0?

API breaks

The most important thing that’s new with the 1.0 release is a bunch of incompatible changes. For the shared libraries, all deprecated API elements have been removed and a bunch of other changes have been made to improve the overall API and especially make it more binding-friendly. That doesn’t mean that the API is completely new and nothing looks like before though, when possible the previous API design was kept and some changes that would have been too disruptive have not been made. Regardless of that, you will have to port your AppStream-using applications. For some larger ones I already submitted patches to build with both AppStream versions, the 0.16.x stable series as well as 1.0+.

For the XML specification, some older compatibility for XML that had no or very few users has been removed as well. This affects for example release elements that reference downloadable data without an artifact block, which has not been supported for a while. For all of these, I checked to remove only things that had close to no users and that were a significant maintenance burden. So as a rule of thumb: If your XML validated with no warnings with the 0.16.x branch of AppStream, it will still be 100% valid with the 1.0 release.

Another notable change is that the generated output of AppStream 1.0 will always be 1.0 compliant, you can not make it generate data for versions below that (this greatly reduced the maintenance cost of the project).

Developer element

For a long time, you could set the developer name using the top-level developer_name tag. With AppStream 1.0, this is changed a bit. There is now a developer tag with a name child (that can be translated unless the translate="no" attribute is set on it). This allows future extensibility, and also allows to set a machine-readable id attribute in the developer element. This permits software centers to group software by developer easier, without having to use heuristics. If we decide to extend the developer information per-app in future, this is also now possible. Do not worry though the developer_name tag is also still read, so there is no high pressure to update. The old 0.16.x stable series also has this feature backported, so it can be available everywhere. Check out the developer tag specification for more details.

Scale factor for screenshots

Screenshot images can now have a scale attribute, to indicate an (integer) scaling factor to apply. This feature was a breaking change and therefore we could not have it for the longest time, but it is now available. Please wait a bit for AppStream 1.0 to become deployed more widespread though, as using it with older AppStream versions may lead to issues in some cases. Check out the screenshots tag specification for more details.

Screenshot environments

It is now possible to indicate the environment a screenshot was recorded in (GNOME, GNOME Dark, KDE Plasma, Windows, etc.) via an environment attribute on the respective screenshot tag. This was also a breaking change, so use it carefully for now! If projects want to, they can use this feature to supply dedicated screenshots depending on the environment the application page is displayed in. Check out the screenshots tag specification for more details.

References tag

This is a feature more important for the scientific community and scientific applications. Using the references tag, you can associate the AppStream component with a DOI (Digital object identifier) or provide a link to a CFF file to provide citation information. It also allows to link to other scientific registries. Check out the references tag specification for more details.

Release tags

Releases can have tags now, just like components. This is generally not a feature that I expect to be used much, but in certain instances it can become useful with a cooperating software center, for example to tag certain releases as long-term supported versions.

Multi-platform support

Thanks to the interest and work of many volunteers, AppStream (mostly) runs on FreeBSD now, a NetBSD port exists, support for macOS was written and a Windows port is on its way! Thank you to everyone working on this 🙂

Better compatibility checks

For a long time I thought that the AppStream library should just be a thin layer above the XML and that software centers should just implement a lot of the actual logic. This has not been the case for a while, but there was still a lot of complex AppStream features that were hard for software centers to implement and where it makes sense to have one implementation that projects can just use.

The validation of component relations is one such thing. This was implemented in 0.16.x as well, but 1.0 vastly improves upon the compatibility checks, so you can now just run as_component_check_relations and retrieve a detailed list of whether the current component will run well on the system. Besides better API for software developers, the appstreamcli utility also has much improved support for relation checks, and I wrote about these changes in a previous post. Check it out!

With these changes, I hope this feature will be used much more, and beyond just drivers and firmware.

So much more!

The changelog for the 1.0 release is huge, and there are many papercuts resolved and changes made that I did not talk about here, like us using gi-docgen (instead of gtkdoc) now for nice API documentation, or the many improvements that went into better binding support, or better search, or just plain bugfixes.

Outlook

I expect the transition to 1.0 to take a bit of time. AppStream has not broken its API for many, many years (since 2016), so a bunch of places need to be touched even if the changes themselves are minor in many cases. In hindsight, I should have also released 1.0 much sooner and it should not have become such a mega-release, but that was mainly due to time constraints.

So, what’s in it for the future? Contrary to what I thought, AppStream does not really seem to be “done” and fetature complete at a point, there is always something to improve, and people come up with new usecases all the time. So, expect more of the same in future: Bugfixes, validator improvements, documentation improvements, better tools and the occasional new feature.

Onwards to 1.0.1! 😁

November 10, 2023

PSA: For Xorg GNOME sessions, use the xf86-input-wacom driver for your tablets

TLDR: see the title of this blog post, it's really that trivial.

Now that GodotWayland has been coming for ages and all new development focuses on a pile of software that steams significantly less, we're seeing cracks appear in the old Xorg support. Not intentionally, but there's only so much time that can be spent on testing and things that are more niche fall through. One of these was a bug I just had the pleasure of debugging and was triggered by GNOME on Xorg user using the xf86-input-libinput driver for tablet devices.

On the surface of it, this should be fine because libinput (and thus xf86-input-libinput) handles tablets just fine. But libinput is the new kid on the block. The old kid on said block is the xf86-input-wacom driver, older than libinput by slightly over a decade. And oh man, history has baked things into the driver that are worse than raisins in apple strudel [1].

The xf86-input-libinput driver was written as a wrapper around libinput and makes use of fancy things that (from libinput's POV) have always been around: things like input device hotplugging. Fancy, I know. For tablet devices the driver creates an X device for each new tool as it comes into proximity first. Future events from that tool will go through that device. A second tool, be it a new pen or the eraser on the original pen, will create a second X device and events from that tool will go through that X device. Configuration on any device will thus only affect that particular pen. Almost like the whole thing makes sense.

The wacom driver of course doesn't do this. It pre-creates X devices for some possible types of tools (pen, eraser, and cursor [2] but not airbrush or artpen). When a tool goes into proximity the events are sent through the respective device, i.e. all pens go through the pen tool, all erasers through the eraser tool. To actually track pens there is the "Wacom Serial IDs" property that contains the current tool's serial number. If you want to track multiple tools you need to query the property on proximity in [4]. At the time this was within a reasonable error margin of a good idea.

Of course and because MOAR CONFIGURATION! will save us all from the great filter you can specify the "ToolSerials" xorg.conf option as e.g. "airbrush;12345;artpen" and get some extra X devices pre-created, in this case a airbrush and artpen X device and an X device just for the tool with the serial number 12345. All other tools multiplex through the default devices. Again, at the time this was a great improvement. [5]

Anyway, where was I? Oh, right. The above should serve as a good approximation of a reason why the xf86-input-libinput driver does not try to be fullly compatible to the xf86-input-wacom driver. In everyday use these things barely matter [6] but for the desktop environment which needs to configure these devices all these differences mean multiple code paths. Those paths need to be tested but they aren't, so things fall through the cracks.

So quite a while ago, we made the decision that until Xorg goes dodo, the xf86-input-wacom driver is the tablet driver to use in GNOME. So if you're using a GNOME on Xorg session [7], do make sure the xf86-input-wacom driver is installed. It will make both of us happier and that's a good aim to strive for.

[1] It's just a joke. Put the pitchforks down already.
[2] The cursor is the mouse-like thing Wacom sells. Which is called cursor [3] because the English language has a limited vocabulary and we need to re-use words as much as possible lest we run out of them.
[3] It's also called puck. Because [2].
[4] And by "query" I mean "wait for the XI2 event notifying you of a property change". Because of lolz the driver cannot update the property on proximity in but needs to schedule that as idle func so the property update for the serial always arrives at some unspecified time after the proximity in but hopefully before more motion events happen. Or not, and that's how hope dies.
[5] Think about this next time someone says they long for some unspecified good old days.
[6] Except the strip axis which on the wacom driver is actually a bit happily moving left/right as your finger moves up/down on the touch strip and any X client needs to know this. libinput normalizes this to...well, a normal value but now the X client needs to know which driver is running so, oh deary deary.
[7] e.g because your'e stockholmed into it by your graphics hardware

November 09, 2023

Hackweek 23

Hack Week is the time SUSE employees experiment, innovate & learn interruption-free for a whole week! Across teams or alone, but always without limits.

The Hack Week 23 was from November 6th to November 10th, and my project was to gvie some love to the GNOME Project.

Before the start of the Hack week I asked in the GNOME devs Matrix channel, what project needs some help and they gave me some ideas. At the end I decided to work on the GNOME Calendar, more specifically, improving the test suite and fixing issues related to timezones, DST, etc.

GNOME Calendar

GNOME Calendar is a Gtk4 application, written in C, that heavily uses the evolution-data-server library. It's a desktop calendar application with a modern user interface that can connect handle local and remote calendars. It's integrated in the GNOME desktop.

The current gnome-calendar project has some unit tests, using the GLib testing framework. But right now there are just a few tests, so the main goal right now is to increase the number of tests as much as possible, to detect new problems and regressions.

Testing a desktop application is not something easy to do. The unit tests can check basic operations, structures and methods, but the user interaction and how it's represented is something hard to test. So the best approach is to try replicate user interactions and check the outputs.

A more sophisticated approach could be to start to use the accessibility stack in tests, so it's possible to verify the UI widgets output without the need of rendering the app.

With gnome-calendar there's another point of complexity for tests because it relies on the evolution-data-server to be running, the app communicates with it using dbus, so to be able to do more complex tests we should mock the evolution-data-server and we should create fake data for testing.

My contribution

By the end of the week I've created four Merge requests, three of them have been merged now, and I'll continue working on this project in the following weeks/months.

I'm happy with the work that I was able to do during this Hack Week. I've learned a bit about testing with GLib in C, and a lot about the evolution-data-server, timezones and calendar problems.

It's just a desktop calendar application, how hard it could be? Have you ever deal with dates, times and different regions and time zones? It's a nightmare. There are a lot of edge cases working with dates that can cause problems, operations with dates in different time zones, changes in dates for daylight saving, if I've an event created for October 29th 2023 at 2:30 it will happens two times?

A lot of problems could happen and there are a lot of bugs reported for gnome-calendar related to this kind of issues, so working on this is not something simple, it requires a lot of edge case testing and that's the plan, to cover most of them with automated tests, because any small change could lead to a bug related to time zones that won't be noticed until someone has an appointment at a very specific time.

And this week was very productive thanks to the people working on gnome-calendar. Georges Stavracas reviews my MR very quickly and it was possible to merge during the week, and Jeff Fortin does a great work with the issues in gitlab and leading me to most relevant bugs.

So after a week of going deep into the gnome-calendar source code it could be a pity to just forget about it, so I'll try to keep the momentum and continue working on this project, of course, I'll have just a few hours per week, but any contribution is better than nothing. And maybe for the next summer I can propose a Google Summer of Code project to get an intern working on this full time.

CapyPDF, performance and shared libraries

People who are extremely performance conscious might not like the fact that CapyPDF ships as a shared library with a C API that hides all internal implementation details. This has several potential sources of slowdown:

  • Function calls can not be inlined to callers
  • Shared library function calls are indirect and thus slower for the CPU to invoke
  • All objects require a memory allocation, they can't be stored on the caller's heap

I have not optimized CapyPDF's performance all that much, so let's see if these things are an issue in practice. As a test we'll use the Python bindings to generate the following PDF document 100 times and measuring the execution time.

Running the test tells us that we can create the document roughly 9 times a second on a single core. This does not seem like a lot, but that's because the measurement was done on a Raspberry Pi 3B+. On a i7-1185G7 laptop processor the performance shoots up to 100 times a second.

For comparison I created the same program in native code and the Python version is roughly 10% slower. More interestingly 40 % of the time is not spent on generating the PDF file itself at all, but is instead taken by font subsetting.

A further thing to note is that in PDF any data stream can be compressed, typically with zlib. This test does not use compression at all. If the document had images, compressing the raw pixel data could easily take 10 to 100x longer than all other parts combined.

PDF does support embedding some image formats (e.g. jpegs) directly, but most of the time you need to generate raw pixel data and compress it yourself. In this case compression is easily the dominant term.

Conclusions

A "type hidden" C API does have some overhead, but in this particular case it is almost certainly unmeasurably small. You probably don't want to create a general array or hash map in this way if you need blazing speeds, but for this kind of high level functionality it is unlikely to be a performance bottleneck.

November 02, 2023

Updating GNOME shell extensions to GNOME 45

The new version of the GNOME desktop was released more than one month ago. It takes some time to arrive to the final user, because distributions should integrate, tests and release the new desktop, and that's not something simple, and it should integrate in the distribution release planning.

So right now, it's possible that there's just a few people with the latest version of the desktop right now, just people with rolling release distros or people using testing versions of mayor distributions.

This is one of the reasons because a lot of gnome-shell extensions aren't updated to work with the latest version of GNOME, even after a few months, because even developers doesn't have the latest version of the desktop and it's not something "easy" to install, without a virtual machine or something like that. Even if the update is just a change in the metadata.json, there should be someone to test the extension, and even someone to request this update, and that will happen once the mayor distributions release a new version.

I'm using Tumbleweed, that's a rolling release and GNOME 45 is here just after the official release, but of course, a lot of extensions are not working, and in the infamous list of non working extensions there where the three that I'm maintaining right now:

  • mute-spotify-ads, extension to mute the spotify app when it's playing ads.
  • calc, a simple calculator in the alt+F2 input.
  • hide-minimized, hide minimized windows from alt+tab and overview.

The correct way

The correct way to maintain and update a gnome-shell extension should be to test and update "before" the official release of GNOME. It should be something easy for me, using Tumbleweed I can just add the GNOME Next repository, install the new desktop when it's in beta stage, and update the extension there.

But that's something that require an active maintainership... And right now what I do is to update the extensions when they are broken for me, so just when I need them, and that's after I get the new desktop and I find some time to update.

I know that's not the correct way and this produces a bad experience for other people using the extensions, but this is the easier thing to do for me. Maybe in the future I can do it correctly, and provide a tested update before the official release, maybe using snapper to be able to go back to stable, without the need of using virtual machines.

Update your extension to GNOME 45

The update to GNOME 45 was a bit more complex than previous ones. This version of the shell and gjs change a lot of things, so the migration it's not just to add a new number to the metadata.json, but requires incompatible changes, so extensions that works for GNOME 45 won't work for previous versions and vice versa.

But the people working in gnome-shell does a great work documenting and there's a really nice guide about how to upgrade your extension.

The most important part is the import section, that now it has a different syntax.

// Before GNOME 45
const Main = imports.ui.main;

// GNOME 45
import * as Main from 'resource:///org/gnome/shell/ui/main.js';

And some changes in the extension class, that now can inherit from existing ones to provide common usage, like preferences window:

import Adw from 'gi://Adw';

import {ExtensionPreferences, gettext as _} from 'resource:///org/gnome/Shell/Extensions/js/extensions/prefs.js';

export default class MyExtensionPreferences extends ExtensionPreferences {
    fillPreferencesWindow(window) {
        window._settings = this.getSettings();

        const page = new Adw.PreferencesPage();

        const group = new Adw.PreferencesGroup({
            title: _('Group Title'),
        });
        page.add(group);

        window.add(page);
    }
}

November 01, 2023

Why ACPI?

"Why does ACPI exist" - - the greatest thread in the history of forums, locked by a moderator after 12,239 pages of heated debate, wait no let me start again.

Why does ACPI exist? In the beforetimes power management on x86 was done by jumping to an opaque BIOS entry point and hoping it would do the right thing. It frequently didn't. We called this Advanced Power Management (Advanced because before this power management involved custom drivers for every machine and everyone agreed that this was a bad idea), and it involved the firmware having to save and restore the state of every piece of hardware in the system. This meant that assumptions about hardware configuration were baked into the firmware - failed to program your graphics card exactly the way the BIOS expected? Hurrah! It's only saved and restored a subset of the state that you configured and now potential data corruption for you. The developers of ACPI made the reasonable decision that, well, maybe since the OS was the one setting state in the first place, the OS should restore it.

So far so good. But some state is fundamentally device specific, at a level that the OS generally ignores. How should this state be managed? One way to do that would be to have the OS know about the device specific details. Unfortunately that means you can't ship the computer without having OS support for it, which means having OS support for every device (exactly what we'd got away from with APM). This, uh, was not an option the PC industry seriously considered. The alternative is that you ship something that abstracts the details of the specific hardware and makes that abstraction available to the OS. This is what ACPI does, and it's also what things like Device Tree do. Both provide static information about how the platform is configured, which can then be consumed by the OS and avoid needing device-specific drivers or configuration to be built-in.

The main distinction between Device Tree and ACPI is that Device Tree is purely a description of the hardware that exists, and so still requires the OS to know what's possible - if you add a new type of power controller, for instance, you need to add a driver for that to the OS before you can express that via Device Tree. ACPI decided to include an interpreted language to allow vendors to expose functionality to the OS without the OS needing to know about the underlying hardware. So, for instance, ACPI allows you to associate a device with a function to power down that device. That function may, when executed, trigger a bunch of register accesses to a piece of hardware otherwise not exposed to the OS, and that hardware may then cut the power rail to the device to power it down entirely. And that can be done without the OS having to know anything about the control hardware.

How is this better than just calling into the firmware to do it? Because the fact that ACPI declares that it's going to access these registers means the OS can figure out that it shouldn't, because it might otherwise collide with what the firmware is doing. With APM we had no visibility into that - if the OS tried to touch the hardware at the same time APM did, boom, almost impossible to debug failures (This is why various hardware monitoring drivers refuse to load by default on Linux - the firmware declares that it's going to touch those registers itself, so Linux decides not to in order to avoid race conditions and potential hardware damage. In many cases the firmware offers a collaborative interface to obtain the same data, and a driver can be written to get that. this bug comment discusses this for a specific board)

Unfortunately ACPI doesn't entirely remove opaque firmware from the equation - ACPI methods can still trigger System Management Mode, which is basically a fancy way to say "Your computer stops running your OS, does something else for a while, and you have no idea what". This has all the same issues that APM did, in that if the hardware isn't in exactly the state the firmware expects, bad things can happen. While historically there were a bunch of ACPI-related issues because the spec didn't define every single possible scenario and also there was no conformance suite (eg, should the interpreter be multi-threaded? Not defined by spec, but influences whether a specific implementation will work or not!), these days overall compatibility is pretty solid and the vast majority of systems work just fine - but we do still have some issues that are largely associated with System Management Mode.

One example is a recent Lenovo one, where the firmware appears to try to poke the NVME drive on resume. There's some indication that this is intended to deal with transparently unlocking self-encrypting drives on resume, but it seems to do so without taking IOMMU configuration into account and so things explode. It's kind of understandable why a vendor would implement something like this, but it's also kind of understandable that doing so without OS cooperation may end badly.

This isn't something that ACPI enabled - in the absence of ACPI firmware vendors would just be doing this unilaterally with even less OS involvement and we'd probably have even more of these issues. Ideally we'd "simply" have hardware that didn't support transitioning back to opaque code, but we don't (ARM has basically the same issue with TrustZone). In the absence of the ideal world, by and large ACPI has been a net improvement in Linux compatibility on x86 systems. It certainly didn't remove the "Everything is Windows" mentality that many vendors have, but it meant we largely only needed to ensure that Linux behaved the same way as Windows in a finite number of ways (ie, the behaviour of the ACPI interpreter) rather than in every single hardware driver, and so the chances that a new machine will work out of the box are much greater than they were in the pre-ACPI period.

There's an alternative universe where we decided to teach the kernel about every piece of hardware it should run on. Fortunately (or, well, unfortunately) we've seen that in the ARM world. Most device-specific simply never reaches mainline, and most users are stuck running ancient kernels as a result. Imagine every x86 device vendor shipping their own kernel optimised for their hardware, and now imagine how well that works out given the quality of their firmware. Does that really seem better to you?

It's understandable why ACPI has a poor reputation. But it's also hard to figure out what would work better in the real world. We could have built something similar on top of Open Firmware instead but the distinction wouldn't be terribly meaningful - we'd just have Forth instead of the ACPI bytecode language. Longing for a non-ACPI world without presenting something that's better and actually stands a reasonable chance of adoption doesn't make the world a better place.

comment count unavailable comments

October 31, 2023

Stuttgart 2005: Memories of GUADECs past

Stuttgart State Opera bathed in golden light, and its reflection on artificial pond

Black and white abstract of a simple modernist gargoyle on a stone-tiled wall

Nokia 770 Internet Tablet, before iPads took over the world

October 28, 2023

FlatSync Status Update

This post gives a long overdue status update on FlatSync's progress.

# Absence

My last update post dates back quite a bit as I'm currently in the process of writing my bachelor's thesis, and I'm thus a bit more occupied. Furthermore, I was struggling along a lot with FlatSync's latest development. Nevertheless, I wanted to give a quick status update on where we currently stand and what to expect going forward.

# FlatSync

# New Features

In my last update post, I mentioned moving away from polling and listening for changes to Flatpak installations instead, as well as respecting monitored networking settings. Both of these features are now implemented in FlatSync and working properly.

A basic GUI is also present, though this is currently being reworked and polished.

# Current Work in Progress

Currently, the authentication is being reworked, swapping from GitHub PATs to using a GitHub App (GitHub's flavor of OAuth2). This required a lot of work since the original provider implementation was not written by myself, so I had a hard time understanding the implementation details as I had to re-write much of the implementation.

Luckily, after getting help from the original author, we were able to properly port the implementation to a GitHub App, and future OAuth implementations should be way easier to realize now as well.

The CLI part is already done and working, and the UI part is currently being worked on.

October 27, 2023

A new accessibility architecture for modern free desktops

My name is Matt Campbell, and I’m delighted to announce that I’m joining the GNOME accessibility team to develop a new accessibility architecture. After providing some brief background information on myself, I’ll describe what’s wrong with the current Linux desktop accessibility architecture, including a design flaw that has plagued assistive technology developers and users on multiple platforms, including GNOME, for decades. Then I’ll describe how two of the three current browser engines have solved this problem in their internal accessibility implementations, and discuss my proposal to extend this solution to a next-generation accessibility architecture for GNOME and other free desktops.

Introducing myself

While I’m new to the GNOME development community, I’m no stranger to accessibility. I’m visually impaired myself, and I’ve been working on accessibility in one form or another for more than 20 years. Among other things:

  • I contributed to the community of blind Linux users from 1999 through 2001. I modified the ZipSlack mini-distro to include the Speakup console screen reader, developed the trplayer command-line front-end for RealPlayer, and helped several new users get started.
  • In 2003 to 2004, I developed a talking browser based on the Mozilla Gecko engine; it ran on both Windows and Linux.
  • Starting in 2004, I developed a Windows screen reader, called System Access, for Serotek (which has since been acquired by my current company, Pneuma Solutions).
  • I later worked on the Windows accessibility team at Microsoft, where I contributed to the Narrator screen reader and the UI Automation API, from mid 2017 to late 2020. (Rest assured, the non-compete clause in my employment agreement with Microsoft expired long ago.)
  • For the past two years, I have also been the lead developer of AccessKit, a cross-platform accessibility abstraction for GUI toolkits. My upcoming work on the GNOME accessibility architecture will build on the work I’ve been doing on AccessKit.

The problems we need to solve

The free desktop ecosystem has changed dramatically since the original GNOME accessibility team, led by Sun Microsystems, designed the original Assistive Technology Service Provider Interface (AT-SPI) in the early 2000s. Back then, per-application security sandboxing was, at best, a research project. X11 was the only free windowing system in widespread use, and it was taken for granted that each application would both know and control the position of each of its windows in global screen coordinates. Obviously, with the rise of Flatpak and Wayland, all of these things have changed, and the GNOME accessibility stack must adapt.

But in my opinion, AT-SPI also has the same fatal flaw as most other accessibility APIs, going back to the 1990s. The first programmatic accessibility API, Microsoft Active Accessibility (MSAA), was introduced in 1997. Sun implemented the Java Access Bridge (JAB) for Windows not long after. What MSAA, the JAB, and AT-SPI all have in common is that their performance is severely limited by the latency of multiple inter-process communication (IPC) round trips. The more recent UI Automation API (introduced in 2005) mitigated this problem somewhat with a caching system, as did AT-SPI, but that has never been a complete solution, especially when assistive technologies need to traverse text documents. For decades, those of us who have developed Windows screen readers have been so determined to work around this IPC bottleneck that we’ve relied, to varying degrees, on the ability to inject some of our code into application processes, so we can more efficiently fetch the information we need and do our own IPC. Needless to say, this approach has grave drawbacks for security and robustness, and it’s not an option on any platform other than Windows. We need a better solution.

The solution: Push-based accessibility

We can find such a solution in the internal accessibility architecture of some modern browsers, particularly Chromium and (more recently) Firefox. As you may know, these browsers have a multi-process architecture, where a sandboxed process, known as the content process or renderer process, renders web content and executes JavaScript. These sandboxed processes have all of the information needed to produce an accessibility tree, but for various reasons, it’s still optimal or even necessary to implement the platform accessibility APIs in the main, unsandboxed browser process. So, to prevent the multi-process architecture from further degrading browser performance for assistive technology users, these browsers internally implement what I’ll call a push architecture. There’s still IPC happening internally, but the renderer process initially pushes a complete serialized snapshot of an accessibility tree, followed by incremental tree updates. The platform accessibility API implementations in the main browser process can then respond immediately to queries using a local copy of the accessibility tree. This is in contrast with the pull-based platform accessibility APIs, where assistive technologies or other clients pull information about one node at a time, sometimes incurring IPC round trips for one property at a time. In terms of latency, the push-based approach is far more efficient. You can learn more in Jamie Teh’s blog post about Firefox’s Cache the World project.

Ever since I learned about Chromium’s internal accessibility architecture more than a decade ago, I have believed that assistive technologies would be more robust and responsive if the push-based approach were applied across the platform accessibility stack, all the way from the application to the assistive technology. If you’re a screen reader user, you have likely noticed that if an application becomes unresponsive, you can’t find out anything about what is currently in the application window, while, in a modern, composited windowing system, a sighted user can still see the last rendered frame. A push-based approach would ensure that the latest snapshot of the accessibility tree is likewise always available, in its entirety, to be queried by the assistive technology in any way that the AT developer wants to. And because an AT would have access to a local copy of the accessibility tree, it can quickly perform complex traversals of that tree, without having to go back and forth with the application to gather information. This is especially useful when implementing advanced commands for navigating web pages and other complex documents.

What’s not changing

Before I go further, I want to reassure readers that many of the fundamentals of accessibility are not changing with this new architecture. An accessible UI is still defined by a tree of nodes, each of which has a role, a bounding rectangle, and other properties. Many of these properties, such as name, value, and the various state flags, will be familiar to anyone who has already worked with AT-SPI, the GTK 4 accessibility API, or the legacy ATK. It’s true that we’ll have to add several new properties, especially for text nodes. However, I believe that most of the work that application and toolkit developers have already done to implement accessibility will still be applicable.

Risks and benefits

I’ve written a more detailed proposal for this new architecture, including a discussion of various risks. The risk that concerns me most at this point is that I’m not yet entirely sure how we’ll handle large, complex documents in non-web applications such as LibreOffice. I specifically mention non-web applications here because web applications, such as the various online office suites, are already limited in this respect by the performance of the browser itself, including the internal push-based accessibility implementations of Chromium and Firefox. My guess is that, with this new architecture, applications such as LibreOffice will need to present a virtualized accessible view of the document, similar to what web applications are already doing. We’ll need to make sure that we implement this without giving up features that assistive technology users have come to expect, particularly when it comes to efficiently navigating large documents. But I believe this is feasible.

I’d like to close with a couple of exciting possibilities that my proposal would enable, which I believe make it worth the risks. Let’s start with accessible screenshots. Anyone who uses a screen reader knows how common, and how frustrating, it is to come across screenshots with no useful alternate text (alt text). Even when an author has made an effort to provide useful alt text, it’s still just a flat string. Imagine how much more useful it would be to have access to the full content and structure of the information in the screenshot, as if you were accessing the application that the screenshot came from (assuming the app itself was accessible). With a push-based accessibility architecture, a screenshot tool can quickly grab a full snapshot of the accessibility tree from the window being captured. From there, I don’t think it would be too difficult to propose a way of including a serialized accessibility tree in the screenshot image file. Getting such a thing standardized might be more difficult, but the push architecture would at least eliminate a major technical barrier to accessible screenshots.

Taking this idea a step further, because the proposed push architecture includes incremental tree updates as well as full snapshots, it would also become feasible to implement accessibility in streaming screen-sharing applications. This obviously includes one-on-one remote desktop use cases; imagine extending VNC or RDP with the ability to push accessibility tree updates. But what’s more exciting to me is the potential to add accessibility in one-to-many use cases, such as the screen-sharing features found in online meeting spaces. And while this too might be difficult to standardize, one could even imagine including accessibility information in standard video container formats such as MP4 or WebM, making visual information accessible in everything from conference talks to product demo videos and online courses.

Conclusion

Too often, free desktop platforms struggle to keep up with their proprietary counterparts. That’s not a criticism; it’s just a fact that free desktop projects don’t have the resources of the proprietary giants. But here, we have an opportunity to leap ahead, by betting on a different accessibility architecture. The push approach has already been proven by two of the three major browser engines, and while there are risks in extending this approach outside of the browser context, I strongly believe that the potential benefits are too big to ignore.

Over the next year we will be experimenting with this new approach in a prototype, collect feedback from various stakeholders across the ecosystem, and hopefully build a new, stronger foundation for state-of-the-art accessibility in the free desktop world. If you have questions or ideas don’t hesitate to reach out or leave a comment!

October 26, 2023

Colorful HIG

The refresh of the Human Interface Guidelines in both the content and presentation is something to be proud of, but there were a couple of areas that weren’t great. Where we don’t quite shine in the area of blueprint illustration style is the contrast for the dark mode. While in many cases a single graphic can work in the two contexts just fine, in other it struggles. And while we tried to address it in the HIG, it became clear we do need to do better.

Low contrast for HIG blueprint illustrations

Inline SVG Stylesheet

there’s a little trick I learned from razze while working Flathub — a single favicon working in both dark and light mode can be achieved using a single SVG. The SVG doesn’t have inline defined fills, but instead has a simple embedded <style> that defines the g,path,rect,circle and whatnot element styles and sets the fill there. For the dark mode it gets overriden with the @media (prefers-color-scheme: dark){} rule. While generally favicons are a single color stencil, it can work for fullcolor graphics (and more complex rules):

<style>
  rect.fg { fill: #5e5c64; }
  path.bg { fill: #fff; }
  @media (prefers-color-scheme: dark) {
    rect.fg { fill: #fff; } 
    path.bg { fill: #5e5c64; }
  }
</style>

This made me think of a similar approach working for inline images as well. Sadly there’s two obstacles. While the support for inline stylesheets in SVGs seems to be quite wide among browsers, Epiphany only seems to respect prefers-color-scheme when using the image directly (or the favicon case), but didn’t seem to work when emebded inside and html page as <img>.

The more severe issue is that producing such SVGs is a little cumbersome as you have to clean up the document generated by Inkscape, which likes to use fill attribute or inline css in style. While it generally doesn’t remove markup, it will reformat your markup and you will be fighting with it every time you need to edit the SVG visually rather than inside a text editor.

HTML5 Picture

For inline images, the approach that seems more straight forward and I’ve taken on many occasions is using the HTML5 <picture> element. It works great for providing dark mode variants using source with a media attribute as well as a neat accessibility feature of showing non-animated image variant for people who opt out:

<picture>
    <source srcset="static.png" 
        media="(prefers-reduced-motion: reduce)" />
    <img src="animated.gif" />
</picture>

Sphinx/RST

GNOME Human Interface Guidelines are written in restructured text/Sphinx, however. Escaping to html for images/pictures would be quite cumbersome, but luckily dark mode is supported in the furo theme (and derivates) using the only-light and only-dark classes. The markup gets a little chatty, but still quite legible. There’s some iterations to be made, but in terms of legibility it’s finally a bit more accessible.

New HIG light New HIG dark

October 25, 2023

Introspection’s edge

tl;dr The introspection data for GLib (and its sub-libraries) is now generated by GLib itself instead of gobject-introspection; and in the near future, libgirepository is going to be part of GLib as well. Yes, there’s a circular dependency, but there are plans for dealing with that. Developers of language bindings should reach out to the GLib maintainers for future planning.


GLib-the-project is made of the following components:

  • glib, the base data types and C portability library
  • gmodule, a portable loadable module API
  • gobject, the object oriented run time type system
  • gio, a grab bag of I/O, IPC, file, network, settings, etc

Just like GLib, the gobject-introspection project is actually three components in a trench coat:

  • g-ir-scanner, the Python and flex based parser that generates the XML description of a GObject-based ABI
  • g-ir-compiler, the “compiler” that turns XML into an mmap()-able binary format
  • libgirepository, the C API for loading the binary format, resolving types and symbols, and calling into the underlying C library

Since gobject-introspection depends on GLib for its implementation, the introspection data for GLib has been, until now, generated during the gobject-introspection build. This has proven to be extremely messy:

  • as the introspection data is generated from the source code, gobject-introspection needs to have access to the GLib headers and source files
  • this means building against an installed copy of GLib for the declarations, and the GLib sources for the docblocks
  • to avoid having access to the sources at all times, there is a small script that extracts all the docblocks from a GLib checkout; this script has to be run manually, and its results committed to the gobject-introspection repository

This means that the gobject-introspection maintainers have to manually keep the GLib data in sync; that any change done to GLib’s annotations and documentation are updated typically at the end of the development cycle, which also means that any issue with the annotations is discovered almost always too late; and that any and all updates to the GLib introspection data require a gobject-introspection release to end up inside downstream distributions.

The gobject-introspection build can use GLib as a subproject, at the cost of some awful, awful build system messiness, but that does not help anybody working on GLib; and, of course, since gobject-introspection depends on GLib, we cannot build the former as a subproject of the latter.

As part of the push towards moving GLib’s API references towards the use of gi-docgen, GLib now checks for the availability of g-ir-scanner and g-ir-compiler on the installed system, and if they are present, they are used to generate the GLib, GObject, GModule, and GIO introspection data.

A good side benefit of having the actual project providing the ABI be in charge of generating its machine-readable description is that we can now decouple the release cycles of GLib and gobject-introspection; updates to the introspection data of GLib will be released alongside GLib, while the version of gobject-introspection can be updated whenever there’s new API inside libgirepository, instead of requiring a bump at every cycle to match GLib.

On the other hand, though, this change introduces a circular dependency between GLib and gobject-introspection; this is less than ideal, and though circular dependencies are a fact of life for whoever builds and packages complex software, it’d be better to avoid them; this means two options:

  • merging gobject-introspection into GLib
  • making gobject-introspection not depend on GLib

Sadly, the first option isn’t really workable, as it would mean making GLib depend on flex, bison, and CPython in order to build g-ir-scanner, or (worse) rewriting g-ir-scanner in C. There’s also the option of rewriting the C parser in pure Python, but that, too, isn’t exactly strife-free.

The second option is more convoluted, but it has the advantage of being realistically implemented:

  1. move libgirepository into GLib as a new sub-library
  2. move g-ir-compiler into GLib’s tools
  3. copy-paste the basic GLib data types used by the introspection scanner into gobject-introspection

Aside from breaking the circular dependency, this allows language bindings that depend on libgirepository’s API (Python, JavaScript, Perl, etc) to limit their dependency to GLib, since they don’t need to deal with the introspection parser bits; it also affords us the ability to clean up the libgirepository API, provide better documentation, and formalise the file formats.

Speaking of file formats, and to manage expectations: I don’t plan to change the binary data layout; yes, we’re close to saturation when it comes to padding bits and fields, but there hasn’t been a real case in which this has been problematic. The API, on the other hand, really needs to be cleaned up, because it got badly namespaced when we originally thought it could be moved into GLib and then the effort was abandoned. This means that libgirepository will be bumped to 2.0 (matching the rest of GLib’s sub-libraries), and will require some work in the language bindings that consume it. For that, bindings developers should get in touch with GLib maintainers, so we can plan the migration appropriately.

October 24, 2023

Fix Fedora IPU6 not working on Dell laptops with kernel 6.5

There is an issue with the rpmfusion packaged IPU6 camera stack for Fedora is not working on many Dell laptop models after upgrading the kernel to a 6.5.y kernel.

This is caused by a new mainline ov0a10 sensor driver which takes precedence over the akmod ov0a10 driver but lacks VSC integration.

This can be worked around by running the following command:
sudo rm /lib/modules/$(uname -r)/kernel/drivers/media/i2c/ov01a10.ko.xz; sudo depmod -a

After the rm + depmod run:
sudo rmmod ov01a10; sudo modprobe ov01a10

Or reboot. After this your camera will hopefully work again.

I have submitted a pull-request to disable the mainline kernel's non working ov01a10 driver, so after the next Fedora kernel update this workaround should no longer be necessary.

October 23, 2023

Project blogs are now welcome on Planet GNOME!

Historically Planet GNOME is the go-to place for those interested in following up with the work of GNOME community members, but with the rise of social media platforms and the downfall of blogs (and RSS feeds), one would think that feed aggregators would follow the path of losing popularity within our community too. Surprisingly that doesn’t seem to be the case for Planet GNOME.

In a poll recently made by Emmanuele Bassi in GNOME’s Discourse, the Planet was well ranked in the list of answers to the question “How do you find what people are working on?”. Conversations in the GNOME Hackers chat channel also led to this conclusion.

Given how useful Planet GNOME still is to our community, I think we should spend some time giving it some love and continue evolving the platform.

A historical “editorial” choice of Planet GNOME was to only include personal blogs. This was decided before I became an editor. The rationale was that this is a community space and by linking people to what they write, we are nurturing a community that puts people first. After all, GNOME is all about its people!

On the other hand, news.gnome.org was designed to be the go-to place for more formal/impersonal/institutional communication. In practice that didn’t work out so well as the website never reached as broad of an audience as Planet GNOME.

Also, as projects started to create their blogs (such as the Shell blog, GTK blog, etc…), people started to work around the editorial choice by adding multiple author feeds to the same blog. I think this seems to strike the right balance between personal and institutional communication.

So to help information spread broadly in our community, I (wearing my Editor hat) decided to lift this restriction. Now we started accepting project blogs too. Recently we added the Vala Blog, Accessibility Blog, and Flathub Blog. File an issue if you write for a GNOME related project blog that you wish to see on Planet GNOME.

While we are at it, interesting conversations are happening intending to improve the overall Planet GNOME experience and reach. Ideas and suggestions are being discussed in Discourse and you are welcome to join the conversation!

October 11, 2023

print("Hello Planet GNOME");

Hello to all of you reading Planet GNOME right now! This is the first post from the Vala Blog now appearing in the big wide world of GNOME.

Big thanks to the Planet GNOME team and Felipe Borges for making this possible!

This blog is operated by the Vala project and we are posting here once in a while about new Vala releases, noteworthy changes like new features and how to use them, interesting projects using Vala and their experience with it and other stuff related to the Vala programming language.

If you have an idea to write about something that fits one of the categories above, reach out to us and we are sure eventually a great article will be released! (Talk to us here)

Stay tuned for the next posts coming out! :)

How and Why to Get Verified on Mastodon

I run a little family Mastodon server so that I can have @cassidy@blaede.family and feel comfortable sharing random photos of my kids with people while remaining in control of how they’re shared. Part of running a Mastodon server is that the admin can curate what hashtags, posts, and links are allowed to trend; e.g. if #furryNSFW is popular across your server’s corner of the Internet, you might not want that to be recommended to your mom on the trending page (true story!). So admins can explicitly allow or disallow each trend before it’s shown to their users. Once a trend is allowed, it will be able to trend again at any point in the future, cutting down on work for seasonal or naturally recurring trends (like #SilentSunday or #SuperbOwl).

This also works for individual links and posts, but a more powerful feature is that admins can explicitly allow (or disallow) certain accounts or publishers to trend. This can greatly cut down on moderation work, as admins can allow-list reputable news sites or people they know are generally good people who stick to the rules and are unlikely to post something they don’t want trending; I personally use this for virtually anyone I actually know in real life or from various open source projects. However, I’ve started becoming a bit more lenient than that, as for my small instance there’s a pretty low risk of abuse: I allow-list anyone with a verified account that looks decent at a glance. I check in on the trends regularly and can always revoke that, but if someone has proven they are who they say they are and aren’t regularly posting things I wouldn’t want to trend, I don’t want to have to go approve every post that might trend!

Example of verified accounts on Mastodon

So my call to anyone and everyone on Mastodon—but especially journalists and well-known personalities—is: get verified! It greatly helps people like me and mods of larger servers know you’re you, and makes it more likely that people will find you in search, follow you, and see you on trending page in their Mastodon app or server.

Okay, but how?

If you’re coming from X (the social network formerly known as Twitter and still available at twitter.com) or a Meta network like Instagram, you might think being verified is some process requiring payment, government ID verification, bribery, etc. With Mastodon, that’s not so! Verification is built on web standards and proves one thing: your account is associated with a specific page on the Internet. That might seem like less of a guarantee (it doesn’t explicitly prove your real human identity), it’s powerful. Here are some recommended ways to use verification to prove you’re, well, you:

  • Personal website: this is the most flexible and in your control; if you have a website where you direct people, use that for verification! It proves that your Mastodon account belongs to the same person as the website.

  • GitHub account: GitHub supports Mastodon links on profiles and will add the right markup to support verification. For open source people, this is a great way to associate your work with your social media presence.

  • Organization about page: if you work for an organization like a news site, see if you can add a link to your Mastodon account on your profile page there. This associates your work at that organization with your Mastodon account and proves your authority when talking about your work.

Another neat thing: you can verify multiple links! So I have cassidyjames.com, blaede.family, and my GitHub account all verified on my profile. The first link is what will show in some contexts like search results, so put the most well-known associated link first.

Technical details

If you’re manually adding a link on your own site (or talking to a web developer to get support added to your organization’s site), here’s what you need to know:

  • The page needs to link back to your public profile, either with an <a> on the page or a <link> in the <head>
  • The link needs to include rel="me" to distinguish it from just linking to someone else’s account
  • The page needs to be secure (served over https)
  • There’s a 1 MB page limit; if your page is too heavy, it won’t work

Read more

October 10, 2023

On CVE-2023-43641

As you might have read already about, there was a vulnerability in libcue that took a side gig demonstrating a sandbox escape in tracker-miners.

The good news first so you can skip the rest, this is fixed in the tracker-miners 3.6.1/3.5.3/3.4.5/3.3.2 versions released on Sept 28th 2023, about a couple weeks ago. The relevant changes are also in the tracker-miners-3.2 and tracker-miners-3.1 branches, but I didn’t wind up to doing releases for those. If you didn’t update yet, please do so already.

Background

The seccomp jail in the metadata extractor is far from new, it was introduced during development for tracker-1.12 in 2016 (yup, tracker, before the tracker-miners indexers spun off the monolithic package). Before that, the file metadata extraction task was already split as a separate process from filesystem structure indexing for stability and resource consumption reasons.

Seccomp comes in different sizes, it allows from creating very permissive sandboxes where system calls are allowed unless blocked by some rule, to creating paranoid sandboxes where every system call is disallowed by default (with a degree of harshness of choice) unless allowed by some rule. Tracker took the most restrictive approach, making every system call fail with SIGSYS by default, and carving the holes necessary for correct operation of the many dependencies within the operational parameters established by yours truly (no outside network access, no filesystem write access, no execution of subprocesses). The list of rules is quite readable in the source code, every syscall there has been individually and painstakingly collected, evaluated, and pondered for inclusion.

But the tracker-extract process as a whole still had to do a number of things that could not work under those parameters:

  1. It wanted to read some settings, and GSettings/Dconf will anyways require r/w access to /run/user/.../dconf/user, or complain loudly.
  2. There is infrastructure to write detailed error reports (visible through the tracker3 status commandline tool) for files where metadata extraction failed. These error reports have helped improve tracker-miners quality and are a valuable resource.
  3. The tracker-extract process depends on GStreamer, and is a likely candidate for being the first to fork and indirectly rewrite the GST registry cache file on gst_init().

While in an ideal world we could work our way through highly specific seccomp rules, in practice the filters based on system call arguments are pretty rudimentary, e.g. not allowing for (sub)string checks. So we cannot carve out allowances for these specific things, not without opening the sandbox significantly.

Rules for thee, but not for me. Given the existing tracker-extract design with threads dedicated to extraction, the pragmatic approach taken back then was to make the main thread (in charge already of dispatching tasks to those dedicated threads) unrestricted and in charge of those other duties, and applying the seccomp ruleset to every thread involved in dealing with the data. While not perfectly airtight, the sandbox in this case is already a defense in depth measure, only applicable after other vulnerabilities (most likely, in 3rd party libraries) were already successfully exploited, all that can be done at that point is to deter and limit the extent of the damage, so the sandbox role is to make it not immediately obvious to make something harmful without needle precision. While the “1-click exploit in GNOME” headline is surely juicy, it had not happened for these 7 years the sandbox existed, despite our uncontrollable hose of dependencies (hi, GStreamer), or our other dependencies with a history of CVEs (hi Poppler, libXML et al), so the sandbox has at least seemed to fulfill the deterrence role.

But there is that, I knew about the potential of a case like this and didn’t act in time. I personally am not sure this is a much better defense than “I didn’t know better!”.

The serenpidity

The discovery of CVE-2023-43641 was doubly serendipitous, on one hand Kevin Backhouse from GitHub Security Lab quickly struck gold, unwittingly managing to corrupt data in a way that it was the unconstrained thread finding the trap being layered. To me, fortune struck by dealing with security issues “the good way”. For something that normally goes as pleasant as a tooth extraction, I must give Kevin five stars in management and diligence.

And as said, this didn’t catch me entirely by surprise. I even talked about the potential case with co-maintainer Sam Thursfield as recently as Fosdem this year, and funnily it was high up in my to-do list, only dropped from 3.6 schedule due to time restrictions. Even though the requirements (to reiterate: no indiscriminate network access, no write access, no execution) and the outliers didn’t budge, there were ideas floating on how to handle them. While the fixes were not as straightforward as just “init seccomp on main(), duh”, there was no moments of hesitation in addressing them, you can read the merge request commits for the details on how those were handled.

The end result is a tracker-extract main() function that pretty literally looks like:

main ()
{
  lower_process_priorities();
  init_seccomp();
  do_main();
}

I.e. the seccomp jail will affect the main thread and every other thread spawned from it, while barely extending the seccomp ruleset (mostly, to make some syscalls fail softly with error codes, instead of hard through SIGSYS), and not giving up in any of the impositions set.

The aftermath

Of course, one does not go willy-nilly adding paranoid sandbox restrictions to a behemoth like GStreamer and call it an early evening, also C library implementations will call different sets of system calls in different architectures and compile-time flags, transitioning to a fully sandboxed process still means adding new restrictions to an non-trivial amount of code, and will definitely stir the SIGSYS bug report pot for some time to come due to legit code hitting the newly set restrictions. This is already being the case, a thank you to the early reporters. The newly confined code is very trotted, so optimistically the grounds will settle back to relatively boring after another round of releases or two.

Reflections

While we could start splitting hairs about the tracker-extract process being allowed to do this or that while it shouldn’t, I think the current sandbox will keep tracker-miners itself out of the way of CVEs for a long long time after this *knocks on wood*. I’m sure this will result in more pairs of eyes looking, and patches flowing and hahaha who am I kidding. I’ll likely think/work alone through ways to decrease further the exposed surface on future cycles out of the spotlight, after the waves of armchair comments were long waded. For now, any vulnerability exploit do already have a much harder time at succeeding in evil.

I’d like to eventually see some tighter control on the chain of dependencies that tracker-miners has, maintaining a paranoid seccomp sandbox has a very high friction with running arbitrarily large amounts of code out of your control, and I’m sure that didn’t just get better. There are reasons we typically commit to only supporting the latest and greatest upstream, we cannot decently assure forward compatibility of the seccomp sandbox.

And regardless, now more than ever I am happy with relying on third party libraries to deal with parsers and formats. It still is the only way to maintain a generic purpose indexer and keep a piece of your sanity. To the people willing to step up and rewrite prime Rust fodder, there’s gold if you pull from the thread of tracker-extract module dependencies.

Conclusion

I’d personally want to apologize to libcue maintainers, as this likely blew out of proportion to them due to my inaction. Anything less dangerous looking than “1 click exploit” would have likely reduced the severity of the issue and the media coverage significantly.

And thanks again to Kevin for top notch management of the issue. My personal silver lining will be that this was pretty much a triumph of “enough eyes make bugs shallow”.

How to indicate device compatibility for your app in MetaInfo data

At the moment I am hard at work putting together the final bits for the AppStream 1.0 release (hopefully to be released this month). The new release comes with many new new features, an improved developer API and removal of most deprecated things (so it carefully breaks compatibility with very old data and the previous C API). One of the tasks for the upcoming 1.0 release was #481 asking about a formal way to distinguish Linux phone applications from desktop applications.

AppStream infamously does not support any “is-for-phone” label for software components, instead the decision whether something is compatible with a device is based the the device’s capabilities and the component’s requirements. This allows for truly adaptive applications to describe their requirements correctly, and does not lock us into “form factors” going into the future, as there are many and the feature range between a phone, a tablet and a tiny laptop is quite fluid.

Of course the “match to current device capabilities” check does not work if you are a website ranking phone compatibility. It also does not really work if you are a developer and want to know which devices your component / application will actually be considered compatible with. One goal for AppStream 1.0 is to have its library provide more complete building blocks to software centers. Instead of just a “here’s the data, interpret it according to the specification” API, libappstream now interprets the specification for the application and provides API to handle most common operations – like checking device compatibility. For developers, AppStream also now implements a few “virtual chassis configurations”, to roughly gauge which configurations a component may be compatible with.

To test the new code, I ran it against the large Debian and Flatpak repositories to check which applications are considered compatible with what chassis/device type already. The result was fairly disastrous, with many applications not specifying compatibility correctly (many do, but it’s by far not the norm!). Which brings me to the actual topic of this blog post: Very few seem to really know how to mark an application compatible with certain screen sizes and inputs! This is most certainly a matter of incomplete guides and good templates, so maybe this post can help with that a bit:

The ultimate cheat-sheet to mark your app “chassis-type” compatible

As a quick reminder, compatibility is indicated using AppStream’s relations system: A requires relation indicates that the system will not run at all or will run terribly if the requirement is not met. If the requirement is not met, it should not be installable on a system. A recommends relation means that it would be advantageous to have the recommended items, but it’s not essential to run the application (it may run with a degraded experience without the recommended things though). And a supports relation means a given interface/device/control/etc. is supported by this application, but the application may work completely fine without it.

I have a desktop-only application

A desktop-only application is characterized by needing a larger screen to fit the application, and requiring a physical keyboard and accurate mouse input. This type is assumed by default if no capabilities are set for an application, but it’s better to be explicit. This is the metadata you need:

<component type="desktop-application">
  <id>org.example.desktopapp</id>
  <name>DesktopApp</name>
  [...]
  <requires>
    <display_length>768</display_length>

    <control>keyboard</control>
    <control>pointing</control>
  </requires>
  [...]
</component>

With this requires relation, you require a small-desktop sized screen (at least 768 device-independent pixels (dp) on its smallest edge) and require a keyboard and mouse to be present / connectable. Of course, if your application needs more minimum space, adjust the requirement accordingly. Note that if the requirement is not met, your application may not be offered for installation.

Note: Device-independent / logical pixels

One logical pixel (= device independent pixel) roughly corresponds to the visual angle of one pixel on a device with a pixel density of 96 dpi (for historical X11 reasons) and a distance from the observer of about 52 cm, making the physical pixel about 0.26 mm in size. When using logical pixels as unit, they might not always map to exact physical lengths as their exact size is defined by the device providing the display. They do however accurately depict the maximum amount of pixels that can be drawn in the depicted direction on the device’s display space. AppStream always uses logical pixels when measuring lengths in pixels.

I have an application that works on mobile and on desktop / an adaptive app

Adaptive applications have fewer hard requirements, but a wide range of support for controls and screen sizes. For example, they support touch input, unlike desktop apps. An example MetaInfo snippet for these kind of apps may look like this:

<component type="desktop-application">
  <id>org.example.adaptive_app</id>
  <name>AdaptiveApp</name>
  [...]

  <requires>
    <display_length>360</display_length>
  </requires>

  <supports>
    <control>keyboard</control>
    <control>pointing</control>
    <control>touch</control>
  </supports>
  [...]
</component>

Unlike the pure desktop application, this adaptive application requires a much smaller lowest display edge length, and also supports touch input, in addition to keyboard and mouse/touchpad precision input.

I have a pure phone/table app

Making an application a pure phone application is tricky: We need to mark it as compatible with phones only, while not completely preventing its installation on non-phone devices (even though its UI is horrible, you may want to test the app, and software centers may allow its installation when requested explicitly even if they don’t show it by default). This is how to achieve that result:

<component type="desktop-application">
  <id>org.example.phoneapp</id>
  <name>PhoneApp</name>
  [...]

  <requires>
    <display_length>360</display_length>
  </requires>

  <recommends>
    <display_length compare="lt">1280</display_length>
    <control>touch</control>
  </recommends>
  [...]
</component>

We require a phone-sized display minimum edge size (adjust to a value that is fit for your app!), but then also recommend the screen to have a smaller edge size than a larger tablet/laptop, while also recommending touch input and not listing any support for keyboard and mouse.

Please note that this blog post is of course not a comprehensive guide, so if you want to dive deeper into what you can do with requires/recommends/suggests/supports, you may want to have a look at the relations tags described in the AppStream specification.

Validation

It is still easy to make mistakes with the system requirements metadata, which is why AppStream 1.0 will provide more commands to check MetaInfo files for system compatibility. Current pre-1.0 AppStream versions already have an is-satisfied command to check if the application is compatible with the currently running operating system:

:~$ appstreamcli is-satisfied ./org.example.adaptive_app.metainfo.xml
Relation check for: */*/*/org.example.adaptive_app/*

Requirements:
 • Unable to check display size: Can not read information without GUI toolkit access.
Recommendations:
 • No recommended items are set for this software.
Supported:
 ✔ Physical keyboard found.
 ✔ Pointing device (e.g. a mouse or touchpad) found.
 • This software supports touch input.

In addition to this command, AppStream 1.0 will introduce a new one as well: check-syscompat. This command will check the component against libappstream’s mock system configurations that define a “most common” (whatever that is at the time) configuration for a respective chassis type.

If you pass the --details flag, you can even get an explanation why the component was considered or not considered for a specific chassis type:

:~$ appstreamcli check-syscompat --details ./org.example.phoneapp.metainfo.xml
Chassis compatibility check for: */*/*/org.example.phoneapp/*

Desktop:
  Incompatible
 • recommends: This software recommends a display with its shortest edge
   being << 1280 px in size, but the display of this device has 1280 px.
 • recommends: This software recommends a touch input device.

Laptop:
  Incompatible
 • recommends: This software recommends a display with its shortest edge 
   being << 1280 px in size, but the display of this device has 1280 px.
 • recommends: This software recommends a touch input device.

Server:
  Incompatible
 • requires: This software needs a display for graphical content.
 • recommends: This software needs a display for graphical content.
 • recommends: This software recommends a touch input device.

Tablet:
 ✔ Compatible (100%)

Handset:
 ✔ Compatible (100%)

I hope this is helpful for people. Happy metadata writing! 😀