November 03, 2024

Igalia and WebKit: status update and plans (2024)

It’s been more than 2 years since the last time I wrote something here, and in that time a lot of things happened. Among those, one of the main highlights was me moving back to Igalia‘s WebKit team, but this time I moved as part of Igalia’s support infrastructure to help with other types of tasks such as general coordination, team facilitation and project management, among other things.

On top of those things, I’ve been also presenting our work around WebKit in different venues, such as in the Embedded Open Source Summit or in the Embedded Recipes conference, for instance. Of course, that included presenting our work in the WebKit community as part of the WebKit Contributors Meeting, a small and technically focused event that happens every year, normally around the Bay Area (California). That’s often a pretty dense presentation where, over the course of 30-40 minutes, we go through all the main areas that we at Igalia contribute to in WebKit, trying to summarize our main contributions in the previous 12 months. This includes work not just from the WebKit team, but also from other ones such as our Web Platform, Compilers or Multimedia teams.

So far I did that a couple of times only, both last year on October 24rth as well as this year, just a couple of weeks ago in the latest instance of the WebKit Contributors meeting. I believe the session was interesting and informative, but unfortunately it does not get recorded so this time I thought I’d write a blog post to make it more widely accessible to people not attending that event.

This is a long read, so maybe grab a cup of your favorite beverage first…

Igalia and WebKit

So first of all, what is the relationship between Igalia and the WebKit project?

Igalia logoWebKit logo

In a nutshell, we are the lead developers and the maintainers of the two Linux-based WebKit ports, known as WebKitGTK and WPE. These ports share a common baseline (e.g. GLib, GStreamer, libsoup) and also some goals (e.g. performance, security), but other than that their purpose is different, with WebKitGTK being aimed at the Linux desktop, while WPE is mainly focused on embedded devices.

WPE logo

This means that, while WebKitGTK is the go-to solution to embed Web content in GTK applications (e.g. GNOME Web/Epiphany, Evolution), and therefore integrates well with that graphical toolkit, WPE does not even provide a graphical toolkit since its main goal is to be able to run well on embedded devices that often don’t even have a lot of memory or processing power, or not even the usual mechanisms for I/O that we are used to in desktop computers. This is why WPE’s architecture is designed with flexibility in mind with a backends-based architecture, why it aims for using as few resources as possible, and why it tries to depend on as few libraries as possible, so you can integrate it virtually in any kind of embedded Linux platform.

Besides that port-specific work, which is what our WebKit and Multimedia teams focus a lot of their effort on, we also contribute at a different level in the port-agnostic parts of WebKit, mostly around the area of Web standards (e.g. contributing to Web specifications and to implement them) and the Javascript engine. This work is carried out by our Web Platform and Compilers team, which tirelessly contribute to the different parts of WebCore and JavaScriptCore that affect not just the WebKitGTK and WPE ports, but also the rest of them to a bigger or smaller degree.

Last but not least, we also devote a considerable amount of our time to other topics such as accessibility, performance, bug fixing, QA... and also to make sure WebKit works well on 32-bit devices, which is an important thing for a lot of WPE users out there.

Who are our users?

At Igalia we distinguish 4 main types of users of the WebKitGTK and WPE ports of WebKit:

Port users: this category would include anyone that writes a product directly against the port’s API, that is, apps such as a desktop Web browser or embedded systems that rely on a fullscreen Web view to render its Web-based content (e.g. digital signage systems).

Platform providers: in this category we would have developers that build frameworks with one of the Linux ports at its core, so that people relying on such frameworks can leverage the power of the Web without having to directly interface with the port’s API. RDK could be a good example of this use case, with WPE at the core of the so-called Thunder plugin (previously known as WPEFramework).

Web developers: of course, Web developers willing to develop and test their applications against our ports need to be considered here too, as they come with a different set of needs that need to be fulfilled, beyond rendering their Web content (e.g. using the Web Inspector).

End users: And finally, the end user is the last piece of the puzzle we need to pay attention to, as that’s what makes all this effort a task worth undertaking, even if most of them most likely don’t need what WebKit is, which is perfectly fine :-)

We like to make this distinction of 4 possible types of users explicit because we think it’s important to understand the complexity of the amount of use cases and the diversity of potential users and customers we need to provide service for, which is behind our decisions and the way we prioritize our work.

Strategic goals

Our main goal is that our product, the WebKit web engine, is useful for more and more people in different situations. Because of this, it is important that the platform is homogeneous and that it can be used reliably with all the engines available nowadays, and this is why compatibility and interoperability is a must, and why we work with the the standards bodies to help with the design and implementation of several Web specifications.

With WPE, it is very important to be able to run the engine in small embedded devices, and that requires good performance and being efficient in multiple hardware architectures, as well as great flexibility for specific hardware, which is why we provided WPE with a backend-based architecture, and reduced dependencies to a minimum.

Then, it is also important that the QA Infrastructure is good enough to keep the releases working and with good quality, which is why I regularly maintain, evolve and keep an eye on the EWS and post-commit bots that keep WebKitGTK and WPE building, running and passing the tens of thousands of tests that we need to check continuously, to ensure we don’t regress (or that we catch issues soon enough, when there’s a problem). Then of course it’s also important to keep doing security releases, making sure that we release stable versions with fixes to the different CVEs reported as soon as possible.

Finally, we also make sure that we keep evolving our tooling as much as possible (see for instance the release of the new SDK earlier this year), as well as improving the documentation for both ports.

Last, all this effort would not be possible if not because we also consider a goal of us to maintain an efficient collaboration with the rest of the WebKit community in different ways, from making sure we re-use and contribute to other ports as much code as possible, to making sure we communicate well in all the forums available (e.g. Slack, mailing list, annual meeting).

Contributions to WebKit in numbers

Well, first of all the usual disclaimer: number of commits is for sure not the best possible metric,  and therefore should be taken with a grain of salt. However, the point here is not to focus too much on the actual numbers but on the more general conclusions that can be extracted from them, and from that point of view I believe it’s interesting to take a look at this data at least once a year.

Igalia contributions to WebKit (2024)

With that out of the way, it’s interesting to confirm that once again we are still the 2nd biggest contributor to WebKit after Apple, with ~13% of the commits landed in this past 12-month period. More specifically, we landed 2027 patches out of the 15617 ones that took place during the past year, only surpassed by Apple and their 12456 commits. The remaining 1134 patches were landed mostly by Sony, followed by RedHat and several other contributors.

Igalia contributions to WebKit (2024)Now, if we remove Apple from the picture, we can observe how this year our contributions represented ~64% of all the non-Apple commits, a figure that grew about ~11% compared to the past year. This confirms once again our commitment to WebKit, a project we started contributing about 14 years ago already, and where we have been systematically being the 2nd top contributor for a while now.

Main areas of work

The 10 main areas we have contributed to in WebKit in the past 12 months are the following ones:

  • Web platform
  • Graphics
  • Multimedia
  • JavaScriptCore
  • New WPE API
  • WebKit on Android
  • Quality assurance
  • Security
  • Tooling
  • Documentation

In the next sections I’ll talk a bit about what we’ve done and what we’re planning to do next for each of them.

Web Platform

content-visibility:auto

This feature allows skipping painting and rendering of off-screen sections, particularly useful to avoid the browser spending time rendering parts in large pages, as content outside of the view doesn’t get rendered until it gets visible.

We completed the implementation and it’s now enabled by default.

Navigation API

This is a new API to manage browser navigation actions and examine history, which we started working on in the past cycle. There’s been a lot of work happening here and, while it’s not finished yet, the current plan is that Apple will continue working on that in the next months.

hasUAVisualTransition

This is an attribute of the NavigateEvent interface, which is meant to be True if the User Agent has performed a visual transition before a navigation event. It was something that we have also finished implementing and is now also enabled by default.

Secure Curves in the Web Cryptography API

In this case, we worked on fixing several Web Interop related issues, as well as on increasing test coverage within the Web Platform Tests (WPT) test suites.

On top of that we also moved the X25519 feature to the “prepare to ship” stage.

Trusted Types

This work is related to reducing DOM-based XSS attacks. Here we finished the implementation and this is now pending to be enabled by default.

MathML

We continued working on the MathML specification by working on the support for padding, border and margin, as well as by increasing the WPT score by ~5%.

The plan for next year is to continue working on core features and improve the interaction with CSS.

Cross-root ARIA

Web components have accessibility-related issues with native Shadow DOM as you cannot reference elements with ARIA attributes across boundaries. We haven’t worked on this in this period, but the plan is to work in the next months on implementing the Reference Target proposal to solve those issues.

Canvas Formatted Text

Canvas has not a solution to add formatted and multi-line text, so we would like to also work on exploring and prototyping the Canvas Place Element proposal in WebKit, which allows better text in canvas and more extended features.

Graphics

Completed migration from Cairo to Skia for the Linux ports

If you have followed the latest developments, you probably already know that the Linux WebKit ports (i.e. WebKitGTK and WPE) have moved from Cairo to Skia for their 2D rendering library, which was a pretty big and important decision taken after a long time trying different approaches and experiments (including developing our own HW-accelerated 2D rendering library!), as well as running several tests and measuring results in different benchmarks.

Skia logoThe results in the end were pretty overwhelming and we decided to give Skia a go, and we are happy to say that, as of today, the migration has been completed: we covered all the use cases in Cairo, achieving feature parity, and we are now working on implementing new features and improvements built on top of Skia (e.g. GPU-based 2D rendering).

On top of that, Skia is now the default backend for WebKitGTK and WPE since 2.46.0, released on September 17th, so if you’re building a recent version of those ports you’ll be already using Skia as their 2D rendering backend. Note that Skia is using its GPU-based backend only on desktop environments, on embedded devices the situation is trickier and for now the default is the CPU-based Skia backend, but we are actively working to narrow the gap and to enable GPU-based rendering also on embedded.

Architecture changes with buffer sharing APIs (DMABuf)

We did a lot of work here, such as a big refactoring of the fencing system to control the access to the buffers, or the continued work towards integrating with Apple’s DisplayLink infrastructure.

On top of that, we also enabled more efficient composition using damaging information, so that we don’t need to pass that much information to the compositor, which would slow the CPU down.

Enablement of the GPUProcess

On this front, we enabled by default the compilation for WebGL rendering using the GPU process, and we are currently working in performance review and enabling it for other types of rendering.

New SVG engine (LBSE: Layer-Based SVG Engine)

If you are not familiar with this, here the idea is to make sure that we reuse the graphics pipeline used for HTML and CSS rendering, and use it also for SVG, instead of having its own pipeline. This means, among other things, that SVG layers will be supported as a 1st-class citizen in the engine, enabling HW-accelerated animations, as well as support for 3D transformations for individual SVG elements.

LBSE logo

On this front, on this cycle we added support for the missing features in the LBSE, namely:

  • Implemented support for gradients & patterns (applicable to both fill and stroke)
  • Implemented support for clipping & masking (for all shapes/text)
  • Implemented support for markers
  • Helped review implementation of SVG filters (done by Apple)

Besides all this, we also improved the performance of the new layer-based engine by reducing repaints and re-layouts as much as possible (further optimizations still possible), narrowing the performance gap with the current engine for MotionMark. While we are still not at the same level of performance as the current SVG engine, we are confident that there are several key places where, with the right funding, we should be able to improve the performance to at least match the current engine, and therefore be able to push the new engine through the finish line.

General overhaul of the graphics pipeline, touching different areas (WIP):

On top of everything else commented above, we also worked on a general refactor and simplification of the graphics pipeline. For instance, we have been working on the removal of the Nicosia layer now that we are not planning to have multiple rendering implementations, among other things.

Multimedia

DMABuf-based sink for HW-accelerated video

We merged the DMABuf-based sink for HW-accelerated video in the GL-based GStreamer sink.

WebCodecs backend

We completed the implementation of  audio/video encoding and decoding, and this is now enabled by default in 2.46. As for the next steps, we plan to keep working on the integration of WebCodecs with WebGL and WebAudio.

GStreamer-based WebRTC backends

We continued working on GstWebRTC, bringing it to a point where it can be used in production in some specific use cases, and we will still be working on this in the next months.

Other

Besides the points above, we also added an optional text-to-speech backend based on libspiel to the development branch, and worked on general maintenance around the support for Media Source Extensions (MSE) and Encrypted Media Extensions (EME), which are crucial for the use case of WPE running in set-top-boxes, and is a permanent task we will continue to work on in the next months.

JavaScriptCore

ARMv7/32-bit support:

A lot of work happened around 32-bit support in JavaScriptCore, especially around WebAssembly (WASM): we ported the WASM BBQJIT and ported/enabled concurrent JIT support, and we also completed 80% of the implementation for the OMG optimization level of WASM, which we plan to finish in the next months. If you are unfamiliar with what the OMG and BBQ optimization tiers in WASM are, I’d recommend you to take a look at this article in webkit.org: Assembling WebAssembly.

WASM logoWe also contributed to the JIT-less WASM, which is very useful for embedded systems that can’t support JIT for security or memory related constraints, and also did some work on the In-Place Interpreter (IPInt), which is a new version of the WASM Low-level interpreter (LLInt) that uses less memory and executes WASM bytecode directly without translating it to LLInt bytecode  (and should therefore be faster to execute).

Last, we also contributed most of the implementation for the WASM GC, with the exception of some Kotlin tests.

As for the next few months, we plan to investigate and optimize heap/JIT memory usage in 32-bit, as well as to finish several other improvements on ARMv7 (e.g. IPInt).

New WPE API

The new WPE API is a new API that aims at making it easier to use WPE in embedded devices, by removing the hassle of having to handle several libraries in tandem (i.e. WPEWebKit, libWPE and WPEBackend-FDO, for instance), available from WPE’s releases page, and providing a more modern API in general, better aimed at the most common use cases of WPE.

A lot of effort happened this year along these lines, including the fact that we finally upstreamed and shipped its initial implementation with WPE 2.44, back in the first half of the year. Now, while we recommend users to give it a try and report feedback as much as possible, this new API is still not set in stone, with regular development still ongoing, so if you have the chance to try it out and share your experience, comments are welcome!

Besides shipping its initial implementation, we also added support for external platforms, so that other ones can be loaded beyond the Wayland, DRM and “headless” ones, which are the default platforms already included with WPE itself. This means for instance that a GTK4 platform, or another one for RDK could be easily used with WPE.

Then of course a lot of API additions were included in the new API in the latest months:

  • Screens management APIAPI to handle different screens, ask the display for the list of screens with their device scale factor, refresh rate, geometry…
  • Top level management API: This API allows a greater degree of control, for instance by allowing more than one WebView for the same top level, as well as allowing to retrieve properties such as size, scale or state (i.e. full screen, maximized…).
  • Maximized and minimized windows API: API to maximize/minimize a top level and monitor its state. mainly used by WebDriver.
  • Preferred DMA-BUF formats API: enables asking the platform (compositor or DRM) for the list of preferred formats and their intended use (scanout/rendering).
  • Input methods APIallows platforms to provide an implementation to handle input events (e.g. virtual keyboard, autocompletion, auto correction…).
  • Gestures API: API to handle gestures (e.g. tap, drag).
  • Buffer damaging: WebKit generates information about the areas of the buffer that actually changed and we pass that to DRM or the compositor to optimize painting.
  • Pointer lock API: allows the WebView to lock the pointer so that the movement of the pointing device (e.g. mouse) can be used for a different purpose (e.g. first-person shooters).

Last, we also added support for testing automation, and we can support WebDriver now in the new API.

With all this done so far, the plan now is to complete the new WPE API, with a focus on the Settings API and accessibility support, write API tests and documentation, and then also add an external platform to support GTK4. This is done on a best-effort basis, so there’s no specific release date.

WebKit on Android

This year was also a good year for WebKit on Android, also known as WPE Android, as this is a project that sits on top of WPE and its public API (instead of developing a fully-fledged WebKit port).

Android logoIn case you’re not familiar with this, the idea here is to provide a WebKit-based alternative to the Chromium-based Web view on Android devices, in a way that leverages HW acceleration when possible and that it integrates natively (and nicely) with the several Android subsystems, and of course with Android’s native mainloop. Note that this is an experimental project for now, so don’t expect production-ready quality quite yet, but hopefully something that can be used to start experimenting with selected use cases.

If you’re adventurous enough, you can already try the APKs yourself from the releases page in GitHub at https://github.com/Igalia/wpe-android/releases.

Anyway, as for the changes that happened in the past 12 months, here is a summary:

  • Updated WPE Android to WPE 2.46 and NDK 27 LTS
  • Added support for WebDriver and included WPT test suites
  • Added support for instrumentation tests, and integrated with the GitHub CI
  • Added support for the remote Web inspector, very useful for debugging
  • Enabled the Skia backend, bringing HW-accelerated 2D rendering to WebKit on Android
  • Implemented prompt delegates, allowing implementing things such as alert dialogs
  • Implemented WPEView client interfaces, allowing responding to things such as HTTP errors
  • Packaged a WPE-based Android WebView in its own library and published in Maven Central. This is a massive improvement as now apps can use WPE Android by simply referencing the library from the gradle files, no need to build everything on their own.
  • Other changes: enabled HTTP/2 support (via the migration to libsoup3), added support for the device scale factor, improved the virtual on-screen keyboard, general bug fixing…

On top of that, we published 3 different blog posts covering different topics, from a general intro to a more deep dive explanation of the internals, and showing some demos. You can check them out in Jani’s blog at https://blogs.igalia.com/jani

As for the future, we’ll focus on stabilization and regular maintenance for now, and then we’d like to work towards achieving production-ready quality for specific cases if possible.

Quality Assurance

On the QA front, we had a busy year but in general we could highlight the following topics.

  • Fixed a lot of API tests failures in the bots that were limiting our test coverage.
  • Fixed lots of assertions-related crashes in the bots, which were slowing down the bots as well as causing other types of issues, such as bots exiting early due too many failures.
  • Enabled assertions in the release bots, which will help prevent crashes in the future, as well as with making our debug bots healthier.
  • Moved all the WebKitGTK and WPE bots to building now with Skia instead of Cairo. This means that all the bots running tests are now using Skia, and there’s only one bot still using Cairo to make sure that the compilation is not broken, but that bot does not run tests.
  • Moved all the WebKitGTK bots to use GTK4 by default. As with the move to Skia, all the WebKit bots running tests now use GTK4 and the only one remaining building with GTK3 does not run tests, it only makes sure we don’t break the GTK3 compilation for now.
  • Working on moving all the bots to use the new SDK. This is still work in progress and will likely be completed during 2025 as it’s needed to implement several changes in the infrastructure that will take some time.
  • General gardening and bot maintenance

In the next months, our main focus would be a revamp of the QA infrastructure to make sure that we can get all the bots (including the debug ones) to a healthier state, finish the migration of all the bots to the new SDK and, ideally, be able to bring back the ready-to-use WPE images that we used to have available in wpewebkit.org.

Security

The current release cadence has been working well, so we continue issuing major releases every 6 months (March, September), and then minor and unstable development releases happening on-demand when needed.

As usual, we kept aligning releases for WebKitGTK and WPE, with both of them happening at the same time (see https://webkitgtk.org/releases and https://wpewebkit.org/release), and then also publishing WebKit Security Advisories (WSA) when necessary, both for WebKitGTK and for WPE.

Last, we also shortened the time before including security fixes in stable releases this year, and we have removed support for libsoup2 from WPE, as that library is no longer maintained.

Tooling & Documentation

On tooling, the main piece of news is that this year we released the initial version of the new SDK,  which is developed on top of OCI-based containers. This new SDK fixes the issues with the current existing approaches based on JHBuild and flatpak, where one of them was great for development but poor for testing and QA, and the other one was great for testing and QA, but not very convenient for development.

This new SDK is regularly maintained and currently runs on Ubuntu 24.04 LTS with GCC 14 & Clang 18. It has been made public on GitHub and announced to the public in May 2024 in Patrick’s blog, and is now the officially recommended way of building WebKitGTK and WPE.

As for documentation, we didn’t do as much as we would have liked here, but we still landed a few contributions in docs.webkit.org, mostly related to WebKitGTK (e.g. Releases and VersioningSecurity UpdatesMultimedia). We plan to do more on this regard in the next months, though, mostly by writing/publishing more documentation and perhaps also some tutorials.

Final thoughts

This has been a fairly long blog post but, as you can see, it’s been quite a year for WebKit here at Igalia, with many exciting changes happening at several fronts, and so there was quite a lot of stuff to comment on here. This said, you can always check the slides of the presentation in the WebKit Contributors Meeting here if you prefer a more concise version of the same content.

In any case, what’s clear it’s that the next months are probably going to be quite interesting as well with all the work that’s already going on in WebKit and its Linux ports, so it’s possible that in 12 months from now I might be writing an equally long essay. We’ll see.

Thanks for reading!

Profiling w/o Frame Pointers

A couple years ago the Fedora council denied a request by Meta engineers to build the distribution with frame-pointers. Pretty immediately I pushed back by writing a number of articles to inform the council members why frame-pointers were necessary for a good profiling experience.

Profiling is used by developers, system administrators, and when we’re lucky by bug reporters!

Since then, many people have discussed other options. For example in the not too distant future we’ll probably see SFrame unwinding provide a reasonable way to unwind stacks w/o frame-pointers enabled and more importantly, without copying the contents of the stack.

Until then, it can be helpful to have a way to unwind stacks even without the presence of frame-pointers. This past week I implemented that for Sysprof based on a prototype put together by Serhei Makarov in the elfutils project called eu-stacktrace.

This prototype works by taking samples of the stack from perf (say 16KB-32KB worth) and resolving enough of the ELF data for DWARF/CFI (Call-frame-information)/etc to unwind the stacks in memory using a copy of the registers. From this you create a callchain (array of instruction pointers) which can be sent to Sysprof for recording.

I say “in memory” because the stack and register content doesn’t hit disk. It only lands inside the mmap()-based ring buffer used to communicate with Linux’s perf event subsystem. The (much smaller) array of instruction pointers eventually lands on disk if you’re not recording to a memfd.

I expanded upon this prototype with a new sysprof-live-unwinder process which does roughly the same thing as eu-stacktrace while fitting into the Sysprof infrastructure a bit more naturally. It consumes a perf data stream directly (eu-stacktrace consumed Sysprof-formatted data) and then provides that to Sysprof to help reduce overhead.

Additionally, eu-stacktrace only unwinds the user-space side of things. On x86_64, at least, you can convince perf to give you both callchains (PERF_SAMPLE_CALLCHAIN) as well as sample stack/registers (PERF_SAMPLE_STACK_USER|PERF_SAMPLE_REGS_USER). If you peek for the location of PERF_CONTEXT_USER to find the context switch, blending them is quite simple. So, naturally, Sysprof does that. The additional overhead for frame-pointer unwinding user-space is negligible when you don’t have frame-pointers to begin with.

I should start by saying that this still has considerable overhead compared to frame-pointers. Locally on my test machine (a Thinkpad X1 Carbon Gen 3 from around 2015, so not super new) that is about 10% of samples. I imagine I can shave a bit of that off by tracking the VMAs differently than libdwfl, so we’ll see.

Here is an example of it working on CentOS Stream 10 which does not have frame-pointers enabled. Additionally, this build is debuginfod-enabled so after recording it will automatically locate enough debug symbols to get appropriate function names for what was captured.

This definitely isn’t the long term answer to unwinding. But if you don’t have frame-pointers on your production operating system of choice, it might just get you by until SFrame comes around.

The code is at wip/chergert/translate but will likely get cleaned up and merged this next week.

November 01, 2024

#172 Valencia

Update on what happened across the GNOME project in the week from October 25 to November 01.

Open source is more than just writing code; it’s about people and the community. Right now, the world faces numerous crises, and this past week, another tragedy occurred - one that also affects members of the GNOME community.

Manu (he/they/she) says

The Valencian Country, among other Spanish autonomies has been hit by the worst natural disaster in its history. Entire villages have been completely flooded. There are more than 200 deaths so far that we know of, and more than 2000 people missing.

If you wish to help, Caritas is a trustful organization to donate to:

ES02 2100 8734 6113 0064 8236

Any donation Apostrophe receives the next two months will be also donated to one of the local Horta Sud associations that are working on the field. I will be also helping were help is needed.

GNOME Core Apps and Libraries

Libadwaita

Building blocks for modern GNOME apps using GTK4.

Alice (she/her) reports

Peter Eisenmann added the :visible-page-tag property to AdwNavigationView - a helper for checking the current page by its tag

Alice (she/her) reports

the style class .dim-label that has always had a misleading name has been soft deprecated in favor of .dimmed. The old style still works same as before

Third Party Projects

Jan-Willem announces

This week I released Java-GI version 0.11.0! Java-GI is a GObject-Introspection bindings generator for Java, using the brand new foreign function interface of OpenJDK 22. It can be used to develop GNOME apps in Java.

This release features a lot of fixes and improvements, so make sure to check out the release notes.

For more information about Java-GI, visit the website, where you will find code samples in both Java and Kotlin. Additionally, the Gtk “Getting started guide” has been ported to Java and is now available here, and a couple new examples were added to the java-gi-examples repository.

Gianni Rosato says

We’ve got a new Aviator release for you! It’s packed with small bug fixes and an SVT-AV1-PSY update.

Bug Fixes:

  • We fixed the hicolor icon that was mislabeled as scalable.
  • We also fixed the audio bitrate resetting when you open a new file.

SVT-AV1-PSY v2.3.0 Improvements for Aviator:

  • You can now encode with odd (non-mod2) dimensions.
  • You can also encode at resolutions lower than 64x64, all the way down to 4x4.
  • The color reproduction and overall picture quality will be better when you disable “Perceptual Tuning.”
  • The color reproduction and overall picture quality with “Perceptual Tuning.” disabled has improved.
  • There will be general perceptual fidelity improvements when you enable “Perceptual Tuning.”
  • There will be general performance improvements, especially on ARM platforms.

Other Changes:

  • We removed the sharpness usage when “Perceptual Tuning” is enabled.
  • We’ve updated FFmpeg to version 7.1.
  • We’ve updated llvm to version 19 for project compilation.
  • We’ve updated to GNOME SDK 47.

Enjoy the new Aviator release! ✈️

Parabolic

Download web video and audio.

Nick says

Parabolic V2024.10.3 is here! This update introduces some new features, including the ability to select a batch txt file with multiple URLs for validation, and fixes many bugs regarding website validation, localization of the app on Windows, and crashes on Linux.

Here’s the full changelog:

  • Added support for selecting a batch file with multiple URLs to validate instead of validating a single URL at a time
  • Added a recovery mode where downloads that were running/queued will be restored when the application is restarted after a crash
  • User entered file names will now be correctly normalized and validated in the Add Download dialog
  • Fixed an issue where YouTube tabs were not correctly validated
  • Fixed an issue where the app’s documentation was not accessible
  • Fixed an issue where UTF-8 characters were not displayed correctly on Windows
  • Fixed an issue where playlist names were not normalized on Windows
  • Fixed an issue where the row animations were choppy using aria2c on Linux
  • Fixed an issue where the app would crash when stopping all downloads on Linux
  • Updated yt-dlp to 2024.10.22

Fractal

Matrix messaging app for GNOME written in Rust.

Kévin Commaille reports

😱 What’s that behind you⁉️ Oh, that’s the new Fractal 9 release❣️ 😁 🎃

  • We switched to the glycin library (the same one used by GNOME Image Viewer) to load images, allowing us to fix several issues, like supporting more animated formats and SVGs and respecting EXIF orientation.
  • The annoying bug where some rooms would stay as unread even after opening them is now a distant memory.
  • The media cache uses its own database that you can delete if you want to free some space on your system. It will also soon be able to clean up unused media files to prevent it from growing indefinitely.
  • Sometimes the day separators would show up with the wrong date, not anymore!
  • We migrated to the new GTK 4.16 and libadwaita 1.6 APIs, including CSS variables, AdwButtonRow and AdwSpinner.
  • We used to only rely on the secrets provider to tell us which Matrix accounts are logged-in, which caused issues for people sharing their secrets between devices. Now we also make sure that there is a data folder for a given session before trying to restore it.
  • Our notifications are categorized as coming from an instant messenger, so graphical shells that support it, such as Phosh, can play a sound for them.
  • Some room settings are hidden for direct chats, because it does not make sense to change them in this type of room.
  • The size of the headerbar would change depending on whether the room has a topic or not. This will not happen anymore.

As usual, this release includes other improvements and fixes thanks to all our contributors, and our upstream projects.

We want to address special thanks to the translators who worked on this version. We know this is a huge undertaking and have a deep appreciation for what you’ve done. If you want to help with this effort, head over to Damned Lies.

This version is available right now on Flathub.

We have a lot of improvements in mind for our next release, but if you want a particular feature to make it, the surest way is to implement it yourself! Start by looking at our issues or just come say hello in our Matrix room.

Dev Toolbox

Dev tools at your fingertips

Alessandro Iepure announces

After a long year of coding on and off (and balancing a lot of real life and university work!), I’m happy to finally share a new Dev Toolbox update! 🎉

This release packs a completely revamped UI for a smoother experience and new search functionality, making it even easier to find exactly the tool you need. You can also now mark your favorite tools to keep them in their own special menu, ready for quick access. And the fun doesn’t stop there; let me introduce three new tools:

  • JavaScript & CSS Minifiers to help you shrink those files down
  • A handy Base64 Encoder (huge thanks to @amersaw for the contribution!)

Plus, a handful of smaller improvements and bug fixes are sprinkled in to make your experience even better.

A huge shoutout to the translators who helped make this app accessible to more people around the world! 🌍

Dev Toolbox is available right now on Flathub.

Events

devrtz reports

📢 🎉.This week the FOSS on mobile devices Call for Proposals has been opened for FOSDEM 2025 🎉. 📢

We are excited to have your presentations, demos and more! 📈 Showcase (and witness) the latest and greatest in Mobile Linux technologies ☎️ next year in Brussels 🚀

For more information, see this post on devrtz fosstodon post

\o/ We hope to see you there \o/

That’s all for this week!

See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects!

A million portals

Approximately four years ago, I published the first release of ASHPD, one of my first Rust libraries, with the simple goal of making it easy to use XDG portals from Rust.

Since then, the library has grown to support all available portals and even includes a demo application showcasing some of these features.

Let's look at an example: the org.freedesktop.portal.Account portal. From the client side, an API end-user can request user information with the following code:

use ashpd::desktop::account::UserInformation;

async fn run() -> ashpd::Result<()> {
    let response = UserInformation::request()
        .reason("App would like to access user information")
        .send()
        .await?
        .response()?;

    println!("Name: {}", response.name());
    println!("ID: {}", response.id());

    Ok(())
}

This code calls the org.freedesktop.portal.Account.GetUserInformation D-Bus method, which xdg-desktop-portal will "redirect" to any portal frontend implementing the org.freedesktop.impl.portal.Account D-Bus interface.

So, how can you provide an implementation of org.freedesktop.impl.portal.Account in Rust? That's exactly what Maximiliano and I have been working on, building on the solid foundations we established earlier. I’m thrilled to announce that we finally shipped this functionality in the 0.10 release!

The first step is to implement the D-Bus interface, which we hide from the API’s end-user using traits.

use ashpd::{
    async_trait,
    backend::{
        account::{AccountImpl, UserInformationOptions},
        request::RequestImpl,
        Result,
    },
    desktop::account::UserInformation,
    AppID, WindowIdentifierType,
};

pub struct Account;

#[async_trait]
impl RequestImpl for Account {
    async fn close(&self) {
        // Close the dialog
    }
}

#[async_trait]
impl AccountImpl for Account {
    async fn get_user_information(
        &self,
        _app_id: Option<AppID>,
        _window_identifier: Option<WindowIdentifierType>,
        _options: UserInformationOptions,
    ) -> Result<UserInformation> {
        Ok(UserInformation::new(
            "user",
            "User",
            url::Url::parse("file://user/icon").unwrap(),
        ))
    }
}

Pretty straightforward! With the D-Bus interface implemented using ASHPD wrapper types, the next step is to export it on the bus.

use futures_util::future::pending;

async fn main() -> ashpd::Result<()> {
    ashpd::backend::Builder::new("org.freedesktop.impl.portal.desktop.mycustomportal")?
        .account(Account)
        .build()
        .await?;

    loop {
        pending::<()>().await;
    }
}

And that’s it—you’ve implemented your first portal frontend!

Currently, the backend feature doesn't yet support session-based portals, but we hope to add that functionality in the near future.

With over 1 million downloads, ASHPD has come a long way, and it wouldn’t have been possible without the support and contributions from the community. A huge thank you to everyone who has helped make this library what it is today.

October 31, 2024

The Bargain-Finder-inator 5000: One programmer's quest for a new flat

 The Bargain-Finder-inator 5000: One programmer's quest for a new flat

Or how I managed to get a reasonably priced apartment offer despite estate agencies

I think every one of us had to go through the hell that's searching for a new place to live. The reasons may be of all kinds, starting with moving between jobs or random life events, ending with your landlord wanting to raise your rent for fixing his couch despite your 3 years of begging for him to do so. You can guess my reasoning from that totally not suspiciously specific example, one thing's for certain - many of us, not lucky enough to be on their own yet, have to go through that not very delightful experience.

One major problem when scraping those online market websites, is that you're not the only one desperately doing so. And if it was only for the fellow lost souls who are trying to make ends meet, oh no - many real estate agencies say hello there as well. So when a very good offer finally comes up, one that you've been dreaming your whole life kind of one, you grab that phone and call them not maybe, but may they please-oh-lord pick up. Despite you wasting no breath, chances are that when you enthusiastically call them (after correcting the typos in the phone number you made out of excitement), you're already too late. Even though you ended up manually checking the damn website every 20 minutes (yup, I set an alarm), and you called after only a quarter, you were still not fast enough and there are already four people in line before you. Which in case of a good offer means it's as good as doughnuts at work you heard they were giving out to buy your sympathy for the corporate - gone even faster than they have probably arrived. Yup, that's basically the housing market situation in Poland, yay \o/

But do not abandon all hope ye who enter here -  after having only a couple of mental break downs my friend sent me a link to a program on github, that was supposed to scrap our local market website and give instance notice about new offers. The web page did have a similar function, but it only worked in theory - the emails about the "latest" offers came only once a day, not to mention the fact that they were from the day before. Oh well, in that case saying goodbye to the 20 minute alarm sounded like a dream come true, so I tried to configure the program olx-scraper to my needs. However, it turned out to be pretty useless as well - it would repeatedly fetch a whole list of offers from only one page of search results, and compare its size between iterations. If the length of such list increased, it would theoretically mean that there are new offers, and the program would send a mail notification that contained the whole list. While this approach kinda worked for searches that returns only a few results, the whole idea fell apart when there were more than could fit in one page. In that case the number of offers would seem to remain constant, and new offers would be missed. Another room for improvement was in lack of ability to ignore certain kinds of offers, such as ads, and not so helpful emails, which could just give you what you're looking for - the newest offer, instead of the whole list.

Here comes the sun in the form of the Bargain-Finder-inator 5000 to the rescue! I quickly realized that a few patches was not enough to fix the old program for my (or frankly saying anyone's) use case and re-wrote the whole searching algorithm, eventually leading to a whole new program. The original name was "Wyszukiwator-Mieszkań 5000", inspired by Dr. Doofenschmirtz various schemes and inventions, and roughly translates to "Searcher-Of-Flats 5000". However, as the project grew beyond the real estate market, I needed a new name that would reflect that - it also needed to be slightly more accessible for foreigners than our oh how beautiful polish words. So I came up with the current one, with the best fitting abbreviation: bf5000. I think it's kind of neat :)

Totally accurate photograph of me giving birth to Bargain-Finder-inator 5000 circa 2024, colorized


What Bargain-Finder-inator 5000 dutifully does is monitor a link you serve to it, pointing to an online marketplace, be it for a real estate market or any other you can think of. The catch is that it needs to be supported, but writing a new backend shouldn't be too much of a hassle, and when it is you can simply copy paste the URL of your search with all the necessary filters specified, and  give it to bf5000. You also need to specify the delay between each check for new offers, which consists of fetching only the latest offer, and comparing it with the previous "latest". If they don't match, then we are in for some goodies - an email notification with the link to the latest offer will be sent, so you need to specify the email title, addresses and the whole provider too. For more information, check out the repository on gitlab.

So, don't wait no more for better days, and be part of the change now! We can take back what's rightfully ours from those money-hungry real estate agencies! When I say Bargain, you say Finder-inator 5000! You get the idea.


October 30, 2024

Happenings at work

A few months ago this happened.

Which, for those of you not up to date on your 1960s British television, is to say that I've resigned. I'm currently enjoying the unemployed life style. No, that is not me being cheeky or ironic. I'm actually enjoying being able to focus on my own free time projects and sleeping late.

Since I'm not a millionaire at some point I'll probably have to get a job again. But not for at least six months. Maybe more, maybe less, we'll see what happens.

This should not affect Meson users in any significant way. I plan to spend some time to work on some fundamental issues in the code base to make things better all round. But the most important thing for now is to land the option refactor monster.

October 27, 2024

Towards a GNOME Mobile Test Suite

GNOME Mobile

Making GNOME adapt to form factors beyond desktop and laptop computers is an ongoing trend that can be dated as early as the late 2000s, when Maemo provided a GNOME-based UI to phones like the Nokia N810 or the Nokia N900. Later, prototype versions of GNOME Shell had a netbook-friendly design that got course-corrected for its first release in 2011, keeping GNOME competent on larger screens.

GNOME 3 was designed with touchscreens in mind, especially touchscreen-equiped netbooks and laptops, with some foray into large touch-only devices like kiosks. With its touch-capabilities and minimalist touch-friendly design, GNOME 3 offered a good base to adapt to even smaller touch-only form factors like tablets and smartphones.

In the late 2010 two Linux smartphones got developed concurrently, Purism’s Librem 5 and Pine64’s PinePhone. Purism choose GNOME as the UI for its phone and invested in the development of the Phosh mobile-first shell for GNOME and into making GNOME apps adapt to smartphones. The GNOME community pretty widely embraced adaptiveness, which led to the creation of GNOME’s platform library libadwaita. At the same time, community-driven projects like postmarketOS and Mobian offered support for these smartphones and contributed to the development of this mobile-friendly software stack, including contributions to GNOME.

While these devices’ reception was polarizing, the Linux community was motivated enough to pursue what they initiated, leading to the birth of GNOME Shell Mobile and to the broadening of supported devices. While GNOME Shell got forked to make it fit smartphones, it is only to prototype this mobile support freely. Ultimately the goal is the this support into Shell, making it adapt from desktops to smartphones. This still overall prototypal support for modern smartphones from GNOME and the initiative that supports it are colloquially referred to as GNOME Mobile.

Testing GNOME Mobile

The GNOME release team defines what constitues the canonical core GNOME stack, and describes it in the gnome-build-meta repository. GNOME OS is built based on this description and is used to test GNOME, ensuring its components are correctly integrated and interact well together. openQA is a high-level and automated OS testing tool, and in 2021, Codethink brought GNOME an openQA instance that is used to test GNOME OS automatically rather than manually. The tests are ran in virtual machines thanks to QEMU.

Testing GNOME on smartphones implies testing its mobile-specific stack on smartphone-like devices the same way we test the rest of GNOME. Hardware requirements for GNOME are pretty loosely defined, and the only real requirement for smartphones is that apps designed for them should fit in a 360 × 294px window, so they can fit a 360px wide screen in portrait mode and a 360px tall screen in landscape mode, minus the space reserved for Shell. To that we can safely assume that a smartphone reports having a handset chassis type, that it has a touchscreen as its main input method, that it should work without a keyboard and a pointing device, that its screen is 9:16 or taller, and that the its has a high pixel density and should be used with an matching integer scaling factor. For reference, here is the pixel density of some de-facto reference GNOME smartphones.

DeviceDiagonalResolutionDensityUI Scale
Librem 55.7”720 × 1440px282 ppi200%
PinePhone5.95”720 × 1440px270 ppi200%
PinePhone Pro6”720 × 1440px268 ppi200%
OnePlus 66.28”1080 × 2280px401 ppi300%
OnePlus 6T6.41”1080 × 2340px402 ppi300%

Building an automated test suite for GNOME Mobile in openQA has already been attempted earlier this year by Dorothy Kabarozi and Tanju Acheleke, and they built the gnome_mobile test suite. Last month I got offered by Codethink the opportunity to continue that effort, thanks to them for sponsoring that work.

I’ve learned Dorothy and Tanju encountered various issues that prevented them from doing proper mobile tests, and the produced suite tests apps on a regular desktop but with their windows resized to smartphone-like sizes. The goal of my project was to make the test VM provide a smartphone-like screen size and chassis type.

Pixel Density

I’ve first tweaked the VM’s screen to be 360 × 720, but such a small resolution isn’t supported and the tests automatically fail. No big deal, smartphones run on high density devices and we want to test UI scaling, so I decided to switch to 720 × 1440 with 200% scaling… except of course the tests weren’t scaled, why would they be?

To set the scaling factor, we first have to complete the system’s initial setup unscaled, and then once finally logged into GNOME Shell, we discover Settings doesn’t let us change it. This happens because Mutter enables changing the scaling factor only on arbitrarily large-enough resolutions, and 720 × 1440@2 is below the required threshold. At this point, I faced the same issues as Dorothy and Tanju and didn’t go any further, but let’s dig a bit more.

Besides Mutter’s arbitrary limitation, we are facing the need to set the display’s physical size or pixel density so the OS can adapt to it from the very beginning. The best way to do this it is to have an EDID declaring our display’s resolution and physical size, we just need to find the best way to generate it and to use it. We could use a tool like qemu-edid to generate the EDID we want, inject it into the OS, and override the one from the virtual machine, but it would be a messy and dirty workaround.

Our test suite uses QEMU with virtio-vga which offers the following properties:

#define VIRTIO_GPU_BASE_PROPERTIES(_state, _conf)                       \
    DEFINE_PROP_UINT32("max_outputs", _state, _conf.max_outputs, 1),    \
    DEFINE_PROP_BIT("edid", _state, _conf.flags, \
                    VIRTIO_GPU_FLAG_EDID_ENABLED, true), \
    DEFINE_PROP_UINT32("xres", _state, _conf.xres, 1280), \
    DEFINE_PROP_UINT32("yres", _state, _conf.yres, 800)

We already use xres and yres to set the display’s resolution, but there also is the edid property, that openQA toggles on to make QEMU generate an EDID describing the virtual machine’s screen. QEMU has all that’s needed to generate and expose an EDID with the right pixel density, except for a way to let the user override the pixel density that QEMU defaults to 100 DPI.

We could imagine exposing the dpi parameter as a virtio-vga property, making QEMU able to emulate devices with a high density screen, and helping us run mobile tests.

Chassis Type

Then I’ve looked at giving the VM a smartphone’s chassis type. The chassis type is defined in the SMBIOS, let’s read about it in the reference specification:

7.4 System Enclosure or Chassis (Type 3)

The information in this structure (see Table 16) defines attributes of the system’s mechanical enclosure(s). For example, if a system included a separate enclosure for its peripheral devices, two structures would be returned: one for the main system enclosure and the second for the peripheral device enclosure. The additions to this structure in version 2.1 of this specification support the population of the CIM_Chassis class.

Table 16 – System Enclosure or Chassis (Type 3) structure
OffsetNameLengthValueDescription
05hTypeBYTEVariesBit 7 Chassis lock is present if 1. Otherwise, either a lock is not resent or it is unknown if the enclosure has a lock. Bits 6:0 Enumeration value; see below.

7.4.1 System Enclosure or Chassis Types

Table 17 shows the byte values for the System Enclosure or Chassis Types field. NOTE Refer to 6.3 for the CIM properties associated with this enumerated value.

Table 17 – System Enclosure or Chassis Types
Byte ValueMeaning
01hOther
0BhHand Held

For our QEMU VM to declare being a handheld device, we need to set the SMBIOS structure type 3’s Type field to 0x0B.

According to its documentation, QEMU lets us set some of the SMBIOS fields conveniently via the -smbios parameter. For type 3 we are allowed -smbios type=3[,manufacturer=str][,version=str][,serial=str][,asset=str][,sku=str], so unfortunately it doesn’t let us set the chassis type. QEMU also let’s us set the whole SMBIOS via -smbios file=binary, so we could write the SMBIOS ourselves and feed it to QEMU, but it would be a dirty workaround to an issue that can be fixed. QEMU has all that’s needed to generate an SMBIOS with the right chassis type, except for a way to let the user override the chassis type that QEMU defaults to 0x01 meaning other.

We could imagine adding a chassis=… parameter to -smbios type=3, making QEMU able to fake devices types, and helping us run mobile tests.

Clearing The Way

Adding the dpi and chassis parameters to QEMU’s CLI shouldn’t be too hard, the internals are there, it’s just a matter of exposing these variables. The important part is of course to work with the QEMU project, making sure they are happy with the proposed modifications. If you want to work on that, please let me know! And if you want to contribute to GNOME Mobile’s automated test suite, feel free to do so on the related issue on GNOME’s GitLab instance.

Thanks again to Codethink for sponsoring that work.

October 25, 2024

2024-10-25 Friday

  • Call with Dave, partner call, lunch with E. and J. Chased people's annual reviews. Sync with Sven & Caolan.
  • Impressed to read about the expulsion of Russian maintainers from the Linux Kernel by gregkh, with rationale from James. Raising awareness of the origin of critical software used in your supply-chain is important, even if FLOSS - sad as it may be for various blameless individuals.
  • Tea with J. and E. - who left to the StAG Travs weekend. Poke at some code.

#171 Point of Interest

Update on what happened across the GNOME project in the week from October 18 to October 25.

GNOME Core Apps and Libraries

Maps

Maps gives you quick access to maps all across the world.

mlundblad says

Maps now has a redesigned UI for editing points-of-interests in OpenStreetMap, using libadwaita widgets, and AdwDialog.

GLib

The low-level core library that forms the basis for projects such as GTK and GNOME.

Philip Withnall announces

Jialu Zhou has renamed the methods in GUnixMountEntry in GLib so they can be introspected properly (previously they couldn’t be used easily from introspected languages); see https://gitlab.gnome.org/GNOME/glib/-/merge_requests/4337

GNOME Incubating Apps

Pablo Correa Gomez announces

Qiu Wenbo implemented using colors for highlight annotations in Papers! This was an often requested feature made possible due to the new mockups, and lots of refactoring under-the-hood

Pablo Correa Gomez reports

@omthorat implemented the new annotation window mockups in Papers, giving annotated documents a great new appearance! You can get the latest development snapshot in gnome-nightly

GNOME Circle Apps and Libraries

Warp

Fast and secure file transfer.

Fina says

Warp 0.8 was released today. Warp allows you to securely send files to each other via the internet or local network by exchanging a word-based code, or scanning a QR code.

This release features quite a few stability improvements for QR code scanning, and lots of translation updates! The flatpak release was updated to the latest runtime and now supports accent colors.

Biblioteca

Read GNOME documentation offline

Akshay Warrier reports

Biblioteca 1.5 is now available on Flathub!

The notable changes in this release are:

  • Updated Flatpak runtime to GNOME 47
  • Added gom documentation
  • Updated various library docs (libportal 0.8.1, vte 0.78, libshumate 1.3.0, libspelling 0.4.2)
  • Added support to persist window size across sessions
  • Removed irrelevant context menu entries in the webview

Amberol

Plays music, and nothing else.

Emmanuele Bassi announces

Amberol 2024.2 is now available on Flathub! Amberol is a small music player with no delusions of grandeur, focused on the task of playing your local music in the simplest way possible. In this new release you’ll find:

  • support for external album cover images named folder (in both PNG and JPEG image file formats) in the same directory as the songs
  • the file selection dialogs used to add songs and folders will start from the XDG Music directory, if one is set
  • an updated MPRIS implementation, dropping the unmaintained mpris-player crate in favour of the mpris-server one
  • an updated playback implementation, using the GstPlay API instead of the deprecated GstPlayer
  • various style updates to reflect the changes in libadwaita UI elements
  • lots of localisation updates

As usual, you can install Amberol from Flathub.

Third Party Projects

Konstantin Tutsch says

Lock is now available on Flathub! Lock is a graphical front-end for GnuPG (GPG) making use of a beautiful LibAdwaita GUI.

Process text and files:

  • Encryption
  • Decryption
  • Signing
  • Verification

Manage your GnuPG keyring:

  • Generate new keypairs
  • Import keys
  • Export public keys
  • View expiry dates
  • Remove keys

Download on Flathub.

Hari Rana | TheEvilSkeleton says

Upscaler 1.4.0 was just released! Upscaler is an app that allows you to upscale and enhance images, be it your photos, digital art, and more.

This release introduces scaling factors, allowing you to upscale between a factor of 2 and 4. The image loading system has been reworked to decrease overall memory consumption. The preview image now has a drop shadow to better make the image distinguishable. Lastly, it fixes a bug where small window sizes made the preview image disappear.

Miscellaneous

Alice (she/her) announces

I blogged about implementing Steam Deck gamepad support in libmanette: https://blogs.gnome.org/alicem/2024/10/24/steam-deck-hid-and-libmanette-adventures/

GNOME Websites

Allan Day reports

GNOME’s wiki was officially retired this week. Its functions have already been replaced by other sites and, if you do need information from the old wiki, a static archive of the site is still available at wiki.gnome.org.

For more information, see the wiki the migration guide.

Events

Kristi Progri announces

We’re excited to announce that registration for GNOME ASIA 2024 is now open! For more details, feel free to check out our blogpost: https://foundation.gnome.org/2024/10/23/registration-now-open-for-gnome-asia-2024/

GNOME Foundation

Allan Day reports

The GNOME Foundation Board is looking for input on the future of GUADEC. Please comment on the Discourse thread with your opinions on where GUADEC should be located, and what its focus should be.

That’s all for this week!

See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects!

October 24, 2024

2024-10-24 Thursday

  • Tech planning call, prep. call with sr. tech. & mgmt team. Mail chew left & right. Call with potential partner.
  • J. out to see B&A for much of the day; struck by the nice WiFi network name: TellMyWifiLoverHer. Somehow managed to recover my thunderbird calendar tab: somehow I close it and don't easily find it again - each time a search; perhaps if the left-hand-side-bar buttons were to the right of the tabs there would be less wasted h-space, and more find-ability.
  • Cyrille & Alex over for home-group, short study on Revelation 10.

October 23, 2024

Steam Deck, HID, and libmanette adventures

Screenshot of gamepad preferences in Highscore, showing Steam Deck gamepad

Recently, I got a Steam Deck OLED. Obviously, one of the main reasons for that is to run a certain yet to be announced here emulation app on it, so I installed Bazzite instead of SteamOS, cleaned up the preinstalled junk and got a clean desktop along with the Steam session/gaming mode.

For the most part, it just works (in desktop mode, at least), but there was one problematic area: input.

Gamepad input

Gamepads in general are difficult. While you can write generic evdev code dealing with, say, keyboard input and be reasonably sure it will work with at least the majority of keyboards, that’s not the case for gamepads. Buttons will use random input codes. Gamepads will assign different input types for the same control. (for example, D-pad can be presented as 4 buttons, 2 hat axes or 2 absolute axes). Linux kernel includes specialized hid drivers for some gamepads which will work reasonably well out of the box, but in general all bets are off.

Projects like SDL have gamepad mapping databases – normalizing input for all gamepads into a standardized list of inputs.

However, even that doesn’t guarantee they will work. Gamepads will pretend to be other gamepads (for example, it’s very common to emulate an Xbox gamepad) and will use incorrect mapping as a result. Some gamepads will even use identical IDs and provide physically different sets of buttons, meaning there’s no way to map both at the same time.

As such, apps have to expect that gamepad may or may not work correctly and user may or may not need to remap their gamepad.

Steam controllers

Both the standalone Steam Controller and Steam Deck’s internal gamepad pose a unique challenge: in addition to being gamepads with every problem mentioned above, they also emulate keyboard and pointer input. To make things more complicated, Steam has a built-in userspace HID driver for these controllers, with subtly different behavior between it and the Linux kernel driver. SteamOS and Bazzite both autostart Steam in background in desktop mode.

If one tries to use evdev in a generic way, same as for other gamepads, the results will not be pretty:

In desktop mode Steam emulates a virtual XInput (Xbox) gamepad. This gamepad works fine, except it lacks access to Steam and QAM buttons, as well as the 4 back buttons (L4, L5, R4, R5). This works perfectly fine for most games, but fails for emulators where in addition to the in-game controls you need a button to exit the game/open menu.

It also provides 2 action sets: Desktop and Gamepad. In desktop action set none of the gamepad buttons will even act like gamepad buttons, and instead will emulate keyboard and mouse. D-pad will act as arrow keys, A button will be Enter, B button will be Esc and so on. This is called “lizard mode” for some reason, and on Steam Deck is toggled by holding the Menu (Start) button. Once you switch to gamepad action set, gamepad buttons will act as a gamepad, with the caveat mentioned above.

Gamepad action set also makes the left touchpad behave differently: instead of scrolling and performing a middle click on press, it does a right click on press while moving finger on it does nothing.

hid-steam

Linux kernel includes a driver for these controllers, called hid-steam, so you don’t have to be running Steam for it to work. While it does most of the same things Steam’s userspace driver does, it’s not identical.

Lizard mode is similar, the only difference is that haptic feedback on the right touchpad stops right after lifting finger instead of after the cursor stops, while left touchpad scrolls with a different speed and does nothing on press.

The gamepad device is different tho – it’s now called “Steam Deck” instead of “Microsoft X-Box 360 pad 0” and this time every button is available, in addition to touchpads – presented as a hat and a button each (tho there’s no feedback when pressing).

The catch? It disables touchpads’ pointer input.

The driver was based on Steam Deck HID code from SDL, and in SDL it made sense – it’s made for (usually fullscreen) games, if you’re playing it with a gamepad, you don’t need a pointer anyway. It makes less sense in emulators or otherwise desktop apps tho. It would be really nice if we could have gamepad input AND touchpads. Ideally automatically, without needing to toggle modes manually.

libmanette

libmanette is the GNOME gamepad library, originally split from gnome-games. It’s very simple and basically acts as a wrapper around evdev and SDL mappings database, and has API for mapping gamepads from apps.

So, I decided to add support for Steam deck properly. This essentially means writing our own HID driver.

Steam udev rules

First, hidraw access is currently blocked by default and you need an udev rule to allow it. This is what the well known Steam udev rules do for Valve devices as well as a bunch of other well known gamepads.

There are a few interesting developments in kernel, logind and xdg-desktop-portal, so we may have easier access to these devices in future, but for now we need udev rules. That said, it’s pretty safe to assume that if you have a Steam Controller or Steam Deck, you already have those rules installed.

Writing a HID driver

Finally, we get to the main part of the article, everything before this was introduction.

We need to do a few things:

1. Disable lizard mode on startup
2. Keep disabling it every now and then, so that it doesn’t get reenabled (this is unfortunately necessary and SDL does the same thing)
3. Handle input ourselves
4. Handle rumble

Both SDL and hid-steam will be excellent references for most of this, and we’ll be referring to them a lot.

For the actual HID calls, we’ll be using hidapi.

Before that, we need to find the device itself. Raw HID devices are exposed differently from evdev ones, as /dev/hidraw* instead of /dev/input/event*, so first libmanette needs to search for those (either using gudev, or monitoring /dev when in flatpak).

Since we’re doing this for a very specific gamepad, we don’t need to worry about filtering out other input devices – this is an allowlist, so we just don’t include those. So we just match by vendor ID and product ID. Steam Deck is 28DE:1205 (at least OLED, but as far as I can tell the PID is the same for LCD).

However, there are 3 devices like that: the gamepad itself, but also its emulated mouse and keyboard. Well, sort of. Only hid-steam uses those devices, Steam instead sends them via XTEST. Since that obviously doesn’t work on Wayland, there’s instead a uinput device provided by extest.

SDL code tells us that only the gamepad device can actually receive HID reports, so the right device is the one that allows to read from it.

Disabling lizard mode

Next, we need to disable lizard mode. SDL sends an ID_CLEAR_DIGITAL_MAPPINGS report to disable keyboard/mouse emulation, then changes a few settings: namely, disables touchpads. As mentioned above, hid-steam does the same thing – it was based on this code.

However, we don’t want to disable touchpads here.

What we want to do instead is to send a ID_LOAD_DEFAULT_SETTINGS feature report to reset settings changed by hid-steam, and then only disable scrolling for the left touchpad. We’ll make it right click instead, like Steam does.

This will keep the right touchpad moving pointer, but the previous ID_CLEAR_DIGITAL_MAPPINGS report had disabled touchpad clicking, so we also need to restore it. For that, we need to use the ID_SET_DIGITAL_MAPPINGS report. SDL does not have an existing struct for its payload (likely because of struct padding issues), so I had to figure it out myself. The structure is as follows, after the standard zero byte and the header:

  • 8 bytes: buttons bitmask
  • 1 byte: emulated device type
  • 1 byte: a mouse button for DEVICE_MOUSE, a keyboard key for DEVICE_KEYBOARD, etc. Note that the SDL MouseButtons struct starts from 0 while the IDs Steam Deck accepts start from 1, so MOUSE_BTN_LEFT should be 1, MOUSE_BTN_RIGHT should be 2 and so on.

Then the structure repeats, up to 6 times in the same report.

ID_GET_DIGITAL_MAPPINGS returns the same structure.

So, setting digital mappings for:

  • STEAM_DECK_LBUTTON_LEFT_PAD, DEVICE_MOUSE, MOUSE_BTN_RIGHT
  • STEAM_DECK_LBUTTON_RIGHT_PAD, DEVICE_MOUSE, MOUSE_BTN_LEFT

(with the mouse button enum fixed to start from 1 instead of 0)

reenables clicking. Now we have working touchpads even without Steam running, with the rest of gamepad working as a gamepad, automatically.

Keeping it disabled

We also need to periodically do this again to prevent hid-steam from reenabling it. SDL does it every 200 updates, so about every 800 ms (update rate is 4 ms), and the same rate works fine here. Note that SDL doesn’t reset the same settings as initially, but only SETTING_RIGHT_TRACKPAD_MODE. I don’t know why, and doing the same thing did not work for me, so I just use the same code as detailed above instead and it works fine. It does mean that clicks from touchpad presses are ended and immediately restarted every 800 ms, but it doesn’t seem to cause any issues in practice, even with e.g. drag-n-drop)

Handling gamepad input

This part was straightforward. Every 4 ms we poll the gamepad and receive the entire state in a single struct: buttons as a bitmask, stick coordinates, trigger values, but also touchpad coordinates, touchpad pressure, accelerometer and gyro.

Right now we only expose a subset of buttons, as well as stick coordinates. There are some very interesting values in the button mask though – for example whether sticks are currently being touched, and whether touchpads are currently being touched and/or pressed. We may expose that in future, e.g. having API to disable touchpads like SDL does and instead offer the raw coordinates and pressure. Or do things on touch and/or click. Or send haptic feedback. We’ll see.

libmanette event API is pretty clunky, but it wasn’t very difficult to wrap these values and send them out.

Rumble

For rumble we’re doing the same thing as SDL: sending an ID_TRIGGER_RUMBLE_CMD report. There are a few magic numbers involved, e.g. for the left and right gain values – originated presumably in SDL, copied into hid-steam and now into libmanette as well ^^

Skipping duplicate devices

The evdev device for Steam Deck is still there, as is the virtual gamepad if Steam is running. We want to skip both of them. Thankfully, that’s easily done via checking VID/PID: Steam virtual gamepad is 28DE:11FF, while the evdev device has the same PID as the hidraw one. So, now we only have the HID device.

Behavior

So, how does all of this work now?

When Steam is not running, libmanette will automatically switch to gamepad mode, and enable touchpads. Once the app exits, it will revert to how it was before.

When Steam is running, libmanette apps will see exactly the same gamepad instead of the emulated one. However, we cannot disable lizard mode automatically in this state, so you’ll have to hold Menu button, or you’ll get input from both the gamepad and keyboard. Since Steam doesn’t disable touchpads in gamepad mode, they will still work as expected, so the only caveat is needing to hold Menu button.

So, it’s not perfect, but it’s a big improvement from how it was before.

Mappings

Now that libmanette has bespoke code specifically for Steam Deck, there are a few more questions. This gamepad doesn’t use mappings, and apps can safely assume it has all the advertised controls and nothing else. They can also know exactly what it looks like. So, libmanette now has ManetteDeviceType enum, currently with 2 values: MANETTE_DEVICE_GENERIC for evdev devices, and MANETTE_DEVICE_STEAM_DECK, for Steam Deck. In future we’ll likely have more dedicated HID drivers and as such more device types. For now though, that’s it.


The code is here, though it’s not merged yet.

Big thanks to people who wrote SDL and the hid-steam driver – I would definitely not be able to do this without being able to reference them. ^^

wireless_status kernel sysfs API

(I worked on this feature last year, before being moved off desktop related projects, but I never saw it documented anywhere other than in the original commit messages, so here's the opportunity to shine a little light on a feature that could probably see more use)

    The new usb_set_wireless_status() driver API function can be used by drivers of USB devices to export whether the wireless device associated with that USB dongle is turned on or not.

    To quote the commit message:

This will be used by user-space OS components to determine whether the
battery-powered part of the device is wirelessly connected or not,
allowing, for example:
- upower to hide the battery for devices where the device is turned off
  but the receiver plugged in, rather than showing 0%, or other values
  that could be confusing to users
- Pipewire to hide a headset from the list of possible inputs or outputs
  or route audio appropriately if the headset is suddenly turned off, or
  turned on
- libinput to determine whether a keyboard or mouse is present when its
  receiver is plugged in.
This is not an attribute that is meant to replace protocol specific
APIs [...] but solely for wireless devices with
an ad-hoc “lose it and your device is e-waste” receiver dongle.
 

    Currently, the only 2 drivers to use this are the ones for the Logitech G935 headset, and the Steelseries Arctis 1 headset. Adding support for other Logitech headsets would be possible if they export battery information (the protocols are usually well documented), support for more Steelseries headsets should be feasible if the protocol has already been reverse-engineered.

    As far as consumers for this sysfs attribute, I filed a bug against Pipewire (link) to use it to not consider the receiver dongle as good as unplugged if the headset is turned off, which would avoid audio being sent to headsets that won't hear it.

    UPower supports this feature since version 1.90.1 (although it had a bug that makes 1.90.2 the first viable release to include it), and batteries will appear and disappear when the device is turned on/off.

A turned-on headset

Registration Now Open for GNOME Asia 2024

Registration for GNOME Asia 2024 is now open! This year’s summit will be held from December 6-8, 2024, in the dynamic city of Bangalore, India, with both in-person and remote participation options.

GNOME Asia 2024 will feature a fantastic lineup of presentations and workshops centered around the latest innovations in the GNOME ecosystem and its community. Whether you’re attending on-site in Bangalore or joining online from anywhere in the world, there’s something for everyone.

The full conference schedule, including session and speaker details, will soon be available on the event website.

Registration is open to everyone—whether you’re an experienced developer, new to the open-source world, or simply curious about what’s happening in GNOME. We look forward to welcoming you, both in person and online, from December 6-8!

Become a GNOME Asia 2024 Sponsor!

We’re still looking for sponsors for this year’s summit. If you or your company are interested in sponsoring GNOME Asia 2024, please find more details and our sponsorship brochure on the event website or reach out to asia@gnome.org.

October 22, 2024

Why bootc doesn’t require “/usr merge”

The systemd docs talk about UsrMerge, and while bootc works nicely with this, it does not require it and never will. In this blog we’ll touch on the rationale for that a bit.

The first stumbling block is pretty simple: For many people shipping “/usr merge” systems, a a lot of backwards compatibility symlinks are required, like /bin/usr/bin etc. Those symbolic links are pretty load bearing, and we really want them to also not just be sitting there as random mutable state.

This problem domain really scope creeps into “how does / (aka the root filesystem)” work?

There are multiple valid models; one that is viable for many use cases is where it’s ephemeral (i.e. a tmpfs) as encouraged by things like systemd-volatile-root. One thing I don’t like about that is that / is just sitting there mutable, given how important those symlinks are. It clashes a bit with things like wanting to ensure all read files are only from verity-protected paths and things like that. These things are closer to quibbles though, and I’m sure some folks are successfully shipping systems where they don’t have those compatibility symlinks at all.

The bigger problem though is all the things that never did “/usr move”, such as /opt. And for many things in there we actually really do want it to be read-only at runtime (and more generally, versioned with the operating system content).

Finally, /opt is just a symptom of a much larger issue that there’s no “/usr merge” requirement for building application containers (docker/podman/kube style) and a toplevel, explicit goal of bootc is to be compatible with that world.

It’s for these reasons that while historically the ostree project encouraged “/usr merge”, it never required it and in fact the default / is versioned with the operating system – defining /etc and /var as the places to put persistent machine local state.

The way bootc works by default is to continue that tradition, but as of recently we default to composefs which provides a strong and consistent story for immutability for everything under / (including /usr and /opt and arbitrary toplevels). There’s more about this in our filesystem docs.

In conclusion I think what we’re doing in bootc is basically more practical, and I hope it will make it easier for people to adopt image-based systems!

October 18, 2024

Shortwave 4.0

It was long overdue, but better late than never! Shortwave 4.0 is now available on Flathub:

Get it on Flathub

General

  • New MPRIS media controls implementation with improved CPU usage
  • Song notifications are disabled by default now
  • No more loading on startup, stations now get directly retrieved from cached data
  • Fixed issue which sometimes prevented loading more than 8 stations from library
  • Refreshed user interface by making use of new Libadwaita widgets
  • Large parts of the app were reworked, providing a solid foundation for the next upcoming features

Playback

  • Last station now gets restored on app launch
  • Redesigned player sidebar, allowing to control volume more easily
  • New recording indicator showing whether the current playback is being recorded
  • Fixed buffering issue which prevented playing new stations, especially after switching stations too fast
  • Fixed issues which sometimes prevented that a song gets recorded
  • Fixed issue that volume remains muted after unmuting

Station Covers

  • More supported image file format for station covers
  • Enhanced security by loading station covers using sandboxed Glycin image library
  • Non square covers automatically get a blurred background
  • New generated fallback for stations without any cover image
  • Improved disk usage by automatically purging no longer needed cached data

Browse / Search

  • More useful station suggestions by respecting configured system language / region
  • Suggestions now get updated with every start, no longer always showing the same stations
  • More accessible search feature, no longer hidden in a subpage
  • Search results are no longer limited at 250 stations
  • Faster and more efficient search by using new grid widgets

Chromecast

  • Shortwave is now a registered Google Cast app, no longer relying on the generic media player
  • New backend which greatly improves communication stability with cast devices
  • Improved discovery of cast devices with lower CPU and memory usage
  • Now possible to change the volume of a connected cast device

Enjoy!

October 17, 2024

GNOME Infrastructure migration to AWS

1. Some historical background

The GNOME Infrastructure has been hosted as part of one of Red Hat’s datacenters for over 15 years now. The “community cage”, which is how we usually define the hosting platform that backs up multiple Open Source projects including OSCI, is made of a set of racks living within the RAL3 (located in Raleigh) datacenter. Red Hat has not only been contributing to GNOME by maintaining the Red Hat’s Desktop Team operational, sponsoring events (such as GUADEC) but has also been supporting the project with hosting, internet connectivity, machines, RHEL (and many other RH products subscriptions). When the infrastructure was originally stood up it was primarily composed of a set of bare metal machines, workloads were not yet virtualized at the time and many services were running directly on top of the physical nodes. The advent of virtual machines and later containers reshaped how we managed and operated every component. What however remained the same over time was the networking layout of these services: a single L2 and a shared (with other tenants) public internet L3 domains (with both IPv4 and IPv6).

Recent challenges

When GNOME’s Openshift 4 environment was built back in 2020 we had to make specific calls:

  1. We’d have ran an Openshift Hyperconverged setup (with storage (Ceph), control plane, workloads running on top of the same subset of nodes)
  2. The total amount of nodes we received budget for was 3, this meant running with masters.schedulable=true
  3. We’d have kept using our former Ceph cluster (as it had slower disks, a good combination for certain workloads we run), this is however not supported by ODF (Openshift Data Foundation) and would have required some glue to make it completely functional
  4. Migrating GNOME’s private L2 network to L3 would have required an effort from Red Hat’s IT Network Team who generally contributes outside of their working hours, no changes were planned in this regard
  5. No changes were planned on the networking equipment side to make links redundant, that means a code upgrade on switches would have required a full services downtime

Over time and with GNOME’s users and contributors base growing (46k users registered in GitLab, 7.44B requests and 50T of traffic per month on services we host on Openshift and kindly served by Fastly’s load balancers) we started noticing some of our original architecture decisions weren’t positively contributing to platform’s availability, specifically:

  1. Every time an Openshift upgrade was applied, it resulted in a cluster downtime due to the unsupported double ODF cluster layout (one internal and one external to the cluster). The behavior was stuck block devices preventing the machines to reboot with associated high IO (and general SELinux labeling mismatches), with the same nodes also hosting OCP’s control plane it was resulting in API and other OCP components becoming unavailable
  2. With no L3 network, we had to create a next-hop on our own to effectively give internet access through NAT to machines without a public internet IP address, this was resulting in connectivity outages whenever the target VM would go down for a quick maintenance

Migration to AWS

With budgets season for FY25 approaching we struggled finding the necessary funds in order to finally optimize and fill the gaps of our previous architecture. With this in mind we reached out to AWS Open Source Program and received a substantial amount for us to be able to fully transition GNOME’s Infrastructure to the public cloud.

What we achieved so far:

  1. Deployed and configured VPC related resources, this step will help us resolve the need to have a next-hop device we have to maintain
  2. Deployed an Openshift 4.17 cluster (which uses a combination of network and classic load balancers, x86 control plane and arm64 workers)
  3. Deployed IDM nodes that are using a Wireguard tunnel between AWS and RAL3 to remain in sync
  4. Migrated several applications including SSO, Discourse, Hedgedoc

What’s upcoming:

  1. Migrating away from Splunk and use a combination of rsyslog/promtail/loki
  2. Keep migrating further applications, the idea is to fully decommission the former cluster and GNOME’s presence within Red Hat’s community cage during Q1FY25
  3. Introduce a replacement for master.gnome.org and GNOME tarballs installation
  4. Migrate applications to GNOME’s SSO
  5. Retire services such as GNOME’s wiki (MoinMoin, a static copy will instead be made available), NSD (authoritative DNS servers were outsourced and replaced with ClouDNS and GitHub’s pipelines for DNS RRs updates), Nagios, Prometheus Blackbox (replaced by ClouDNS endpoints monitoring service), Ceph (replaced by EBS, EFS, S3)
  6. Migrate smtp.gnome.org to OSCI in order to maintain current public IP’s reputation

And benefits of running GNOME’s services in AWS:

  1. Scalability, we can easily scale up our worker nodes pool
  2. We run our services on top of AWS SDN and can easily create networks, routing tables, benefit from faster connectivity options, redundant networking infrastructure
  3. Use EBS/EFS, don’t have to maintain a self-managed Ceph cluster, easily scale volumes IOPS
  4. Use a local to-the-VPC load balancer, less latency for traffic to flow between the frontend and our VPC
  5. Have access to AWS services such as AWS Shield for advanced DDOS protection (with one bringing down GNOME’s GitLab just a week ago)

I’d like to thank AWS (Tom “spot” Callaway, Mila Zhou) for their sponsorship and the massive opportunity they are giving to the GNOME’s Infrastructure to improve and provide resilient, stable and highly available workloads to GNOME’s users and contributors base. And a big thank you to Red Hat for the continued sponsorship over more than 15 years on making the GNOME’s Infrastructure run smoothly and efficiently, it’s crucial for me to emphatise how critical Red Hat’s long term support has been.

October 16, 2024

Status update, 16/10/2024

I’ve participated in two internships this year, and interns — who are usually busy full-time students — often ask “How do you get time to contribute to open source?”.

And the truth is that there’s no secret formula. It’s tricky to get paid to work on something that you give away for free, isn’t it? Mostly I contribute to open source in free time, either after work hours, or occasionally during periods of downtime.

To my complete surprise I managed to buy a house this year and so I suddenly don’t have any time after work. During the day most of my time is spent on proprietary customer-specific work, and after work I go to look at the house and try to figure out where to start with the whole thing. (By the way, does anyone around Santiago need a load of 1980s-style furniture made from chipboard?).

I’ll still be participating in GNOME around desktop search and the openQA tests, answering questions and triaging bug reports, but I won’t be driving any new stuff forwards.

Anyway, why is it interesting to blog about things I’m not doing?

I read this quote in LWN the other day:

Make it easy to quit – Actively celebrate people who step back from maintainer positions. Celebrate what they accomplished and what they are moving on to. Don’t punish or otherwise shame quitting. This also incentivizes other people to step up, knowing that they don’t necessarily have to do it forever.

Rich Bowen, “Open Source Summit Vienna 2024”

At least in GNOME, we often don’t do this. We don’t celebrate what people *have achieved*, with I think one exception (the legendary “Pants of Thanks” ceremony).

We should do better at this. It’s not that we don’t appreciate each others work. But mostly we require the person doing the work to also be the one shouting loudly about it, before we notice. Is there a better way?

Another thing we don’t do, by the way, is celebrate corporate participation. The great exception to this is the STF grant, and everyone involved in that did an excellent job of highlighting work which the STF grant enabled. We’re less good at crediting all the work that happens thanks to paid engineers from Red Hat, Endless, Canonical, SUSE, and so on.

Another quote from this article:

Each generation of a project (ie open source but not only open source) is responsible for mentoring the next generation. When you mentor someone, spend time emphasizing that it’s their job to mentor the next person, otherwise they will assume that it’s your job. A failure to commuincate this will result in the eventual attrition and death of the community.

Rich Bowen, “Open Source Summit Vienna 2024”

I quite like giving conference talks and I’ve been wondering what I could speak about, if I’m not driving any new development myself.

We now have 25 years of history in GNOME and it would be nice to give some talks about “How $thing works.” Desktop search comes to mind here, of course. I also learned (against my will) a lot about initial-setup this year. So I might propose some talks along these lines. It seems like also a nice way to look back at work that’s been done over the years, and give credit the people who have worked on these things over time, doing stuff that’s often invisible.

On that topic, I want to highlight the excellent work done over the summer by our two GSoC interns Divyansh Jain and Rachel Tam, adding a web-based IDE to TinySPARQL that can run queries against the GNOME search database. You can read more about that both on Rachel’s blog and on Demigod’s blog. The idea behind this was making it easier to visualize how the LocalSearch index actually works, what is stored there, and what you can do with it. Hopefully this can lead into some interesting talks about search!

If you like this post, please leave a comment! You use the form below, or reply on the Fediverse to @samthursfield.wordpress.com@samthursfield.wordpress.com. I’m also on LinkedIn.

October 15, 2024

Fedora at LinuxDays 2024

Last weekend I went to Prague to represent Fedora at LinuxDays 2024. It’s the biggest Linux event in the country with more than a thousand attendees and the Fedora booth is busy there every year.

Like last year the Fedora booth was colocated with the Red Hat booth. It made sense not only because there is a relationship between the two, but it had very practical reasons: I was the only person representing and staffing the Fedora booth and I appreciated help from my colleagues who watch over the Fedora booth when I took a break to have a meal or give a talk.

Post by @fedoracz@floss.social
View on Mastodon

The biggest magnet at our booth was again a macbook running Fedora Asahi Remix. I gave a talk about it which was only 20 minutes long and was intended as a teaser: here is an overview of the project and if you’d like to know and see more, come to your booth.

Fortunately just two days before the conference, the Asahi Linux project announced support for Steam via the Fex/muvm emulation, so I could utilize a large library of games I own have a license for on Steam. During the talk someone asked if it could run the Factorio game and it could, indeed.

Post by @fedoracz@floss.social
View on Mastodon

We also had a Fedora conference box which includes a Fedora Slimbook laptop. It was a nice contrast to the Macbook because Slimbook focuses on Linux whereas Apple doesn’t care about Linux at all.

The booth was so busy that I was making a post about our presence for 2 hours because I couldn’t find even a few minutes to finish it.

I also did a bit of user support. An older gentleman approached our booth stating that he had traveled 100km to get help. He had a dual boot of Fedora and Ubuntu and an Ubuntu update had broken the bootloader. Regenerating the GRUB resolved the issue.

Pavel Píša, a doctor from Czech University of Technology, invited me to their booth to check out Fedora Linux running on a Milk-V box with a RISC-V CPU. I left a flyer regarding an open Fedora QA position for RISC-V because Red Hat is currently looking for someone to test Fedora Linux on RISC-V.

Me with the RISC-V box. Original post.

Overall, the conference was a great experience, albeit tiring. I hope to attend next year again.

October 11, 2024

debuginfod-enabled Sysprof

Based on some initial work by Barnabás Pőcze Sysprof gained support for symbolizing stack traces using debuginfod.

If you don’t want to install debuginfo packages for your entire system but still want really useful function names, this is for you. The system-configured debuginfod servers will provide you access to those debuginfo-enabled ELF binaries so that we can discover the appropriate symbol names.

Much like in gdb, you can cancel them if you don’t care and those symbols will fallback to the “In File lib.so+offset” you’re used to.

A screenshot showing a popover of symbols being downloaded.

I expect a bit more UI work around this before GNOME 48 like preferences to disable it or configure alternate debuginfod servers.

Happy profiling!

Making it easy to generate fwupd device emulation data

We’re trying to increase the fwupd coverage score, so we can mercilessly refactor and improve code upstream without risks of regressions. To do this we run thousands of unit tests for each part of the libfwupd public API and libfwupdplugin private API. This gets us a long way, but what we really want to do is emulate the end-to-end firmware update of every real device we support.

It’s not trivial (or quick) connecting hundreds of devices to a specific CI machine, and so for some time we’ve supported recording USB device enumeration, re-plug, firmware write, rere-plug and re-enumeration. For fwupd 2.0.0 we added support for all sysfs-based devices too, which allows us emulate a real world NVMe disk doing actual ioctls() and reads() in every submitted CI job. We’re now going to ask vendors to record emulations for existing plugins of the firmware update so we can run those in CI too.

The device emulation docs are complicated and there’s lots of things that the user can do wrong. What I really wanted was a “click, click, save-as, click” user experience that doesn’t need to use the command line. The tl;dr: is that we’ve now added the needed async API in fwupd 2.0.1 (probably going to be released on Monday) and added the click, click UI to gnome-firmware:

There’s a slight niggle when the user starts recording the first “internal” device (e.g. a NVMe disk) that we need to ask the user to restart the daemon or the computer. This is because we can’t just hotplug the internal non-removable device, and need to “start recording” then “enumerate device(s)” rather than the other way around. Recording all the device enumeration isn’t free in CPU or RAM (and is possibly a security problem too), and so we don’t turn it on by default. All the emulation is also all controlled using polkit now, so you need the root password to do anything remotely interesting.

Some of the strings are a bit unhelpful, and some a bit clunky, so if you see anything that doesn’t look awesome or is hard to translate please tell us and we can fix it up. Of course, even better would be a merge request with a better string.

If you want to try it out there’s a COPR with all the right bits for Fedora 41. It’ll might also work on Fedora 40 if you remove gnome-software. I’ll probably switch the Flathub build to 48.alpha when fwupd 2.0.1 is released too. Feedback welcome.

October 10, 2024

2024-2025 budget and economic review

Dear community members,

As promised in the previous communication the Board would like to share some more details on our current financial situation and the budget for our 2024-2025 financial year, which runs from 1st October 2024 to 30th September 2025.

Background

  • The Foundation needs an approved budget in place because our spending policies use the budget to authorise what staff and committees are allowed to spend money on. This year we passed the budget on time for the start of the financial year, which was thanks to a lot of detailed and particularly challenging work by Richard, which the board is grateful for.
  • We consider the budget in 2 distinct parts:
    • Budget for our fiscally-sponsored projects. We consider their income, but not their expenses. The reason for that is that the Foundation takes a small part of the income as the fiscal sponsorship fee, supporting our administrative and operating costs. Funds received on behalf of other projects are tracked separately, called “reserved funds”, and the Foundation cannot spend money that belongs to the other projects.
    • General operating budget for the GNOME Foundation, which is what this post is all about! At any later point, when talking about the budget, we’re talking about the general/unrestricted operating funds and it is safe to assume that income for fiscally-sponsored projects is not included.
  • The budget for the previous 2023-2024 fiscal year was presented to the board as a roughly balanced break-even budget, anticipating $1.201M of revenue and $1.195M of expenses. The board considered two fundraising scenarios proposed by our previous ED, with the most ambitious scenario planning to raise an additional $2M for the Foundation, and one more conservative which anticipated an additional $475k of revenue from various sources (donations, grants, event sponsorship). This more conservative scenario was included in the budget, but in practice things did not work out as planned. This additional funding was not raised, meaning that in practice the Foundation once again ran at a deficit over the past year and used funds from our reserves.
  • The new 2024-2025 budget considers a total income of $586k, and total expense of $550k. Two things are clearly different from last year: the expenses have been greatly reduced, and we have aimed for a surplus instead of the deficit we ended up with last year. Both things were a consequence of the budget from previous year not being executed as expected. Since our reserve policy requires us to retain enough money to sustain core operations without income for another year (specifically, 1.1 times core spending), we’ve had to reduce expenses to save money and restore our reserves.

So, let’s dig into the details:

Income

  • $205,100 in donations. This number is based on previous years income, of individual contributions ($75,000), Advisory Board fees ($105,800), and other small contributions ($7,800) like matching donations (where companies double what employees donate). It also includes $16,500 currently pending from Wau Holland Stiftung, an organization we had a historic agreement with to collect funds from European donors that is tax deductible. We believe that there is a great potential for the GNOME Foundation to increase the amount of individual contributions received, and this has been included in the Strategic Plan and many board discussions. Unfortunately, without a permanent Executive Director, we cannot guarantee that we will be able to establish a program to do so in the short-term, so we have decided to budget conservatively to ensure economic sustainability.
  • $64,500 from event sponsorship. Most of that money comes from GUADEC ($61,000), with some from LAS and GNOME Asia, which is one of the main reasons why we are able to maintain our events: because they are sponsored separately, they are mostly self-sustaining.
  • $65,500 in fiscal sponsorship fees. This is based on a % fee the GNOME Foundation takes for our operational costs from hosting GIMP and Black Python Devs. This number is uncommonly high due as we have been workng with the GIMP on financial and legal arrangements to receive approx $1M of historical Bitcoin donations. (And sell them immediately – holding Bitcoin assets creates a regulatory/reporting problem for US nonprofits and our accountants have advised us against it.)
  • $1,000 in interest from money in the bank account. This is budgeted higher than previous years, as work is already in progress to change bank accounts to increase this income, as recommended by our auditors.
  • $500 profit from selling T-shirts and other goods ($2,500 income, $2,000 in expenses).
  • $250,000 from the 2nd year of an Endless grant that was approved last year. This grant provides $50,000 for general funds that the Foundation can use at its discretion, and $200,000 that need to be spent on specific tasks. Currently, those are assigned to Flathub, Parental Controls, GNOME Software maintenance, and internships. Some of those will be detailed in the expense section.

Expenditures

  • $10,000 interim ED salary. This is to be able to pay Richard to continue managing the Foundation and staff team until 10th December.
  • $100,000 for development contractors for work associated with the Endless grant. This work includes improvements in Parental Controls and GNOME Software, and is being executed by Philip Withnall (development), Sam Hewitt (design) and potentially one more developer over the coming year. Philip gave an update on the work in his presentation at GUADEC.
  • $110,600 in contractor costs for program staff, including events and infrastructure. This covers Kristi’s work which is the backbone of events such as GUADEC, LAS and GNOME.Asia, and Bart’s work running GNOME and Flathub infrastructure. The Flathub portion of this work is funded by the Endless grant.
  • $32,000 in Outreachy interships. This is a long-term partnership with Conservancy and commitment by the GNOME Foundation as the original birthplace of the Outreachy initiative. They are supported this year by reallocating some of the Endless grant, with their permission. This will pay for a total 4 interns between the winter and summer cohort.
  • $20,000 in contractor support. This is allocated for part-time contracting of Thibault Martin and Dawid Jankowiak to support the STF team and work on a crowdfunding platform for our development fundraising. Some of this is funded by the Endless grant and will be spent on coordinating the next steps of the Flathub payments/donations launch.
  • $158,000 in employment/contractor costs for operations and admin staff, supporting the GNOME Foundation across finances, events and community initiatives.
  • $47,500 in professional services, ie legal and accounting. These include a reserve for legal fees ($10,000), an external accounts audit for the previous financial year ($17,500), which is required due to our income (mostly due to STF) being over the $2M threshold, and accounting fees ($20,000). Some of the financial and legal costs are driven by work setting up Flathub LLC and are covered by the Endless grant.
  • $3,200 in office expenses, mostly related to postal expenses required for sending material between contractors, staff, and event organisers.
  • $54,000 in conferences and travel. These include the budget for the conferences themselves ($30,000), which includes GUADEC, GNOME Asia, and hackathons around the globe, but also travel for staff ($12,000) and community ($12,000). Travel particularly has been significantly reduced from previous year, but should still allow for staff/organisers to attend our events, and for the travel committee to support some community travel to GUADEC and GNOME Asia.
  • $15,000 in other fees. These include banking costs for sending money from the US to Europe, PayPal fees, and insurance. They might seem high, but are in total less than 1.5% of the cash flow of the Foundation, which is within the expected value for any organization.

Balance

  • As of the preparation of this budget, we have approx $140,000 in GNOME Foundation reserves. There’s a lot more money in the bank, but they are reserved funds held for GIMP and BPD.
  • We need to ensure that we meet our reserve policy of retaining 1.1 times core spending. Unfortunately, core spending is fairly loosely defined. This year, we have considered: Events and minimal staff travel, part-time infrastructure support, minimal staff, and some fees and professional services. In total, we accounted that we would need at least $158,000 at the end of the year to be able meet the policy.
  • The approved budget should put our reserves around $176,000 at the year end, which is slightly above our reserve policy. Considering we used a very limited interpretation of the reserves policy, it’s better to include a small safety margin for any unanticipated costs.

Conclusion

With limited time from our interim Executive Director (ED), Richard Littauer, who is working part-time, the board is prioritising: recruiting our new ED, delivering our current project/grant commitments (to STF and to Endless), and fundraising for development work. This includes working with the community to launch our development fund crowdfunder/platform and plan a follow-up project for STF grant, so that the GNOME Foundation can support and grow its direct investment in project development.

Keen readers will note that there is nothing in the current budget for the ED’s salary. We are in discussions with a potential donor to see whether we can find support for the salary for the ED for the first year. In any case, transparently sharing our financial situation and fundraising needs is an essential part of any ED recruitment process, so we could still recruit somebody with “raise money for your own salary” being their first priority.

Hopefully this additional detail helps to show the challenges of our current situation, and why we had to make really tough decisions, like parting ways with some greatly appreciated members of our staff team. We hope this sheds some more light on why those decisions were taken, provides confidence on the work done by the board and the ED, and where we currently stand. We are also very relieved to be able to provide a surplus budget for the first time in many years, and doing so while still being able to support the community: events, infrastructure, internships, travel funding, and meeting our commitment to donors for work done in some parts of the stack, e.g.: Flathub, parental controls and GNOME Software.

We welcome any feedback and questions from the GNOME community. Thanks to all of our GNOME members, contributors, donors, sponsors and advisory board members!

The GNOME Foundation Board of Directors

October 09, 2024

PyCon España 2024

This year PyCon Spain happened just down the road from me in Vigo, and I was able to attend with a few folk from Codethink. Python conferences are usually great as there’s so much variety in the talks and the bar for quality is also quite high.

I’m not sure when the videos will be out and they’ll anyway be in Spanish, but here are some notes for the talks I found most interesting. (Codethink sponsors our attendence to conferences on the basis that we write an internal report afterwards, which is where most of these notes came from).

Mapping illegal tourist apartments with Python

https://pretalx.com/pycones-2024/talk/PWEFGB/

Presented by Juan Luis Cano Rodríguez, who is a co-founder of Python España and PyCon ES. This talk was activism with some data engineering, in the context of the housing crisis that’s affecting most Spanish cities and tourist destinations.

This talk is the story of how he accidentally got involved as an activist after he noticed someone in his own appt building in Madrid was doing short-term rents to tourists, and wondered whether they had a license.

He started by asking the audience to guess how many long-term rental properties are available on Idealista for Madrid. Is it 7,000, 21,000 or 43,000? Then he asked to guess how many short-term rentals are available on AirBnB.

Tourist apartments in Madrid are licensed only after a professional inspection. The council publish data on how many licenses they’ve issued, and there are 1008 properties listed. Meanwhile the property register, which doesn’t require any inspection, lists 11,000 short-term rental properties. AirBnB meanwhile has 26,000 offers in Madrid. He realized that illegal apartment rentals are far worse than he had imagined.

He used Python (of course) to compare the register of properties against the license data to identify which properties were unlicensed, and participated in a group effort to send 10,000 individual reports to the council in one day. This made the national news back in June. The mayor of Madrid even praised this “citizens effort”.

AirBnB has a “license” field for short-term rentals in Spain, but they don’t do any checking of whether it’s valid. He experimented with their “Report” button with an apartment he knew was unlicensed, and AirBnB’s internal “investigation team” reported back a couple of days later that they had found nothing wrong.

Towards the end of the talk he offered the audience a choice, he could explain the data wrangling with Python in more detail, or give his thoughts on the housing crisis. The vote was around 90% for the latter, and his message was that (unsurprisingly) a lot more pressure is needed before anything is going to change. He finished by announcing a protest happening on the 13th October in Madrid.

The talk was very well received and could probably have filled 3 hours instead of 35 minutes.

How we are removing the GIL in Python

https://pretalx.com/pycones-2024/talk/ESYBVA/

Presented by Pablo Galindo Salgado, a Python core developer and Bloomberg engineer, this was a summary of the incredible work going on to remove CPython’s “Global Interpeter Lock” or GIL. The full technical details of this are in PEP 703, this is my short summary.

Again this talk could have been a full-day workshop but Pablo packed a lot of details into the time, using the simple of trick of talking incredibly fast.

CPython’s global lock means that writing multithreaded Python code is a waste of time, as the threads just get stuck waiting for the global lock. Today people use horrible workarounds such as the ‘multiprocessing’ library.

However, the GIL has advantages too, mainly its simplicitly. Deadlocks are a common problem in multithreaded code but currently CPython is immune to all that — a deadlock is caused when two mutexes block each other, but of course CPython only has the one lock.

So the big challenge is replacing the GIL with many separate locks, without making CPython slower, and avoiding deadlocks. CPython uses reference counting to manage memory, and this is the hardest piece of the puzzle as sharing objects between threads means every refcount change must be atomic and this can require locking, which limits throughput.

People have been trying to solve this since the mid 1990s and noone has yet succeeded, but the work here is actually being released this week in Python 3.13. It’s behind a compile-time configure flag, and the expectation is to spent around 5 years testing this before it’s enabled by default for everyone.

A minimal overview of how they’ve done it:

  • switch to biased ref counting for Python objects: this means there are now two refcounts per object, a thread-local refcount and a global refcount. The global one can be slow to access, but most of the time a local refcount is good enough and this is a fast path.
  • make common objects immortal: remember Python dates from the golden age of object-oriented programming, and everything is an object including True, False, None, the number 4 and so on. By setting the highest 2 bits of the refcount, these are now marked as immortal.
  • fast locking: using spinlocks from WebKit (the excellently named WTF::Lock) where possible
  • new memory allocator: use mimalloc, which allows linked list segments to be allocated contiguously instead of all over the place.

It seems a lot of this work has been funded by corporations, mainly Meta as noted in the Python 3.13 release notes, and we can assume Bloomberg are somewhat involved as well.

I hope Pablo does the same talk in English at some point because it was excellent. In the meantime you can hear him on the Changelog podcast:

The tsunami of disinformation: how Generative AI can help us in the fight against Fake News.

https://pretalx.com/pycones-2024/talk/8ZXZ8Z/

Presented by Rubén Míguez and Agustín Cañas from Newtral, a fact-checking organization in Spain.

There are many bullshit talks around about how generative AI will magically solve some or other intractable problem, as if a text synthesis machine built from Reddit comments is somehow going to succeed where our best scientists and leaders have so far failed. I recommend reading https://pivot-to-ai.com/ on that topic, but let me get back to
the talk.

The talk began with an overview of fake news, which I’m sure I don’t need to repeat here, except the excellent statistic taken from Washington Post fact-checkers that during 2016-2020, the then US president made around 30,000 provably false statements, which works out at 21 lies a day, every day for four years including Sundays and bank holidays. So it’s been an interesting time in the world of fact-checkers.

Newtral exists since 2013 and began in a very analogue way: human fact checkers watching news programs with big notebooks and writing down things to be verified. The goal of the small technology team was not to replace the human fact checkers but to build tools that could make them work 30x as fast.

(So if you were wondering how generative AI tools could possibly detect fake news, when Google’s “state of the art” AI is telling people to throw car batteries into the ocean and put glue on pizza, the answer is of course: it’s not going to, there will still be a human fact
checker in the loop).

The fact checking process is roughly as follows:

  1. Monitor media
  2. Spot facts
  3. Verify them
  4. Publish (“exploit”) the result

The talk detailed some tools to automate the first two parts: using speech-to-text models to transcribe political speeches and so on as they happen, and then text analysis tools to identify potentially “verifiable” statements. As well as deciding whether a statement can be
verified, they have to analyse whether it matters, whether the misinformation could be dangerous, whether it’s a joke, and so on. There wasn’t much detail how they do this in the talk, unfortunately.

Then they mentioned a multimodal model named ClaimCheck which has run as a WhatsApp bot since 2020 where people can send forwarded messages to see if they are true. This works using RAG (Retrieval Augmented Generation) to search an existing database of human-verified facts. Again I’d have loved more details on how this works.

Finally they mentioned that Newtral will be launching a technology-focused spinoff named Trueflag, aiming to build the existing tech into a software-as-a-service tool for fact checking that can be used by news organisations, social media platforms and so on.

I found this article online which gives some more info:

I’m still very curious about how far they can get with this tech and will be following closely.

Other note

There was some talks I didn’t enjoy so much. One about using Pyomo to model the game Age of Empires 2, delivered by some mathmeticians who unfortunately didn’t actually hook up their model to the real game, and the second half the talk was just graphs when what we obviously wanted was footage from actual Age of Empires 2 games. Pyomo looks interesting though.

I also saw a talk on the Robot Framework for testing. The talk was well delivered but I fundamentally disagree with the premise that by hiding Python code behind a bunch of free form text, “non-programmers” will be able to write and adapt test suites. The tool itself looks fine, but I wonder when we’ll learn that there’s no tool that can magically make testing easy, people just have to learn to write excellent test suites using code, and that’s the only way anyone will be able to produce a good test suite. (I’m considering doing a talk myself called “How to write a good test suite using regular programming tools”.)

Vigo was of course fantastic despite the intermittent rain, minor travel issues and increasingly strange procedures for checking in to tourist apartments. (This is the first time I’ve stayed in a place with a “Virtual key”, a web page linked to a smart lock on the apartment door which means you need working internet to enter the place). At least the rental apartment did have a license.

I’d recommend Python community events to everyone. People use Python for all kinds of different things – science, journalism, art, activism, and so on – you always get an interesting mix of talks and not just dry technical presentations of screenshots of code. Even the very technical talk on the GIL was not dry at all. Wherever you live you probably have a local Python community and they may well organize an annual conference, worth investigating.Those of us outside the UK no doubt also have a national Python
community who probably hold a conference from time to time. Get
involved!

Thanks as always to Codethink for sponsoring accommodation, food and sponsoring the event itself.

Dev Log September 2024

A long overdue dev log. The last one was for September 2023. That's a year. Stuff in life has happened.

Compiano

In November I switched Compiano to use pipewire directly for sound. This mean removing bits of UI too.

I should look at a release, but I have a few blockers I need to tackle. One key element is that there is a mechanism to download the soundbanks and for now it doesn't ask consent to do so. I don't want a release without this.

Raw thumbnailer

I already posted about it.

libopenraw

Lot has happened on that front. I want it to be a foundation of Niepce and others.

First adding it to glycin triggered an alpha release on crates.io. There started a long cycle of alpha release, we are at alpha 8 as of now. Various changes:

  • Added the mp4parse crate directly as a module. A key reason is that I already used a fork so it complicate things. Maybe I should make these few bits upstreamable and use upstream.
  • Added a mime types API
  • Save up a lot of RAM when doing colour interpolation: it's done in place instead of allocating a new buffer. This is significant as this is done using a 64-bit float per component.
  • Fixed the rendering in many ways. The only thing it needs it to apply the colour balance.
  • Fixed unpack or decompression of Olympus and Fuji files.
  • Got an external contribution: Panasonic decompression. This made me fix the loading of uncompressed Panasonic raw too. The more recent Panasonic cameras are still failing though, there is a subtle variant that needs to be handled.

Still missing from rendering: recent Nikon and all their exotic variants and compression scheme, Canon CR3, GoPro, Sony.

Niepce

Not so much work directly done on Niepce in the last few month, but still.

Ongoing features

Started a while ago some work towards the import and a rework of the catalog (the main data storage).

The former is the implementation of a workflow that allow immporting images into the catalog.

The latter involve reworking the catalog to become a self contained storage as a sqlite3 database. One step I already did was to use it to store the catalog preferences instead of a separate file. This should also include fixing the UI open, creating, switching.

This big two things are user visible and are a stop forward what I want to happen as an internal milestone. Then I can start pluging the library import and maybe import my picture vault. A good starting point towards managing the collection, but not really for photo editing yet. Gotta make choices.

Images

Implemented support for HEIF which is being adopted by camera manufacturers.

I updated the RT engine to 5.11 which came with RawTherapee 5.11. This is still a soft work of the code base to use strip out Gtk3 and a more recent version of glibmm. The latter patch might no longer be needed as I have since removed gtkmm from Niepce.

I also implemented the GEGL pipeline using gegl-rs, which I took over to make it useful. At one I shall try to figure out how to write a loader in Rust to use libopenraw with GEGL.

Cleanups

The UI is slowly moving to use blueprint, and I removed all the first-party C++ code outside of bindings, no more Gtkmm and Glibmm is only here because RT engine needs it.

Other

Stuff I contributed to.

STF

I took part to the STF effort and worked on fixing issues with the desktop portal. The big chunk of the work related to the USB portal, taking over code by Georges that is itself based on code by Ryan. It spreads through multiple component of the stack: flatpak, xdg-desktop-portal, xdg-desktop-portal-gnome, libportal and ashpd.

I also did a bunch of bug fixes, crashes, memory leaks, etc in flatpak, flatpak-builder, and the rest of the stack.

I also implemented issue #1 for flatpak-builder: easy renaming of MIME files and icons, and also properly fixing the id in the appstream file, a common problem in flatpak.

Glycin

Glycin is a sandboxed image loader. I did implement the raw camera loader using libopenraw. It's written in Rust, so is libopenraw now. Thank you Sophie for merging it.

Poppler

Jeff was complaining about a file being super slow, with sysprof flamegraph. That picked my curiosity and looked at it. The peculiarity of the document is that it has 16000 pages and a lot of cross references.

This lead to two patches:

  • The first one is in 24.06. There was a loop, calling for the length of the container at each iteration. Turns out this is protected by a mutex and that's a lot of time spend for nothing since the value is immutable. Call it once before the loop and voila.

  • The other one, merged for 24.09 change that loop to be a hash table lookup. The problem is that it want to locate the page by object reference, but iterate through the page list. Lots of ref and lots of a page mean even more iterator. The more complex approach is when building the page cache (it's done on demand), we build a reference to page index map. And the slow code is no longer slow and almost disappear from the flamegraphs.

This make Evince and Okular faster to open that document and any presenting similar attribute: a lot of bookmarks.

October 06, 2024

CapyPDF 0.12.0 released

I have just made the 0.12 release of CapyPDF. It does not really have new features, but the API has been overhauled. It is almost guaranteed that no code developed against 0.11 will work without code changes. Such is the joy of not having any users.

Experimental C++ wrapper

CapyPDF has a plain C API. This makes it stable and easy to use from any programming language. That also makes it cumbersome to use. Here is what you need to write to create a PDF file that has a single rectangle:

Given that this is C, you can't really do better. However I was asked if I could create something more ergonomic for C++ users. This seemed like an interesting challenge, so I did some experimentation, which ships with the 0.12 release.

The requirements

The C++ wrapper should fulfill the following requirements:

  • Fully type safe
  • No manual memory management
  • Ideally a single header
  • Zero overhead
  • Fast to compile
  • All objects are move-only
  • IDE code completion friendly
  • Does not need to maintain API or ABI stability (those who need that have to use the C API anyway)

The base

After trying a bunch of different things I eventually came up with this design:

Basically it just stores an underlying CapyPDF object type and a helper class used to deallocate it. The operators merely mean that the object can be cast to a pointer of the underlying type. Typically you want conversion operators to be explicit but these are not, because being implicit removes a ton of boilerplate code.

For example let's look what the Color object wrapper looks like. First you need the deleter:

The class definition itself is simple:

The class has no other members than the one from the base class. The last thing we need before we can start calling into CapyPDF functions is an error handler. Because all functions in the API have the exact same form, this can be done with a single macro:

This makes the constructor look like this:

Method calls bring all of these things together.

Because the wrapper types are implicitly convertible to the underlying pointer types you can pass them directly to the C API functions. Otherwise the wrapper code would be filled with .get() method calls or the like.

The end result

With all that done, the C code from the beginning of this post can be written like this:

Despite using templates, inheritance and stdlib types the end result can be proven to have zero overhead:

After this what remains is mostly the boring work of typing out all the wrapper calls.

October 05, 2024

Boiling The Ocean Hackfest

Last weekend we had another edition of last year’s post-All Systems Go hackfest in Berlin. This year it was even more of a collaborative event with friends from other communities, particularly postmarketOS. Topics included GNOME OS, postmarketOS, systemd, Android app support, hardware enablement, app design, local-first sync, and many other exciting things.

This left us with an awkward branding question, since we didn’t want to name the event after one specific community or project. Initially we had a very long and unpronounceable acronym (LMGOSRP), but I couldn’t bring myself to use that on the announcement post so I went with something a bit more digestible :)

“Boiling The Ocean” refers to the fact that this is what all the hackfest topics share in common: They’re all very difficult long-term efforts that we expect to still be working on for years before they fully bear fruit. A second, mostly incidental, connotation is that the the ocean (and wider biosphere) are currently being boiled thanks to the climate crisis, and that much of our work has a degrowth or resilience angle (e.g. running on older devices or local-first).

I’m not going to try to summarize all the work done at the event since there were many different parallel tracks, many of which I didn’t participate in. Here’s a quick summary of a few of the things I was tangentially involved in, hopefully others will do their own write-ups about what they were up to.

Mobile

Mainline Linux on ex-Android phones was a big topic, since there were many relevant actors from this space present. This includes the postmarketOS crew, Robert with his camera work, and Jonas and Caleb who are still working on Android app support via Alien Dalvik.

To me, one of the most exciting things here is that we’re seeing more well-supported Qualcomm devices (in addition to everyone’s favorite, the Oneplus 6) these days thanks to all the work being done by Caleb and others on that stack. Between this, the progress on cameras, and the Android app support maybe we can finally do the week-long daily driving challenge we’ve wanted to do for a while at GUADEC 2025 :)

Design

On Thursday night we already did a bit of pre-event hacking at a cafe, and I had an impromptu design session with Luca about eSIM support. He has an app for this at the moment, though of course ideally this should just be in Settings longer-term. For now we discussed how to clean up the UI a bit and bring it more in line with the HIG, and I’ll push some updates to the cellular settings mockups based on this soon.

On Friday I looked into a few Papers things with Pablo, in particular highlights/annotations. I pushed the new mockups, including a new way to edit annotations. It’s very exciting to see how energetic the Papers team is, huge kudos to Pablo, Qiu, Markus, et al for revitalizing this app <3

On Saturday I sat down with fellow GNOME design contributor Philipp, and looked at a few design questions in Decibels and Calendar. One of my main takeaways is that we should take a fresh look at the adaptive Calendar layout now that we have Adwaita breakpoints and multi-layout.

47 Release Party

On Saturday night we had the GNOME 47 release party, featuring a GNOME trivia quiz. Thanks to Ondrej for preparing it, and congrats to the winners: Adrian, Marvin, and Stefan :)

Local-First

Adrian and Andreas from p2panda had some productive discussions about a longer-term plan for a local-first sync system, and immediate next steps in that direction.

We have a first collaboration planned in the form of a Hedgedoc-style local-first syncing pad, codenamed “Aardvark” (initial mockups). This will be based on a new, more modular version of p2panda (still WIP, but to be released later this year). Longer-term the idea is to have some kind of shared system level daemon so multiple apps can use the same syncing infrastructure, but for now we want to test this architecture in a self-contained app since it’s much easier to iterate on. There’s no clear timeline for this yet, but we’re aiming to start this work around the end of the year.

GNOME OS

On Sunday we had a GNOME OS planning meeting with Adrian, Abderrahim, and the rest of the GNOME OS team (remote). The notes are here if you’re interested in the details, but the upshot is that the transition to the next-generation stack using systemd sysupdate and homed is progressing nicely (thanks to the work Adrian and Codethink have been doing for our Sovereign Tech Fund project).

If all goes to plan we’ll complete both of these this cycle, making GNOME OS 48 next spring a real game changer in terms of security and reliability.

Community

Despite the very last minute announcement and some logistical back and forth the event worked out beautifully, and we had over 20 people joining across the various days. In addition to the usual suspects I was happy to meet some newcomers, including from outside Berlin and outside the typical desktop crowd. Thanks for joining everyone!

Thanks also to Caleb and Zeeshan for helping with organization, and the venues we had hosting us across the various days:

  • offline, a community space in Neukölln
  • JUCR, for hosting us in their very cool Kreuzberg office and even paying for drinks and food
  • The x-hain hackerspace in Friedrichshain

See you next time!

October 04, 2024

fwupd 2.0.0 and new tricks

Today I tagged fwupd 2.0.0, which includes lots of new hardware support, a ton of bugfixes and more importantly a redesigned device prober and firmware loader that allows it to do some cool tricks. As this is a bigger-than-usual release I’ve written some more verbose releases notes below.

The first notable thing is that we’ve removed the requirement of GUsb in the daemon, and now use libusb directly. This allowed us to move the device emulation support from libgusb up into libfwupdplugin, which now means we can emulate devices created from sysfs too. This means that we can emulate end-to-end firmware updates on fake hidraw and nvme devices in CI just like we’ve been able to emulate using fake USB devices for some time. This increases the coverage of testing for every pull request, and makes sure that none of our “improvements” actually end up breaking firmware updates on some existing device.

The emulation code is actually pretty cool; every USB control request, ioctl(), read() (and everything inbetween) is recorded from a target device and saved to a JSON file with a unique per-request key for each stage of the update process. This is saved to a zip archive and is usually uploaded to the LVFS mirror and used in the device-tests in fwupd. It’s much easier than having a desk full of hardware and because each emulation is just that, emulated, we don’t need to do the tens of thousands of 5ms sleeps in between device writes — which means most emulations take a few ms to load, decompress, write and verify. This means you can test [nearly] “every device we support” in just a few seconds of CI time.

Another nice change is the removal of GUdev as a dependency. GUdev is a nice GObject abstraction over libudev and then sd_device from systemd, but when you’re dealing with thousands of devices (that you’re poking in weird ways), and tens of thousands of device children and parents the “immutable device state” objects drift from reality and the abstraction layers really start to hurt. So instead of using GUdev we now listen to the netlink socket and parse those events into fwupd FuDevice objects, rather than having an abstract device with another abstract device being used as a data source. It has also allowed us to remove at least one layer of caching (that we had to work around in weird ways), and also reduce the memory requirement both at startup and at runtime at the expense of re-implementing the netlink parsing code. It also means we can easily start using ueventd, which makes it possible to run fwupd on Android. More on that another day!

dep graph showing lots of things
The old
dep graph showing a lot less things
The new

The biggest change, and the feature that’s been requested the most by enterprise customers is the ability to “stream” firmware from archives into devices. What fwupdmgr used to do (and what 1_9_X still does) is:

  • Send the cabinet archive to the daemon as a file descriptor
  • The daemon then loads the input stream into memory (copy 1)
  • The memory blob is parsed as a cabinet archive, and the blocks-with-header are re-assembled into whole files (copy 2)
  • The payload is then typically chunked into pieces, with each chunk being allocated as a new blob (copy 3)
  • Each chunk is sent to the device being updated

This worked fine for a 32MB firmware payload — we allocate ~100MB of memory and then free it, no bother at all.

Where this fails is for one of two cases: huge firmware or underpowered machine — or in the pathological case, huge video conferencing camera firmware with inexpensive Google ChromeBook. In that example we might have a 1.5GB firmware file (it’s probably a custom Android image…) on a 4GB-of-RAM budget ChromeBook. The running machine has a measly 1GB free system memory, and then fwupd immediately OOMs when just trying to parse the archive, let alone deploy the firmware.

So what can we do to reduce the number of in memory copies, or maybe even remove them all completely? There are two tricks that fwupd 2.0.x uses to load firmware now, and those two primitives we now use all over the source tree:

Partial Input Stream:

This models an input stream (which you can think of like a file descriptor) that is made up of a part of a different input stream at a specific offset. So if you have a base input stream of [123456789] you can build two partial input streams of, say, [234] and [789]. If you try and read() 5 bytes from the first partial stream you just get 3 bytes back. If you seek to offset 0x1 on the second partial input stream you get the two bytes of [89].

Composite Input Stream

This models a different kind of input stream, which is made up of one or more partial input streams. In some cases there can be hundreds of partial streams making up one composite stream. So if you take the first two partial input streams defined a few lines before, and then add them to a composite input stream you get [234789] — and reading 8 bytes at offset 0x0 from that would give you what you expect.

This means the new way of processing firmware archives can be:

  • Send the cabinet archive to the daemon as a file descriptor
  • The daemon parses it as a cab archive header, and adds the data section of each block to a partial stream that references the base stream at a specific offset
  • The daemon “collects” all the partial streams into a composite stream for each file in the archive that spans multiple blocks
  • The payload is split into chunks, with each chunk actually being a partial stream of the composite file stream
  • Each chunk is read from the stream, and sent to the device being updated

Sooo…. We never actually read the firmware payload from the cabinet file descriptor until we actually send the chunk of payload to the hardware. This means we have to seek() all over the place, possibly many times for each chunk, but in the kernel a seek() is really just doing some pointer maths to a memory buffer and so it’s super quick — even faster in real time than the “simple” process we used in 1_9_X. The only caveat is that you have to use uncompressed cabinet archives (the default for the LVFS) — as using MSZIP decompression currently does need a single copy fallback.

blocks in a cab archive

This means we can deploy a 1.5GB firmware payload using an amazingly low 8MB of RSS, and using less CPU that copying 1.5GB of data around a few times. Which means, you can now deploy that huge firmware to that $3,000 meeting room camera from a $200 ChromeBook — but also means we can do the same in RHEL for 5G mobile broadband radios on low-power, low-cost IoT hardware.

Making such huge changes to fwupd meant we could justify branching a new release, and because we bumped the major version it also made sense to remove all the deprecated API in libfwupd. All the changes are documented in the README file, but I’ve already sent patches for gnome-firmware, gnome-software and kde-discover to make the tiny changes needed for the library bump.

My plan for 2.0.x is to ship it in Flathub, and in Fedora 42 — but NOT Fedora 41, RHEL 9 or RHEL 10 just yet. There is a lot of new code that’s only had a little testing, and I fully expect to do a brown paperbag 2.0.1 release in a few days because we’ve managed to break some hardware for some vendor that I don’t own, or we don’t have emulations for. If you do see anything that’s weird, or have hardware that used to be detected, and now isn’t — please let us know.

Anyway, enough talking for now, enjoy!

HIOCREVOKE merged for kernel 6.12

TLDR: if you know what EVIOCREVOKE does, the same now works for hidraw devices via HIDIOCREVOKE.

The HID standard is the most common hardware protocol for input devices. In the Linux kernel HID is typically translated to the evdev protocol which is what libinput and all Xorg input drivers use. evdev is the kernel's input API and used for all devices, not just HID ones.

evdev is mostly compatible with HID but there are quite a few niche cases where they differ a fair bit. And some cases where evdev doesn't work well because of different assumptions, e.g. it's near-impossible to correctly express a device with 40 generic buttons (as opposed to named buttons like "left", "right", ...[0]). In particular for gaming devices it's quite common to access the HID device directly via the /dev/hidraw nodes. And of course for configuration of devices accessing the hidraw node is a must too (see Solaar, openrazer, libratbag, etc.). Alas, /dev/hidraw nodes are only accessible as root - right now applications work around this by either "run as root" or shipping udev rules tagging the device with uaccess.

evdev too can only be accessed as root (or the input group) but many many moons ago when dinosaurs still roamed the earth (version 3.12 to be precise), David Rheinsberg merged the EVIOCREVOKE ioctl. When called the file descriptor immediately becomes invalid, any further reads/writes will fail with ENODEV. This is a cornerstone for systemd-logind: it hands out a file descriptor via DBus to Xorg or the Wayland compositor but keeps a copy. On VT switch it calls the ioctl, thus preventing any events from reaching said X server/compositor. In turn this means that a) X no longer needs to run as root[1] since it can get input devices from logind and b) X loses access to those input devices at logind's leisure so we don't have to worry about leaking passwords.

Real-time forward to 2024 and kernel 6.12 now gained the HIDIOCREVOKE for /dev/hidraw nodes. The corresponding logind support has also been merged. The principle is the same: logind can hand out an fd to a hidraw node and can revoke it at will so we don't have to worry about data leakage to processes that should not longer receive events. This is the first of many steps towards more general HID support in userspace. It's not immediately usable since logind will only hand out those fds to the session leader (read: compositor or Xorg) so if you as application want that fd you need to convince your display server to give it to you. For that we may have something like the inputfd Wayland protocol (or maybe a portal but right now it seems a Wayland protocol is more likely). But that aside, let's hooray nonetheless. One step down, many more to go.

One of the other side-effects of this is that logind now has an fd to any device opened by a user-space process. With HID-BPF this means we can eventually "firewall" these devices from malicious applications: we could e.g. allow libratbag to configure your mouse' buttons but block any attempts to upload a new firmware. This is very much an idea for now, there's a lot of code that needs to be written to get there. But getting there we can now, so full of optimism we go[2].

[0] to illustrate: the button that goes back in your browser is actually evdev's BTN_SIDE and BTN_BACK is ... just another button assigned to nothing particular by default.
[1] and c) I have to care less about X server CVEs.
[2] mind you, optimism is just another word for naïveté

October 03, 2024

preliminary notes on a nofl field-logging barrier

When you have a generational collector, you aim to trace only the part of the object graph that has been allocated recently. To do so, you need to keep a remembered set: a set of old-to-new edges, used as roots when performing a minor collection. A language run-time maintains this set by adding write barriers: little bits of collector code that run when a mutator writes to a field.

Whippet’s nofl space is a block-structured space that is appropriate for use as an old generation or as part of a sticky-mark-bit generational collector. It used to have a card-marking write barrier; see my article diving into V8’s new write barrier, for more background.

Unfortunately, when running whiffle benchmarks, I was seeing no improvement for generational configurations relative to whole-heap collection. Generational collection was doing fine in my tiny microbenchmarks that are part of Whippet itself, but when translated to larger programs (that aren’t yet proper macrobenchmarks), it was a lose.

I had planned on doing some serious tracing and instrumentation to figure out what was happening, and thereby correct the problem. I still plan on doing this, but instead for this issue I used the old noggin technique instead: just, you know, thinking about the thing, eventually concluding that unconditional card-marking barriers are inappropriate for sticky-mark-bit collectors. As I mentioned in the earlier article:

An unconditional card-marking barrier applies to stores to slots in all objects, not just those in oldspace; a store to a new object will mark a card, but that card may contain old objects which would then be re-scanned. Or consider a store to an old object in a more dense part of oldspace; scanning the card may incur more work than needed. It could also be that Whippet is being too aggressive at re-using blocks for new allocations, where it should be limiting itself to blocks that are very sparsely populated with old objects.

That’s three problems. The second is well-known. But the first and last are specific to sticky-mark-bit collectors, where pages mix old and new objects.

a precise field-logging write barrier

Back in 2019, Steve Blackburn’s paper Design and Analysis of Field-Logging Write Barriers took a look at the state of the art in precise barriers that record not regions of memory that have been updated, but the precise edges (fields) that were written to. He ends up re-using this work later in the 2022 LXR paper (see §3.4), where the write barrier is used for deferred reference counting and a snapshot-at-the-beginning (SATB) barrier for concurrent marking. All in all field-logging seems like an interesting strategy. Relative to card-marking, work during the pause is much less: you have a precise buffer of all fields that were written to, and you just iterate that, instead of iterating objects. Field-logging does impose some mutator cost, but perhaps the payoff is worth it.

To log each old-to-new edge precisely once, you need a bit per field indicating whether the field is logged already. Blackburn’s 2019 write barrier paper used bits in the object header, if the object was small enough, and otherwise bits before the object start. This requires some cooperation between the collector, the compiler, and the run-time that I wasn’t ready to pay for. The 2022 LXR paper was a bit vague on this topic, saying just that it used “a side table”.

In Whippet’s nofl space, we have a side table already, used for a number of purposes:

  1. Mark bits.

  2. Iterability / interior pointers: is there an object at a given address? If so, it will have a recognizable bit pattern.

  3. End of object, to be able to sweep without inspecting the object itself

  4. Pinning, allowing a mutator to prevent an object from being evacuated, for example because a hash code was computed from its address

  5. A hack to allow fully-conservative tracing to identify ephemerons at trace-time; this re-uses the pinning bit, since in practice such configurations never evacuate

  6. Bump-pointer allocation into holes: the mark byte table serves the purpose of Immix’s line mark byte table, but at finer granularity. Because of this though, it is swept lazily rather than eagerly.

  7. Generations. Young objects have a bit set that is cleared when they are promoted.

Well. Why not add another thing? The nofl space’s granule size is two words, so we can use two bits of the byte for field logging bits. If there is a write to a field, a barrier would first check that the object being written to is old, and then check the log bit for the field being written. The old check will be to a byte that is nearby or possibly the same as the one to check the field logging bit. If the bit is unsert, we call out to a slow path to actually record the field.

preliminary results

I disassembled the fast path as compiled by GCC and got something like this on x86-64, in AT&T syntax, for the young-generation test:

mov    %rax,%rdx
and    $0xffffffffffc00000,%rdx
shr    $0x4,%rax
and    $0x3ffff,%eax
or     %rdx,%rax
testb  $0xe,(%rax)

The first five instructions compute the location of the mark byte, from the address of the object (which is known to be in the nofl space). If it has any of the bits in 0xe set, then it’s in the old generation.

Then to test a field logging bit it’s a similar set of instructions. In one of my tests the data type looks like this:

struct Node {
  uintptr_t tag;
  struct Node *left;
  struct Node *right;
  int i, j;
};

Writing the left field will be in the same granule as the object itself, so we can just test the byte we fetched for the logging bit directly with testb against $0x80. For right, we should be able to know it’s in the same slab (aligned 4 MB region) and just add to the previously computed byte address, but the C compiler doesn’t know that right now and so recomputes. This would work better in a JIT. Anyway I think these bit-swizzling operations are just lost in the flow of memory accesses.

For the general case where you don’t statically know the offset of the field in the object, you have to compute which bit in the byte to test:

mov    %r13,%rcx
mov    $0x40,%eax
shr    $0x3,%rcx
and    $0x1,%ecx
shl    %cl,%eax
test   %al,%dil

Is it good? Well, it improves things for my whiffle benchmarks, relative to the card-marking barrier, seeing a 1.05×-1.5× speedup across a range of benchmarks. I suspect the main advantage is in avoiding the “unconditional” part of card marking, where a write to a new object could cause old objects to be added to the remembered set. There are still quite a few whiffle configurations in which the whole-heap collector outperforms the sticky-mark-bit generational collector, though; I hope to understand this a bit more by building a more classic semi-space nursery, and comparing performance to that.

Implementation links: the barrier fast-path, the slow path, and the sequential store buffers. (At some point I need to make it so that allocating edge buffers in the field set causes the nofl space to page out a corresponding amount of memory, so as to be honest when comparing GC performance at a fixed heap size.)

Until next time, onwards and upwards!

October 02, 2024

IPU6 camera support in Fedora 41

I'm happy to announce that the last tweaks have landed and that the fully FOSS libcamera software ISP based IPU6 camera support in Fedora 41 now has no known bugs left. See the Changes page for testing instructions.

Supported hardware

Unlike USB UVC cameras where all cameras work with a single kernel driver, MIPI cameras like the Intel IPU6 cameras require multiple drivers. The IPU6 input-system CSI receiver driver is common to all laptops with an IPU6 camera, but different laptops use different camera sensors and each sensor needs its own driver and then there are glue ICs like the LJCA USB IO-expander and the iVSC (Intel Visual Sensing Controller) and there also is the ipu-bridge code which translates Windows oriented ACPI tables with sensor info into the fwnodes which the Linux drivers expect.

This means that even though IPU6 support has landed in Fedora 41 not all laptops with an IPU6 camera will work. Currently the IPU6 integrated in the following CPU models works if the sensor + glue hw/sw is also supported:

  • Tiger Lake
  • Alder Lake
  • Raptor Lake

Jasper Lake and Meteor Lake also have an IPU6 but there is some more integration work necessary to get things to work there. Getting Meteor Lake IPU6 cameras to work is high on my TODO list.

The mainline kernel IPU6 CSI receiver + libcamera software ISP has been successfully tested on the following models:

  • Various Lenovo ThinkPad models with ov2740 (INT3474) sensor (1)
  • Various Dell models with ov01a10 (OVTI01A0) sensor
  • Dell XPS 13 PLus with ov13b10 (OVTIDB10/OVTI13B1)
  • Some HP laptops with hi556 sensor (INT3537)

To see which sensor your laptop has run: "ls /sys/bus/i2c/devices" this will show e.g. "i2c-INT3474:00" if you have an ov2740, with INT3474 being the ACPI Hardware ID (HID) for the sensor. See here for a list of currently known HID to sensor mappings. Note not all of these have upstream drivers yet. In that cases chances are that there might be a sensor driver for your sensor here.

We could really use help with people submitting drivers from there upstream. So if you have a laptop with a sensor which is not in the mainline but is available there, you know a bit of C-programming and you are willing to help, then please drop me an email so that we can work together to get the driver upstream.

1) on some ThinkPads the ov2740 sensor fails to start streaming most of the time. I plan to look into this next week and hopefully I can come up with a fix.

MIPI camera Integration work done for Fedora 41

After landing the kernel IPU6 CSI receiver and libcamera software ISP support upstream early in the Fedora 41 cycle, there still was a lot of work to do with regards to integrating this into the rest of the stack so that the cameras can actually be used outside of the qcam test app.

The whole stack looks like this "kernel → libcamera → pipewire | pipewire-camera-consuming-app". Where the 2 currently supported pipewire-camera consuming apps are Firefox and GNOME Snapshot.

Once this was all up and running testing found quite a few bugs which have all been fixed now:

  • Firefox showing 13 different cameras in its camera selection pulldown for a single IPU6 camera (fix).
  • Installing pipewire-plugin-libcamera leads to UVC cameras being powered on all the time causing significant battery drain (bug, bug, discussion, fix).
  • Pipewire does not always recognizes cameras on login (bug, bug, bug, fix).
  • Pipewire fails to show cameras with relative controls (fix).
  • spa_libcamera_buffer_recycle sometimes fails, causing stream to freeze on first frame (bug, fix)
  • Firefox chooses bad default resolution of 640x480. I worked with Jan Grulich to get this fixed and this is fixed as of firefox-130.0.1-3.fc41. Thank you Jan!
  • Snapshot prefers 4:3 mode, e.g. 1280x1080 on 16:9 camera sensors capable of 1920x1080 (pending fix)
  • Added intel-vsc-firmware, pipewire-plugin-libcamera, libcamera-ipa to the Fedora 41 Workstation default package-set (pull, pull, pull)



comment count unavailable comments

October 01, 2024

Berlin Mini GUADEC 2024

It’s been over two months but I still haven’t gotten around to writing a blog post about this year’s Berlin Mini GUADEC. I still don’t have time to write a longer post, but instead of putting this off forever I thought I’d at least share a few photos.

Overall I think our idea of running this as a self-organized event worked out great. The community (both Berlin locals and other attendees) really came together to make it a success, despite the difficult circumstances. Thanks in particular to Jonas Dreßler for taking care of recording and streaming the talks, Ondřej Kolín and Andrei Zisu for keeping things on track during the event, and Sonny Piers for helping with various logistical things before the event.

 

 

Thanks to everyone who helped to make it happen, and see you next year!

September 27, 2024

Graphics improvements in WebKitGTK and WPEWebKit 2.46

WebKitGTK and WPEWebKit recently released a new stable version 2.46. This version includes important changes in the graphics implementation.

Skia

The most important change in 2.46 is the introduction of Skia to replace Cairo as the 2D graphics renderer. Skia supports rendering using the GPU, which is now the default, but we also use it for CPU rendering using the same threaded rendering model we had with Cairo. The architecture hasn’t changed much for GPU rendering: we use the same tiled rendering approach, but buffers for dirty regions are rendered in the main thread as textures. The compositor waits for textures to be ready using fences and copies them directly to the compositor texture. This was the simplest approach that already resulted in much better performance, specially in the desktop with more powerful GPUs. In embedded systems, where GPUs are not so powerful, it’s still better to use the CPU with several rendering threads in most of the cases. It’s still too early to announce anything, but we are already experimenting with different models to improve the performance even more and make a better usage of the GPU in embedded devices.

Skia has received several GCC specific optimizations lately, but it’s always more optimized when built with clang. The optimizations are more noticeable in performance when using the CPU for rendering. For this reason, since version 2.46 we recommend to build WebKit with clang for the best performance. GCC is still supported, of course, and performance when built with GCC is quite good too.

HiDPI

Even though there aren’t specific changes about HiDPI in 2.46, users of high resolution screens using a device scale factor bigger than 1 will notice much better performance thanks to scaling being a lot faster on the GPU.

Accelerated canvas

The 2D canvas can be accelerated independently on whether the CPU or the GPU is used for painting layers. In 2.46 there’s a new setting WebKitSettings:enable-2d-canvas-acceleration to control the 2D canvas acceleration. In some embedded devices the combination of CPU rendering for layer tiles and GPU for the canvas gives the best performance. The 2D canvas is normally rendered into an image buffer that is then painted in the layer as an image. We changed that for the accelerated case, so that the canvas is now rendered into a texture that is copied to a compositor texture to be directly composited instead of painted into the layer as an image. In 2.46 the offscreen canvas is enabled by default.

There are more cases where accelerating the canvas is not desired, for example when the canvas size is not big enough it’s faster to use the GPU. Also when there’s going to be many operations to “download” pixels from GPU. Since this is not always easy to predict, in 2.46 we added support for the willReadFrequently canvas setting, so that when set by the application when creating the canvas it causes the canvas to be always unaccelerated.

Filters

All the CSS filters are now implemented using Skia APIs, and accelerated when possible. The most noticeable change here is that sites using blur filters are no longer slow.

Color spaces

Skia brings native support for color spaces, which allows us to greatly simplify the color space handling code in WebKit. WebKit uses color spaces in many scenarios – but especially in case of SVG and filters. In case of some filters, color spaces are necessary as some operations are simpler to perform in linear sRGB. The good example of that is feDiffuseLighting filter – it yielded wrong visual results for a very long time in case of Cairo-based implementation as Cairo doesn’t have a support for color spaces. At some point, however, Cairo-based WebKit implementation has been fixed by converting pixels to linear in-place before applying the filter and converting pixels in-place back to sRGB afterwards. Such a workarounds are not necessary anymore as with Skia, all the pixel-level operations are handled in a color-space-transparent way as long as proper color space information is provided. This not only impacts the results of some filters that are now correct, but improves performance and opens new possibilities for acceleration.

Font rendering

Font rendering is probably the most noticeable visual change after the Skia switch with mixed feedback. Some people reported that several sites look much better, while others reported problems with kerning in other sites. In other cases it’s not really better or worse, it’s just that we were used to the way fonts were rendered before.

Damage tracking

WebKit already tracks the area of the layers that has changed to paint only the dirty regions. This means that we only repaint the areas that changed but the compositor incorporates them and the whole frame is always composited and passed to the system compositor. In 2.46 there’s experimental code to track the damage regions and pass them to the system compositor in addition to the frame. Since this is experimental it’s disabled by default, but can be enabled with the runtime feature PropagateDamagingInformation. There’s also UnifyDamagedRegions feature that can be used in combination with PropagateDamagingInformation to unify the damage regions into one before passing it to the system compositor. We still need to analyze the impact of damage tracking in performance before enabling it by default. We have also started an experiment to use the damage information in WebKit compositor and avoid compositing the entire frame every time.

GPU info

Working on graphics can be really hard in Linux, there are too many variables that can result in different outputs for different users: the driver version, the kernel version, the system compositor, the EGL extensions available, etc. When something doesn’t work for some people and work for others, it’s key for us to gather as much information as possible about the graphics stack. In 2.46 we have added more useful information to webkit://gpu, like the DMA-BUF buffer format and modifier used (for GTK port and WPE when using the new API). Very often the symptom is the same, nothing is rendered in the web view, even when the causes could be very different. For those cases, it’s even more difficult to gather the info because webkit://gpu doesn’t render anything either. In 2.46 it’s possible to load webkit://gpu/stdout to get the information as a JSON directly in stdout.

Sysprof

Another common symptom for people having problems is that a particular website is slow to render, while for others it works fine. In these cases, in addition to the graphics stack information, we need to figure out where we are slower and why. This is very difficult to fix when you can’t reproduce the problem. We added initial support for profiling in 2.46 using sysprof. The code already has some marks so that when run under sysprof we get useful information about timings of several parts of the graphics pipeline.

Next

This is just the beginning, we are already working on changes that will allow us to make a better use of both the GPU and CPU for the best performance. We have also plans to do other changes in the graphics architecture to improve synchronization, latency and security. Now that we have adopted sysprof for profiling, we are also working on improvements and new tools.

New Cambalache Release 0.92.0!

I am pleased to announce a new Cambalache stable release, version 0.92.0!

This comes with two major dependencies changes, the first one is a very basic port to Adwaita and webkit/broadway replacement with a custom Wayland compositor widget based on wlroots.

What’s new:

    • Basic port to Adwaita
    • Use Casilda compositor widget for workspace
    • Update widget catalogs to SDK 47
    • Improved Drag&Drop support
    • Improve workspace performance
    • Enable workspace animations
    • Fix window ordering
    • Support new desktop dark style
    • Support 3rd party libraries
    • Streamline headerbar
    • Lots of bug fixes and minor improvements

Adwaita

The port to Adwaita gives Cambalache the new modern look and enables dark mode support.
The headerbar is simplified only keeping most common used actions, everything else was moved to the main menu.

Cambalache editing Cambalache UI
Cambalache editing Cambalache UI in dark mode

Casilda Compositor

Up until this release, Cambalache showed windows from a different process in its workspace running broadwayd or gtk4-broadwayd backend depending on the gtk version of your project and using a WebView to connect to it and show the windows in an HTML canvas.Workspace diagram using Bradwayd and WebKit WebViewAll of this was replaced with a simple Wayland compositor widget which reduces hard dependencies a lot.

On top of that we get all the optimizations from using Wayland instead of a protocol meant to go over the internet.

With Broadway, the client would render the window in memory, the broadway backend would compress the image and sent it over TCP to the webview which has to uncompress it and render it on an HTML5 canvas.

Now, the client just renders in shared memory which is directly available to the compositor widget to use. This also leave the option to further improve performance by adding support for dmabuf which would allow to offload the composition to the host compositor reducing the number of memory copies to show the windows on the screen.

This allowed me to re enable Gtk animations since they no longer impact the workspace performance.

Special thanks to emersion, kennylevinsen, vyivel and the wlroots community for their support and awesome project, I would not have been able to do this without wlroots and their help.

You can read more about Casilda in my previous post

3rd party libraries

Cambalache now loads 3rd party catalogs from GLib.get_system_data_dirs()/cambalache/catalogs and ~/.cambalache/catalogs

These catalog files are generated from Gir data with a new tool bundled in Cambalache calledcmb-catalog-gen. This used to be an internal and still lacks proper documentation but you can see an example of how its used internally here

So what is a catalog anyway?

A catalog is a XML file with all the necessary data for Cambalache to produce UI files with widgets from a particular library, this includes the different GTypes, with their properties, signals and everything else except the actual object implementations.

Runtime objects are created in the workspace by loading the GI namespace specified in the catalog.

Feel free to contact me on matrix if you are interested in adding support for a 3rd party library.

Improved Drag&Drop

After the extensive rework done porting the main widget hierarchy from GtkTreeView to GtkColumnView and implementing several GListModel interfaces to avoid maintaining multiple lists I was able to reimplement and extend Drag&Drop code so now its possible to drop widgets in different parents.

Data Model

History handling for Undo/Redo was simplified from multiple history tables (one per table tracked) into one history table by adding a few extra columns to store data change in JSON format.

CREATE TABLE history (
  history_id INTEGER PRIMARY KEY,
  command TEXT NOT NULL,
  range_id INTEGER REFERENCES history,
  table_name TEXT,
  column_name TEXT,
  message TEXT,
+  table_pk JSON,
+  new_values JSON,
+  old_values JSON
);

This is the current history table, entries are populated automatically by triggers each time something in the project is created, changed or removed.

This data is then used to implement Undo and Redo commands.

Where to get it?

You can get it from Flathub

flatpak remote-add --if-not-exists flathub https://dl.flathub.org/repo/flathub.flatpakrepo

flatpak install flathub ar.xjuan.Cambalache

or directly from gitlab

git clone https://gitlab.gnome.org/jpu/cambalache.git

Matrix channel

Have any question? come chat with us at #cambalache:gnome.org

Mastodon

Follow me in Mastodon @xjuan to get news related to Cambalache development.

Happy coding!

 

 

 

 

 

September 26, 2024

needed-bits optimizations in guile

Hey all, I had a fun bug this week and want to share it with you.

numbers and representations

First, though, some background. Guile’s numeric operations are defined over the complex numbers, not over e.g. a finite field of integers. This is generally great when writing an algorithm, because you don’t have to think about how the computer will actually represent the numbers you are working on.

In practice, Guile will represent a small exact integer as a fixnum, which is a machine word with a low-bit tag. If an integer doesn’t fit in a word (minus space for the tag), it is represented as a heap-allocated bignum. But sometimes the compiler can realize that e.g. the operands to a specific bitwise-and operation are within (say) the 64-bit range of unsigned integers, and so therefore we can use unboxed operations instead of the more generic functions that do run-time dispatch on the operand types, and which might perform heap allocation.

Unboxing is important for speed. It’s also tricky: under what circumstances can we do it? In the example above, there is information that flows from defs to uses: the operands of logand are known to be exact integers in a certain range and the operation itself is closed over its domain, so we can unbox.

But there is another case in which we can unbox, in which information flows backwards, from uses to defs: if we see (logand n #xff), we know:

  • the result will be in [0, 255]

  • that n will be an exact integer (or an exception will be thrown)

  • we are only interested in a subset of n‘s bits.

Together, these observations let us transform the more general logand to an unboxed operation, having first truncated n to a u64. And actually, the information can flow from use to def: if we know that n will be an exact integer but don’t know its range, we can transform the potentially heap-allocating computation that produces n to instead truncate its result to the u64 range where it is defined, instead of just truncating at the use; and potentially this information could travel farther up the dominator tree, to inputs of the operation that defines n, their inputs, and so on.

needed-bits: the |0 of scheme

Let’s say we have a numerical operation that produces an exact integer, but we don’t know the range. We could truncate the result to a u64 and use unboxed operations, if and only if only u64 bits are used. So we need to compute, for each variable in a program, what bits are needed from it.

I think this is generally known a needed-bits analysis, though both Google and my textbooks are failing me at the moment; perhaps this is because dynamic languages and flow analysis don’t get so much attention these days. Anyway, the analysis can be local (within a basic block), global (all blocks in a function), or interprocedural (larger than a function). Guile’s is global. Each CPS/SSA variable in the function starts as needing 0 bits. We then compute the fixpoint of visiting each term in the function; if a term causes a variable to flow out of the function, for example via return or call, the variable is recorded as needing all bits, as is also the case if the variable is an operand to some primcall that doesn’t have a specific needed-bits analyser.

Currently, only logand has a needed-bits analyser, and this is because sometimes you want to do modular arithmetic, for example in a hash function. Consider Bon Jenkins’ lookup3 string hash function:

#define rot(x,k) (((x)<<(k)) | ((x)>>(32-(k))))
#define mix(a,b,c) \
{ \
  a -= c;  a ^= rot(c, 4);  c += b; \
  b -= a;  b ^= rot(a, 6);  a += c; \
  c -= b;  c ^= rot(b, 8);  b += a; \
  a -= c;  a ^= rot(c,16);  c += b; \
  b -= a;  b ^= rot(a,19);  a += c; \
  c -= b;  c ^= rot(b, 4);  b += a; \
}
...

If we transcribe this to Scheme, we get something like:

(define (jenkins-lookup3-hashword2 str)
  (define (u32 x) (logand x #xffffFFFF))
  (define (shl x n) (u32 (ash x n)))
  (define (shr x n) (ash x (- n)))
  (define (rot x n) (logior (shl x n) (shr x (- 32 n))))
  (define (add x y) (u32 (+ x y)))
  (define (sub x y) (u32 (- x y)))
  (define (xor x y) (logxor x y))

  (define (mix a b c)
    (let* ((a (sub a c)) (a (xor a (rot c 4)))  (c (add c b))
           (b (sub b a)) (b (xor b (rot a 6)))  (a (add a c))
           (c (sub c b)) (c (xor c (rot b 8)))  (b (add b a))
           ...)
      ...))

  ...

These u32 calls are like the JavaScript |0 idiom, to tell the compiler that we really just want the low 32 bits of the number, as an integer. Guile’s compiler will propagate that information down to uses of the defined values but also back up the dominator tree, resulting in unboxed arithmetic for all of these operations.

(When writing this, I got all the way here and then realized I had already written quite a bit about this, almost a decade ago ago. Oh well, consider this your lucky day, you get two scoops of prose!)

the bug

All that was just prelude. So I said that needed-bits is a fixed-point flow analysis problem. In this case, I want to compute, for each variable, what bits are needed for its definition. Because of loops, we need to keep iterating until we have found the fixed point. We use a worklist to represent the conts we need to visit.

Visiting a cont may cause the program to require more bits from the variables that cont uses. Consider:

(define-significant-bits-handler
    ((logand/immediate label types out res) param a)
  (let ((sigbits (sigbits-intersect
                   (inferred-sigbits types label a)
                   param
                   (sigbits-ref out res))))
    (intmap-add out a sigbits sigbits-union)))

This is the sigbits (needed-bits) handler for logand when one of its operands (param) is a constant and the other (a) is variable. It adds an entry for a to the analysis out, which is an intmap from variable to a bitmask of needed bits, or #f for all bits. If a already has some computed sigbits, we add to that set via sigbits-union. The interesting point comes in the sigbits-intersect call: the bits that we will need from a are first the bits that we infer a to have, by forward type-and-range analysis; intersected with the bits from the immediate param; intersected with the needed bits from the result value res.

If the intmap-add call is idempotent—i.e., out already contains sigbits for a—then out is returned as-is. So we can check for a fixed-point by comparing out with the resulting analysis, via eq?. If they are not equal, we need to add the cont that defines a to the worklist.

The bug? The bug was that we were not enqueuing the def of a, but rather the predecessors of label. This works when there are no cycles, provided we visit the worklist in post-order; and regardless, it works for many other analyses in Guile where we compute, for each labelled cont (basic block), some set of facts about all other labels or about all other variables. In that case, enqueuing a predecessor on the worklist will cause all nodes up and to including the variable’s definition to be visited, because each step adds more information (relative to the analysis computed on the previous visit). But it doesn’t work for this case, because we aren’t computing a per-label analysis.

The solution was to rewrite that particular fixed-point to enqueue labels that define a variable (possibly multiple defs, because of joins and loop back-edges), instead of just the predecessors of the use.

Et voilà ! If you got this far, bravo. Type at y’all again soon!

September 25, 2024

GUADEC 2024

GUADEC was in Denver this year! I meant to write an update right after the conference, but Real Life™ got in the way and it took a while to finish this post. I finally found a little spare time to collect my thoughts and finish writing this.

It was a smaller crowd than normal this year. There were ~100 people registered, though unfortunately a number of people were unable to make it at the last minute due to Cloudstrike– and visa– related issues.

Denver City Hall
Denver City Hall

I gave two talks: Crosswords, Year Three (slides) and a spur-of-the-moment lightning talk on development docs. The first talk was nominally about authoring crosswords, but I also presented the architecture we used to create the game. Although rushed, I hope I got most of the points about our design across. It’s definitely worth a full blog post at a future date.

Other highlights of the conference included Martin’s very funny (and brave) live demo of gameeky, Scott’s talk about being bold with design, the AGM, and a fabulous Thunderbird keynote about the power of money. That last one spurred conversations about putting a fundraising request popup in GNOME itself to raise funds. The yearly popup in Thunderbird appears to continue being wildly successful. Since GUADEC, I see that KDE has attempted to do that as well. I’d love for GNOME to do something similar. Maybe this is something the new board can pick up.

Original Nikolai Tesla generator in the Tivoli Brewing Co.
Original Nikolai Tesla generator in the Tivoli Brewing Co.

It was a very chill GUADEC, and I enjoyed the change of pace. I had never spent time in Denver (other than at the airport), and found it to be a surprisingly intimate city with a very walkable downtown. The venue was absolutely fabulous. Every conference should have a pub on-site, and the Tivoli Brewing Co definitely surpassed expectations. It even has an original Nikolai Tesla generator in its basement.

Reflections

It was really nice having GUADEC relatively close to me for once. There was a different crowd than normal: there were long-time GNOME people I haven’t seen in a very long time (Hi Owen, Behdad, and Michael!) as well as numerous new folks (welcome Richard!) Holding it in North America opened us up to different contributors, and maybe let us reengage with long-time gnomies.

On the other hand, it did feel like the community was split. Sam said it extremely well in his blog post.

Let’s not pretend that a video conference or a hybrid BOF is the same as an in-person meetup. Once you’ve sung karaoke with someone, or explored Meow Wolf, or camped in the desert in Utah together, your relationship is richer than when you only interacted via Gitlab pull requests and BigBlueButton. You have more empathy and you can resolve conflicts better.

Fragmentation is always a danger with distributed endeavors and any group bigger than two will have politics, but it feels like our best tool to deal with those issues is fragmenting too.

Personally, as someone who has schlepped across the Atlantic for over two decades to meet with other folks, it doesn’t feel great to have comparatively few people come the other direction. There are a plenty of good individual decisions that lead to this, but collectively it felt like a misfire.

I also really appreciate the commitment of our South American / Asian / African developers who have tough travel routes to get to the Euro/American events.

The first GUADEC poster
The first GUADEC poster

In some sense, it feels like we’ve gone full-circle. When GNOME started, development was strongly centered in North America. The GIMP started in Berkeley, and GNOME itself was founded in Mexico, and there were quite a few other pockets of GNOME activity (Boston, North Carolina, etc). Proportionally, Europe was underrepresented — so GUADEC was proposed as a way to build a European community. It took sustained engagement to build it up. Twenty-four years on, it appears we need to do the reverse.

What’s next? Well for me, it’s time to look more local. We used to have a Bay Area GNOME community and it has fallen on hard times. Maybe it’s worth trying to push some local enthusiasm. If you’re a Bay Area GNOME person, drop me a note. We should hold a release party!

Nonograms

While in Denver, ptomato and I nerd-sniped each other into writing a nonogram game. Nonograms are a popular puzzle-type, and are quite common on existing mobile platforms. Conceptually, they’re pen-and-paper grid-based games and could easily be implemented as an .ipuz extension.

I’ve been slowly changing the libipuz API over the summer to work with gobject-introspection, and was excited at the chance to get someone to test it out. Meanwhile, Philip had been wanting to write an app with typescript. So, I sketched out an extension and put together an API for Philip to use. With a little back-and-forth, he got something to render. Exciting!

NonogramI don’t think it is playable yet but it’s lovely to see the potential emerging. There’s a lot of great pixel art floating around GNOME. Some of it might make the basis for a really fun nonogram game.

As a bonus, Philip has been experimenting with using the stateless design we use in Crosswords. I’m hoping he’ll be able to provide additional validation and feedback to our architectural approach.

September 23, 2024

GNOME 47 Wallpapers

With GNOME 47 out, it’s time for my bi-annual wallpaper deep dive. For many, these may seem like simple background images, but GNOME wallpapers are the visual anchors of the project, defining its aesthetic and identity. The signature blue wallpaper with its dark top bar remains a key part of that.

GNOME 47 Wallpapers

In this release, GNOME 47 doesn’t overhaul the default blue wallpaper. It’s more of a subtle tweak than a full redesign. The familiar rounded triangles remain, but here’s something neat: the dark variant mimics real-world camera behavior. When it’s darker, the camera’s aperture widens, creating a shallower depth of field. A small but nice touch for those who notice these things.

The real action this cycle, though, is in the supplemental wallpapers.

We haven’t had to remove much this time around, thanks to the JXL format keeping file sizes manageable. The focus has been on variety rather than cutting old designs. We aim to keep things fresh, though you might notice that photographic wallpapers are still missing (we’ll get to that eventually, promise.

In terms of fine tuning changes, the classic, Pixels has been updated to feature newer apps from GNOME Circle.

The dark variant of Pills also got some love with lighting and shading tweaks, including a subtle subsurface scattering effect.

As for the new wallpapers, there are a few cool additions this release. I collaborated with Dominik Baran to create a tube-map-inspired vector wallpaper, which I’m particularly into. There’s also Mollnar, a nod to Vera Molnar, using simple geometric shapes in SVG format.

Most of our wallpapers are still bitmaps, largely because our rendering tools don’t yet handle color banding well with vectors. For now, even designs that would work better as vectors—like mesh gradients—get converted to bitmaps.

We’ve introduced some new abstract designs as well – meet Sheet and Swoosh. And for fans of pixel art, we’ve added LCD and its colorful sibling, LCD-rainbow. Both give off that retro screen vibe, even if the color gradient realism isn’t real-world accurate.

Lastly, there’s Symbolic Soup, which is, well… a bit chaotic. It might not be everyone’s cup of tea, but it definitely adds variety.

Preview

LCD Pills Map Mollnar LCD Raindow Pixels Sheet Swoosh Symbolic Soup

If you’re wondering about the strange square aspect ratio, take a look at the wallpaper sizing guide in our GNOME Interface Guidelines.

Also worth noting is the fact that all of these wallpapers have been created by humans. While I’ve experimented with image generation for some parts of the workflow in some of of my personal projects, all this work is AIgen-free and explicitly credited.

It’s like cp -R but for your GUI

As a JavaScript engine developer at Igalia I don’t find myself writing much plain C code anymore. I’m either writing JS or TypeScript, or hacking on large compiler codebases in C++1, or writing ECMAScript specification language. Frankly, that is fine with me. C’s time may not be over yet, but I wouldn’t be sad if I never had to write another line of it. (Hopefully this post conveys why.)

However, while working on modernizing an app written in C for the GNOME platform, that I hack on in my spare time, I wanted to copy a folder recursively using the GIO async APIs. Like cp -R at the shell, but without freezing up your GUI while it works.

C’s callback style for async programming, combined with lack of capturing variables in closures, is like going back to the dark ages if you’ve gotten used to languages with async/await style or even C++’s lambdas. I would’ve avoided writing this if I could, but apparently no one else had done it publicly on the internet that I could find.2 So here it is for your enjoyment.

typedef struct {
	GFile *dest_folder;
	GQueue *files_to_copy;
	GQueue *folders_to_copy;
	GFileCopyFlags flags;
} CopyRecursiveClosure;

/* Pre-declare so we can read them in the order they are executed: */
static void on_recursive_make_dir_finish(GFile *file, GAsyncResult *res, GTask *data);
static void on_recursive_file_enumerate_finish(GFile* file, GAsyncResult *res, GTask *data);
static void on_recursive_file_next_files_finish(GFileEnumerator *children, GAsyncResult *res, GTask *data);
static void copy_file_queue_async(GTask *task);
static void on_recursive_file_copy_finish(GFile *file, GAsyncResult *result, GTask *data);
static void on_recursive_folder_copy_finish(GFile *file, GAsyncResult *result, GTask *data);
static void copy_folder_queue_async(GTask *task);
static void copy_recursive_closure_free(CopyRecursiveClosure *ptr);

/**
 * copy_recursive_async:
 * @src: The source folder
 * @dest: Destination folder in which to place the copy of @src
 * @flags: #GFileCopyFlags to apply to copy operations
 * @prio: I/O priority, e.g. #G_PRIORITY_DEFAULT
 * @cancel: #GCancellable that will interrupt the operation when triggered
 * @done_cb: Function to call when the operation is finished
 * @data: Pointer to pass to @done_cb
 *
 * Copy the folder @src and all of the files and subfolders in it into the
 * folder @dest, asynchronously.
 *
 * The only @flags supported are #G_FILE_COPY_NONE and #G_FILE_COPY_OVERWRITE.
 */
void
copy_recursive_async(GFile *src, GFile *dest, GFileCopyFlags flags, int prio, GCancellable *cancel,
	GAsyncReadyCallback done_cb, void *data)
{
	g_return_if_fail(G_IS_FILE(src));
	g_return_if_fail(G_IS_FILE(dest));
	g_return_if_fail(flags == G_FILE_COPY_NONE || flags == G_FILE_COPY_OVERWRITE);
	g_return_if_fail(!cancel || G_IS_CANCELLABLE(cancel));

	g_autoptr(GTask) task = g_task_new(src, cancel, done_cb, data);
	g_task_set_priority(task, prio);

	CopyRecursiveClosure *task_data = g_new0(CopyRecursiveClosure, 1);
	g_autofree char *basename = g_file_get_basename(src);
	task_data->dest_folder = g_file_get_child(dest, basename);
	task_data->files_to_copy = g_queue_new();
	task_data->folders_to_copy = g_queue_new();
	task_data->flags = flags;
	g_task_set_task_data(task, task_data, (GDestroyNotify)copy_recursive_closure_free);

	g_file_make_directory_async(task_data->dest_folder, prio, cancel,
		(GAsyncReadyCallback)on_recursive_make_dir_finish, g_steal_pointer(&task));
}

/**
 * copy_recursive_finish:
 * @src: The source folder
 * @result: The #GAsyncResult passed to the callback
 * @error_out: (nullable): Return location for a #GError
 *
 * Complete the asynchronous copy operation started by copy_recursive_async().
 *
 * Returns: %TRUE if the operation completed successfully, %FALSE on error.
 */
bool
copy_recursive_finish(GFile *src, GAsyncResult *result, GError **error_out)
{
	g_return_val_if_fail(G_IS_FILE(src), false);
	g_return_val_if_fail(G_IS_TASK(result), false);
	g_return_val_if_fail(g_task_is_valid(result, src), false);

	return g_task_propagate_boolean(G_TASK(result), error_out);
}

static void
on_recursive_make_dir_finish(GFile *file, GAsyncResult *result, GTask *task_ptr)
{
	g_autoptr(GTask) task = g_steal_pointer(&task_ptr);
	g_autoptr(GError) error = NULL;
	GCancellable *cancel = g_task_get_cancellable(task);
	int prio = g_task_get_priority(task);

	if (!g_file_make_directory_finish(G_FILE(file), result, &error)) {
		/* With the OVERWRITE flag, don't error out when the folder already
		 * exists. (Hopefully plopping all the files in the existing folder is
		 * sufficient. If not, another way to do this would be to delete the
		 * existing folder recursively, so that extra existing files not in the
		 * source don't remain in the destination.) */
		CopyRecursiveClosure *data = g_task_get_task_data(task);
		bool overwrite = !!(data->flags & G_FILE_COPY_OVERWRITE);
		if (!overwrite || !g_error_matches(error, G_IO_ERROR, G_IO_ERROR_EXISTS)) {
			g_autofree char *path = g_file_get_path(file);
			g_task_return_prefixed_error(task, g_steal_pointer(&error),
				"Error creating destination folder %s: ", path);
			return;
		}
	}

	GFile *src = g_task_get_source_object(task);
	g_file_enumerate_children_async(src, "standard::*", G_FILE_QUERY_INFO_NONE, prio, cancel,
		(GAsyncReadyCallback)on_recursive_file_enumerate_finish, g_steal_pointer(&task));
}

static void
on_recursive_file_enumerate_finish(GFile *file, GAsyncResult *result, GTask *task_ptr)
{
	g_autoptr(GTask) task = g_steal_pointer(&task_ptr);
	g_autoptr(GError) error = NULL;
	GCancellable *cancel = g_task_get_cancellable(task);
	int prio = g_task_get_priority(task);

	g_autoptr(GFileEnumerator) children = g_file_enumerate_children_finish(G_FILE(file), result, &error);
	if (!children) {
		g_autofree char *path = g_file_get_path(file);
		g_task_return_prefixed_error(task, g_steal_pointer(&error),
			"Error reading folder %s: ", path);
		return;
	}

	g_file_enumerator_next_files_async(children, 10, prio, cancel,
		(GAsyncReadyCallback)on_recursive_file_next_files_finish, g_steal_pointer(&task));
}

static void
on_recursive_file_next_files_finish(GFileEnumerator *children, GAsyncResult *result, GTask *task_ptr)
{
	g_autoptr(GTask) task = g_steal_pointer(&task_ptr);
	g_autoptr(GError) error = NULL;
	GCancellable *cancel = g_task_get_cancellable(task);
	int prio = g_task_get_priority(task);

	g_autolist(GFileInfo) next_files = g_file_enumerator_next_files_finish(children, result, &error);
	if (error) {
		g_autofree char *path = g_file_get_path(g_file_enumerator_get_container(children));
		g_task_return_prefixed_error(task, g_steal_pointer(&error),
			"Error reading files from folder %s: ", path);
		return;
	}

	CopyRecursiveClosure *data = g_task_get_task_data(task);

	if (next_files) {
		for (GList *iter = next_files; iter != NULL; iter = g_list_next(iter)) {
			GFileInfo *info = G_FILE_INFO(iter->data);
			GFileType type = g_file_info_get_file_type(info);
			g_autoptr(GFile) file = g_file_enumerator_get_child(children, info);
			switch (type) {
				case G_FILE_TYPE_DIRECTORY:
					g_queue_push_tail(data->folders_to_copy, g_steal_pointer(&file));
					break;
				case G_FILE_TYPE_REGULAR:
					g_queue_push_tail(data->files_to_copy, g_steal_pointer(&file));
					break;
				default:
					g_warning("Unhandled file type %d in recursive copy: %s", type, g_file_info_get_name(info));
					continue;
			}
		}

		g_file_enumerator_next_files_async(children, 10, prio, cancel,
			(GAsyncReadyCallback)on_recursive_file_next_files_finish, g_steal_pointer(&task));

		return;
	}

	copy_file_queue_async(g_steal_pointer(&task));
}

static void
copy_file_queue_async(GTask *task_ptr)
{
	g_autoptr(GTask) task = task_ptr;
	CopyRecursiveClosure *data = g_task_get_task_data(task);

	g_autoptr(GFile) file = g_queue_pop_head(data->files_to_copy);
	if (file) {
		GCancellable *cancel = g_task_get_cancellable(task);
		int prio = g_task_get_priority(task);

		g_autofree char *basename = g_file_get_basename(file);
		g_autoptr(GFile) dest = g_file_get_child(data->dest_folder, basename);
		g_file_copy_async(file, dest, data->flags, prio, cancel,
			/* progress_callback = */ NULL, NULL,
			(GAsyncReadyCallback)on_recursive_file_copy_finish, g_steal_pointer(&task));
		return;
	}

	copy_folder_queue_async(g_steal_pointer(&task));
}

static void
on_recursive_file_copy_finish(GFile *file, GAsyncResult *result, GTask *task_ptr)
{
	g_autoptr(GTask) task = task_ptr;
	g_autoptr(GError) error = NULL;
	if (!g_file_copy_finish(file, result, &error)) {
		g_autofree char *path = g_file_get_path(file);
		g_task_return_prefixed_error(task, g_steal_pointer(&error),
			"Error copying file %s: ", path);
		return;
	}
	copy_file_queue_async(g_steal_pointer(&task));
}

static void
copy_folder_queue_async(GTask *task_ptr)
{
	g_autoptr(GTask) task = task_ptr;
	CopyRecursiveClosure *data = g_task_get_task_data(task);

	g_autoptr(GFile) folder = g_queue_pop_head(data->folders_to_copy);
	if (folder) {
		GCancellable *cancel = g_task_get_cancellable(task);
		int prio = g_task_get_priority(task);

		copy_recursive_async(folder, data->dest_folder, data->flags, prio, cancel,
			(GAsyncReadyCallback)on_recursive_folder_copy_finish, g_steal_pointer(&task));
		return;
	}

	g_task_return_boolean(task, true);
}

static void
on_recursive_folder_copy_finish(GFile *folder, GAsyncResult *result, GTask *task_ptr)
{
	g_autoptr(GTask) task = task_ptr;
	g_autoptr(GError) error = NULL;
	if (!copy_recursive_finish(folder, result, &error)) {
		g_autofree char *path = g_file_get_path(folder);
		g_task_return_prefixed_error(task, g_steal_pointer(&error),
			"Error copying folder %s: ", path);
		return;
	}
	copy_folder_queue_async(g_steal_pointer(&task));
}

static void
copy_recursive_closure_free(CopyRecursiveClosure *ptr) {
	g_object_unref(ptr->dest_folder);
	g_queue_free_full(ptr->files_to_copy, g_object_unref);
	g_queue_free_full(ptr->folders_to_copy, g_object_unref);
	g_free(ptr);
}

You are welcome to take this code and customize it to your needs. I’m putting it into the public domain so hopefully nobody else has to go through this.

Although if you really want to, it could be improved by implementing progress callbacks like g_file_copy_async() has.

Just so you can understand what’s going on at a glance, here’s what it would look like in about 30 lines of JavaScript, with async/await style:

async function copyRecursive(src, dest, flags, prio, cancel) {
  const destFolder = dest.get_child(src.get_basename());
  const overwrite = !!(flags & Gio.FileCopyFlags.OVERWRITE);

  try {
    await destFolder.make_directory_async(prio, cancel);
  } catch (error) {
    if (!overwrite || !error.matches(Gio.IOErrorEnum, Gio.IOErrorEnum.EXISTS))
      throw error;
  }

  const children = await src.enumerate_children_async('standard::*',
    Gio.FileQueryInfoFlags.NONE, prio, cancel);
  let nextFiles;
  const filesToCopy = [];
  const foldersToCopy = [];
  while ((nextFiles = await children.next_files_async(10, prio, cancel)).length) {
    const {
      [Gio.FileType.REGULAR]: files,
      [Gio.FileType.DIRECTORY]: folders,
    } = Object.groupBy(nextFiles, info => info.get_file_type()); 
    foldersToCopy.push(...folders?.map(info => children.get_child(info)) ?? []);
    filesToCopy.push(...files?.map(info => children.get_child(info)) ?? []);
  }

  for (const file of filesToCopy) {
    const dest = destFolder.get_child(file.get_basename());
    await file.copy_async(dest, flags, prio, cancel, null, null);
  }

  for (const folder of foldersToCopy)
    await copyRecursive(folder, destFolder, flags, prio, cancel);
}

(This excludes the imports and calls to Gio._promisify that you would have to do; hopefully we’ll get native async operations in GNOME 48!)

[1] C++ before C++11 used to be a worse experience than C. However, I don’t have to deal with that because the three major JS engines use C++17. It’s … its own category of special, but better. ↩

[2] No, ChatGPT couldn’t do it either; it made up GIO APIs that don’t exist. If that programming technique is on the table, then sure, it’d have been a lot easier. ↩

September 22, 2024

How to fork: Best practices and guide

New raw thumbnailer

I have resurrected my camera old raw thumbnailer so that I can browse directories full of of camera raw images in Nautilus. This is version 47.0.1, because GNOME 47 is out.

Like the old one, it uses libopenraw to extract the previews from the raw files.

But, it now supports more raw formats, and if needed will render the raw image to generate a preview, like it has to for my old Ricoh GR Digital II images (the one from 2007). This leverage libopenraw 0.4.0 (still in alpha stage) that has been rewritten in Rust.

Sadly to get is in the hands of users the only good solution is a distribution package. At the time of writing there is none, but I put together smething that allowed me to build a package for Fedora 40 to install on my big rig.

If you feel like it you can download the source code from GNOME, and the repository is on GNOME gitlab.

This is how it looks with Nautilus 46 with a bunch of images from my old Olympus E-P1:

Nautilus directory view with ORF images thumbnails

September 20, 2024

Image Viewing and Editing in GNOME 47 and Beyond

Loupe is GNOME’s default image viewer since GNOME 45. It is powered by the newly written safe image loading and editing library glycin.

What’s new in 47

With GNOME 47, Loupe version 47 is available as well. This release mostly consists of a lot of subtle changes. For JPEGs, the image rotation feature now writes the new orientation to the image file. While Loupe 46 was still defaulting to an older GTK renderer, the new version is using the same defaults as all other apps. Thanks to work by Benjamin Otte in GTK, Loupe now also handles very large images (larger than 256 megapixels) reliably on systems with limited VRAM while also increasing the loading speed.

Loupe and the underlying image loading and editing library glycin now support much better error reporting if it is not possible to load an image. The new glycin version uses a different decoder for JPEG images, improving loading speed and fixing all known compatibility issues. As part of my work on the GNOME STF grant, glycin now also provides bindings for other programming languages than Rust including C, GJS, Python, and Vala. If necessary, glycin now automatically disables its sandbox features in Flatpak development environments, simplifying development.

What we are working on

But there is more! We have already merged the first GNOME 48 features, which will be released in March 2025. Allan Day worked on a new design for overlay controls, especially zoom. This is already implemented and merged. It allows the selection of zoom levels like 100% without using keyboard shortcuts like Ctrl+1 and additionally gives the option to select arbitrary zoom levels. There is also a new experimental design for dragging images into the Loupe window.

On the more technical side, Hubert Figuière has written an initial loader implementation for raw image formats which is now merged into glycin. Last but not least, I’m planning to finally have some initial image editing features beyond image rotation in Loupe 48. I’m currently working on all the basics and an image cropping feature.

App window with copping selection over an image and and open menu to select the aspect ratio.

A huge thanks goes out to everyone who contributed to this work including all the people that are kind enough to support my work financially! If you want to get weekly behind-the-scenes development updates or just support my work financially, you can do so via Patreon, Ko-Fi, GitHub, or OpenCollective.

Header Image © Friedrich Haag / Wikimedia CommonsCC BY-SA 4.0

Understanding GNOME Shell’s focus stealing prevention

Focus stealing prevention exists for two main reasons: One is security, since we need to prevent rogue apps from deceiving users into e.g. typing their password into another window. If apps can silently claim keyboard focus and open their own window over the currently focused one, this enables phishing and other similar attacks. The other is user experience: Even if an app isn’t maliciously taking over your focus, it can be annoying to have a new window popping up while you’re typing something and have half your sentence end up in the wrong app.

At the same time there are cases where you want apps to be able to request focus, for example when clicking a link in a chat app and wanting it to open in the browser. In this case you want the focus to move to the browser window.

This is why our compositor library mutter implements focus stealing prevention mechanisms, which allow the currently focused app to request that a specific other app be allowed to claim focus now.

<App> is ready??

Most users have probably seen an “<App> is ready” notification in GNOME Shell at some point. Unfortunately this notification doesn’t really explain why it’s being shown and what’s happening, which may cause confusion.

Because of this there have been proposals to disable focus stealing prevention until it works better (mutter issue 673), and a number of GNOME Shell extensions).

Screenshot of a GNOME Shell notification showing that Telegram Desktop Media viewer is ready

These are the main cases where the notification is shown:

  •  A new window is opened and either the launcher app, or the launched app doesn’t implement the XDG Activation protocol or the startup notification specification
  •  An app requests focus for one of its windows, but was not activated in a valid way (e.g. because it wasn’t started by a user action)
  • An app requests focus for a new window, but it’s slow to start and in the meantime there are additional user interactions. In this case we don’t want to interrupt, and show the notification so people can switch at their convenience.
  • An app is launched from an environment that isn’t able to use the XDG Activation protocol (e.g. a terminal)

The protocol responsible for this, XDG Activation, the Wayland equivalent to the X11-specific startup notification spec was introduced somewhat recently (2020), and needs to be adopted by UI toolkits. GNOME 46 and 47 saw a few fixes and the feature was polished both in the client toolkit side (GTK and xdg-desktop-portal, as well as in the compositor implementation mutter, but there are still cases where XDG activation isn’t hooked up properly.

How XDG activation works

Flow xdg activation protocol.
XDG activation flow for moving focus between two existing windows

The way the protocol works is that the currently focused app asks the compositor to create a token linked to the focused window (Wayland surface) and the most recent user interaction (an input event serial associated with a seat).

This token is then used by the app that should receive focus when it requests to be activated. In GNOME Shell, activation means that the the window receives focus and is placed on top of other windows. An activation token may still be rejected, for example if the window linked to the token doesn’t have focus or when the linked user interaction isn’t recent enough.

In addition to handling focus, GNOME Shell also tracks app launching. Until the new app window is actually shown, GNOME Shell uses a “loading spinner” mouse cursor to indicate to the user that the app is loading. If the app doesn’t implement the XDG Activation protocol, the loading indicator only disappears after a timeout because GNOME Shell doesn’t know that the application finished loading and has presented the target window.

The protocol doesn’t define how tokens are given to the target app. One reason for this is because it depends on how the app is started. The main options are:

  • Setting the XDG_ACTIVATION_TOKEN environment variable
  • D-Bus Activation using the platform-data field, which contains the activation token
  • XDG portals that will launch an app (e.g. the OpenURI or OpenFile portals)

The target app then needs to collect the token and use it to have its window activated to receive focus and to signal to the compositor that it started successfully.

Not smart enough

When I started looking into how our focus prevention mechanism works to investigate the issues mentioned above, I was initially pretty confused. There were a lot of cases where the focus window switch worked fine, but other times it wouldn’t. I realized quickly that with existing windows, the “<App> is ready” notification is shown, but new window would get focus immediately.

This struck me as odd: Why are new windows allowed to do whatever, but existing windows are restricted in the way they can take over focus?

I first thought this was some sort of bug, but then I discovered that the behavior was by design: Mutter has a gsettings property called focus-new-windows that controls the focus stealing prevention mechanism. This property can be strict or smart (the latter being the default).

  • smart means that in most cases new windows get focus (even without asking for it) and are raised to the top of the window stack
  • strict means they get focus (are “activated”, in technical terms) only when they are actually supposed to

The smart mode exists in part because there are some cases where our current focus prevention system does not work well. These issues include:

  • Launching apps via terminal (vte issue #2788). The main issue is that the terminal executing a command does not know whether that process will present a window or not. For example, if you launch vim there’s no new window, but if you launch firefox there is.
  • Launching apps via Run a Command in GNOME Shell (gnome-shell issue #7704) shares similar issues as running apps from the terminal
  • Apps launched via custom keyboard shortcut (e.g. set up in Settings > Keyboard > Keyboard Shortcuts)
  • The lack of implementation of the appropriate protocols in apps or toolkits

Because the cases where a new window is opened are a significant percentage of the overall cases where focus prevention is triggered, this smart mode is making it appear as though apps actually implement the XDG Activation protocol, even if they don’t. While it does somewhat reduce annoyance for users, it gives developers the false impression that they don’t have to do anything.

It also makes it harder to debug issues where something doesn’t work as expected or is missing the correct implementation. For example, even in GTK4 the focus transferring is broken in some cases and took a long time to be discovered (gtk issue #6711).

Security implications

Unfortunately the current situation with smart as the default means that we’re not getting most of the benefits of focus stealing prevention. Apps are able to spawn a new window over your current one and grab keyboard focus, because the smart mode just gives the new window focus, circumventing the safety measures. This is trivial to exploit by malicious apps: All they need to do is open a new window, and focus stealing prevention doesn’t apply.

Next steps

While some people have asked for focus stealing prevention to be disabled completely until it’s implemented by most apps and toolkits, I’m not sure this is the best way forward. If we did that, nobody would notice which apps don’t implement it, so there’d be no reason for toolkits to do so.

On the other hand, there are some remaining issues around terminal applications and similar use cases that we don’t have a plan for yet, so just switching to strict to flush out app bugs isn’t ideal either at the moment.

  • There is currently no consensus in the team as to how to proceed. The two main directions we could take are:
  • Switch to strict mode by default (mutter issue #3486) once a few remaining issues are resolved, perhaps with a “flag day” deadline so apps have time to implement it.
  • Slowly make the smart mode stricter over time.

Either way we need to raise more awareness of the issue to get app and toolkit developers interested in improving things in this area, which this blogpost is a part of 🙂

It’d also be helpful if more people (especially developers) turn on strict mode on their system, so we get more testing for which apps work and which don’t. This is the relevant gsetting:

gsettings set org.gnome.desktop.wm.preferences focus-new-windows 'strict'

Thanks

Thanks to the Sovereign Tech Fund for allowing me to take the time to properly work through this as part of my broader effort around improving notifications. Thanks also to Sonny Piers and Tobias Bernard for organizing the STF project, Florian Müllner, Sebastian Wick, Carlos Garnacho, and the rest of the GNOME Shell team for reviewing my MRs, and Jonas Dreßler and Jonas Ådahl for reviewing the blogpost.