May 05, 2023

GNOME will be mentoring 9 new contributors in Google Summer of Code 2023

We are happy to announce that GNOME was assigned nine slots for Google Summer of Code projects this year!

GSoC is a program focused on bringing new contributors into open source software development. A number of long term GNOME developers are former GSoC interns, making the program a very valuable entry point for new members in our project.

In 2023 we will mentoring the following projects:

Project Title Contributor Assigned Mentor(s)
Make GNOME platform demos for Workbench Akshay Warrier Sonny Piers
Andy Holmes
Rust and GTK 4 Bustle Rewrite Dave Patrick Caberto Bilal Elmoussaoui
Maximilian
Create a New “System” panel in GNOME Settings Gotam Gorabh Felipe Borges
Implement backlog search in Polari IRC client Gurmannat Sohal Carlos Garnacho
Florian Müllner
Integrate GNOME Network Displays features into GNOME Settings Pedro Sader Azevedo Felipe Borges
Claudio Wunder
Jonas Ådahl
Anupam Kumar
GNOME Crosswords Anagram Support Pratham Gupta jrb
Make GNOME Platform Demos for Workbench Sriyansh Shivam Sonny Piers
Andy Holmes
Add Acrostic Puzzles to GNOME Crosswords Tanmay Patil jrb
Flatpak synching between machines Tim FB Rasmus Thomsen

As part of the contributor’s acceptance into GSoC they are expected to actively participate in the Community Bonding period (May 4 – 28). The Community Bonding period is intended to help prepare contributors to start contributing at full speed starting May 29.

The new contributors will soon get their blogs added to Planet GNOME making it easy for the GNOME community to get to know them and the projects that they will be working on.

We would like to also thank our mentors for supporting GSoC and helping new contributors enter our project.

If you have any doubts, feel free to reply to this Discourse topic or message us privately at soc-admins@gnome.org

** This is a repost from https://discourse.gnome.org/t/announcement-gnome-will-be-mentoring-9-new-contributors-in-google-summer-of-code-2023/15232

#94 Configuring Columns

Update on what happened across the GNOME project in the week from April 28 to May 05.

GNOME Core Apps and Libraries

Files

Providing a simple and integrated way of managing your files and browsing your file system.

antoniof reports

There is a new interface for configuring the columns of Files list! After almost 20 years of the same column chooser, the deprecation of GtkTreeView encouraged Corey Berla to replace the column chooser with one containing modern widgets and design. This enhancement, with additional contributions by Peter Eisenmann, also allows for changing visible columns either globally or only for the current folder, without the old misleading interface duplication in the Preferences.

GNOME Incubating Apps

Loupe

A simple and modern image viewer.

Sophie 🏳️‍🌈 🏳️‍⚧️ ✊ reports

While it hasn’t been a priority for this cycle, Loupe should already deliver a solid experience on mobile devices with this week’s update. Apart from the complete interface being now adaptive, typical features like one-finger swipe, double-tap to zoom in and out, pinch zoom, and panning are already supported.

This week’s update includes:

  • Adopt the properties view for smaller form factors using the latest libadwaita Breakpoint feature.
  • Hide the HeaderBar and mouse cursor in fullscreen after a moment.
  • Skip unsupported image formats when browsing images.
  • Many more minor tweaks and fixes.

GNOME Circle Apps and Libraries

NewsFlash feed reader

Follow your favorite blogs & news sites.

Jan Lukas announces

NewsFlash 2.3.0 was released today. The only visible change is the fancy new icon by Tobias Bernard. Under the hood WebKitGTK was upgraded to the latest version, which should fix a lot of problems. But the most work went into the new content grabber. It should be a lot faster than the javascript library used before. It is better documented and it is easier to provide custom extraction rules.

gtk-rs

Safe bindings to the Rust language for fundamental libraries from the GNOME stack.

Bilal Elmoussaoui announces

The Rust bindings generator gir is now capable of embedding docs for virtual methods and class methods from the corresponding GIR files.

Third Party Projects

0xMRTT reports

I’ve released Bavarder, an app for chatting with AI. With Bavarder, you can ask a question to differents providers like Hugging Chat, BAI Chat, OpenAI GPT-3.5-turbo, etc. It has been designed to be minimalist for having access to a chatbot without a browser or an account.

Remember that AI can produce fake contents and should not be used in a fraudulous way.

You can download Bavarder from Flathub or from either Github or Codeberg

Flare

An unofficial Signal GTK client.

schmiddi reports

Flare 0.8.0 was released which brings big improvements to the message list. Instead of the previous “Load More”-button, content now gets dynamically loaded if needed leading to an improved experience using Flare with longer chats. Since the last update, message deletion has now also been implemented. And, as always, many bug fixes and minor features have also been developed to make sure Flare works as expected.

GNOME Foundation

Caroline Henriksen says

This week, the GNOME Foundation has been wrapping up LAS 2023 tasks and focusing on GUADEC organization. We recently shared the full schedule of talks, which can be viewed on guadec.org, and hope to share more details about social events and keynote speakers soon. One fun event item I’ve been working on is the design for GUADEC 2023 t-shirts! We’ll share that on shop.gnome.org as soon as it’s ready.

Registration is now open for GUADEC 2023! Let us know you’re attending, either in-person in Riga, Latvia, or remotely by signing up online. More details and links can be found on GNOME Foundation News.

We’re still looking for GUADEC 2023 sponsors! If you or your company would like to sponsor this year’s conference, you can find our brochure and learn more on guadec.org.

That’s all for this week!

See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects!

May 04, 2023

Twitter's e2ee DMs are better than nothing

Elon Musk appeared on an interview with Tucker Carlson last month, with one of the topics being the fact that Twitter could be legally compelled to hand over users' direct messages to government agencies since they're held on Twitter's servers and aren't encrypted. Elon talked about how they were in the process of implementing proper encryption for DMs that would prevent this - "You could put a gun to my head and I couldn't tell you. That's how it should be."

tl;dr - in the current implementation, while Twitter could subvert the end-to-end nature of the encryption, it could not do so without users being notified. If any user involved in a conversation were to ignore that notification, all messages in that conversation (including ones sent in the past) could then be decrypted. This isn't ideal, but it still seems like an improvement over having no encryption at all. More technical discussion follows.

For context: all information about Twitter's implementation here has been derived from reverse engineering version 9.86.0 of the Android client and 9.56.1 of the iOS client (the current versions at time of writing), and the feature hasn't yet launched. While it's certainly possible that there could be major changes in the protocol between now launch, Elon has asserted that they plan to launch the feature this week so it's plausible that this reflects what'll ship.

For it to be impossible for Twitter to read DMs, they need to not only be encrypted, they need to be encrypted with a key that's not available to Twitter. This is what's referred to as "end-to-end encryption", or e2ee - it means that the only components in the communication chain that have access to the unencrypted data are the endpoints. Even if the message passes through other systems (and even if it's stored on other systems), those systems do not have access to the keys that would be needed to decrypt the data.

End-to-end encrypted messengers were initially popularised by Signal, but the Signal protocol has since been incorporated into WhatsApp and is probably much more widely used there. Millions of people per day are sending messages to each other that pass through servers controlled by third parties, but those third parties are completely unable to read the contents of those messages. This is the scenario that Elon described, where there's no degree of compulsion that could cause the people relaying messages to and from people to decrypt those messages afterwards.

But for this to be possible, both ends of the communication need to be able to encrypt messages in a way the other end can decrypt. This is usually performed using AES, a well-studied encryption algorithm with no known significant weaknesses. AES is a form of what's referred to as a symmetric encryption, one where encryption and decryption are performed with the same key. This means that both ends need access to that key, which presents us with a bootstrapping problem. Until a shared secret is obtained, there's no way to communicate securely, so how do we generate that shared secret? A common mechanism for this is something called Diffie Hellman key exchange, which makes use of asymmetric encryption. In asymmetric encryption, an encryption key can be split into two components - a public key and a private key. Both devices involved in the communication combine their private key and the other party's public key to generate a secret that can only be decoded with access to the private key. As long as you know the other party's public key, you can now securely generate a shared secret with them. Even a third party with access to all the public keys won't be able to identify this secret. Signal makes use of a variation of Diffie-Hellman called Extended Triple Diffie-Hellman that has some desirable properties, but it's not strictly necessary for the implementation of something that's end-to-end encrypted.

Although it was rumoured that Twitter would make use of the Signal protocol, and in fact there are vestiges of code in the Twitter client that still reference Signal, recent versions of the app have shipped with an entirely different approach that appears to have been written from scratch. It seems simple enough. Each device generates an asymmetric keypair using the NIST P-256 elliptic curve, along with a device identifier. The device identifier and the public half of the key are uploaded to Twitter using a new API endpoint called /1.1/keyregistry/register. When you want to send an encrypted DM to someone, the app calls /1.1/keyregistry/extract_public_keys with the IDs of the users you want to communicate with, and gets back a list of their public keys. It then looks up the conversation ID (a numeric identifier that corresponds to a given DM exchange - for a 1:1 conversation between two people it doesn't appear that this ever changes, so if you DMed an account 5 years ago and then DM them again now from the same account, the conversation ID will be the same) in a local database to retrieve a conversation key. If that key doesn't exist yet, the sender generates a random one. The message is then encrypted with the conversation key using AES in GCM mode, and the conversation key is then put through Diffie-Hellman with each of the recipients' public device keys. The encrypted message is then sent to Twitter along with the list of encrypted conversation keys. When each of the recipients' devices receives the message it checks whether it already has a copy of the conversation key, and if not performs its half of the Diffie-Hellman negotiation to decrypt the encrypted conversation key. One it has the conversation key it decrypts it and shows it to the user.

What would happen if Twitter changed the registered public key associated with a device to one where they held the private key, or added an entirely new device to a user's account? If the app were to just happily send a message with the conversation key encrypted with that new key, Twitter would be able to decrypt that and obtain the conversation key. Since the conversation key is tied to the conversation, not any given pair of devices, obtaining the conversation key means you can then decrypt every message in that conversation, including ones sent before the key was obtained.

(An aside: Signal and WhatsApp make use of a protocol called Sesame which involves additional secret material that's shared between every device a user owns, hence why you have to do that QR code dance whenever you add a new device to your account. I'm grossly over-simplifying how clever the Signal approach is here, largely because I don't understand the details of it myself. The Signal protocol uses something called the Double Ratchet Algorithm to implement the actual message encryption keys in such a way that even if someone were able to successfully impersonate a device they'd only be able to decrypt messages sent after that point even if they had encrypted copies of every previous message in the conversation)

How's this avoided? Based on the UI that exists in the iOS version of the app, in a fairly straightforward way - each user can only have a single device that supports encrypted messages. If the user (or, in our hypothetical, a malicious Twitter) replaces the device key, the client will generate a notification. If the user pays attention to that notification and verifies with the recipient through some out of band mechanism that the device has actually been replaced, then everything is fine. But, if any participant in the conversation ignores this warning, the holder of the subverted key can obtain the conversation key and decrypt the entire history of the conversation. That's strictly worse than anything based on Signal, where such impersonation would simply not work, but even in the Twitter case it's not possible for someone to silently subvert the security.

So when Elon says Twitter wouldn't be able to decrypt these messages even if someone held a gun to his head, there's a condition applied to that - it's true as long as nobody fucks up. This is clearly better than the messages just not being encrypted at all in the first place, but overall it's a weaker solution than Signal. If you're currently using Twitter DMs, should you turn on encryption? As long as the limitations aren't too limiting, definitely! Should you use this in preference to Signal or WhatsApp? Almost certainly not. This seems like a genuine incremental improvement, but it'd be easy to interpret what Elon says as providing stronger guarantees than actually exist.

comment count unavailable comments

Vivid colors in Brno

Co-authored by Sebastian Wick & Jonas Ådahl.

During April 24 to 26 Red Hat invited people working on compositors and display drivers to come together to collaborate on bringing the Linux graphics stack to the next level. There were three high level topics that were discussed at length: Color Management, High Dynamic Range (HDR) and Variable Refresh Rate (VRR). This post will go through the discussions that took place, and occasional rough consensus reached among the people who attended.

The event itself aimed to be both as inclusive and engaging as possible, meaning participants could attend both in person, in the Red Hat office in Brno, Czech Republic, or remotely via a video link. The format of the event was structured in a way aiming to give remote attendees and physical attendees an equal opportunity to participate in discussions. While the hallway track can be a great way to collaborate, discussions accessible remotely were prioritized by having two available rooms with their own video link.

This meant that if the main room wanted to continue on the same topic, while some wanted to do a breakout session, they could go to the other room, and anyone attending remotely could tag along by connecting to the other video link. In the end, the break out room became the room where people collaborated on various things in a less structured manner, leaving the main room to cover the main topics. A reason for this is that the microphones in both rooms were a bit too good, effectively catching any conversation anyone had anywhere in the room. Making one of the rooms a bit more chaotic, while the other focused, also allowed for both ways of collaborating.

For the kernel side, people working on AMD, Intel and NVIDIA drivers were among the attendees, and for user space there was representation from gamescope, GNOME, KDE, smithay, Wayland, weston and wlroots. Some of those people are community contributors and some of them were attending on behalf of Red Hat, Canonical, System76, sourcehut, Collabora, Blue Systems, Igalia, AMD, Intel, Google, and NVIDIA. We had a lot of productive discussion, ending up in total with a 20 (!) page document of notes.

Discussion with remote attendees during the hackfest

Color management & HDR

Wayland

Color management in the Linux graphics stack is shifting in the way it is implemented, away from the style used in X11 where the display server (X.org) takes a hands-off approach and the end result is dependent on individual client capabilities, to an architecture where the Wayland display server takes an active role to ensure that all clients, be them color aware or not, show up on screen correctly.

Pekka Paalanen and Sebastian Wick gave a summary of the current state of digital color on Linux and Wayland. For full details, see the Color and HDR documentation repository.

They described the in-development color-representation and color-management Wayland protocols. The color-representation protocol lets clients describe the way color channels are encoded and the color-management protocol lets clients describe the color channels’ meaning to completely describe the appearance of surfaces. It also gives clients information about how it can optimize its content to the target monitor capabilities to minimize the color transformations in the compositor.

Another key aspect of the Wayland color protocols in development is that compositors will be able to choose what they want to support. This allows for example to implement HDR without involving ICC workflows.

There is already a broad consensus that this type of active color management aligns with the Wayland philosophy and while work is needed in compositors and client toolkits alike, the protocols in question are ready for prototyping and review from the wider community.

Colors in kernel drivers & compositors

There are two parts to HDR and color management for compositors. The first one is to create content from different SDR and HDR sources using color transformations. The second is signaling the monitor to enter the desired mode. Given the current state of kernel API capabilities, compositors are in general required to handle all of their color transformations using shaders during composition. For the short term we will focus on removing the last blockers for HDR signaling and in the long term work on making it possible to offload color space conversions to the display hardware which should ideally make it possible to power down the GPU while playing e.g. a movie

Short term

Entering HDR mode is done by setting the colorimetry (KMS Colorspace property) and overriding the transfer characteristics (KMS HDR_OUTPUT_METADATA property).

Unfortunately the design of the Colorspace property does not mix well with the current broader KMS design where the output format is an implementation detail of the driver. We’re going to tweak the behavior of the Colorspace property such that it doesn’t directly control the InfoFrame but lets the driver choose the correct variant and transparently convert to YCC using the correct matrix if required. This should allow AMD to support HDR signaling upstream as well.

The HDR_OUTPUT_METADATA property is a bit weird as well and should be documented. Changing it might require a mode set and changing the transfer characteristics part of the blob will make monitors glitch, while changing other parameters must not require a mode set and must not glitch.

Both landing support upstream for the AMD driver, and improvements to the documentation should happen soon, enabling proper upstream HDR signaling.

Vendor specific uAPI for color pipelines

Recently a proposal for adding vendor specific properties for exposing hardware color pipelines via KMS has been posted, and while it is great to see work being done to improve situation in the Linux kernel, there are concerns that this opens up for per vendor API that end up necessary for compositors to implement, effectively reintroducing per vendor GPU drivers in userspace outside of mesa.

Still, upstream support in the kernel has its upsides, as it for example makes it much easier to experiment. A way forward discussed is to propose that vendor specific color pipeline properties should be handled with care, by requiring them to be clearly documented as experimental, and disabled by default both with a build configuration, and a off-by-default module parameter.

A proposal for this will be sent by Harry Wentland to the relevant kernel mailing lists.

Color pipelines in KMS

Long term, KMS should support color pipelines without any experimental flags, and there is a wide agreement that it should be done with a vendor agnostic API. To achieve this, a proposal was discussed at length, but to summarize it, the goal is to introduce a new KMS object for color operations. A color operation object exposes a low level mathematical function (e.g. Matrix multiplication, 1D or 3D look up tables) and a link to the next operation. To declare a color pipeline, drivers construct a linked list of these operations, for example 1D LUT → Matrix → 1D LUT to describe the current DEGAMMA_LUT → CTM → GAMMA_LUT KMS pipeline.

The discussions primarily focused on per plane color pipelines for the pre-blending stage, but the same concept should be reusable for the post blending stage on the CRTC.

Eventually this work should also make it possible to cleanly separate KMS properties which change the colors (i.e. color operations) from properties changing the mode and signaling to sinks, such as Broadcast RGB, Colorspace, max_bpc.

It was also agreed that user space needs more control over the output format, i.e. what is transmitted over the wire. Right now this is a driver implementation detail and chosen such that the bandwidth requirements of the selected mode will be satisfied. In particular making it possible to turn off YCC subsampling, specifying the minimum bit depth and specifying the compression strength for DCC seems to have consensus.

There are a lot more details that handle all the quirks that hardware may have. For more details and further discussion about the color pipeline proposal, head over to the RFC that Simon Ser just sent to the relevant mailing lists.

Testing & VKMS

Testability of color pipelines and KMS in general was a topic that was brought up as well, with two areas of interest: testing compositors and the generic DRM layer in the kernel using VKMS, and testing actual kernel drivers.

The state of VKMS is to some degree problematic; it currently lacks a large enough pool of established contributors that can take maintainership responsibilities, i.e. reviewing and landing code, but at the same time, there is an urge to make it a more central part of GPU driver development in general, where it can take a more active role in ensuring cross driver conformance. Discussions on how to create more incentive for both kernel developers and compositor developers to help out were discussed, and while ability to test compositors is a relatively good incentive, an idea discussed was to require new DRM properties to always get a VKMS implementation as well to be able to land. This is, however, not easy, since a significant amount of bootstrapping is needed to make that viable. Some ideas were thrown around, and hopefully something will come out of it; keep an eye on the relevant mailing lists for something related to this area.

For testing actual drivers, the usage of Chamelium was discussed, and while everyone agreed it’s something that is definitely nice to have, it takes a significant amount of resources to maintain wired up CI runners for the community to rely on. Ideally a setup that can be shared across the different compositors and GPU drivers would be great, but it’s a significant task to handle.

Variable Refresh Rate

Smoothing out refresh rate changes

Variable Refresh Rate monitors driven at a certain mode have a minimum and maximum refresh cycle duration and the actual duration can be chosen for every refresh cycle. One problem with most existing VRR monitors however is that when the refresh duration changes too quickly, they tend to produce visible glitches. They appear as brightness changes for a fraction of a second and can be very jarring. To avoid them, each refresh cycle must change the duration only up to some fixed amount. The amount however varies between monitors, with some having no restriction at all.

A VESA certification is currently being deployed aiming to certify monitors where any change in the refresh cycle duration does not result in glitches. For all other monitors, the increase and decrease in duration which does not result in glitches is unknown if not provided by optional EDID/DisplayID data blocks.

Driving monitors glitch-free without machine readable information therefore requires another approach. One idea is to make the limits configurable. Requiring all users to tweak and fiddle to make it work good enough, however, is not very user friendly, so another idea that was discussed is to maintain a database similar to the one used by libinput, but in libdisplay-info, that contains the required information about monitors, even if there is no such information made available by the vendor.

With all of the required information, the smoothing of refresh rate changes still needs to happen somewhere. It was debated whether this should be handled transparently by the kernel, or if it should be completely up to user space. There are pros and cons to both ways, for example better timing ability in the kernel, but less black box magic if handled by user space. In the end, the conclusion is for user space components (i.e. compositors) to handle this themselves first, and then reconsider some point in the future if that is enough, or whether new kernel uAPI is needed.

Low Framerate Compensation

The usual frame rates that a VRR monitor can achieve typically do not cover a bunch of often used low frame rates, such as 30, 25, or 24 Hz. To still be able to show such content without stutter, the display can be driven at a multiple of the target frame rate and present new content on every n-th refresh cycle.

Right now this Low Framerate Compensation (LFC) feature is built into the kernel driver, and when VRR is enabled, user space can transparently present content at refresh rates even lower than what the display supports. While this seems like a good idea, there are problems with this approach. For example the cursor can only be updated when there is a content update, making it very sluggish because of the low rate of content updates even though the screen refreshes multiple times. This either requires a special KMS commit which does not result in an immediate page flip but ends up on the refresh cycles inserted by LFC, or implementing LFC in user space instead. Like with the refresh rate change smoothing talked about earlier, moving LFC to user space might be possible but also might require some help from the kernel to be able to time page flips well enough.

Wayland

For VRR to work, applications need to provide content updates on a surface in a semi-regular interval. GUI applications for example often only draw when something changed which makes the updates irregular, driving VRR to its minimum refresh rate until e.g. an animation is playing and VRR is ramping up the refresh rate over multiple refresh cycles. This results in choppy mouse cursor movements and animations for some time. GUI applications sometimes do provide semi-regular updates, e.g. during animations or video playback. Some applications, like games, always provide semi-regular updates.

Currently there is no1 Wayland protocol letting applications advertise that a surface works with VRR at a moment in time, or at all. There is no way for a compositor to automatically determine if an app or a surface is suitable for VRR as well. For wayland native applications a protocol to communicate this information could be created but there are a lot of applications out there which would work fine with VRR but will not get updated to support this protocol.

Maintaining a database similar to the one mentioned above, but for applications, was discussed, but there is no clear winner in how to do so, and where to store the data. Maintaining a list is cumbersome, and complicates the ability for applications to work with VRR on release, or on distributions with out of date databases. Another idea was a desktop file entry stating support, but this too has its downsides. All in all, there is no clear path forward in how to actually enable VRR for applications transparently without causing issues.

1. Except for a protocol proposal.

Wrap-up

Brno, Czech Republic

The hackfest was a huge success! Not only was this a good opportunity to get everyone up to speed and learn about what everyone is doing, having people with different backgrounds in the discussions made it possible to discuss problems, ideas and solutions spanning all the way from clients over compositors, to drivers and hardware. Especially on the color and HDR topics we came up with good, actionable consensus and a clear path to where we want to go. For VRR we managed to pin-point the remaining issues and know which parts require more experimentation.

For GNOME, Color management, HDR and VRR are all topics that are being actively worked on, and the future is both bright and dynamic, not only when it comes to luminescence and color intensity, but also when it comes to the rate monitors present all these intense colors.

Dor Askayo who has been working on bringing VRR to GNOME attended part of the hackfest, and together we can hopefully bring experimental VRR to GNOME soon. There will be more work needed to iron out the overall experience, as covered above, but getting the fundamental building blocks in place is a critical first step.

For HDR, work has been going on to attach color state information to the scene graph, and at the hackfest Georges Basile Stavracas, Sebastian Wick and Jonas Ådahl sat down and sketched out a new Clutter rendering API that aims replace the current Clutter paint nodes API that is used in Mutter and GNOME Shell, which will make color transformations a first class citizen. We will initially focus on using shaders for everything, but down the road, the goal is to utilize the future color pipeline KMS uAPI for both performance and power consumption improvements.

We’d like to thank Red Hat for organizing and hosting the hackfest and for allowing us to work on these interesting topics, Red Hat and Collabora for sponsoring food and refreshments, and especially Carlos Soriano Sanchez and Tomas Popela for actually doing all the work making the event happen. It was great. Also thanks to Jakub Steiner for the illustration, and Carlos Soriano Sanchez for the photo from the hackfest.

For another great hackfest write-up, head over to Simon Ser’s blog post.

May 03, 2023

Niepce April 2023 updates

This is the April 2023 update for Niepce. Between outages caused by an unseasonbly ice storm, squirrels chewing cable, I discovered how painful is being off the grid. This is not the excuse, just the events that lead to downtime and April not being very productive.

The importer moved a little bit forward. I tried to use the brand new "sort by date" importer tool, that is used to test and exercise the logic. The improved importer address a few long standing issues, including not using libgphoto2 for USB Mass Storage, including flash card readers. This was a shortcut I had taken and the result was suboptimal. The new approach is to use libgphoto2 to find the device, and then switch to pure filesystem operations. That was issue 26.

I picked up my camera which I haden't done much since the pandemic started. After a firmware upgrade on the Fujifilm X-T3 (it's a 4 year old model, and the firmware is from this year, unlike most smartphones), I notice a new feature that allow setting a picture as favourite. My first question was "how is it stored?". I put that on my list for later.

After taking some pictures of the newly bloomed spring flowers, I tried the importer. The new files cause a bug with metadata making the importer unable to determine the creation date ; I hit this when trying to sort these new pictures.

Metadata parsing

Metadata parsing relies on exempi2 and rexiv2. The former are the Rust bindings for Exempi, the latter are the Rust bindings for gexiv2 which is itself C bindings on top of exiv2 C++ API1.

The algorithm work this way: try to load an XMP from a file, and if not, fallback on rexiv2. Given that Exempi isn't aware of raw files2, it should always fallback, except it doesn't, and the XMP it finds only contains one value: Rating. I think I just answered the question as to where the camera store if the image is favourited or not: in an XMP packet in the raw file ; packet found by Exempi's packet scanner. So I need to fix the logic by making sure that camera raw files are always parsed by the rexiv2 fallback.

libopenraw

And looking at Fujifilm RAF files I ended up peeling the onion and improving support for parsing these. Things are a bit more robust.

I released libopenraw 0.3.5. One of the key changes is that the tarball vendor the Rust crates.

I also moved further along with the Rust port by resyncing camera support with the main branch and adding support for Minolta MRW that was left behind so far, bringing the same support between the C++ mainline and the Rust port. And this keep uncovering bugs in the C++ implementation.

User interface

To integrate everything, the UI needed to be adjusted. I can now tell to copy on import for the directory importer. I also cleaned up a few leftovers from great ideas from a decade ago.

In conclusion

There are so many things to do that I try to focus on specific tasks from the user standpoint and implement what is necessary to perform them, which often means working on a solid foundation to support it. But in the end, I still haven't run the importer successfully on these pictures.

Until next month...

1

At that point, I ponder removing gexiv2 and using cxx for evix2, albeit the use of deprectated std::auto_ptr<> all over the API might make me want to not do that. Just to be clear, it's about removing layers of complexities.

2

A bit of context. Exempi is an API wrapper around Adobe's XMP SDK (BSD licensed), and that includes all the same features. One would think that the company that also champion DNG and has a flagship product to decode camera raw files would support them in XMP. Ah ah. Sorry for the joke attempt. The XMP SDK will reconcile metadata, so for example a JPEG with Exif, it will return an XMP packet build out of the Exif. With a raw file, it returns nothing. Except when it does because the packet scanner found one.

May 02, 2023

structure and interpretation of ark

Hello, dear readers! Today's article describes Ark, a new JavaScript-based mobile development platform. If you haven't read them yet, you might want to start by having a look at my past articles on Capacitor, React Native, NativeScript, and Flutter; having a common understanding of the design space will help us understand where Ark is similar and where it differs.

Ark, what it is

If I had to bet, I would guess that you have not heard of Ark. (I certainly hadn't either, when commissioned to do this research series.) To a first approximation, Ark—or rather, what I am calling Ark; I don't actually know the name for the whole architecture—is a loosely Flutter-like UI library implemented on top of a dialect of JavaScript, with build-time compilation to bytecode (like Hermes) but also with support for just-in-time and ahead-of-time compilation of bytecode to native code. It is made by Huawei.

At this point if you are already interested in this research series, I am sure this description raises more questions than it answers. Flutter-like? A dialect? Native compilation? Targetting what platforms? From Huawei? We'll get to all of these, but I think we need to start with the last question.

How did we get here?

In my last article on Flutter, I told a kind of just-so history of how Dart and Flutter came to their point in the design space. Thanks to corrections from a kind reader, it happened to also be more or less correct. In this article, though, I haven't talked with Ark developers at all; I don't have the benefit of a true claim on history. And yet, the only way I can understand Ark is by inventing a narrative, so here we go. It might even be true!

Recall that in 2018, Huawei was a dominant presence in the smartphone market. They were shipping excellent hardware at good prices both to the Chinese and to the global markets. Like most non-Apple, non-Google manufacturers, they shipped Android, and like most Android OEMs, they shipped Google's proprietary apps (mail, maps, etc.).

But then, over the next couple years, the US decided that allowing Huawei to continue on as before was, like, against national security interests or something. Huawei was barred from American markets, a number of suppliers were forbidden from selling hardware components to Huawei, and even Google was prohibited from shipping its mobile apps on Huawei devices. The effect on Huawei's market share for mobile devices was enormous: its revenue was cut in half over a period of a couple years.

In this position, as Huawei, what do you do? I can't even imagine, but specifically looking at smartphones, I think I would probably do about what they did. I'd fork Android, for starters, because that's what you already know and ship, and Android is mostly open source. I'd probably plan on continuing to use its lower-level operating system pieces indefinitely (kernel and so on) because that's not a value differentiator. I'd probably ship the same apps on top at first, because otherwise you slip all the release schedules and lose revenue entirely.

But, gosh, there is the risk that your product will be perceived as just a worse version of Android: that's not a good position to be in. You need to be different, and ideally better. That will take time. In the meantime, you claim that you're different, without actually being different yet. It's a somewhat ridiculous position to be in, but I can understand how you get here; Ars Technica published a scathing review poking fun at the situation. But, you are big enough to ride it out, knowing that somehow eventually you will be different.

Up to now, this part of the story is relatively well-known; the part that follows is more speculative on my part. Firstly, I would note that Huawei had been working for a while on a compiler and language run-time called Ark Compiler, with the goal of getting better performance out of Android applications. If I understand correctly, this compiler took the Java / Dalvik / Android Run Time bytecodes as its input, and outputted native binaries along with a new run-time implementation.

As I can attest from personal experience, having a compiler leads to hubris: you start to consider source languages like a hungry person looks at a restaurant menu. "Wouldn't it be nice to ingest that?" That's what we say at restaurants, right, fellow humans? So in 2019 and 2020 when the Android rug was pulled out from underneath Huawei, I think having in-house compiler expertise allowed them to consider whether they wanted to stick with Java at all, or whether it might be better to choose a more fashionable language.

Like black, JavaScript is always in fashion. What would it mean, then, to retool Huawei's operating system -- by then known by the name "HarmonyOS" -- to expose a JavaScript-based API as its primary app development framework? You could use your Ark compiler somehow to implement JavaScript (hubris!) and then you need a UI framework. Having ditched Java, it is now thinkable to ditch all the other Android standard libraries, including the UI toolkit: you start anew, in a way. So are you going to build a Capacitor, a React Native, a NativeScript, a Flutter? Surely not precisely any of these, but what will it be like, and how will it differ?

Incidentally, I don't know the origin story for the name Ark, but to me it brings to mind tragedy and rebuilding: in the midst of being cut off from your rich Android ecosystem, you launch a boat into the sea, holding a promise of a new future built differently. Hope and hubris in one vessel.

Two programming interfaces

In the end, Huawei builds two things: something web-like and something like Flutter. (I don't mean to suggest copying or degeneracy here; it's rather that I can only understand things in relation to other things, and these are my closest points of comparison for what they built.)

The web-like programming interface specifies UIs using an XML dialect, HML, and styles the resulting node tree with CSS. You augment these nodes with JavaScript behavior; the main app is a set of DOM-like event handlers. There is an API to dynamically create DOM nodes, but unlike the other systems we have examined, the HarmonyOS documentation doesn't really sell you on using a high-level framework like Angular.

If this were it, I think Ark would not be so compelling: the programming model is more like what was available back in the DHTML days. I wouldn't expect people to be able to make rich applications that delight users, given these primitives, though CSS animation and the HML loop and conditional rendering from the template system might be just expressive enough for simple applications.

The more interesting side is the so-called "declarative" UI programming model which exposes a Flutter/React-like interface. The programmer describes the "what" of the UI by providing a tree of UI nodes in its build function, and the framework takes care of calling build when necessary and of rendering that tree to the screen.

Here I need to show some example code, because it is... weird. Well, I find it weird, but it's not too far from SwiftUI in flavor. A small example from the fine manual:

@Entry
@Component
struct MyComponent {
  build() {
    Stack() {
        Image($rawfile('Tomato.png'))
        Text('Tomato')
            .fontSize(26)
            .fontWeight(500)
    }
  }
}

The @Entry decorator (*) marks this struct (**) as being the main entry point for the app. @Component marks it as being a component, like a React functional component. Components conform to an interface (***) which defines them as having a build method which takes no arguments and returns no values: it creates the tree in a somewhat imperative way.

But as you see the flavor is somewhat declarative, so how does that work? Also, build() { ... } looks syntactically a lot like Stack() { ... }; what's the deal, are they the same?

Before going on to answer this, note my asterisks above: these are concepts that aren't in JavaScript. Indeed, programs written for HarmonyOS's declarative framework aren't JavaScript; they are in a dialect of TypeScript that Huawei calls ArkTS. In this case, an interface is a TypeScript concept. Decorators would appear to correspond to an experimental TypeScript feature, looking at the source code.

But struct is an ArkTS-specific extension, and Huawei has actually extended the TypeScript compiler to specifically recognize the @Component decorator, such that when you "call" a struct, for example as above in Stack() { ... }, TypeScript will parse that as a new expression type EtsComponentExpression, which may optionally be followed by a block. When Stack() is invoked, its children (instances of Image and Text, in this case) will be populated via running the block.

Now, though TypeScript isn't everyone's bag, it's quite normalized in the JavaScript community and not a hard sell. Language extensions like the handling of @Component pose a more challenging problem. Still, Facebook managed to sell people on JSX, so perhaps Huawei can do the same for their dialect. More on that later.

Under the hood, it would seem that we have a similar architecture to Flutter: invoking the components creates a corresponding tree of elements (as with React Native's shadow tree), which then are lowered to render nodes, which draw themselves onto layers using Skia, in a multi-threaded rendering pipeline. Underneath, the UI code actually re-uses some parts of Flutter, though from what I can tell HarmonyOS developers are replacing those over time.

Restrictions and extensions

So we see that the source language for the declarative UI framework is TypeScript, but with some extensions. It also has its restrictions, and to explain these, we have to talk about implementation.

Of the JavaScript mobile application development frameworks we discussed, Capacitor and NativeScript used "normal" JS engines from web browsers, while React Native built their own Hermes implementation. Hermes is also restricted, in a way, but mostly inasmuch as it lags the browser JS implementations; it relies on source-to-source transpilers to get access to new language features. ArkTS—that's the name of HarmonyOS's "extended TypeScript" implementation—has more fundamental restrictions.

Recall that the Ark compiler was originally built for Android apps. There you don't really have the ability to load new Java or Kotlin source code at run-time; in Java you have class loaders, but those load bytecode. On an Android device, you don't have to deal with the Java source language. If we use a similar architecture for JavaScript, though, what do we do about eval?

ArkTS's answer is: don't. As in, eval is not supported on HarmonyOS. In this way the implementation of ArkTS can be divided into two parts, a frontend that produces bytecode and a runtime that runs the bytecode, and you never have to deal with the source language on the device where the runtime is running. Like Hermes, the developer produces bytecode when building the application and ships it to the device for the runtime to handle.

Incidentally, before we move on to discuss the runtime, there are actually two front-ends that generate ArkTS bytecode: one written in C++ that seems to only handle standard TypeScript and JavaScript, and one written in TypeScript that also handles "extended TypeScript". The former has a test262 runner with about 10k skipped tests, and the latter doesn't appear to have a test262 runner. Note, I haven't actually built either one of these (or any of the other frameworks, for that matter).

The ArkTS runtime is itself built on a non-language-specific common Ark runtime, and the set of supported instructions is the union of the core ISA and the JavaScript-specific instructions. Bytecode can be interpreted, JIT-compiled, or AOT-compiled.

On the side of design documentation, it's somewhat sparse. There are some core design docs; readers may be interested in the rationale to use a bytecode interface for Ark as a whole, or the optimization overview.

Indeed ArkTS as a whole has a surfeit of optimizations, to an extent that makes me wonder which ones are actually needed. There are source-to-source optimizations on bytecode, which I expect are useful if you are generating ArkTS bytecode from JavaScript, where you probably don't have a full compiler implementation. There is a completely separate optimizer in the eTS part of the run-time, based on what would appear to be a novel "circuit-based" IR that bears some similarity to sea-of-nodes. Finally the whole thing appears to bottom out in LLVM, which of course has its own optimizer. I can only assume that this situation is somewhat transitory. Also, ArkTS does appear to generate its own native code sometimes, notably for inline cache stubs.

Of course, when it comes to JavaScript, there are some fundamental language semantics and there is also a large and growing standard library. In the case of ArkTS, this standard library is part of the run-time, like the interpreter, compilers, and the garbage collector (generational concurrent mark-sweep with optional compaction).

All in all, when I step back from it, it's a huge undertaking. Implementing JavaScript is no joke. It appears that ArkTS has done the first 90% though; the proverbial second 90% should only take a few more years :)

Evaluation

If you told a younger me that a major smartphone vendor switched from Java to JavaScript for their UI, you would probably hear me react in terms of the relative virtues of the programming languages in question. At this point in my career, though, the only thing that comes to mind is what an expensive proposition it is to change everything about an application development framework. 200 people over 5 years would be my estimate, though of course teams are variable. So what is it that we can imagine that Huawei bought with a thousand person-years of investment? Towards what other local maximum are we heading?

Startup latency

I didn't mention it before, but it would seem that one of the goals of HarmonyOS is in the name: Huawei wants to harmonize development across the different range of deployment targets. To the extent possible, it would be nice to be able to write the same kinds of programs for IoT devices as you do for feature-rich smartphones and tablets and the like. In that regard one can see through all the source code how there is a culture of doing work ahead-of-time and preventing work at run-time; for example see the design doc for the interpreter, or for the file format, or indeed the lack of JavaScript eval.

Of course, this wide range of targets also means that the HarmonyOS platform bears the burden of a high degree of abstraction; not only can you change the kernel, but also the JavaScript engine (using JerryScript on "lite" targets).

I mention this background because sometimes in news articles and indeed official communication from recent years there would seem to be some confusion that HarmonyOS is just for IoT, or aimed to be super-small, or something. In this evaluation I am mostly focussed on the feature-rich side of things, and there my understanding is that the developer will generate bytecode ahead-of-time. When an app is installed on-device, the AOT compiler will turn it into a single ELF image. This should generally lead to fast start-up.

However it would seem that the rendering library that paints UI nodes into layers and then composits those layers uses Skia in the way that Flutter did pre-Impeller, which to be fair is a quite recent change to Flutter. I expect therefore that Ark (ArkTS + ArkUI) applications also experience shader compilation jank at startup, and that they may be well-served by tesellating their shapes into primitives like Impeller does so that they can precompile a fixed, smaller set of shaders.

Jank

Maybe it's just that apparently I think Flutter is great, but ArkUI's fundamental architectural similarity to Flutter makes me think that jank will not be a big issue. There is a render thread that is separate from the ArkTS thread, so like with Flutter, async communication with main-thread interfaces is the main jank worry. And on the ArkTS side, ArkTS even has a number of extensions to be able to share objects between threads without copying, should that be needed. I am not sure how well-developed and well-supported these extensions are, though.

I am hedging my words, of course, because I am missing a bit of social proof; HarmonyOS is still in infant days, and it doesn't have much in the way of a user base outside China, from what I can tell, and my ability to read Chinese is limited to what Google Translate can do for me :) Unlike other frameworks, therefore, I haven't been as able to catch a feel of the pulse of the ArkUI user community: what people are happy about, what the pain points are.

It's also interesting that unlike iOS or Android, HarmonyOS is only exposing these "web-like" and "declarative" UI frameworks for app development. This makes it so that the same organization is responsible for the software from top to bottom, which can lead to interesting cross-cutting optimizations: functional reactive programming isn't just a developer-experience narrative, but it can directly affect the shape of the rendering pipeline. If there is jank, someone in the building is responsible for it and should be able to fix it, whether it is in the GPU driver, the kernel, the ArkTS compiler, or the application itself.

Peak performance

I don't know how to evaluate ArkTS for peak performance. Although there is a JIT compiler, I don't have the feeling that it is as tuned for adaptive optimization as V8 is.

At the same time, I find it interesting that HarmonyOS has chosen to modify JavaScript. While it is doing that, could they switch to a sound type system, to allow the kinds of AOT optimizations that Dart can do? It would be an interesting experiment.

As it is, though, if I had to guess, I would say that ArkTS is well-positioned for predictably good performance with AOT compilation, although I would be interested in seeing the results of actually running it.

Aside: On the importance of storytelling

In this series I have tried to be charitable towards the frameworks that I review, to give credit to what they are trying to do, even while noting where they aren't currently there yet. That's part of why I need a plausible narrative for how the frameworks got where they are, because that lets me have an idea of where they are going.

In that sense I think that Ark is at an interesting inflection point. When I started reading documentation about ArkUI and HarmonyOS and all that, I bounced out—there were too many architectural box diagrams, too many generic descriptions of components, too many promises with buzzwords. It felt to me like the project was trying to justify itself to a kind of clueless management chain. Was there actually anything here?

But now when I see the adoption of a modern rendering architecture and a bold new implementation of JavaScript, along with the willingness to experiment with the language, I think that there is an interesting story to be told, but this time not to management but to app developers.

Of course you wouldn't want to market to app developers when your system is still a mess because you haven't finished rebuilding an MVP yet. Retaking my charitable approach, then, I can only think that all the architectural box diagrams were a clever blind to avoid piquing outside interest while the app development kit wasn't ready yet :) As and when the system starts working well, presumably over the next year or so, I would expect HarmonyOS to invest much more heavily in marketing and developer advocacy; the story is interesting, but you have to actually tell it.

Aside: O platform, my platform

All of the previous app development frameworks that we looked at were cross-platform; Ark is not. It could be, of course: it does appear to be thoroughly open source. But HarmonyOS devices are the main target. What implications does this have?

A similar question arises in perhaps a more concrete way if we start with the mature Flutter framework: what would it mean to make a Flutter phone?

The first thought that comes to mind is that having a Flutter OS would allow for the potential for more cross-cutting optimizations that cross abstraction layers. But then I think, what does Flutter really need? It has the GPU drivers, and we aren't going to re-implement those. It has the bridge to the platform-native SDK, which is not such a large and important part of the app. You get input from the platform, but that's also not so specific. So maybe optimization is not the answer.

On the other hand, a Flutter OS would not have to solve the make-it-look-native problem; because there would be no other "native" toolkit, your apps won't look out of place. That's nice. It's not something that would make the platform compelling, though.

HarmonyOS does have this embryonic concept of app mobility, where like you could put an app from your phone on your fridge, or something. Clearly I am not doing it justice here, but let's assume it's a compelling use case. In that situation it would be nice for all devices to present similar abstractions, so you could somehow install the same app on two different kinds of devices, and they could communicate to transfer data. As you can see here though, I am straying far from my domain of expertise.

One reasonable way to "move" an app is to have it stay running on your phone and the phone just communicates pixels with your fridge (or whatever); this is the low-level solution. I think HarmonyOS appears to be going for the higher-level solution where the app actually runs logic on the device. In that case it would make sense to ship UI assets and JavaScript / extended TypeScript bytecode to the device, which would run the app with an interpreter (for low-powered devices) or use JIT/AOT compilation. The Ark runtime itself would live on all devices, specialized to their capabilities.

In a way this is the Apple WatchOS solution (as I understand it); developers publish their apps as LLVM bitcode, and Apple compiles it for the specific devices. A FlutterOS with a Flutter run-time on all devices could do something similar. As with WatchOS you wouldn't have to ship the framework itself in the app bundle; it would be on the device already.

Finally, publishing apps as some kind of intermediate representation also has security benefits: as the OS developer, you can ensure some invariants via the toolchain that you control. Of course, you would have to ensure that the Flutter API is sufficiently expressive for high-performance applications, while also not having arbitrary machine code execution vulnerabilities; there is a question of language and framework design as well as toolchain and runtime quality of implementation. HarmonyOS could be headed in this direction.

Conclusion

Ark is a fascinating effort that holds much promise. It's also still in motion; where will it be when it anneals to its own local optimum? It would appear that the system is approaching usability, but I expect a degree of churn in the near-term as Ark designers decide which language abstractions work for them and how to, well, harmonize them with the rest of JavaScript.

For me, the biggest open question is whether developers will love Ark in the way they love, say, React. In a market where Huawei is still a dominant vendor, I think the material conditions are there for a good developer experience: people tend to like Flutter and React, and Ark is similar. Huawei "just" needs to explain their framework well (and where it's hard to explain, to go back and change it so that it is explainable).

But in a more heterogeneous market, to succeed Ark would need to make a cross-platform runtime like the one Flutter has and engage in some serious marketing efforts, so that developers don't have to limit themselves to targetting the currently-marginal HarmonyOS. Selling extensions to JavaScript will be much more difficult in a context where the competition is already established, but perhaps Ark will be able to productively engage with TypeScript maintainers to move the language so it captures some of the benefits of Dart that facilitate ahead-of-time compilation.

Well, that's it for my review round-up; hope you have enjoyed the series. I have one more pending article, speculating about some future technologies. Until then, happy hacking, and see you next time.

April 30, 2023

Udev Rules for Dirtywave M8

This post is very unlikely for you. It’s for future me.

The little magic box that is the Dirtywave M8 tracker is pretty well supported in Linux. It works great as an audio device (input and output), it does usb midi and you can also use its remote display using laamaa’s m8c which now also does audio monitoring.

M8c isn’t an app, so it’s a bit of a hassle to build it and use it from within a toolbx. Regular Linux distro chore. In addition, to update its firmware, which Timothy pushes very frequently and brings amazing new functionality, requires adding udev rules to have the device writable by a user. Which is what this post is about. I have no clue what I’m doing, but having this config in /etc/udef/rules.d/50-myusb.rules (first is the regular device for m8c and the latter is the second stage of the firmware update using tytools):

SUBSYSTEMS=="usb", ATTRS{idVendor}=="16c0", ATTRS{idProduct}=="048a", GROUP="users", MODE="0666"
KERNEL=="hidraw*", ATTRS{idVendor}=="16c0", GROUP="users", MODE="0666"

Enjoy my last track, Sines of our fathers if you don’t care of any of the above ;)

April 29, 2023

The unbearable tightness of printing

Let's say you want to print a full colour comic book in the best possible quality. For simplicity we'll use this image as an example.

As you can probably guess, just putting this image in a PDF does not work, even if it had sufficient resolution. Instead what you need to do is to create two images. One for linework that is monochrome and has least 600 PPI and one for colours, which is typically a 300 PPI colour managed CMYK TIFF.

The colour image is drawn first and then the monochrome image is drawn on top of it. In this way you get both smooth colours and crisp linework. Most people would stop here, but this where the actual work begins. It is also where things start to wander into undocumented (or, rather, "implementation defined") territory.

Printing in true black

In computer monitors the blackest colour possible is when all colour components are off, or (0, 0, 0) in RGB values. Thus you might expect that the blackest CMYK colour is either (0, 0, 0, 1) or (1, 1, 1, 1). Surprisingly it is neither. The former looks grayish when printed whereas the latter can't be printed at all because of physical limitations. If you put too much ink in one place on the page, the underlying proper gets too wet, warps and might even rip. And tear.

Instead what you need to do is to use a colour called rich black. Each print shop has their own values for this, as the exact amount of inks to use to get the deepest black colour is dependent on the inks, paper and printing machine used. We'll use the value (0.1, 0.1, 0.1, 1.0) for rich black in this text.

Thus we need three different images rather than two.

First the colour image is laid down, then the image holding the areas that should be printed in rich black. This is a 300PPI colour image with the colour value (0.1, 0.1, 0.1, 0) on pixels that should be painted with rich black. Finally the line work is drawn on top the other two. The first two images can be combined into one. This is usually done by graphic artists when preparing their artwork to print. However the middle image can be automatically generated from the linework image with some Python so we're doing that to reduce manual work and reduce the possibility of human error.

If you create a PDF with these images you are still not done. In fact the output would be identical to the previous setup. There are still more quirks to handle.

Trapping and overprinting

Since all of the colours are printed separately they are suspect to misregistration. That is, the various colours might shift relative to each other during the printing process. This causes visual artifacts in the edges between two colours. This is a fairly complicated topic, Wikipedia has more details. This issue can be fixed by trapping, that is, "spreading" the colour under the "edge" of the linework. Like so:

If you look closely at the middle image, the gray area is slightly smaller than in the previous picture. This shrunk image can be automatically generated from the linework image with morphological erode/dilate operations. Now we have everything needed to print things properly, but if you actually try it it still won't work.

The way the PDF imaging model works is that if you draw on the canvas with any colour, all colour channels of the existing colour on the page get affected. That is, if the existing colour on the canvas is (0.1, 0.1, 0.1, 0) and you draw on top of it with (0, 0, 0, 1) the output is (0, 0, 0, 1). All the work we did getting the proper rich black colour under the linework gets erased as if it was never there.

PDF has a feature called overprinting to handle this exact case (you could also use the "multiply" filter but it requires the use of transparency, which is still prohibited in some workflows). It does pretty much what it says on the tin. When overprinting is enabled any draw operations accumulate over the existing inks. Thus the final step is to enable overprinting for the final line work image and then Bob's your uncle?

In theory yes. In practice lol no, because this part of the PDF specification is about as hand-wavy as things go. There are several toggles that affect how overprinting gets handled. What they actually do is only explained in descriptive text. One of the outcomes of this is that every single generally available PDF viewer renders the output incorrectly. Poppler, Ghostscript, Apple Preview and even Adobe Acrobat Reader all produce outputs that are incorrect in different ways. They don't even warn you that the PDF uses overprinting and that the output might be incorrect. This makes development and debugging this use case somewhat challenging.

The only way to get correct output is to use Adobe Acrobat Pro and tell it to enable overprint simulation. Fortunately I have a friend who has a 10 year old version (remember, back when you could actually buy software once and keep using it as opposed to a monthly license that can get yanked at any time?). After pestering him with an endless flow of test PDFs I finally managed to work out the exact steps needed to make this work:

  • Create a 300 PPI image with the colours, a 300 or 600 PPI monochrome image with the rich black areas and a 600 DPI monochrome image for the linework (the rich black image can be autogenerated from the linework image and/or precomposited in the colour image)
  • Load and draw the colour image as usual
  • Load the rich black image and store it as a PDF ImageMask rather than a plain image
  • Set nonstroke colour to (0.1, 0.1, 0.1, 0), set the rich black image as a stencil and fill it
  • Load the linework image as an imagemask
  • Enable overprinting mode
  • Set overprinting mode to 1
  • Set nonstroke colour to (0, 0, 0, 1)
  • Draw the line image as a stencil

If you deviate from any of the above steps, the output will be silently wrong. If you process the resulting PDF with anything except Adobe's tool suite the end result might become silently wrong. As an example here is the output of colour separation using Adobe Acrobat and Ghostscript.

Acrobat has preserved the rich black values under the linework whereas Ghostscript has instead cleared the colour value to zero losing the "rich" part of black. Interestingly Ghostscript seems to handle overprinting correctly in basic PDF shape drawing operations but not in stencil drawing operations.

Or maybe it does and Acrobat is incorrect here. The only way to know for sure would be to print test samples on a dozen or so commercial offset printing presses, inspecting the plates manually and then seeing what ends up on paper. Sadly I don't have the resources for that.

April 28, 2023

Flatseal 2.0

To celebrate Flatseal reaching 800,000 downloads on Flathub 🤯, a new release is out! Flatseal 2.0 comes with improved visuals powered by GTK 4 and Libadwaita and, with that, a few quality of life improvements and bug fixes.

Kudos to @natasria for the initial work on porting the user interface to GTK 4 and Libadwaita, and to @A6GibKm for helping with the reviews and making a lot of further improvements to that work.

As a result, the user interface looks cleaner and incorporates many of Libadwaita goodies like the mobile-friendly about-me dialog, among other things.

Also, this release includes a few quality of life improvements and bug fixes, e.g. typing to search for applications and a popover with suggestions for auto-completing XDG directories. See the full list here.

There are a few improvements and new features that didn’t make it to the release, e.g. incorporating even more Libadwaita widgets, new translations, and detecting when applications have been installed, removed or updated to allow users reload the applications list. So expect a follow-up release, shortly.

Last but never least, a big thanks to @rusty-snake for always keeping an eye on the issue tracker and answering people’s questions.

#93 Snapshot

Update on what happened across the GNOME project in the week from April 21 to April 28.

GNOME Incubating Apps

Maximiliano 🥑 says

Snapshot was recently accepted into the Incubator group and the first new preview release is out. Snapshot aims to be the next generation camera app for GNOME, supporting both desktop and mobile devices.

You can get Snapshot at Flathub.

GNOME Core Apps and Libraries

Libadwaita

Building blocks for modern GNOME apps using GTK4.

Alice (she/they) announces

after almost a year, breakpoints have finally landed in libadwaita. They allow to do arbitrary layout changes depending on the bin/window size and aspect ratio, with the tradeoff of losing automatic minimum size calculation. This finally allows to do things such as adding a bottom bar on narrow sizes without issues, and enable a lot of designs that are currently impractically hard to implement.

Breakpoints can also be used directly on AdwWindow and AdwApplicationWindow for convenience.

GLib

The low-level core library that forms the basis for projects such as GTK and GNOME.

Philip Withnall says

GLib has just acquired an internal list of pending GTasks, for debugging what’s going on in your app using gdb. Use it by calling print g_task_print_alive_tasks() in gdb. See https://gitlab.gnome.org/GNOME/glib/-/merge_requests/3404

GNOME Circle Apps and Libraries

Sophie 🏳️‍🌈 🏳️‍⚧️ ✊ says

This week, Telegraph joined GNOME Circle. With Telegraph, you can translate Morse code back and forth. Congratulations!

Amberol

Plays music, and nothing else.

Emmanuele Bassi reports

You thought I forgot about Amberol, but here we are, with a new release! And what a release is this, with lots and lots of fixes big and small:

  • you can now restore the playlist from your last session
  • background playback can be toggled
  • there’s a quick mute/unmute button
  • no duplicate songs in the playlist
  • the UI has been slightly tweaked to avoid confusing the volume scale with the song position
  • the readability of the drop overlay has been improved
  • the base run time and dependencies have been updated to GNOME 44
  • lots and lots of big and small fixes

Plus, Amberol is now verified on the new Flathub website, and we’re close to 100k downloads in a bit over a year of development!

Third Party Projects

Flatseal

A graphical utility to review and modify permissions of Flatpak applications.

Martín Abente Lahaye says

To celebrate Flatseal reaching 800,000 downloads on Flathub 🤯, a new release is out! Flatseal 2.0 comes with improved visuals powered by GTK 4 and Libadwaita and, with that, a few quality of life improvements and bug fixes.

Kudos to natasria for the initial work on porting the user interface to GTK 4 and Libadwaita, and to A6GibKm for helping with the reviews and making a lot of further improvements to that work.

Download on Flathub!

Miscellaneous

Casper Meijn announces

A long-standing issue with the GNOME CI templates is fixed. Rust apps were built three times by the template. The old template executes a normal build. Then rebuilds all the dependencies in preparation for the tests. And will accidentally build again during the test execution. With the recent change, the CI job rewrites the Flatpak manifest to enable run-tests. Flatpak will take care of executing the test suite during the build.

https://gitlab.gnome.org/GNOME/citemplates/-/issues/5

GNOME Foundation

Rosanna says

(Some of the hats I wear here at the Foundation include work on the Travel Committee, the Code of Conduct Committee, the Executive Committee, the Finance Committee, as well as the staff liaison to the Board of Directors. For the sake of clarity, I will avoid using the word “we” without specifying which group I am referring to.)

With another successful LAS conference completed, the Foundation staff spent this week recovering, traveling, or switching gears. With scheduling for GUADEC happening this week, the travel committee is also getting into gear to go through travel requests. I have also been invoicing the generous folks who have committed to sponsoring GUADEC this year.

The Code of Conduct Committee has been made aware that the sending of reports to discourse is not working for some people. The committee and staff are working on an alternative at blogs.gnome.org/coc/. Reports can already be sent via this webpage, but it does need a bit more work. It needs to allow anonymous reports and to link to the code of conduct. Only those on the Code of Conduct Committee and moderators on gitlab.gnome.org have access to these reports. If you wish to report on anyone on that list, you can still personally email a report to the other members of the committee.

I have had some meetings with some nonprofit bookkeepers in an effort to find help to lighten my load. I am looking forward to carving out more time to be able to catch up on some of my other tasks.

In other goings on this week, I have also corresponded with the 401K accountant to figure out some discrepancies for a tax filing, updated the employee handbook, paid some bills, filled out and uploaded some forms for compliance, and spent time at the bank. Just another regular week working for the Foundation.

That’s all for this week!

See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects!

April 26, 2023

structure and interpretation of flutter

Good day, gentle hackfolk. Like an old-time fiddler I would appear to be deep in the groove, playing endless variations on a theme, in this case mobile application frameworks. But one can only recognize novelty in relation to the familiar, and today's note is a departure: we are going to look at Flutter, a UI toolkit based not on JavaScript but on the Dart language.

The present, from the past

Where to start, even? The problem is big enough that I'll approach it from three different angles: from the past, from the top, and from the bottom.

With the other frameworks we looked at, we didn't have to say much about their use of JavaScript. JavaScript is an obvious choice, in 2023 at least: it is ubiquitous, has high quality implementations, and as a language it is quite OK and progressively getting better. Up to now, "always bet on JS" has had an uninterrupted winning streak.

But winning is not the same as unanimity, and Flutter and Dart represent an interesting pole of contestation. To understand how we got here, we have to go back in time. Ten years ago, JavaScript just wasn't a great language: there were no modules, no async functions, no destructuring, no classes, no extensible iteration, no optional arguments to functions. In addition it was hobbled with a significant degree of what can only be called accidental sloppiness: with which can dynamically alter a lexical scope, direct eval that can define new local variables, Function.caller, and so on. Finally, larger teams were starting to feel the need for more rigorous language tooling that could use types to prohibit some classes of invalid programs.

All of these problems in JavaScript have been addressed over the last decade, mostly successfully. But in 2010 or so if you were a virtual machine engineer, you might look at JavaScript and think that in some other world, things could be a lot better. That's effectively what happened: the team that originally built V8 broke off and started to work on what became Dart.

Initially, Dart was targetted for inclusion in the Chrome web browser as an alternate "native" browser language. This didn't work, for various reasons, but since then Dart grew the Flutter UI toolkit, which has breathed new life into the language. And this is a review of Flutter, not a review of Dart, not really anyway; to my eyes, Dart is spiritually another JavaScript, different but in the same family. Dart's implementation has many interesting aspects as well that we'll get into later on, but all of these differences are incidental: they could just as well be implemented on top of JavaScript, TypeScript, or another source language in that family. Even if Flutter isn't strictly part of the JavaScript-based mobile application development frameworks that we are comparing, it is valuable to the extent that it shows what is possible, and in that regard there is much to say.

Flutter, from the top

At its most basic, Flutter is a UI toolkit for Dart. In many ways it is like React. Like React, its interface follows the functional-reactive paradigm: programmers describe the "what", and Flutter takes care of the "how". Also, like the phenomenon in which new developers can learn React without really knowing JavaScript, Flutter is the killer app for Dart: Flutter developers mostly learn Dart at the same time that they pick up Flutter.

In some other ways, Flutter is the logical progression of React, going in the same direction but farther along. Whereas React-on-the-web takes the user's declarative specifications of what the UI should look like and lowers them into DOM trees, and React Native lowers them to platform-native UI widgets, Flutter has its own built-in layout, rasterization, and compositing engine: Flutter draws all the pixels.

This has the predictable challenge that Flutter has to make significant investments so that its applications don't feel out-of-place on their platform, but on the other hand it opens up a huge space for experimentation and potential optimization: Flutter has the potential to beat native at its own game. Recall that with React Native, the result of the render-commit-mount process is a tree of native widgets. The native platform will surely then perform a kind of layout on those widgets, divide them into layers that correspond to GPU textures, paint those layers, then composite them to the screen -- basically, what a web engine will do.

What if we could instead skip the native tree and go directly to the lower GPU layer? That is the promise of Flutter. Flutter has the potential to create much more smooth and complex animations than the other application development frameworks we have mentioned, with lower overhead and energy consumption.

In practice... that's always the question, isn't it? Again, please accept my usual caveat that I am a compilers guy moonlighting in the user interface domain, but my understanding is that Flutter mostly lives up to its promise, but with one significant qualification which we'll get to in a minute. But before that, let's traverse Flutter from the other direction, coming up from Dart.

Dart, from the bottom

To explain some aspects of Dart I'm going to tell a just-so story that may or may not be true. I know and like many of the Dart developers, and we have similar instincts, so it's probably not too far from the truth.

Let's say you are the team that originally developed V8, and you decide to create a new language. You write a new virtual machine that looks like V8, taking Dart source code as input and applying advanced adaptive compilation techniques to get good performance. You can even be faster than JS because your language is just a bit more rigid than JavaScript is: you have traded off expressivity for performance. (Recall from our discussion of NativeScript that expressivity isn't a value judgment: there can be reasons to pay for more "mathematically appealing operational equivalences", in Felleisen's language, in exchange for applying more constraints on a language.)

But, you fail to ship the VM in a browser; what do you do? The project could die; that would be annoying, but you work for Google, so it happens all the time. However, a few interesting things happen around the same time that will cause you to pivot. One is a concurrent experiment by Chrome developers to pare the web platform down to its foundations and rebuild it. This effort will eventually become Flutter; while it was originally based on JS, eventually they will choose to switch to Dart.

The second thing that happens is that recently-invented smart phones become ubiquitous. Most people have one, and the two platforms are iOS and Android. Flutter wants to target them. You are looking for your niche, and you see that mobile application development might be it. As the Flutter people continue to experiment, you start to think about what it would mean to target mobile devices with Dart.

The initial Dart VM was made to JIT, but as we know, Apple doesn't let people do this on iOS. So instead you look to write a quick-and-dirty ahead-of-time compiler, based on your JIT compiler that takes your program as input, parses and baseline-compiles it, and generates an image that can be loaded at runtime. It ships on iOS. Funnily enough, it ships on Android too, because AOT compilation allows you to avoid some startup costs; forced to choose between peak performance via JIT and fast startup via AOT, you choose fast startup.

It's a success, you hit your product objectives, and you start to look further to a proper ahead-of-time compiler native code that can stand alone without the full Dart run-time. After all, if you have to compile at build-time, you might as well take the time to do some proper optimizations. You actually change the language to have a sound typing system so that the compiler can make program transformations that are valid as long as it can rely on the program's types.

Fun fact: I am told that the shift to a sound type system actually started before Flutter and thus before AOT, because of a Dart-to-JavaScript compiler that you inherited from the time in which you thought the web would be the main target. The Dart-to-JS compiler used to be a whole-program compiler; this enabled it to do flow-sensitive type inference, resulting in faster and smaller emitted JavaScript. But whole-program compilation doesn't scale well in terms of compilation time, so Dart-to-JS switched to separate per-module compilation. But then you lose lots of types! The way to recover the fast-and-small-emitted-JS property was through a stronger, sound type system for Dart.

At this point, you still have your virtual machine, plus your ahead-of-time compiler, plus your Dart-to-JS compiler. Such riches, such bounty! It is not a bad situation to be in, in 2023: you can offer a good development experience via the just-in-time compiled virtual machine. Apparently you can even use the JIT on iOS in developer mode, because attaching ptrace to a binary allows for native code generation. Then when you go to deploy, you make a native binary that includes everything.

For the web, you also have your nice story, even nicer than with JavaScript in some ways: because the type checker and ahead-of-time compiler are integrated in Dart, you don't have to worry about WebPack or Vite or minifiers or uglifiers or TypeScript or JSX or Babel or any of the other things that JavaScript people are used to. Granted, the tradeoff is that innovation is mostly centralized with the Dart maintainers, but currently Google seems to be investing enough so that's OK.

Stepping back, this story is not unique to Dart; many of its scenes also played out in the world of JavaScript over the last 5 or 10 years as well. Hermes (and QuickJS, for that matter) does ahead-of-time compilation, albeit only to bytecode, and V8's snapshot facility is a form of native AOT compilation. But the tooling in the JavaScript world is more diffuse than with Dart. With the perspective of developing a new JavaScript-based mobile operating system in mind, the advantages that Dart (and thus Flutter) has won over the years are also on the table for JavaScript to win. Perhaps even TypeScript could eventually migrate to have a sound type system, over time; it would take a significant investment but the JS ecosystem does evolve, if slowly.

(Full disclosure: while the other articles in this series were written without input from the authors of the frameworks under review, through what I can only think was good URL guesswork, a draft copy of this article leaked to Flutter developers. Dart hacker Slava Egorov kindly sent me a mail correcting a number of misconceptions I had about Dart's history. Fair play on whoever guessed the URL, and many thanks to Slava for the corrections; any remaining errors are wholly mine, of course!)

Evaluation

So how do we expect Flutter applications to perform? If we were writing a new mobile OS based on JavaScript, what would it mean in terms of performance to adopt a Flutter-like architecture?

Startup latency

Flutter applications are well-positioned to start fast, with ahead-of-time compilation. However they have had problems realizing this potential, with many users seeing a big stutter when they launch a Flutter app.

To explain this situation, consider the structure of a typical low-end Android mobile device: you have a small number of not-terribly-powerful CPU cores, but attached to the same memory you also have a decent GPU with many cores. For example, the SoC in the low-end Moto E7 Plus has 8 CPU cores and 128 GPU cores (texture shader units). You could paint widget pixels into memory from either the CPU or the GPU, but you'd rather do it in the GPU because it has so many more cores: in the time it takes to compute the color of a single pixel on the CPU, on the GPU you could do, like, 128 times as many, given that the comparison is often between multi-threaded rasterization on the GPU versus single-threaded rasterization on the CPU.

Flutter has always tried to paint on the GPU. Historically it has done so via a GPU back-end to the Skia graphics library, notably used by Chrome among other projects. But, Skia's API is a drawing API, not a GPU API; Skia is the one responsible for configuring the GPU to draw what we want. And here's the problem: configuring the GPU takes time. Skia generates shader code at run-time for rasterizing the specific widgets used by the Flutter programmer. That shader code then needs to be compiled to the language the GPU driver wants, which looks more like Vulkan or Metal. The process of compilation and linking takes time, potentially seconds, even.

The solution to "too much startup shader compilation" is much like the solution to "too much startup JavaScript compilation": move this phase to build time. The new Impeller rendering library does just that. However to do that, it had to change the way that Flutter renders: instead of having Skia generate specialized shaders at run-time, Impeller instead lowers the shapes that it draws to a fixed set of primitives, and then renders those primitives using a smaller, fixed set of shaders. These primitive shaders are pre-compiled at build time and included in the binary. By switching to this new renderer, Flutter should be able to avoid startup jank.

Jank

Of all the application development frameworks we have considered, to my mind Flutter is the best positioned to avoid jank. It has the React-like asynchronous functional layout model, but "closer to the metal"; by skipping the tree of native UI widgets, it can potentially spend less time for each frame render.

When you start up a Flutter app on iOS, the shell of the application is actually written in Objective C++. On Android it's the same, except that it's Java. That shell then creates a FlutterView widget and spawns a new thread to actually run Flutter (and the user's Dart code). Mostly, Flutter runs on its own, rendering frames to the GPU resources backing the FlutterView directly.

If a Flutter app needs to communicate with the platform, it passes messages across an asynchronous channel back to the main thread. Although these messages are asynchronous, this is probably the largest potential source of jank in a Flutter app, outside the initial frame paint: any graphical update which depends on the answer to an asynchronous call may lag.

Peak performance

Dart's type system and ahead-of-time compiler optimize for predictable good performance rather than the more variable but potentially higher peak performance that could be provided by just-in-time compilation.

This story should probably serve as a lesson to any future platform. The people that developed the original Dart virtual machine had a built-in bias towards just-in-time compilation, because it allows the VM to generate code that is specialized not just to the program but also to the problem at hand. A given system with ahead-of-time compilation can always be made to perform better via the addition of a just-in-time compiler, so the initial focus was on JIT compilation. On iOS of course this was not possible, but on Android and other platforms where this was available it was the default deployment model.

However, even Android switched to ahead-of-time compilation instead of the JIT model in order to reduce startup latency: doing any machine code generation at all at program startup was more work than was needed to get to the first frame. One could add JIT back again on top of AOT but it does not appear to be a high priority.

I would expect that Capacitor could beat Dart in some raw throughput benchmarks, given that Capacitor's JavaScript implementation can take advantage of the platform's native JIT capability. Does it matter, though, as long as you are hitting your frame budget? I do not know.

Aside: An escape hatch to the platform

What happens if you want to embed a web view into a Flutter app?

If you think on the problem for a moment I suspect you will arrive at the unsatisfactory answer, which is that for better or for worse, at this point it is too expensive even for Google to make a new web engine. Therefore Flutter will have to embed the native WebView. However Flutter runs on its own threads; the native WebView has its own process and threads but its interface to the app is tied to the main UI thread.

Therefore either you need to make the native WebView (or indeed any other native widget) render itself to (a region of) Flutter's GPU backing buffer, or you need to copy the native widget's pixels into their own texture and then composite them in Flutter-land. It's not so nice! The Android and iOS platform view documentation discuss some of the tradeoffs and mitigations.

Aside: For want of a canvas

There is a very funny situation in the React Native world in which, if the application programmer wants to draw to a canvas, they have to embed a whole WebView into the React Native app and then proxy the canvas calls into the WebView. Flutter is happily able to avoid this problem, because it includes its own drawing library with a canvas-like API. Of course, Flutter also has the luxury of defining its own set of standard libraries instead of necessarily inheriting them from the web, so when and if they want to provide equivalent but differently-shaped interfaces, they can do so.

Flutter manages to be more expressive than React Native in this case, without losing much in the way of understandability. Few people will have to reach to the canvas layer, but it is nice to know it is there.

Conclusion

Dart and Flutter are terribly attractive from an engineering perspective. They offer a delightful API and a high-performance, flexible runtime with a built-in toolchain. Could this experience be brought to a new mobile operating system as its primary programming interface, based on JavaScript? React Native is giving it a try, but I think there may be room to take things further to own the application from the program all the way down to the pixels.

Well, that's all from me on Flutter and Dart for the time being. Next up, a mystery guest; see you then!

April 25, 2023

The life of a GUI application

Some generalities about the life of a GUI application. What problems are usually encountered as developers, with solutions.

At time t0 - decisions when creating a new app

When creating a new application, some decisions need to be taken. The ones that interest us in this article:

  • Which GUI toolkit to use.
  • Which Human Interface Guidelines (HIG) to follow.

These are often inter-related. And we tend to choose the latest versions of both at the time of the decision.

Examples:

  • In the past, one could choose GTK 2 with the GNOME 2 HIG (with the classic menubar and toolbar).
  • At the time of writing, one could choose GTK 4 (with libadwaita) with the latest GNOME HIG (with adaptive layouts, suitable for both desktop and mobile devices).

(Note that it's also possible to write one backend plus several frontends, to have for instance a GTK frontend, a Qt frontend, terminal frontends (CLI, TUI), and so on, without forgetting the web and the cloud of course).

First bursts of development

Once the GUI toolkit (and version) chosen, plus the HIG, the development starts!

One thing to note already is that the application developer is - in practice - limited by what the GUI toolkit provides (and what is easily consumable). Even though it's possible for the app developer to implement custom widgets based on lower-level API, it's much harder to do so and many app developers don't go that route.

So what the GUI toolkit provides shapes what the application looks like, how features are presented to the user and how they are implemented.

We would prefer to have as few frontend code as possible, to focus on the backend and features, but in practice we realize that the frontend is not that easy to implement, after all. A GUI application is not a batch program! Especially if we want to create a real product with a long lifetime, and a revenue stream (incidentally).

Several years later, the GUI toolkit and HIG evolve

This is a fact of life, design is in constant evolution. As such, the HIG evolve, and the toolkit API typically reflects the changes over time (be it with API that become deprecated, or with new major versions that introduce new APIs and new ways of doing things).

But we have created an application! Most users like it the way it is, and perhaps we are earning a living thanks to it.

So, what to do?

By implementing the new design, we know that it will attract potential new users. Also to not become completely irrelevant a decade from now. But it has its own risks.

In my humble opinion, the perfect HIG/design doesn't exist. All have strengths and weaknesses. As a consequence, radically changing the design of an application will make some users unhappy, and contributes negatively to the churn rate (temporarily, we hope so).

From a technical point of view, adapting the codebase to the new HIG and API can involve a lot of work (it depends on how big the application is; an older application usually has more features). When we heavily modify the code, there can be regressions, etc. It's sometimes possible to quickly adapt the code, but then the code is awkward to read and would need a lot of cleanup afterwards (it's a technical debt). If that cleanup is never done, the codebase becomes messier and messier, with bugs lurking a bit everywhere. (The general stability also contributes a lot to the popularity of the application, so technical debt must be taken with care, if at all).

Some solutions to keep the best of both worlds

Having a clear separation between the backend and the frontend is a good solution. "Only" the frontend code needs to be adapted or rewritten. The backend can be written in a library fashion (it can be an internal/static library with an unstable API, or a shared library with more API stability guarantees). That way, the backend will have a longer lifetime.

Give to the user the choice between several design alternatives. Like it is done by LibreOffice, for example: when starting the application for the first time, ask the user what he/she prefers (with also the option in the preferences dialog to change our mind later).

Or follow a software product line approach: have a different application for each main design/HIG. With as much code in common as possible between the different apps.

When adapting to the latest HIG is too much work

As outlined above, adapting the frontend to a new design paradigm can be a lot of work. It's technically possible, but if the developers lack the time, it won't be done, especially if everything already works fine.

That said, it would still be valuable to port the application to the newer versions of the GUI toolkit. Because at some point the older toolkit versions become obsolete and no longer receive security fixes.

This is where there can be a problem: if the newer GUI toolkit versions have removed the mechanisms for the older HIGs.

Example: GTK 4 and older GNOME HIG mis-support

To make the discussion concrete, let's give an example with GTK and GNOME.

GTK 4 still has some support for the traditional UI with a menubar and toolbar. But the API has completely changed compared to GTK 2 or the first GTK 3 versions. And certain things are just no longer possible to do (as far as I know; correct me if I'm wrong), such as putting icons in menu items, or showing a longer description in the statusbar when a menu item is hovered with the mouse.

So basically, GTK 4 has removed some of the mechanisms needed to implement the older design. As explained, it's just software, so an application is able to implement this itself with the lower-level APIs, but it's a lot of work.

Thus, some application developers are stuck with older versions of the GUI toolkit. Without good solutions at their disposal.

Sandboxing as a solution

We mentioned the security problem of running apps that use a very old version of a GUI toolkit. The solution is to run such apps sandboxed, in a container or virtual machine.

Developing a GUI toolkit - mechanism versus policy

I fully acknowledge that developing a GUI toolkit is not an easy task. Here are some general thoughts / philosophy about how things could be better from the toolkit point of view. (Easier said than done, as usual).

I first heard about "mechanism versus policy" when reading the beginning of Linux Device Drivers (see chapter 1, The Role of the Device Driver). A device driver should expose all what the hardware is capable to do (the mechanism). Then higher-level software can choose how those capabilities can be used (the policy).

For a GUI toolkit, we could envision a clear split between the mechanism (with a low-level API) and the policy (with higher-level API, following a specific HIG).

The mechanism API would be more stable over time. Without feature removals. But since it's low-level, it's not the most convenient to use. But it would always be there, in case an application needs it.

The policy API would be more subject to changes, since the HIG evolves. But it doesn't prevent older policies' APIs to be kept around.

(For GTK's case, there is already GDK as the low-level API, but GDK 4 is completely different from GDK 3, to follow more the Wayland peculiarities. And the mechanical way of creating classic menus in GTK 3 has fallen through the cracks in GTK 4).

Conclusion

Developing a GUI application that has existed for a long time (several decades) is very different from developing a brand-new app. But - some day - the new app will sooner or later face the same kind of problems. So I wanted to put some perspective, and give some advice.

If you've followed some courses on software engineering, or just have read some books (on the subject), what I described in this article should be familiar. Since I talked about libraries (both for the application codebase, and obviously the GUI toolkit as well), the problem of re-usability directly comes to mind.

So as a conclusion and to give you some food for thoughts, here are some quotes taken from Facts and Fallacies of Software Engineering (Robert Glass):

Reuse-in-the-small (libraries of subroutines) [...] is a well-solved problem.
Reuse-in-the-large (components) remains a mostly unsolved problem, even though everyone agrees it is important and desirable.
Reuse-in-the-large works best in families of related systems and thus is domain-dependent.

Happy hacking!

Even though it's not possible to write comments on this blog, don't hesitate to drop me an email. I do read them, and like to have feedbacks.

Apologies

Just wanted to say sorry. Some paradoxes.

(less is more)

For more details, read on.

Edit: rephrasing; publish and improve the text that was initially written in an HTML comment.

Some notes

GTK and GNOME in general

GtkObject/GObject and language bindings (back then, before C++98 stdlib ISO spec and good GCC support) was a pragmatic solution to welcome and unite several developers' communities. With the C language being more bindable (simpler ABI calling conventions).

GTK 4 with (some) API breaks compared to GTK 3 was a good solution for time-to-market for long-time asked features.

Design is actually important, especially to compete with the competition. I find GNOME 44 quite good!

Paradoxes

I'm against proprietary software, the MAGMA (GAFAM), etc. But in practice it's very difficult to avoid them all.

The difference between theory and practice tends to be very small in theory, but in practice it is very large indeed.
Anonymous

Quote copied from Open Sources (Michael Tiemann's chapter).

Acknowledgments

The GNOME Brussels Events at (almost) each new GNOME stable release that I came to (from ~2011 onward).

Various people I talked to at the two GUADECs I came to (Strasbourg and Almería). Traveling a long distance is something that scares me a lot.
Especially for Almería where I had to go to a funeral the day just before taking a "plain" plane at 6am, sleeping (maybe) 1 hour at most during the night in-between. Very sad moment. But not every tears are bad things.

Loads of FOSDEMs I came to.

The local LUG (LouviLUG) that I have the luck to have nearby and to organize now since a long time :-) (Although it's not me who created it, but several-ones passed me the baton).

Overall, I meet and I've met really great minds, and I'm thankful for that.

What's next

I don't really know (or, I prefer not telling you everything). But what I'm sure about is that I need to change one or more things in my life.

Putting away the dark side of my past, keeping what's positive. And probably reverse the roles between job and hobbies (so, keeping CS as a hobby only, and do a retraining and career change). I now approach this with more serenity.

Wish me good luck :-) !

Even though it's not possible to write comments on this blog, don't hesitate to drop me an email. I do read them, and like to have feedbacks and to exchange ideas.

April 24, 2023

Notes on using Freedom to block digital distractions

Escape” by Metaphox is licensed under CC BY 2.0, via openverse.

I’ve been using freedom.to more or less since it launched. I was recently asked by someone how I use it, and since it didn’t fit in the text box on that platform, here’s a mid-ish-long set of notes on how I use it.

  • Everywhere: It’s on all my devices, no exceptions—iPad, phone, desktop work computer.
  • Scheduled sessions: I use what Freedom calls “sessions” to schedule blocks of time that block out distracting sites. My current blocked sessions are:
    • Working hours: Mostly self-explanatory, starts an hour after school drop-off so I can do some messing around or necessary social stuff before work. Also includes a lunch break every day (actually implemented as two sessions), though unfortunately that isn’t synced to my calendar so sometimes I am unblocked while working and then blocked during my actual lunch.
    • Morning self-care block: There’s an hour in the morning (before family breakfast) that blocks social plus work stuff. Ideal (not always respected) is that I should be using this time mostly for fitness or meditation, not the device. Am sorely tempted to replace this with just keeping the device well away (probably upstairs) and using an Apple TV device for fitness/meditation.
    • Evening family block: From roughly when my son gets home to his bedtime, blocks primarily social. Can still do some work if sorely needed but try to avoid social during that time.
  • Add sites freely: If I’m at all tempted by a site, I add it to the Blocklist. Better to overblock than underblock. (I understand why it probably can’t, but it’d be neat if Freedom monitored web usage, said “I notice that you’ve visited this more than X times today, would you like to block it?” Bonus for: “and add its RSS feed to your feed reader?”) In particular, I mass block sports, social, and news sites—you simply don’t need those things most of the time.

Because of my recent participation in the Mind Over Tech Accelerator trainings (which I cannot endorse enough if you struggle with digital focus!), I’ve made some recent experimental changes to my freedom use. I’m not sure yet if these will stick long-term:

  • Breakfast-time session: I’m experimenting with blocking more things during my family breakfast (6:50-7:50) and just relying on voice control or a shared family kitchen iPad for the few things I need to do during that block. In other words, using Freedom to force me not to touch my own devices during this block of family time.
  • Pomodoro: I’ve been experimenting with Forest for pomodoro timing, but I do wonder if I can do Freedom instead.
  • Apps: Freedom just added the ability to block apps, so I don’t use this yet. Not sure if I will, because most of the things that suck my time can be dealt with by blocking their network connections.

My challenges with Freedom (suggestions welcome!):

  • Reliability: Because Freedom is inherently doing something that the operating system doesn’t love (interfering with your use of the device!) it sometimes stops working, and you have to re-engage it. Which is fine, unless… I’m doing something that it wants to block. For example, during a recent trip, it disengaged and it was more convenient for various reasons not to turn it back on. Which was fine except now I’m back home, I should turn it back on, and I’m… not. (I’ll do it right after I post this! Probably!)
  • Social: My work does sometimes genuinely require use of social, which is hard if you’ve… blocked all social with Freedom. I don’t have a great solution to this, but people are mostly reasonable if you say “actually, I have social blocked on this machine, is it OK if [I do it later, you screenshot it, etc.]”
  • Location: This is a very First World Problem, but when I travel to another timezone, Freedom assumes I’m trying to cheat (not unreasonable!) and stays on my home timezone. I wish it would check my actual location (GPS?) and update automatically to match where I actually am.
  • Slack: My most distracting friend groupchat is on slack, and there is no good way (as far as I know) to only block my friend slack while not blocking work slack. I solve that on my primary work computer by simply not logging into friend-slack on that machine, but on other devices I use during the work day (most painfully my iPad) it can be a distraction.

Overall this is an amazing product, and if you struggle with digital distractions I highly recommend it.

Thanks everyone for Linux App Summit 2023!

We just wrapped up the 2023 edition of Linux App Summit here in Brno, and the conference was a blast! I was delighted to see so many friends and new people. I feel the future of Desktop Linux and our ecosystem is in very good hands!

For those that came to Brno and those participating virtually, I hope you also had a positive experience at the conference and at our city. If you missed the conference, you can watch the presentations in the LAS YouTube channel.

I hope to see you all soon at other events, such as GUADEC 2023.

Thank you!

Group photo of the 2023 Linux App Summit attendees
Group photo of the 2023 Linux App Summit attendees

 

April 23, 2023

Shotwell 0.32.0

Hello everyone, as “teased” over on discourse, April 22nd saw the release of Shotwell’s new stable version, 0.32.0.

New features include:

  • More image formats: AVIF, CR3, HEIF/HVEC, JPEG XL & WebP
  • Some initial geographical data handling (The map display will likely return in 33)
  • Multiple libraries and settings that are isolated from each other
  • Support of more than one account for publishing services
  • Manual tagging of persons
  • Automatic detection and recognition of persons can be enabled during compile time
  • HiDPI support for the image viewer
  • Simpler handling of hierarchical tags

The new version should pop up on flathub soon, and if you were subscribing to the GNOME nightly version, I would also recommend to switch to the flathub version, the GNOME nightly version will start to become quite unstable soonish.

April 21, 2023

2023-04-21 Friday

  • Out for a run with J. early. Mail chew, plugged away at some consistency checks to detect unhelpful 3rd parties that like to eg. delete the filesystem underneath a running process, and then wonder why things stop working.
  • Various partner calls; into Cambridge to meet up with the very charming Niels of OpenProject fame, and his brother.
  • Took H. home from work, dinner, back to cleaning up loose ends from the day. Fixed up J's counselling website so http works (was never tested), and mended some old blog links.

April 20, 2023

2023-04-20 Thursday

  • Wedding anniversary - 21 years - what a lovely time we've had together: what a blessing. To work!
  • Technical planning call, COOL community call, K8s design call, catch up with Andras, ESC meeting. Apparently my blood says I need more vitamin-D - fair enough.
  • Mihai making good progress, concrete base, bits of roof, fibreglass & epoxy.

How GTK3 themes work in Flatpak

There seems to be a lot of misinformation and low quality content out there on how to use a theme with Flatpak. So I’m going to break down how it all works.

Clients

Before we talk about Flatpak we have to talk about how GTK3 itself decides what theme you use.

Wayland

If you use Wayland this is very simple. It talks to xdg-desktop-portal-gtk to get your theme name from the host. The setting location on the host is in GSettings:

gsettings set org.gnome.desktop.interface gtk-theme 'Adwaita-dark'

gnome-tweaks sets this value for you, so I’d recommend just using it. Other desktop tools may set this value in their respective settings applications.

X11

If you are on X11 it relies upon a standard known as XSettings. How to configure this is less straightforward. On GNOME it uses gsd-xsettings as part of the gnome-settings-daemon project and it reads the GSettings value discussed above.

If you use a different daemon like xsettingsd you have to set Net/ThemeName in its configuration file. Other desktops may have their own daemon that need to be configured.

Getting themes inside of Flatpak

Now that your theme is actually configured properly you need to get the theme files inside of flatpak. You often can run a single command, flatpak update and everything will work now. It reads the GSetting discussed above and downloads a packaged theme.

Not packaged themes

If no package for the theme was found a common direction people go in is modifying permissions to add folders to the sandbox. I don’t recommend this. Instead here is how you package your theme.

#!/bin/bash
DEFAULT_ARCH=`flatpak --default-arch`
THEME_NAME=`gsettings get org.gnome.desktop.interface gtk-theme | tr -d \'`
THEME_EXTENSION_DIR=~/.local/share/flatpak/extension/org.gtk.Gtk3theme.$THEME_NAME/$DEFAULT_ARCH/3.22

mkdir -p $THEME_EXTENSION_DIR

cp -r /usr/share/themes/$THEME_NAME/gtk-3.0/* $THEME_EXTENSION_DIR

This simply copies your theme into a private flatpak extension and everything should work fine as long as there aren’t weird symlinks in your theme. Replace /usr/share/themes with a different directory like ~/.themes if needed.

April 19, 2023

Building a GStreamer plugin in Rust with meson instead of cargo

Over the Easter holidays I spent some time on a little experiment: How hard is it to build a GStreamer plugin in Rust with meson instead of using the default Rust build system cargo?

meson is a community-developed, open-source, cross-platform (including Windows), multi-language build system that is supposed to be fast and easy to use. It’s nowadays used as a build system by a lot of components of the GNOME desktop, by GStreamer, systemd, Mesa, X.org, QEMU and many other pieces of Free Software. Mostly for building C or C++ code, but also for configuring and installing Python or JavaScript code, Vala, and other languages.

Wouldn’t it be nice if we could also build software in Rust with it, build it together with existing code in other languages and have a common build system for it all? What would be the advantages and disadvantages of that, and what’s the current status of Rust support in meson? How much effort is it to make use of all the existing 100k software packages (“crates”) that are available for Rust?

Especially as most the projects mentioned before are looking into adopting Rust more or less seriously as a safer and more modern successor of C, these seem like useful questions to investigate. Anecdotally, I also heard that a maintainer of one of these projects said that being able to use the same build system as the rest of the codebase would be a requirement to even consider the language. Another project is starting to write some components in Rust and building them with meson, but without depending on any external Rust dependencies for now.

Another reason for looking into this was that there seems to be the opinion that you can’t really use any build system apart from cargo for building Rust code, and using meson would be very hard to impossible and involve a lot of disadvantages. This has lead to currently all GNOME applications written in Rust having a chimera of a build system using both meson and cargo because neither of the two does everything these applications need. Such a setup is hard to maintain, debug and probably almost nobody really understands it. cargo’s design does not make embedding into another build system easy, and both cargo and meson have very specific, and to some degree incompatible, opinions about how things have to be done. Let’s see if that’s actually necessary and what’s missing to move away from that. As Facebook is apparently using buck to build part of their Rust code, and Google bazel, this shouldn’t really be impossible.

As I’m a GStreamer developer, the maintainer of its Rust bindings and the one who started writing plugins for it in Rust, trying to do build a GStreamer plugin in Rust with meson instead of cargo seemed like the obvious choice for this experiment.

However, everything here applies in the same way to building GTK applications with its Rust bindings or similarly to any of the software mentioned before for writing components of them in Rust.

EDIT: After publishing the first version I was told that meson actually supports a couple of things that I missed before. For example, running Rust tests is already supported just fine (but more verbose in the build definition than cargo), and meson can already install executables setuid root by itself. Also there is an equivalent to cargo init, meson init, which makes starting new projects a bit more convenient. Also I wrote that cargo clippy is not really supported yet. While true, you can get around that by telling meson (or any other build system) to use clippy-driver as compiler instead of using rustc. I’ve updated the text below accordingly.

Summary

The code for my experiment can be found here, but at the point of writing it needs some changes to meson that are not merged yet. A list of those changes with some explanation can be found further below. The git repository also includes a cargo build system that gives the same build results for comparison purposes.

I should also make clear that this is only an experiment at this point and while it works fine, it is more manual work than necessary and if you depend on a lot of existing crates from crates.io then you probably want to wait a bit more before considering meson. Also more on that later. However, if you won’t have to depend on a lot of crates, your codebase is relatively self-contained and maybe even has to be built together with C code, then meson is already a viable alternative and has some advantages to offer compared to cargo. But also some disadvantages.

Almost all of the manual work I did as part of this experiment can be automated, and a big part of that is not going to be a lot of work either. I didn’t do that here to get an idea of the problems that would actually be encountered in practice when implementing such an automated system. I’ll get to that in more detail at the very end.

In summary I would say that

  • meson is less convenient and less declarative than cargo, but in exchange more powerful and flexible
  • meson’s Rust support is not very mature yet and there’s very little tooling integration
  • cargo is great and easy to use if your project falls into the exact pattern it handles but easily becomes annoying for you and your users otherwise
  • the developer experience is much better with cargo currently but the build and deployment experience is better with meson

More on each of these items below.

Procedure

As a goal I wanted to build one of the parts of the gst-plugins-rs tutorial, specifically an identity-kind of element that simply passes through its input, and build that into a GStreamer plugin that can be loaded into any GStreamer process. For that it has to be built into cdylib in Rust terms: a shared library that offers a C ABI. For this purpose, meson already has a big advantage over cargo in that it can actually install the build result in its correct place while cargo can only install executables right now. But more on that part later too.

The main task for this was translating all the Rust crate dependencies from cargo to meson, manually one by one. Overall 44 dependencies were needed, and I translated them into so-called meson wrap dependencies. The wrap dependency system of meson, similar to cargo, allows to download source code of dependencies from another location (in this case crates.io) and to extend it with a patch, in this case to add the meson-based build system for each.

In practice this meant creating a lot of meson.build files based on the Cargo.toml of each crate. The following following is the meson.build for the toml_edit crate.

project('toml_edit-rs', 'rust',
  version : '0.19.8',
  meson_version : '>= 1.0.0',
  default_options : ['buildtype=debugoptimized',
                     'rust_std=2021'
                    ]
)

rustc = meson.get_compiler('rust')

toml_datetime_dep = dependency('toml_datetime-rs', version : ['>= 0.6', '< 0.7'])
winnow_dep = dependency('winnow-rs', version : ['>= 0.4', '< 0.5'])
indexmap_dep = dependency('indexmap-rs', version : ['>= 1.9.1', '< 2.0'])

features = []
features += ['--cfg', 'feature="default"']

toml_edit = static_library('toml_edit', 'src/lib.rs',
  rust_args : features,
  rust_crate_type : 'rlib',
  dependencies : [toml_datetime_dep, winnow_dep, indexmap_dep],
  pic : true,
)

toml_edit_dep = declare_dependency(link_with : toml_edit)

and the corresponding wrap file

[wrap-file]
directory = toml_edit-0.19.8
source_url = https://crates.io/api/v1/crates/toml_edit/0.19.8/download
source_filename = toml_edit-0.19.8.tar.gz
source_hash = 239410c8609e8125456927e6707163a3b1fdb40561e4b803bc041f466ccfdc13
diff_files = toml_edit-rs.meson.patch

[provide]
toml_edit-rs = toml_edit_dep

As can be seen from the above, this could all be autogenerated from the corresponding Cargo.toml and that’s the case for a lot of crates. There are also some plans and ideas since quite a while in the meson community to actually develop such a tool, but so far this hasn’t materialized yet. Maybe my experiment can provide some motivation to actually start with that work.

For simplicity, when translating these by hand I didn’t consider including

  • any optional dependencies unless needed for my tasks
  • any tests, examples, executables as part of the crate
  • any cargo feature configuration instead of exactly what I needed

All of this can be easily done with meson though, and an automated tool for translating Cargo.toml into meson wraps should easily be able to handle that. For cargo features there are multiple ways of mapping them to meson, so some conventions have to be defined first of all. Similarly for naming Rust crates as meson dependencies it will be necessary to define some conventions to allow sharing between projects.

The hard part for translating some of the crates were the cargo build.rs build scripts. These build scripts allow running arbitrary Rust code as part of the build, which will also make automatic translation challenging. More on that later.

Once I had translated all the 44 dependencies, including the GStreamer Rust bindings, various procedural Rust macros and a lot of basic crates of the Rust ecosystem, I copied the plugin code into the new repository, changed some minor things and could then write a meson.build file for it.

project('gst-plugin-meson-test', 'rust',
  version : '0.0.1',
  meson_version : '>= 1.0.0',
  default_options : ['buildtype=debugoptimized',
                     'rust_std=2021'
                    ]
)

plugins_install_dir = '@0@/gstreamer-1.0'.format(get_option('libdir'))

rustc = meson.get_compiler('rust')

add_global_arguments(
  '-C', 'embed-bitcode=no',
  language: 'rust'
)

gst_dep = dependency('gstreamer-rs', version : ['>= 0.20', '< 0.21'])
once_cell_dep = dependency('once_cell-rs', version : ['>= 1.0', '< 2.0'])

gst_plugin_meson_test = dynamic_library('gstmesontest', 'src/lib.rs',
  rust_crate_type : 'cdylib',
  rust_dependency_map : {
    'gstreamer' : 'gst',
  },
  dependencies : [gst_dep, once_cell_dep],
  install : true,
  install_dir : plugins_install_dir,
)

To get everything building like this with the latest meson version (1.1.0), at this point some additional changes are needed: 1, 2, 3 and 4. Also currently all of this only supports native compilation. Cross-compilation of proc-macro in meson is currently broken but should be not too hard to fix in the future.

A careful reader will also notice that currently all crates with a dash (-) in their name are named like that for the dependency name, but the library they’re building has the dash replaced by an underscore in its name. This is due to a rustc bug (or undocumented behaviour?), and meson will likely require this translation of forbidden characters in crate names for the foreseeable future.

With all that in place, building the project is a matter of running

$ meson builddir
$ ninja -C builddir

compared to a single command with cargo

$ cargo build

Note that the meson build will be slower because by default meson already builds with optimizations (-C opt-level=2) while cargo does not.

Comparison between cargo and meson

In the following section I’m going to compare some aspects of cargo and meson, and give my general impression of both. To take the conclusion ahead, my perfect build system would include aspects of both cargo and meson and currently both are lacking in their own way. Like with any other tool (or programming language, for that matter): if you don’t have anything to complain about it you don’t know it well enough yet or don’t know any of the alternatives.

To avoid that people focus on the negative aspects, I’m also going to try to describe all the shortcomings of one build system as a good feature of the other. This is no competition but both meson and cargo can learn a lot from each other.

Build Times

Above I shortly mentioned build times. And because that’s something everybody is interested in and a common complaint about Rust, let’s start with that topic even if it’s in my opinion the most boring one.

Overall you would expect both build systems to behave the same if they do the same work as they are both basically just sophisticated Rust compiler process spawners. If you want to improve build times then your time is probably spent better on the Rust compiler itself and LLVM.

All times below are measured on my system very unscientifically with time. This is all only to give an idea of the general behaviour and to check if there are conceptual inefficiencies or problems. Also, when reproducing these results make sure to pass -Dbuildtype=debug to meson for comparable results between meson and cargo.

  • Running meson (build configuration): 1.4s (1.0s user, 0.4s system)
  • Running ninja (actual build): 10.4s (23.5s user, 3.2s system)
  • Running cargo: 11.0s (34.1s user, 6.6s system)

One thing that shows here immediately is that both need approximately the same time. The build alone takes slightly less than cargo, the configure and build steps together slightly more. So far what would be expected. However, cargo uses almost 45% more CPU time. I didn’t investigate this in great detail but the main two reasons for that are likely

  • cargo is building and running a build.rs build script in 23 of the 44 dependencies, which takes a while and also pulls in some more dependencies that are not needed for the meson build
  • meson is currently parallelizing less well compared to cargo when building Rust code, or otherwise the build step would likely be 10% or more faster

cargo build scripts

Interestingly, the main pain point when translating the crate build systems from cargo to meson also seems to be the main factor for cargo being more inefficient than it could be. This also seems to be a known fact in the wider Rust community by now.

But in addition to being inefficient and painful to translate (especially automatically), it is in my opinion also a maintenance nightmare and literally the worst part of cargo. It’s not declarative in what the build script is actually trying to do, it’s very easy to write build scripts that don’t work correctly in other environments and two crates doing the same things in a build script are generally going to behave differently in non-obvious ways.

For all the crates that I translated, the reasons why the build scripts existed in 23 of 44 crates should really be features that cargo should provide directly and that meson can do directly:

  • Checking which version of the Rust compiler is used and based on that enabling/disabling various features
  • Checking features or versions of the underlying platform or operating system
  • Checking for existence of native, external libraries or even building them
  • Not in these cases: code generation

Especially the last two points are painful for build and deployment of Rust code, and that every crate has its own special way of solving it in a build script doesn’t make it better. And in the end both tasks are exactly what you have a build system for: building things, tracking dependencies between them and generating an ideal build plan or schedule.

The system-deps crate is providing a way for expressing external dependencies in a declarative way as part of Cargo.toml, and a similar system built into cargo and integrated with the platform’s mechanism for finding libraries would be a huge improvement. Similar approaches for the other aspects would also be helpful. And not just for making a different build system for crates, but also for developers using cargo.

I’m sure that once cargo gained features for handling the four items above there will still be a need for custom build scripts in some situations, but these four items should cover more than 95% of the currently existing build scripts. It’s a mystery to me why there are no apparent efforts being made to improve cargo in this regard.

Good Parts of meson

Now let’s focus on some of the good parts of meson in comparison to cargo.

More flexible and expressive

Generally, meson has a much more flexible and expressive build definition language. It looks more like a scripting language than the toml files used by cargo, and as such appears more complex.

However thanks to this approach, almost anything for which cargo requires custom build scripts, with the disadvantages listed above, can be written directly as part of the meson build definitions. Expressing many of these things in toml for cargo would likely become convoluted and not very straightforward, as can already be seen nowadays with e.g. platform-specific dependencies.

meson provides built-in modules with support for e.g.

  • finding and checking versions of compilers of different languages and other build tools, e.g. for code generation, including the Rust bindgen tool
  • testing compilation of code snippets
  • finding external library dependencies and checking their versions
  • defining shared/static library, excutable and code generation build targets together with how (and if) they should be installed later
  • installing (and generating) data files, including specific support for e.g. library metadata (pkg-config), documentation, gobject-introspection, …

As an escape hatch, whenever something can’t be expressed by just meson, it is also possible to run external configuration/build/install scripts that could be written as a Python/shell script, C or Rust executable, or anything else really. This is rarely needed though and the meson project seems to add support for anything that people actually need in practice.

Build configuration system

meson provides an extensible configuration mechanism for the build, which allow the user building the software to customize the build process and its results.

  1. The built-in options allow for configuring things like switching between debug and release builds, defining toolchains and defining where to install build results.

  2. The build options allow each project to extend the above with custom configuration, e.g. for enabling/disabling optional features of the project or generally selecting between different configurations. Apart from simple boolean flags this also allows for other types of configuration, including integers and strings.

cargo only provides one of the two in the form of fixed, non-extensible compiler profiles and configuration.

The second is not to be confused with cargo’s feature flags, which are more a mechanism for selecting configuration of dependencies by the developer of the software. The meson build configuration however is for the builder of the software to select different configurations. While for executables cargo feature flags are used in a similar way for boolean configurations, cargo does not provide anything for this in general.

As a workaround for this, a couple of Rust crates are using environment variables together with the env! macro for build configuration but this is fragile and not discoverable, and more of a hack than a real solution.

Support for Rust and non-Rust code in the same project

meson supports mixing Rust and non-Rust code in the same project, and allows tracking dependencies between targets using different languages in the same way.

While it is not supported to mix Rust and e.g. C code in the same build target due to Rust’s compilation model, it is possible to e.g. build a static Rust library, a static C library and link both together into e.g. a D application or Python module. An example for this would be this Python module that combines C, C++, Rust and Fortran code.

Code generation can be handled in a similar way as in the end code generation is just another transformation from one format into another.

cargo doesn’t directly support anything but Rust code. As usual, build scripts provide a mechanism to get around this limitation. The cc crate for example is widely used to build C code and there are also crates for building meson or cmake based software as part of the build process.

All of this is completely opaque to cargo though and can’t be taken into account for defining an optimal build schedule, can’t be configured from the outside and regularly fails in non-standard build situations (e.g. cross-compilation).

Installation of other files than executables

meson allows every build result to be installed in a configurable location. This is especially useful for more complex applications that might have to provide various data files or come as an executable plus multiple libraries or plugins, or simply for projects that only provide a shared library. If any of the built-in installation mechanisms are not sufficient (e.g. the executable should be get specific process capabilities set via setcap), meson also allows to customize the install process via scripts.

cargo only allows to install executables right now. There are cargo extensions that also allow for more complex tasks, e.g. cargo xtask but there is no standard mechanism. There once was an RFC to make cargo’s installation process extensible, but the way how this was proposed would also suffer from the same problems as the cargo build scripts.

External dependency and library support

In addition to mixing build targets with multiple languages in the same project, meson also has a mechanism to find external dependencies in different ways. If an external dependency is not found, it can be provided and built as part of the project via the wrap mechanism mentioned before. The latter is similar to how cargo handles dependencies, but the former is missing completely and currently implemented via build scripts instead, e.g. by using the pkg-config crate.

As Rust does not currently provide a stable ABI and provides no standard mechanism to locate library crates on the system, this mostly applies to library dependencies written in other languages. meson does support building Rust shared/static libraries and installing them too, but because of the lack of a stable ABI this has to be made use of very carefully.

On the other hand, Rust allows building shared/static libraries that provide the stable C ABI of the platform (cdylib, staticlib crate types). meson allows building these correctly too, and also offers mechanisms for installing them together with their (potentially autogenerated header files) and locating them again later in other projects via e.g. pkg-config.

For cargo this job can be taken care of by cargo-c, including actually building shared libraries correctly by setting e.g. the soname correctly and setting other kinds of versioning information.

Good Parts of cargo

After writing so much about meson and how great it is, let’s now look at some aspects of cargo that are better than what meson provides. Like I said before, both have their good sides.

Simpler and more declarative build definitions

The cargo manifest format is clearly a lot simpler and more declarative than the meson build definition format.

For a simple project it looks more like a project description than something written in a scripting language.

[package]
name = "hello_world"
version = "0.1.0"
edition = "2021"

[dependencies]
anyhow = "1.0"

As long as a project stays in the boundaries of what cargo makes easy to express, which should be the case for the majority of existing Rust projects, it is going to be simpler than meson. The lack of various features in cargo that require the use of a build script prevent this currently for many crates, but that seems like something that could be easily improved.

meson on the other hand feels more like writing actual build scripts in some kind of scripting language, and information like “what dependencies does this have” are not as easily visible as from something like a Cargo.toml.

Tooling integration

cargo provides a lot development tools that make development with it a very convenient and smooth experience. There are also dozens of cargo extension commands that provide additional features on top of cargo.

cargo init creates a new project, cargo add adds new dependencies, cargo check type-checks code without full compilation, cargo clippy runs a powerful linter, cargo doc builds the documentation (incl. dependencies), cargo bench and cargo test allow running tests and benchmarks, cargo show-asm shows the generated, annotated assembly for the code, cargo udeps finds unused dependencies, …

All of this makes development of Rust project a smooth and well-integrated experience.

In addition rust-analyzer provides a lot of IDE features via the LSP protocol for various editors and IDEs.

Right now, IDEs and editors are assuming Rust projects to make use of cargo and offer integration with its features.

On the other hand, when using meson almost none of this is currently provided and development feels less well integrated. Right now the only features provided by meson for making Rust development easier is generation of a rust-project.json to make use of rust-analyzer and being able to run tests in a similar way to cargo, and of course actually building the code. Building of documentation could be easily added to meson and is supported for other languages, something like cargo add for wrap dependencies exists already and adding crates.io support to it would be possible, but it’s going to take a while and a bit of effort to handle crates with cargo build scripts. Making use of all the cargo extension commands without actually using cargo seems unrealistic.

In the end, cargo is the default build system for Rust and everything currently assumes usage of cargo so using cargo offers the best developer experience.

Rust dependencies

As shortly mentioned above, cargo add makes it extremely easy to add new Rust dependencies to a project and build them as part of the project. This helps a lot with code reuse and modularity. That an average Rust project has dozens of direct dependencies and maybe a hundred or more indirect dependencies shows this quite clearly, as does the encapsulation of very small tasks in separate dependencies compared to huge multi-purpose libraries that are common with other languages.

cargo also directly handles updating of dependencies via cargo update, including making sure that only semver compatible versions are automatically updated and allows including multiple incompatible versions of dependencies in the same build if necessary.

In addition to just adding dependencies, cargo features allow for conditional compilation and for defining which parts and features of a dependency should be enabled or not.

meson has some explicit handling of dependencies and the wrap system also allows building external dependencies as part of the project, but adding or updating dependencies is more manual process unless they are in the meson wrapdb. For now it is completely manual with regards to Rust crates. It is no different from adding dependencies in non-Rust languages though.

There is also no direct equivalent of the cargo feature flags, which potentially seems like a useful addition to meson.

Next steps

Considering all of the above, there is not really a simple answer to which of the two choices is the best for your project. As of now I would personally always use cargo for Rust projects if I can, especially if they will have other Rust dependencies. It’s a lot more convenient to develop with cargo.

However various features of meson might make it a better choice for some projects, maybe already now or at least in the future. For example the ability to handle multiple languages, to handle dependencies and shared libraries correctly or the ability to install data files together with the application. Or simply because the remainder of the project already uses meson for compiling e.g. C code, meson might be the more natural choice for adding Rust code to the mix.

As outlined above there are many areas of meson that could be improved and where Rust support is not very mature yet, and to make meson a more viable alternative to cargo these will have to happen sooner or later. Similarly, there are various areas where cargo also could be improved and learn from meson, and where improvements to cargo, in addition to making cargo more flexible and easier to use, would also make it easier to handle other build systems.

From a meson point of view, I would consider the following the next steps.

Various bugs and missing features

rustc and cargo

On the Rust compiler side, there are currently two minor issues that would need looking into.

  • rustc#80792: Adding support for passing environment variables via the command-line instead of using actual environment variables. Environment variables are currently used for build configuration by many crates. Allowing to provide them via the command-line would allow for more clean and reliable build rules without accidentally leaking actual environment variables into the build.
  • rustc#110460: Undocumented and maybe unintentional library filename requirement based on the crate name. This currently requires meson to disallow dashes in crate names and to have them explicitly replaced by underscore, which is something cargo does implicitly. Implicit conversion inside meson would also be feasible but probably not desireable because there would then be a mismatch between the name of the build target in the build definition and the actual name of the build target when building it. Similar to the mild confusion that some people ran into when noticing that a crate with a dash in the name can only be referenced with underscores from Rust code.

In addition it would be useful to look into moving various features from build scripts into proper cargo, as outlined above. This will probably need a lot of design and discussion effort, and will likely also take years after implementation until the important crates are moved to it due to very conservative Rust toolchain version requirement policies in various of these crates.

So on the cargo side that’s more something for the long run but also something that would greatly benefit users of cargo.

meson

On the meson side there are a couple of bugs of different severity that will have to be solved, and probably more that will show up once more people are starting to use meson to build Rust projects.

In addition to the ones I mentioned above and that are merged already into the meson git repository, at the time of this writing there were for example the following outstanding issues

  • meson#11681: Add a feature to meson to allow renaming crates when using as a dependency. This is used throughout the Rust ecosystem for handling multiple versions of the same crate at once, or also simply for having a more convenient local name of a dependency.
  • meson#11695: Parallelize the Rust build better by already starting to build the next Rust targets when the metadata of the dependencies is available. This should bring build time improvements of 10% or more on machines with enough cores, and the lack of it is the main reason why the meson build in my experiment was “only” as fast as the cargo build and not faster.
  • meson#11702: Cross-compilation of proc-macro crates is currently using the wrong toolchain and simply doesn’t work.
  • meson#11694: Indirect dependencies of all dependencies are passed onwards on the later compiler invocations, which brings the risk of unnecessary name conflicts and simply causes more work for the compiler than necessary.
  • meson#10030: Add support for passing environment variables to the Rust compiler. As mentioned above, many crates are currently using this for build configuration so this would have to be supported by meson in one way or another.

Apart from the second one in this list these should all be doable relatively fast and generally getting fixes and improvements required for Rust merged into meson was a fast and pleasant experience so far. I didn’t encounter any unnecessary bikeshedding or stop energy.

Tooling for managing cargo dependencies

During my experiment I wrote all the meson wrap files manually. This does not really scale, is inconvenient, and also makes it harder to update dependencies later.

The goal here would be to provide a tool in the shape of cargo add and cargo update that allows to automatically add cargo-based Rust dependencies to a meson project. This is something that was discussed a lot in the past and various people in the meson community have ideas and plans around this. meson already has something similar for the wrapdb, meson wrap add and meson wrap update, but the idea would be to have something similar (or integration into that) to directly support crates from crates.io so Rust dependencies can be added with as little effort to a meson project as they can currently to a cargo project.

Apart from the cargo build scripts this shouldn’t be a lot of effort and a project for an afternoon at most, so maybe I’ll give that a try one of these days if nobody beats me to it.

As part of such a tool, it will also be necessary to define conventions about naming, mapping of cargo features, versioning, etc. of Rust crates inside meson and this should ideally be done from early on to avoid unnecessary churn. The way how I did it as part of my experiment has various drawbacks with regards to versioning and needs improvements.

Handling cargo build scripts is a bigger issue though. As my experiment showed, about half of the crates had build scripts. While all of them were more or less trivial, automatically translating this into meson build definitions seems unrealistic to me.

It might be possible to have meson use the cargo build scripts directly in one way or another, or they would have to be translated manually to meson build definitions. The latter would considerably improve build times so seems like a better approach for common crates at least. And for those the meson build definitions could be stored in a central place, like the meson wrapdb or maybe even be included in the crates on crates.io if their maintainers feel like dealing with two build systems.

Together with all this, some thought will also have to be put into how to locate such Rust dependencies, similar to how pkg-config allows to locate shared libraries. For example, Linux distributions will want to package such dependencies and make sure that a project built for such a distribution is making use of the packaged dependencies instead of using any other version, or worse downloading some version from the Internet at build time. The way how this is currently handled by cargo is also not optimal for Linux distributions and a couple of other build and deployment scenarios.

Because of the lack of a stable Rust ABI this would mean locating Rust source code.

Tooling integration

And last, as mentioned above there is basically no tooling integration right now apart from being able to build Rust code and using rust-analyzer. meson should at least support the most basic tasks that cargo supports, and that meson already supports for other languages: running tests and benchmarks, running linters and building documentation.

Once those basic tasks are done, it might be worth investigating other tooling integration like the various cargo extension commands offer or extending those commands to handle other build systems, e.g. via the rust-project.json that rust-analyzer uses for that purpose.

April 18, 2023

Status update, 18/04/2023

It’s been a long month, thankfully with a nice holiday in the middle. I am divided between lots of things at work which is not helpful for being able to focus on any interesting thing.

The development of large language models is everywhere at the moment, I shared some thoughts on those on the Lines forum. I won’t go in depth here as we’re going to be discussing these things for the next 15 years anyway. Stay safe out there in the dark forest.

I’ve had a little time to poke at GNOME’s OpenQA tests. I made a simple helper tool named ssam_openqa which simplifies the task of running the OpenQA tests locally on your laptop. My current goal is to figure out how to split up the test suite into multiple pieces, as the current cycle time when developing tests is around 5 minutes which is too slow to be fun. And it has to be fun so you will want to contribute 🙂

Here’s some new music!


Also available on Bandcamp.

PSA: upgrade your LUKS key derivation function

Here's an article from a French anarchist describing how his (encrypted) laptop was seized after he was arrested, and material from the encrypted partition has since been entered as evidence against him. His encryption password was supposedly greater than 20 characters and included a mixture of cases, numbers, and punctuation, so in the absence of any sort of opsec failures this implies that even relatively complex passwords can now be brute forced, and we should be transitioning to even more secure passphrases.

Or does it? Let's go into what LUKS is doing in the first place. The actual data is typically encrypted with AES, an extremely popular and well-tested encryption algorithm. AES has no known major weaknesses and is not considered to be practically brute-forceable - at least, assuming you have a random key. Unfortunately it's not really practical to ask a user to type in 128 bits of binary every time they want to unlock their drive, so another approach has to be taken.

This is handled using something called a "key derivation function", or KDF. A KDF is a function that takes some input (in this case the user's password) and generates a key. As an extremely simple example, think of MD5 - it takes an input and generates a 128-bit output, so we could simply MD5 the user's password and use the output as an AES key. While this could technically be considered a KDF, it would be an extremely bad one! MD5s can be calculated extremely quickly, so someone attempting to brute-force a disk encryption key could simply generate the MD5 of every plausible password (probably on a lot of machines in parallel, likely using GPUs) and test each of them to see whether it decrypts the drive.

(things are actually slightly more complicated than this - your password is used to generate a key that is then used to encrypt and decrypt the actual encryption key. This is necessary in order to allow you to change your password without having to re-encrypt the entire drive - instead you simply re-encrypt the encryption key with the new password-derived key. This also allows you to have multiple passwords or unlock mechanisms per drive)

Good KDFs reduce this risk by being what's technically referred to as "expensive". Rather than performing one simple calculation to turn a password into a key, they perform a lot of calculations. The number of calculations performed is generally configurable, in order to let you trade off between the amount of security (the number of calculations you'll force an attacker to perform when attempting to generate a key from a potential password) and performance (the amount of time you're willing to wait for your laptop to generate the key after you type in your password so it can actually boot). But, obviously, this tradeoff changes over time - defaults that made sense 10 years ago are not necessarily good defaults now. If you set up your encrypted partition some time ago, the number of calculations required may no longer be considered up to scratch.

And, well, some of these assumptions are kind of bad in the first place! Just making things computationally expensive doesn't help a lot if your adversary has the ability to test a large number of passwords in parallel. GPUs are extremely good at performing the sort of calculations that KDFs generally use, so an attacker can "just" get a whole pile of GPUs and throw them at the problem. KDFs that are computationally expensive don't do a great deal to protect against this. However, there's another axis of expense that can be considered - memory. If the KDF algorithm requires a significant amount of RAM, the degree to which it can be performed in parallel on a GPU is massively reduced. A Geforce 4090 may have 16,384 execution units, but if each password attempt requires 1GB of RAM and the card only has 24GB on board, the attacker is restricted to running 24 attempts in parallel.

So, in these days of attackers with access to a pile of GPUs, a purely computationally expensive KDF is just not a good choice. And, unfortunately, the subject of this story was almost certainly using one of those. Ubuntu 18.04 used the LUKS1 header format, and the only KDF supported in this format is PBKDF2. This is not a memory expensive KDF, and so is vulnerable to GPU-based attacks. But even so, systems using the LUKS2 header format used to default to argon2i, again not a memory expensive KDFwhich is memory strong, but not designed to be resistant to GPU attack (thanks to the comments pointing out my misunderstanding here). New versions default to argon2id, which is. You want to be using argon2id.

What makes this worse is that distributions generally don't update this in any way. If you installed your system and it gave you pbkdf2 as your KDF, you're probably still using pbkdf2 even if you've upgraded to a system that would use argon2id on a fresh install. Thankfully, this can all be fixed-up in place. But note that if anything goes wrong here you could lose access to all your encrypted data, so before doing anything make sure it's all backed up (and figure out how to keep said backup secure so you don't just have your data seized that way).

First, make sure you're running as up-to-date a version of your distribution as possible. Having tools that support the LUKS2 format doesn't mean that your distribution has all of that integrated, and old distribution versions may allow you to update your LUKS setup without actually supporting booting from it. Also, if you're using an encrypted /boot, stop now - very recent versions of grub2 support LUKS2, but they don't support argon2id, and this will render your system unbootable.

Next, figure out which device under /dev corresponds to your encrypted partition. Run

lsblk

and look for entries that have a type of "crypt". The device above that in the tree is the actual encrypted device. Record that name, and run

sudo cryptsetup luksHeaderBackup /dev/whatever --header-backup-file /tmp/luksheader

and copy that to a USB stick or something. If something goes wrong here you'll be able to boot a live image and run

sudo cryptsetup luksHeaderRestore /dev/whatever --header-backup-file luksheader

to restore it.

(Edit to add: Once everything is working, delete this backup! It contains the old weak key, and someone with it can potentially use that to brute force your disk encryption key using the old KDF even if you've updated the on-disk KDF.)

Next, run

sudo cryptsetup luksDump /dev/whatever

and look for the Version: line. If it's version 1, you need to update the header to LUKS2. Run

sudo cryptsetup convert /dev/whatever --type luks2

and follow the prompts. Make sure your system still boots, and if not go back and restore the backup of your header. Assuming everything is ok at this point, run

sudo cryptsetup luksDump /dev/whatever

again and look for the PBKDF: line in each keyslot (pay attention only to the keyslots, ignore any references to pbkdf2 that come after the Digests: line). If the PBKDF is either "pbkdf2" or "argon2i" you should convert to argon2id. Run the following:

sudo cryptsetup luksConvertKey /dev/whatever --pbkdf argon2id

and follow the prompts. If you have multiple passwords associated with your drive you'll have multiple keyslots, and you'll need to repeat this for each password.

Distributions! You should really be handling this sort of thing on upgrade. People who installed their systems with your encryption defaults several years ago are now much less secure than people who perform a fresh install today. Please please please do something about this.

comment count unavailable comments

April 16, 2023

PDF forms, the standard that seemingly isn't

Having gotten the basic graphical output or A4PDF working I wanted to see if I could make PDF form generation work.

This was of course a terrible idea but sadly I lacked foresight.

After a lot of plumbing code it was time to start defining form widgets. I chose to start simple and create a form with a single togglable check button. This does not seem like an impossibly difficult problem and the official PDF specification even has a nice code sample for this:

The basic idea is simple. You define the "widget" and give it two "state objects" that contain PDF drawing operations. The idea is that the PDF renderer will draw one of the two on top of the basic PDF document depending on whether the checkbox is toggled or not. The code above sets things up so that the appearance of the checkbox is one of two different DingBat symbols. Their values are not shown in the specification, but presumably they are a checked box and an empty square.

I created a test PDF with LibreOffice's form designer and then set about trying to recreate it. LO's form generator uses OpenSymbol for the checked status of the checkbox and an empty appearance for the off state. A4PDF uses the builtin Helvetica "X" character.The actual files can be downloaded here.

What we have here is a failure to communicate

No matter how much I tried I could not make form generation actually work. The output was always broken in weird ways I could not explain. Unfortunately this part of the PDF spec is not very helpful, because it does not give out full examples, only snippets, and search engines are worthless at finding any technical content when using "PDF" as a search term. It may even be that information about this is not available in public web sites. Who knows?

Anyhow, when regular debugging does not work, it's time approach things sideways. Let's start by opening the LibreOffice test document with Okular:

This might seem to be working just fine, but people with sharp eyes might notice a problem. That check mark is not from OpenSymbol. FWICT it is the standard Qt checkbox widget. Still, the checkbox works and its appearance is passable. But what happens if you increase the zoom level?

Oh dear. That "Yes" text is the PDF-internal label given to the "on" state. Why is it displayed? No idea. It's time to bring out the heavy guns and see how things work in The Gold Standard of PDF Rendering, Adobe Reader.

Nope, that's not the OpenSymbol checkmark either. Adobe Reader seems to be ignoring the spec and drawing its own checkmarks instead. After seeing this I knew I had to try this on every PDF renderer I could reasonably get my hands on. Here's the results:

  • Okular
    • LO: incorrect appearance, breaks when zooming
    • A4PDF: shows both the "correct" checkmark as well as the Qt widget on top of each other, takes a noticeable amount of time after clicking until the widget state is updated
  • Evince
    • LO: does not respond to clicks
    • A4PDF: works correctly
  • Adobe Reader win64
    • LO: incorrect appearance
    • A4PDF: incorrect appearance, does not always respond to button clicks
  • Firefox
    • LO: Incorrect appearance
    • A4PDF: Incorrect appearance
  • Chromium
    • LO: Incorrect appearance
    • A4PDF: works correctly
  • Apple Preview:
    • LO: works correctly (though the offset is a bit wonky, probably an issue in the drawing commands themselves)
    • A4PDF: works correctly

The only viewer that seems to be working correctly in all cases is Apple Preview.

PDF has a ton of toggleable flags and the like to make things invisible when printing and so on. It is entirely possible that the PDF files are "incorrect" in some way. But still, either the behaviour should either be the same on all viewers or they should report format errors. No errors are reported, though, even by this online validator.

April 05, 2023

Identity v0.5 and Synchronized Scrolled Windows

My university studies and work revolve around image- and video-processing algorithms. I frequently need to compare similar but subtly different output videos: to see how various algorithms solving the same problem behave, or to see the progress as I’m tweaking my own algorithm.

In 2020, I made a GNOME app, Identity, to assist me. It plays multiple videos at once in sync, and lets you switch between them like tabs in a browser. This way you can easily examine the differences at any point.

Identity has seen a number of releases since then and grown a number of helpful features, like zooming or viewing media properties. And now, in v0.5, I have implemented a side-by-side comparison mode. All files are arranged in a row or a column, and their zoom and pan positions are synchronized. You can explore different parts of an image or a video and see how they look across all versions that you opened. This is a quite useful comparison mode, and also more obvious for first-time users.

Identity comparing an image with three upscaling methods in a column

Under the hood, every image sits inside a GtkScrolledWindow, the standard GTK 4 widget that provides scrolling/panning gestures for its child widget, and draws the scroll bars and the overshoot effect.1 It’s easy to synchronize two or more of these scrolled windows together, but avoiding weird gesture interactions can be tricky. Let’s see how to get them to play along.

Synchronizing Positions #

Scrolled windows use GtkAdjustments to monitor the full size of their child widget and to control its current scroll position. Adjustments are objects with properties for the lower and upper bounds (in our case set to 0 and the full child size), the current value (which is the scroll position), and the step and page increments (which set how far arrow keys and PgUp/PgDown scroll the widget). There are two adjustments in every scrolled window: one for horizontal scrolling, and one for vertical scrolling, called hadjustment and vadjustment.

To synchronize multiple scrolled windows which show widgets of matching size, simply use the same two adjustments for all of them. Scrolling one widget will update the adjustments, causing all other widgets to also update their scroll position.

const shared_hadj = new Gtk.Adjustment();
const shared_vadj = new Gtk.Adjustment();

const scroll1 = new Gtk.ScrolledWindow({
    child: pictures[0],
    hadjustment: shared_hadj,
    vadjustment: shared_vadj,
});
const scroll2 = new Gtk.ScrolledWindow({
    // This pictures[1] widget has the same size as pictures[0].
    child: pictures[1],
    // Same adjustments as above!
    hadjustment: shared_hadj,
    vadjustment: shared_vadj,
});

You can run the full example with gjs -m simple.js:

GTK window with two synchronized scrolled windows.

Despite being a relatively simple and supported use-case, adjustment sharing actually makes conditions more favorable for an allocation loss bug that had plagued some of the more complex GTK 4 apps like Sysprof or GNOME Builder. When I implemented the initial version of side-by-side comparison in Identity, I started hitting the bug as well, very easily (panning a video while it was finishing and seeking back to the start was usually enough). So, I decided to investigate, and a few hours of rr and intense discussion in #gtk later, I managed to fix it! Of course, allocation machinery being very complex, this broke some things, but after a few follow-up fixes by the GTK maintainers, the bug seems to have been at last completely conquered. The fixes are included in GTK 4.10 and should make their way into GTK 4.8.4.

Anyhow, Identity can show and synchronize images of different size. Reusing the same adjustments would cause the upper boundaries to mismatch, and things to break. Instead, I keep track of my own, normalized adjustments, which always range from 0 to 1.2 They are bound back and forth with the scrolled window adjustments, so that scrolling will cause an update to the normalized adjustments, and vice versa. In turn, the value of the normalized adjustments are bound together between all open images. This way, zooming into the center of one image will set the values to 0.5, which will scroll all other images into their centers, regardless of their current size.3

Finally, watch out for widgets which can change their size depending on the scroll position, like GtkListView with variably-sized items. Scrolling to a particular point may cause such a widget to update the upper boundary of the adjustment and recompute the scroll position relative to what it now believes to be its size. This may cause a cascading reaction with the synchronized widgets, and potentially an infinite loop.

Fixing Kinetic Scrolling #

Scrolled window implements kinetic deceleration for two-finger panning on a touchpad and one-finger panning on a touchscreen—if you swipe your fingers with some speed, the widget will keep scrolling for a bit, until it comes to a halt. At first it may seem that it works fine—you can try it in the simple example above—until you try to pan one widget, and then quickly pan the other widget, while the first one is still decelerating:

For this demonstration, I used the “Simulate Touchscreen” toggle in the Inspector

Something weird is happening: it’s like the widget doesn’t let you pan until the deceleration is over. The reason for this issue is that the pan gesture and the kinetic deceleration live in each scrolled window separately. So when you pan one scrolled window, it starts updating the (shared) adjustment value every frame, and if you try to pan another scrolled window in the meantime, the movement gets continuously overwritten by the first scrolled window.

The workaround is to stop kinetic deceleration on all other scrolled windows when starting the pan. It’s further complicated by the fact that the pan gestures themselves live inside the scrolled window, and you can’t mess with them. Thankfully, you can catch the two-finger touchpad gesture with a GtkEventControllerScroll and the one-finger touchscreen gesture with a GtkGestureDrag:

// Our scrolled windows, for stopping their kinetic scrolling.
const scrolledWindows = [];

function stopKineticScrollingExcluding(source) {
    for (const widget of scrolledWindows) {
        if (widget === source)
            continue;

        // There's no special function to stop kinetic scrolling,
        // but disabling and enabling it works fine.
        widget.set_kinetic_scrolling(false);
        widget.set_kinetic_scrolling(true);

        // Fix horizontal touchpad panning after resetting
        // kinetic scrolling.
        widget.queue_allocate();
    }
}

const shared_hadj = new Gtk.Adjustment();
const shared_vadj = new Gtk.Adjustment();

function createScrolledWindow() {
    // The scrollable widget.
    const picture = new Gtk.Picture({
        file: image,
        can_shrink: false,
    });

    const scrolledWindow = new Gtk.ScrolledWindow({
        child: picture,
        hadjustment: shared_hadj,
        vadjustment: shared_vadj,
    });
    scrolledWindows.push(scrolledWindow);

    // The scroll controller will catch touchpad pans.
    const scrollController = Gtk.EventControllerScroll.new(
        Gtk.EventControllerScrollFlags.BOTH_AXES,
    );
    scrollController.connect('scroll', (scrollController, _dx, _dy) => {
        const device = scrollController.get_current_event_device();
        if (device?.source === Gdk.InputSource.TOUCHPAD) {
            // A touchpad pan is about to start!
            // Let's stop the kinetic scrolling on other widgets.
            stopKineticScrollingExcluding(scrolledWindow);
        }

        // Let the default scrolling work.
        return false;
    });
    picture.add_controller(scrollController);

    // The drag gesture will catch touchscreen pans.
    const dragGesture = new Gtk.GestureDrag();
    dragGesture.connect('drag-begin', (dragGesture, _x, _y) => {
        const device = dragGesture.get_current_event_device();
        if (device?.source === Gdk.InputSource.TOUCHSCREEN) {
            // A touchscreen pan is about to start!
            // Let's stop the kinetic scrolling on other widgets.
            stopKineticScrollingExcluding(scrolledWindow);
        }

        // We don't want to handle the drag.
        dragGesture.set_state(Gtk.EventSequenceState.DENIED);
    });
    picture.add_controller(dragGesture);

    return scrolledWindow;
}

This gives us panning across all widgets with nice kinetic deceleration which doesn’t break. Try the full example with gjs -m reset-kinetic.js:4

Touchpad panning works as expected across all scrolled windows

There are two extra complications about this code, both related to touchscreen panning. First, we stop the kinetic scrolling on all scrolled windows excluding the one handling the new event. This is because for some reason resetting the kinetic scrolling like this in the middle of a touchscreen pan prevents it from working (touchpad pans keep working fine).

Second, we queue an allocation on the scrolled windows right after resetting the kinetic scrolling. For whatever reason, resetting the kinetic scrolling causes the scrolled window to stop handling horizontal touchscreen pans altogether (vertical and mixed pans keep working fine). I suspect it’s caused by some logic error related to check_attach_pan_gesture(). This function is called when toggling the kinetic scrolling, breaking the horizontal touchscreen pans. Thankfully, it’s also called at the end of allocation, where it fixes back the touchscreen pans. I haven’t investigated this bug further, but it would be nice to get it fixed.

And that’s it! The code we’ve added also comes in useful for implementing custom gestures like zoom or mouse pan. Just remember that when writing custom gestures, you might need to stop the kinetic scrolling on the current scrolled window too, not only on the linked ones.

Closing Thoughts #

When synchronizing scrolled windows, and just dealing with GTK gesture code in general, make sure to test with different input devices, as each has its own quirks. Be careful when scrollable widgets have different sizes, or can change their size depending on the scroll position.

At a higher level, I think it would be better if the kinetic deceleration lived somewhere around the GtkAdjustments themselves. This way it would also be shared between all synchronized scrolled windows, and the workarounds, along with their oddities, wouldn’t be necessary. Something to keep in mind for GTK 5 perhaps.

When discussing a draft of this post with GTK developers and contributors, another potential GTK 5 idea came up. Different scrollable widgets (GtkViewport, GtkListView, GtkTextView, WebKitGTK’s web view, libshumate’s map) have slightly different needs, and GtkScrollable with GtkScrolledWindow can’t offer them all a unified interface that would work without compromises or big technical hurdles. (The last two examples don’t implement the scrollable interface for these reasons.) So, maybe, instead of GtkScrolledWindow, there should be a collection of helpers, and scrollable widgets should show scrollbars and handle scrolling themselves.

With all that said, if you think Identity might be useful to you, download it from Flathub and give it a try! I’d love to hear your thoughts, ways to contact me are linked at the bottom of this page.

Comparing three videos side-by-side in Identity


  1. The subtle glow that shows up when you try to scroll past the end of a scrollable widget. ↩︎

  2. These normalized adjustments are also responsible for the behavior when resizing the Identity window with zoomed-in images: instead of always expanding to the bottom-left, the images expand around their current scroll position. This is because the normalized adjustments don’t change during resizing. So, for example, a value of 0.25 before and after resizing will keep the image scrolled to 25% of its size. ↩︎

  3. This is not the only way to share position between differently sized scrollable widgets, just one that makes sense for Identity’s comparison use-case. You could imagine some other use-case where it makes more sense to share the pixel position, rather than the normalized position. It can be implemented using the same idea of two extra adjustments. It’ll work fine as long as different widgets don’t try to overwrite the upper bound on the same adjustment with different values. ↩︎

  4. Unfortunately, “Simulate Touchscreen” won’t help you see this fix; you’ll need a real touchpad or touchscreen. At the moment, the toggle does not change the device types that the gesture code receives, so it doesn’t run our workaround code. To test Identity, I’ve been using the work-in-progress Mutter SDK branch which has a compositor-level touchscreen emulation. ↩︎

GTK 4.11.1

Here is the first GTK snapshot of the new development cycle. A lot of things fell into place recently, so it is worth taking some time to go through the details of what is new, and what you can expect to see in 4.12.

List View Improvements

The family of GtkListView, GtkColumnView and GtkGridView widgets was one of the big additions in GTK 4. They are meant to replace GtkTreeView, but up until now, this was clearly still a bit aspirational.

In GTK 4.10, we’ve finally taken the big step to port GtkFileChooser away from tree views—a sign that list views are ready for prime time. And the next GTK 4 release will bring a number of missing features:

  • Finally, a fix for the longstanding scrolling bug
  • Better keyboard navigation, with customizable tab behavior
  • Focus control
  • Programmatic scrolling
  • Sections, maybe

Some of these are already available in 4.11.1. We even managed to backport the scrolling fix to 4.10.1.

Better Textures

Textures are used frequently in GTKs GL renderer—for icons and images, for glyphs, and for intermediate offscreen rendering. Most of the time, we don’t have to think about them, they just work. But if the texture is the main content of your app, such as in an image viewer, you need a bit more control over it, and it is important that the corner cases work correctly.

In GTK 4.10, we introduced a GskTextureScale node, which gives applications control over the filtering that is applied when scaling a texture up or down. This lets apps request the use of mipmaps with GSK_SCALING_FILTER_TRILINEAR. GTK 4.12 will automatically use mipmaps when it is beneficial.

One corner case that we’ve recently explored is texture slicing. Whenever a texture is bigger than the GL stack supports, GSK will break it into smaller slices and use separate GL textures for each. Modern GPUs support enormous textures (on my system, the max. texture size is 16384), which means that the slicing support is rarely tested in practice and not well covered by our unit tests either.

We added support for artificially limiting the texture size (with the GSK_MAX_TEXTURE_SIZE environment variable), and promptly discovered that our texture slicing support needed some love. It will work much better in 4.12.

Fractional Scaling

It landed on April 1st, but it is not a joke.

We’ve added support for the experimental wp_fractional_scale_manager_v1 protocol to the Wayland backend, and use the wp_viewporter protocol to tell the compositor  about the scaling that the buffer is using.  It is nice that this was easy to fit into our rendering stack, but don’t expect miracles. It works well with the cairo renderer (as you can see in the video), but we still consider it experimental with the GL and Vulkan renderers.

To try fractional scaling with the GL renderer, set

GDK_DEBUG=gl-fractional

in the environment.

Summary

There’s lots of new things to explore in GTK 4.11. Please try them and let us know what you think, in gitlab or on Discourse.

April 03, 2023

WebKitGTK accelerated compositing rendering

Initial accelerated compositing support

When accelerated compositing support was added to WebKitGTK, there was only X11. Our first approach was quite simple, we sent the web view widget Xwindow ID to the web process to be used as rendering target using GLX. This was very efficient, but soon we realized it broke the GTK rendering model so it was not possible to use a web view inside a GtkOverlay, for example, to show status messages on top. The solution was to use a redirected Xcomposite window in the web process, and use its ID as the render target using GLX. The pixmap ID of the redirected Xcomposite window was sent to the UI process to be painted in the web view widget using a Cairo Xlib surface. Since the rendering happens in the web process, this approach required to use Xdamage to monitor when the redirected Xcomposite window was updated to schedule a web view redraw.

Wayland support

To support accelerated compositing under Wayland we initially added a nested Wayland compositor running in the UI process. The web process connected to the nested Wayland compositor and created a surface to be used as the rendering target using EGL. The good thing about this approach compared to the X11 one, is that we can create an EGLImage from Wayland buffers and use a GDK GL context to paint the contents in the web view. This is more efficient than X11 because we can use OpenGL both in web and UI processes.
WPE, when using the fdo backend, uses the same approach of running a nested Wayland compositor, but in a more efficient way, using DMABUF instead of Wayland buffers when available. So, we decided to use libwpe in the GTK port only for rendering under Wayland, and eventually remove our Wayland compositor implementation.
Before the removal of the custom Wayland compositor we had all these possible combinations:

  • UI Process
    • X11: Cairo Xlib surface
    • Wayland: EGL
  • Web Process
    • X11: GLX using redirected Xwindow
    • Wayland (nested Wayland compositor): EGL using Wayland surface
    • Wayland (libwpe): EGL using libwpe to get the Wayland surface

To reduce a bit the differences, and to make it easier to support WebGL with ANGLE we decided to change X11 to prefer EGL if possible, falling back to GLX only if EGL failed.

GTK4

GTK4 was released and we added support for it. The fact that GTK4 uses GL by default should make the rendering more efficient in accelerated compositing mode. This is definitely true under Wayland, because we are using a GL context already, so we just keep passing a texture to GTK to paint the contents in the web view. However, in the case of X11 we still have a Cairo Xlib surface that GTK paints into a Cairo image surface to be uploaded to the GPU. With GTK4 now we have two more combinations in the UI process side X11 + GTK3, X11 + GTK4, Wayland + GTK3 and Wayland + GTK4.

Reducing all the combinations to (almost) one: DMABUF

All these combinations to support the different platforms made it quite difficult to maintain, every time we get a bug report about something not working in accelerated compositing mode we have to figure out the combination actually used by the reporter, GTK3 or GTK4? X11 or Wayland? using EGL or GLX? custom Wayland compositor or libwpe? driver? version? etc.

We are already using DMABUF in WebKit for different things like WebGL and media rendering, so we thought that we could also use it for sharing the rendered buffer between the web and UI processes. That would be a more efficient solution but it would also drastically reduce the amount of combinations to maintain. The web process always uses the surfaceless platform, so it doesn’t matter if it’s under Wayland or X11. Then we create a surfaceless context as the render target and use EGL and GBM APIs to export the contents as a DMABUF buffer. The UI process imports the DMABUF buffer using EGL and GBM too, to be passed to GTK as a texture that is painted in the web view.

This theoretically recudes all the previous combinations to just one (note that we removed GLX support entirely, making EGL a requirement for accelerated compositing), but there’s a problem under X11: GTK3 doesn’t support EGL on X11 and GTK4 defaults to EGL but falls back to GLX if it doesn’t find an EGL config that perfectly matches the screen visual. In my system it never finds that EGL config because mesa doesn’t expose any 32 bit depth config. So, in the case of GTK3 we have to manually download the buffer to CPU and paint normally using Cairo, but in the case of GTK4 + GLX, GTK uploads the buffer again to be painted using GLX. I don’t think it’s possible to force GTK to use EGL from the API, but at least you can use GDK_DEBUG=gl-egl.

WebKitGTK 2.41.1

WebKitGTK 2.41.1 is the first unstable release of this cycle and already includes the DMABUF support that is used by default. We encourage everybody to try it out and provide feedback or report any issue. Please, export the contents of webkit://gpu and attach it to the bug report when reporting any problem related to graphics. To check if the issue is a regression of the DMABUF implementation you can use WEBKIT_DISABLE_DMABUF_RENDERER=1 to use the WPE renderer or X11 instead. This environment variable and the WPE render/X11 code will be eventually removed if DMABUF works fine.

WPE

If this approach works fine we plan to use something similar for the WPE port and get rid of the nested Wayland compositor there too.

April 02, 2023

Crosswords 0.3.8: Change Management

It’s time for another Crosswords release. This is a somewhat quieter release on the surface as it doesn’t have as many user-visible changes. But like the last release, a lot happened under the hood in preparation for the next phase.

This release marks a change in focus. I’ve shifted my work to the editor instead of the game. I hadn’t given the editor much attention over the past year and it’s overdue for updates. I have a lot of features planned for it; it’s time to make progress on them.

Crosswords Editor

The first change I made was to give a revamp to the workflow for creating a new puzzle. The old editor would let you change the puzzle type while editing it — something that was technically neat but not actually useful to setters. I have different editing sections planned based on the puzzle type, which means that restricting a window to one type. For example, editing an acrostic grid is totally different from editing a crossword grid.

The new greeter also cleans up a  weird flow where the first tab would lock once you’d picked your initial values.

New greeter dialog for the Crossword Editor
New puzzle greeter

To do this, I added a greeter that lets you select the type of puzzle right from the outset. We also took advantage of the fact that we added a separate GType for each puzzle type. It’s very loosely inspired by the new project greeter from GNOME Builder.

The second thing I spent time on wasn’t actually a code change, but a design plan. An implementation challenge I’ve had is balancing letting people use all the crazy features that the ipuz spec allows, and adding guardrails to let people write standard puzzles without thinking about those extra features. The problem with those features is that you can easily end up in a legal-but-weird puzzle file. As an example, imagine a crossword where the numbering of all the clues are out-of-order. That’s a legal .ipuz file and possibly valid in fringe circumstances, but rarely what any puzzle designer actually wants.

There is a new design doc for how to handle intermediate states. The details are complicated, but the overall approach involves adding lint() and fixup() functions to the puzzle types. This will let us make the changes we want, but then let the editor get the puzzle back to a reasonable state.

Many thanks to Federico, who very kindly let me call him on the weekends to talk through the issues and iterate to a proposal. I’ve started updating libipuz to implement this new design.

Crosswords: Adaptive Layout

This is the third release that I will have blogged about the adaptive layout, and it’s the first time I feel good about the results. It has been incredibly challenging to get Crosswords to work well at a variety of sizes. This cycle, I introduced the concept of a both a natural size and a minimum size to the game. This results in a mixture of user control and screen-driven sizing. Here is an example of a crossword scaling to the very small and to the very large.

I hope I can put this feature down for now!

Misc fixes and thanks

There were a number of other fixes this release. Most excitingly, we have a number of new contributors too! Crosswords is a potential GNOME GSOC program and some perspective students spent time trying to learn the code base. Here are the improvements and credits.

Puzzle Set tagging dialog

  • First, we now render the puzzle set description labels to look more like tags. Thanks to Pratham for working on this.
  • Thanks to Tanmay for fixing a bug where horizontal clue enumerations weren’t rendering correctly. He also contributed fixes to keyboard layout dialog, and started a promising thumbnailer (targeting next release)
  • Thanks to Philip for layout fixes, a new icon, and for having a preternatural ability to find bugs in my code as soon as I declare them “done.”
  • I fixed an issue where we didn’t lock down the puzzle correctly after winning the game. I’ve been trying to track this down for six months.
  • Thanks as always to the  translators for translations.

Until next time!

April 01, 2023

Niepce March 2023 updates

This is the March 2023 update for Niepce. This is not an April's fool, and this is not the year I can announce a release on April's fool day. Sorry about that.

Continuing with the renderer / previewer cache.

I had to move ncr1 to Rust. I didn't port it all, but the widget and the main API are now in a Rust crate npc-craw2, the fourth one in the workspace. The goal of this crate is to provide the interface to the rendering pipeline. Some of the work included moving away from Cairo surface and using GdkTextures instead. The main pipeline is still the original C++ code using GEGL, it's easier for me to bind the C++ code than to write wrappers for GEGL.

In the same way, I also ported most of the darkroom module to Rust. This module is the one that will allow the image editing and currently only handle displaying the images.

All of this was necessary to finish the render / previewer integration and making it work asynchronously: the image rendering happen in the background without freezing the UI. There are still some issues but it on overall, it works well.

Alternative rendering

The initial rendering code for camera raw files was written a long time ago using GEGL, and with GEGL built in camera raw converter. It works, but the result are not satisfying. It's not GEGL's fault it is just that camera raw processing is a complex matter, requires a lot of work, and GEGL here is just a set of operations to build image processing pipelines. It's also more complex now as some cameras require lens correction. There are long term plans for it including using libopenraw, adding lensfun, etc. but this will come later.

I also have a few plans down the road, including compatibility with existing (open source) raw processing, namely with the two most populars, RawTherapee and Darktable. The easiest way to have compatible rendering is to use their code.

RawTherapee

Here we go, I took RawTherapee, added it as a submodule and built a Rust crate, rtengine, around its camera raw engine. I have written up more details on how I did this.

Its integration is light for now, the longer term is to treat the .pp3 on import as sidecars and use them, eventually as the default processor in that case. This is the compatibility I'm talking about. In the long run editing parameters is in the card, it's part of what I consider necessary for the initial release.

But what about Darktable? One thing at a team, but it's in the card.

UI changes

Now as a prelude to change the the rendering parameters (i.e. the adjustement you'll do to the pictures), I have added a simple UI to select ncr or rtengine and this is saved. This is also useful to manually test the rendering cache. The information is displayed in the new "Process" section of the metadata pane.

Rendering cache

I now have a rendering cache that will cache previews on disk, by storing a PNG. This lead to an interesting about space vs time for a cache storage, and I mean the image format. PNG seems like a good compromise for speed but I get 20-30MB files, while JPEG would be more space efficient, but lossy3. This can be investigated later.

This has been more work than anticipated, but not taking some shortcuts lead to some more infrastructure built to provide the core features.

Importer

I moved on to the importer. That's the part that will be used to add images into the catalog. Currently it support importing in place, and a very rough importing from a camera or a flash card. This is very light in functionality.

One of the key feature I want with the importer is being able to copy images from one place to another, and sort they automatically. This is particularly useful when importing from a camera or memory card where the images are all together.

I wrote a shell script not too long ago to do the same from a download of my phone pictures. The file IMG_0123.JPG with the Exif property DateTimeOriginal that is set to 1 April 2023 is copied to the directory 2023/20230401. This is the way I like it, but the importer will be more flexible. The longer term will possibly see a renaming feature.

To implement this image copying workflow, and to help testing, I wrote a command line tool to run the copy. This involves recursively (as an option) walking through the directories, grouping files together (this is called bundle), extracting the date from the metadata, and copying the files. One of the shortcomings of the Rust std::fs::copy is that it doesn't copy the timestamps, so I had to implement this. Also I learned that the file creation date isn't a thing you can change.

Ever felt that you have two pieces of code you need to put together and they don't fit? That's where I'm at now. The copying code needs to be integrated in the UI, and it doesn't fit at all, nor with the existing importer. The whole thing is taking longer than I wished, but it'll get there as I will redesign the import code.

Other

I also updated the Flatpak manifest so I can build a flatpak with the current code.

And last but not least, I have submitted a few PR for RawTherapee, mostly issues triggered by the adresss sanitizer: Issue #6708 - fix overlapping buffer strcpy, Issue #6721 - Fix memory leak in Crop.

Thank you reading.

1

NCR stands for Niepce Camera Raw, i.e. the camera raw processing code. When I started it eons ago. I had great plans, that didn't materialize, like build a processing pipeline based on GEGL, libopenraw, exempi, etc.

2

There is an unrelated ncr crate on crates.io, so I decided to not use that crate name, and didn't want to use npc-ncr, even though the crate is private to the application and not intended to be published separately.

3

As I write this now it hits me that I could do an experiment with lossless JPEG compression. This is what is used by file formats like CR2 or DNG. And given that the PNG of a rendered CR2 is about the same size, it's possible that it is a viable idea.

March 30, 2023

Ensuring steady frame rates with GPU-intensive clients

On Wayland, a surface is the basic primitive used to build what users refer to as a “window”. Wayland clients define their contents by attaching buffers to surfaces. This turns the contents of the buffer into the current surface contents. Wayland clients are free to attach a new buffer to a surface anytime. When a Wayland compositor like Mutter starts working on a new output frame, it picks the latest available buffer for each visible surface. This is called “mailbox semantics” (the buffers are metaphorical letters falling into a mailbox, the visible “letter” is the last one on top).

Problem

With hardware accelerated drawing, a client normally attaches a new buffer to a surface right after it finished calling OpenGL/Vulkan/<insert your favourite drawing API> APIs to define the contents of the buffer. When the compositor processes the protocol requests attaching the buffer to the surface, the GPU generally hasn’t finished drawing to the buffer yet.

Since the contents of the compositor’s output frame depend on the contents of each visible surface, the former cannot complete before the GPU finishes drawing to each of the picked surface buffers (and subsequently to the compositor’s own output buffer, in the general case).

If the GPU does not finish drawing in time for the next display refresh cycle, the compositor’s output frame misses that cycle and is delayed by at least the duration of one refresh cycle. This can be noticeable as judder/stutter, because the compositor’s frame rate is reduced, and the contents of some frames are not consistent with the timing when they become visible.

The likelihood of that happening depends largely on the clients, mainly on how long it takes the GPU to draw their buffer contents and how much time lies between when a client starts drawing to its buffer and when the compositor starts working on its resulting output frame.

In summary, a Wayland compositor can miss a display refresh cycle because the GPU failed to finish drawing to a client buffer in time.

This diagram visualizes a normal and problematic case:

Left side: normal case, right side: problematic case
Left side: normal case, right side: problematic case

Solution

Basic idea

The basic idea is simple: the compositor considers a client buffer “available” per the mailbox semantics only once the GPU finishes drawing to it. Until then, it picks the previously available buffer.

Complications

Now if it was as simple as that might sound, there would be no need to write a >1000-word article about it. 🙂

The main thing which makes things more complicated is that, together with attaching a new buffer, various other surface states can be modified in the same commit. All state changes in the same commit must be applied atomically, i.e. the user must either see all or none of them (per Wayland’s “every frame is perfect” motto). For an example, there are various states which affect how a Wayland surface is scaled for display. Attaching a new buffer and changing the scaling state in the same commit ensures that the surface always appears consistently. If the buffer size and scaling state were to change independently, the surface might intermittently appear in the wrong size.

As if that wasn’t complicated enough, Wayland has so-called synchronized sub-surfaces. State changes for a synchronized sub-surface are not applied immediately, but only the next time any state changes are applied for its parent surface. Conceptually, one can think of the committed sub-surface state becoming part of the parent surface’s state commit. Again, all state combined like this between sub-surfaces (which can be nested, i.e. a sub-surface can be the parent of another sub-surface) and their parents must be applied atomically, all or nothing, to ensure that sub-surfaces and their parents always appear consistently as a whole.

This means that the compositor cannot simply wait for the GPU to finish drawing to client buffers, while applying other corresponding surface state immediately. It needs to stage the committed state changes somehow, and actually apply them only once the GPU has finished drawing to all new buffers attached in the same combined state commit.

Enter transactions

The idea for “stage somehow” is to introduce the concept of a transaction, which combines a set of state changes for one or multiple (sub-)surfaces. When a client commits a set of state changes for a surface, they are inserted into an appropriate transaction; either a new one or an existing one, depending on circumstances.

When the committed state changes should get applied per Wayland protocol semantics, the transaction is committed and inserted into a queue of committed transactions. The queue is ordered such that for any given surface,  state commits are applied in the same order as they were committed by the client. This ensures that the contents of a surface never appear to “move backwards” because one transaction affecting the surface managed to “overtake” another one.

A transaction is considered ready to be applied only once both of these conditions are true:

  1. It’s the oldest (closest to the queue head) transaction in the queue for all surfaces it carries state for.
  2. The GPU has finished drawing to all client buffers attached in the transaction.

Once both of these conditions are true, the transaction is applied atomically. From that point on, the compositor uses the state in the transaction for its output frames.

Results

I implemented the solution described above in Mutter merge request !1880, which was merged for the GNOME 44 release. While it went under the radar of news outlets, I hope that many of you will notice the benefits!

One situation where the benefits of transactions can be noticed is interactive OpenGL applications such as games, with “vsync” disabled (e.g. for better input → output latency), you should be less likely to see stuttering due to Mutter missing a display refresh cycle, in particular in fullscreen and if Mutter can use direct scanout of client buffers.

If the GPU & drivers support true high priority EGL contexts which can preempt lower priority ones (as of this writing, this is true e.g. with “not too old” Intel GPUs), Mutter can now sustain full frame rate even if clients are GPU-bound to lower frame rates, as demonstrated in this video:

Even if the GPU & drivers do not support this, Mutter should now get bogged down less by such heavy clients, in particular the mouse cursor.

It’s effective for X clients running via Xwayland as well, not only for native Wayland clients.

Long term, all major Wayland compositors will want to do something like this. gamescope already does.

Thanks

It took almost two years (on and off, not full-time) from having the initial idea, deciding to try implementing it myself, until finally getting it ready to be merged. I wasn’t very familiar with the Mutter code or Wayland protocol semantics when I started, so I couldn’t have done it without a lot of help from many Mutter and Wayland developers. I am deeply grateful to all of you.

Thanks to Jakub Steiner for the featured image and to Niels De Graef for the diagram of this post.

I would also like to thank Red Hat for giving me the opportunity to work on this, even though “Mutter developer” isn’t really a central part of my job description.

March 28, 2023

New gitlab.freedesktop.org spamfighting abilities

As of today, gitlab.freedesktop.org allows anyone with a GitLab Developer role or above to remove spam issues. If you are reading this article a while after it's published, it's best to refer to the damspam README for up-to-date details. I'm going to start with the TLDR first.

For Maintainers

Create a personal access token with API access and save the token value as $XDG_CONFIG_HOME/damspam/user.token Then run the following commands with your project's full path (e.g. mesa/mesa, pipewire/wireplumber, xorg/lib/libX11):

$ pip install git+https://gitlab.freedesktop.org/freedesktop/damspam
$ damspam request-webhook foo/bar
# clean up, no longer needed.
$ pip uninstall damspam
$ rm $XDG_CONFIG_HOME/damspam/user.token
The damspam command will file an issue in the freedesktop/fdo-bots repository. This issue will be automatically processed by a bot and should be done by the time you finish the above commands, see this issue for an example. Note: the issue processing requires a git push to an internal repo - if you script this for multiple repos please put a sleep(30) in to avoid conflicts.

Once the request has been processed (and again, this should be instant), any issue in your project that gets assigned the label Spam will be processed automatically by damspam. See the next section for details.

For Developers

Once the maintainer for your project has requested the webhook, simply assign the Spam label to any issue that is spam. The issue creator will be blocked (i.e. cannot login), this issue and any other issue filed by the same user will be closed and made confidential (i.e. they are no longer visible to the public). In the future, one of the GitLab admins can remove that user completely but meanwhile, they and their spam are gone from the public eye and they're blocked from producing more. This should happen within seconds of assigning the Spam label.

For GitLab Admins

Create a personal access token with API access for the @spambot user and save the token value as $XDG_CONFIG_HOME/damspam/spambot.token. This is so you can operate as spambot instead of your own user. Then run the following command to remove all tagged spammers:

$ pip install git+https://gitlab.freedesktop.org/freedesktop/damspam
$ damspam purge-spammers
The last command will list any users that are spammers (together with an issue that should make it simple to check whether it is indeed spam) and after interactive confirmation purge them as requested. At the time of writing, the output looks like this:
$ damspam purge-spammers
0: naughtyuser              : https://gitlab.freedesktop.org/somenamespace/project/-/issues/1234: [STREAMING@TV]!* LOOK AT ME
1: abcuseless               : https://gitlab.freedesktop.org/somenamespace/project/-/issues/4567: ((@))THIS STREAM IS IMPORTANT
2: anothergit               : https://gitlab.freedesktop.org/somenamespace/project/-/issues/8778: Buy something, really
3: whatawasteofalife        : https://gitlab.freedesktop.org/somenamespace/project/-/issues/9889: What a waste of oxygen I am
Purging a user means a full delete including all issues, MRs, etc. This is nonrecoverable!
Please select the users to purge:
[q]uit, purge [a]ll, or the index: 
     
Purging the spammers will hard-delete them and remove anything they ever did on gitlab. This is irreversible.

How it works

There are two components at play here: hookiedookie, a generic webhook dispatcher, and damspam which handles the actual spam issues. Hookiedookie provides an HTTP server and "does things" with JSON data on request. What it does is relatively generic (see the Settings.yaml example file) but it's set up to be triggered by a GitLab webhook and thus receives this payload. For damspam the rules we have for hookiedookie come down to something like this: if the URL is "webhooks/namespace/project" and damspam is set up for this project and the payload is an issue event and it has the "Spam" label in the issue labels, call out to damspam and pass the payload on. Other rules we currently use are automatic reload on push events or the rule to trigger the webhook request processing bot as above.

This is also the reason a maintainer has to request the webhook. When the request is processed, the spambot installs a webhook with a secret token (a uuid) in the project. That token will be sent as header (a standard GitLab feature). The project/token pair is also added to hookiedookie and any webhook data must contain the project name and matching token, otherwise it is discarded. Since the token is write-only, no-one (not even the maintainers of the project) can see it.

damspam gets the payload forwarded but is otherwise unaware of how it is invoked. It checks the issue, fetches the data needed, does some safety check and if it determines that yes, this is spam, then it closes the issue, makes it confidential, blocks the user and then recurses into every issue this user ever filed. Not necessarily in that order. There are some safety checks, so you don't have to worry about it suddenly blocking every project member.

Why?

For a while now, we've suffered from a deluge of spam (and worse) that makes it through the spam filters. GitLab has a Report Abuse feature for this but it's... woefully incomplete. The UI guides users to do the right thing - as reporter you can tick "the user is sending spam" and it automatically adds a link to the reported issue. But: none of this useful data is visible to admins. Seriously, look at the official screenshots. There is no link to the issue, all you get is a username, the user that reported it and the content of a textbox that almost never has any useful information. The link to the issue? Not there. The selection that the user is a spammer? Not there.

For an admin, this is frustrating at best. To verify that the user is indeed sending spam, you have to find the issue first. Which, at best, requires several clicks and digging through the profile activities. At worst you know that the user is a spammer because you trust the reporter but you just can't find the issue for whatever reason.

But even worse: reporting spam does nothing immediately. The spam stays up until an admin wakes up, reviews the abuse reports and removes that user. Meanwhile, the spammer can happily keep filing issues against the project. Overall, it is not a particularly great situation.

With hookiedookie and damspam, we're now better equipped to stand against the tide of spam. Anyone who can assign labels can help fight spam and the effect is immediate. And it's - for our use-cases - safe enough: if you trust someone to be a developer on your project, we can trust them to not willy-nilly remove issues pretending they're spam. In fact, they probably could've deleted issues beforehand already anyway if they wanted to make them disappear.

Other instances

While we're definitely aiming at gitlab.freedesktop.org, there's nothing in particular that requires this instance. If you're the admin for a public gitlab instance feel free to talk to Benjamin Tissoires or me to check whether this could be useful for you too, and what changes would be necessary.

March 26, 2023

No-Added-Sugar Granola Recipe

Granola is tasty, crunchy, it can take many shapes and forms, and it’s great both for breakfast and as a snack. This recipe will help you make one that’s not too sweet and which requires few ingredients.

You can easily make your own variants, e.g. replacing almonds with hazzlenuts, or replacing raisins with dried cranberries. You can also replace some of the seeds by 50 g of chocolate chips or chocolate shavings. Add the chocolate when the granola is cool and it will remain as-is, but mix it right after taking the granola out of the oven and it will completely melt and disappear, giving the granola an even golden color and a light chocolate flavor. Add some fresh banana slices or some banana chips, pour some coconut milk, and you have a great chocolate, banana and coconut breakfast granola. If you make your own variant, let me know about it!

For approximately 500 g.

Ingredients

  • 300 g of oatmeal
  • 50 g of almonds
  • 50 g of squash seeds
  • 50 g of sunflower seeds
  • 100 g of raisins
  • 200 ml of water

All you need to make some no-added-sugar granola

Instructions

  • Soak the raisins in the water for 1 h.

The soaked raisins

  • Mix the oatmeal and the seeds in a large bowl.

The oatmeal and seeds mix

  • Just before the end of the soaking time, pre-heat the oven at 150 °C.
  • Mix the raisins with the oatmeal and the seeds, one tablespoon at a time to ensure the water is well spread.

All the ingredients are now mixed

  • Spread the unbaked granola on a baking tray, make the layer uniform.

The granola in the baking tray, ready to be baked

  • Bake the granola 1 hour, mixing it every 15 minutes to ensure it’s baked evenly.

The granola after 1 hour in the oven

  • Let the granola cool down out of the oven.
  • Store it in jars.

Add some fresh fruits, pour some plant milk in, get some tea or coffee, and you made some great breakfast

March 24, 2023

How to Propose Features to GNOME

Introduction

Recently, GNOME added an option into GNOME Settings to adjust pointer acceleration, which was a feature that the developers and designers were originally against. One person managed to convince them, by giving one reason. Thanks to them, pointer acceleration options are now available in GNOME Settings!

Firstly, I’m going to summarize the relevant parts of the proposal and discussion behind the addition, and explain how it was accepted. Then, to build on top of that, GNOME’s philosophy and the importance of taking it into consideration. And lastly, how to propose features to GNOME and what to avoid.

However, this article is not about whether GNOME is successful with their philosophy, and the tone of developers and designers. Additionally, this isn’t about where to propose features, rather how to formulate the proposal and what to consider.

Disclaimer: I am not speaking on the behalf of GNOME.

Summary of the Proposal and Discussion

Felipe Borges, a maintainer of GNOME Settings, submitted a merge request to make pointer acceleration options discoverable in GNOME Settings. However, Georges Stavracas, a maintainer, and Allan Day, a designer, were strongly against this addition and expressed that it does not benefit the majority of users and can lead to confusion. At that time, this feature was leaning towards rejection.

I responded that this feature can hugely benefit gamers, as many of them are sensitive to pointer acceleration during gameplay. I asked to reconsider this position, as there is a considerably large amount of people who would appreciate it (albeit still a minority).

Georges responded that they were still against it. They argued that it isn’t relevant for most people, and the target audience will be left confused. In my opinion, ignoring the unjust rant, their reasoning was completely justified and valid.1 In the end, they were unconvinced.

Later, John Smith commented that having the flat profile (a nondefault option on GNOME) as opposed to the default is desirable for people who suffer from hand tremors (shaking/fidgety hands) – John often needs to tweak pointer acceleration within GNOME Tweaks for their relative, who suffers from Parkinson. They also pointed out that they use GNOME Tweaks more often than GNOME Settings because of that feature.

Afterwards, the developers and designers were convinced, and then discussed with John Smith on adding it and wording it appropriately. The merge request was then accepted and merged, and finally added to GNOME Settings in GNOME 44.

Importance of Taking GNOME’s Philosophy Into Consideration

So what happened? How come I couldn’t convince the developers and designers even though my statement was correct, while John Smith convinced them even though their statement was just as correct?

That is because John Smith took GNOME’s philosophy (especially target audience) into consideration, whereas I didn’t. While both of our points were correct, mine was, to put it in the nicest way possible, irrelevant. This sole difference between our approaches lead to opposite directions of this merge request, as only John Smith’s approach contributed to GNOME’s philosophy, and thus literally made the developers and designers reconsider.

Understanding GNOME’s philosophy and taking it into consideration is, in my opinion, the most important factor when proposing anything to GNOME. GNOME takes the philosophy very seriously, so it is really important that proposals satisfy it. As shown above, this is a matter of acceptance and rejection.

What is GNOME’s Philosophy?

GNOME’s philosophy is sophisticated and there is a lot of room for misunderstanding. Keep in mind that I wrote an article about GNOME’s philosophy in “What is GNOME’s Philosophy?”, where it explains in depth.

To summarize the article, productivity is a key focus for GNOME. However, GNOME approaches it in a very specific manner: The target audiences are people with disabilities, the average computer and mobile user, as well as developers interested in GNOME. GNOME aims to create a coherent environment and experience, courtesy of the Human Interface Guidelines, in which the guidelines are heavily based on cognitive ergonomics, such as perception, memory, reasoning, and motor response. GNOME also encourages peacefulness and organization by discouraging some (sometimes conventional) features.

This means, GNOME is fully open with feature proposals, as long as the feature complies with the philosophy.

How Should I Propose Features?

Now that we (hopefully) have a better understanding of GNOME’s philosophy and why it’s important to take it into consideration, let’s look at how I would recommend proposing features.

There are many important notes and recommendations I’d like to make whenever you propose anything (including features) to GNOME:

  • Correctness and relevance are different from one another. Being correct doesn’t necessarily mean that the proposal will contribute to GNOME’s philosophy. Relevancy, however, ensures that the information satisfies with GNOME’s philosophy, so it’s important that proposals are relevant, as shown with the merge request.
  • Using GNOME primarily doesn’t automatically make you its target audience. Please take the time to ask yourself if your proposal contributes to GNOME’s philosophy. If it does, explain how it benefits the target audience, not how it benefits you. Same goes to commenting/replying to someone.
  • Follow the GNOME Code of Conduct. Being respectful to others is a really important step for a healthy relationship between you and GNOME. Of course, it doesn’t mean that every party will behave respectfully, including GNOME members.
  • This isn’t 100% guaranteed, and may be really exhausting. Don’t rush yourself to reply as soon as possible. If you are tired or aren’t in the mood for replying, then comment another time. The last thing you’d want to do is offend a maintainer and have your proposal rejected, whether it is done intentionally or not.
  • Do not set any expectations for being treated well by members. Unfortunately, some members may behave poorly (like in that merge request) and even violate the Code of Conduct without dire consequences. Dealing with unjust situations can be difficult, even if you were respectful. If a member is behaving inappropriately, you can try to follow Procedure For Reporting Code of Conduct Incidents. In any case, make sure not to seem offensive or demanding.

Whenever you propose a feature to GNOME, ask yourself this question: “Does it comply with GNOME’s Philosophy?

If it does, then GNOME developers and designers will likely be interested in your idea. Of course, like explained above, proper communication is crucial, as well as wording it appropriately. Nevertheless, if the proposal fundamentally goes against the philosophy, then it will very likely be rejected.

Conclusion

One thing I’ve learned from experience is that GNOME mainly cares about proposals that serve the philosophy, as they take it very seriously.

Providing good user experience is difficult, as preferences have a cost. It can also be difficult to know what features the target audience would want, especially when they are nontechnical users, as many of them may not know how and who to contact. Having people propose features is a wonderful privilege for all of us, so it’s really important that we do it with care, and put the time and effort to word it in a way that we consider the philosophy, and explain thoroughly how it contributes to it.

Hopefully, this helps you understand GNOME’s goals with the desktop and its philosophy. With a better understanding, you should be able to carefully propose and formulate the feature you want.


Edit 1: Add bullet point for members behaving inappropriately (credit to Maksym Hazevych)


Footnotes

  1. “Acceleration” is a term I always had a hard time understanding, so I can fully agree that the term is really confusing. I don’t remember learning it in school, and I suffer from comprehension difficulties as well, so reading definitions and watching videos haven’t really helped so far. The only hint I’ve gotten from is in Mario Kart Wii, where I associate acceleration with “the vehicle goes from 0 to brrr”. 

March 23, 2023

libpeas-2

Now that GNOME 44 is out the door, I took some time to do a bunch of the refactoring I’ve wanted in libpeas for quite some time. For those not in the know, libpeas is the plugin engine behind applications like Gedit and Builder.

This does include an ABI break but libpeas-1.0 and libpeas-2 can be installed side-by-side.

In particular, I wanted to remove a bunch of deprecated API that is well over a decade old. It wasn’t used for very long and causes libpeas to unnecessarily link against gobject-introspection-1.0.

Additionally, there is no need for the libpeas-gtk library anymore. With GTK 4 came much more powerful list widgets. Combine that with radically different plugin UI designs, the “one stop plugin configuration widget” in libpeas-gtk just isn’t cutting it.

Now that there is just the single library, using subdirectories in includes does not make sense. Just #include <libpeas.h> now.

Therefore, PeasEngine is now a GListModel containing PeasPluginInfo.

I also made PeasExtensionSet a GListModel which can be convenient when you want to filter which extensions use care about using something like GtkFilterListModel.

And that is one of the significant reasons for the ABI break. Previously, PeasPluginInfo was a boxed-type, incompatible with GListModel. It is now derived from GObject and thusly provides properties for all the important bits, including PeasPluginInfo:loaded to denote if the plugin is loaded.

A vestige of the old-days is PeasExtension which was really just an alias to GObject. This just isn’t needed anymore and we use GObject directly in function prototypes.

PeasActivatable is also removed because creating interfaces is so easy these days with language bindings and/or G_DECLARE_INTERFACE() that it doesn’t make sense to have such an interface in-tree. Just create the interface you want rather than shoehorning this one in.

I’ve taken this opportunity to rename our development branch to main and you can get the old libpeas-1.0 ABI from the very fresh 1.36 branch.

March 22, 2023

Endless contributions to GNOME 44

The GNOME 44 release is rushing towards us like an irate pangolin! Here is a quick roundup of some of the Endless OS Foundation team’s contributions over its six-month development cycle.

Software

As in the previous cycle, our team has been a key contributor to GNOME Software 44. Based on a very unscientific analysis of the Git commit log, about 30% of non-merge commits and 75% of merge commits to GNOME Software during this cycle came from our team. Co-maintainer Philip Withnall has continued his work to refactor Software’s internal threading model to improve its reliability. He’s also contributed a number of fixes in both GNOME Software and Flatpak to fix issues related to app updates, such as not leaking large temporary directories when an update fails. Dan Nicholson fixed an issue in Flatpak which would cause Software to remove an app rather than updating it when its ID changes.

Georges Stavracas added some sysprof instrumentation which allowed him to quickly pinpoint the main cause of slow loading of category pages. To our collective surprise, the culprit was… loading remote icons for apps! Georges fixed this issue by downloading icons asynchronously. A side-by-side comparison is really quite striking:

As we came closer to the release of Endless OS 5, we realised we needed some improvements to the handling of OS updates in GNOME Software, such as showing a Learn More link for major upgrades, distinguishing between major upgrades and minor updates, and using the distro’s icon when showing a minor update. Like Endless OS, GNOME OS uses eos-updater, although these improvements will not kick in fully there right now, since it currently does not set any OS version metadata on its updates, or a logo in os-release.

The GNOME Software updates page, showing a minor update for Endless OS with the Endless logo beside it.

Of course, we’ve also contributed to the ongoing maintenance of Software, and other functional improvements such as displaying release urgency levels for firmware updates.

Looking ahead, Joana Filizola has spearheaded a series of user studies on topics like how navigation within Software works, discoverability of search, and the name ‘Software’ itself: we hope these will bear fruit in future GNOME cycles.

Shell

As well as ongoing maintenance of Shell and Mutter, Georges Stavracas contributed improvements to the quick settings pills, adding subtitles to improve the information density. This went hand-in-hand with work to improve GNOME’s handling of Flatpak apps that are running in the background (i.e. without a visible window). Previously this was rather crude: if a Flatpak app ran without a window for some period of time, you would get a decontextualized dialog box asking if you want to allow the app to keep running. Choosing the “wrong” option would kill the app and forbid it from running in the background in future – breaking core functionality for certain apps. In GNOME 44, background apps are instead listed within the quick settings popover, and those apps that do use the background portal API to ask nicely to run in the background are allowed to do so without user interaction.

We also supported the design team’s experiments around how window focus is communicated.

GLib

Philip Withnall has, as in many previous cycles, contributed tens of hours of ongoing maintenance to this library that underpins the entire desktop. This has included a number of GVariant security fixes (like this one), GApplication security fixes, GDBus refcounting fixes, and more. Philip also added g_free_sized() and g_aligned_free_sized(), mirroring similar functions in C23, so that applications can start using these without needing to check for (or wait for) C23 support in the toolchain.

Initial Setup

I spent somewhat fewer hours—but not zero!—on general maintenance of Initial Setup. Georges fixed a regression that meant that privacy policies could not be viewed from within Initial Setup; I fixed the display of a shortlist of keyboard layouts, and of non-ASCII characters in location names after switching locale; and Cassidy James Blaede refreshed the design of the password page to use Adwaita widgets & styling.

Password page of GNOME Initial Setup. The password fields have inline icons to edit the text and reveal the password.

…and more

Every quarter, the engineering teams at Endless OS Foundation have an “intermission week”, where the team sets aside our normal priorities to focus on addressing tech debt, wishlist items, innovative or experimental ideas, and learning. Some of the items above came out of the last couple of intermission weeks! On top of that, Philip has spent some time experimenting with APIs to allow apps’ state to be saved and restored; and João Paulo Rechi Vita explored making the GNOME Online Accounts daemon quit when idle, saving a small but non-zero amount of RAM. Neither of these are quite in a production-ready state, but as they say: there’s always another release!

Meanwhile, we’ve been working on extending the set of web apps offered in GNOME Software on Endless OS, using more expansive criteria than the list shipped by GNOME Software by default, and a different delivery mechanism for the catalogue. More on this in a future post!

March 21, 2023

WebKitGTK API for GTK 4 Is Now Stable

With the release of WebKitGTK 2.40.0, WebKitGTK now finally provides a stable API and ABI for GTK 4 applications. The following API versions are provided:

  • webkit2gtk-4.0: this API version uses GTK 3 and libsoup 2. It is obsolete and users should immediately port to webkit2gtk-4.1. To get this with WebKitGTK 2.40, build with -DPORT=GTK -DUSE_SOUP2=ON.
  • webkit2gtk-4.1: this API version uses GTK 3 and libsoup 3. It contains no other changes from webkit2gtk-4.0 besides the libsoup version. With WebKitGTK 2.40, this is the default API version that you get when you build with -DPORT=GTK. (In 2.42, this might require a different flag, e.g. -DUSE_GTK3=ON, which does not exist yet.)
  • webkitgtk-6.0: this API version uses GTK 4 and libsoup 3. To get this with WebKitGTK 2.40, build with -DPORT=GTK -DUSE_GTK4=ON. (In 2.42, this might become the default API version.)

WebKitGTK 2.38 had a different GTK 4 API version, webkit2gtk-5.0. This was an unstable/development API version and it is gone in 2.40, so applications using it will break. Fortunately, that should be very few applications. If your operating system ships GNOME 42, or any older version, or the new GNOME 44, then no applications use webkit2gtk-5.0 and you have no extra work to do. But for operating systems that ship GNOME 43, webkit2gtk-5.0 is used by gnome-builder, gnome-initial-setup, and evolution-data-server:

  • For evolution-data-server 3.46, use this patch which applies on evolution-data-server 3.46.4.
  • For gnome-initial-setup 43, use this patch which applies on gnome-initial-setup 43.2. (Update: for your convenience, this patch will be included in gnome-initial-setup 43.3.)
  • For gnome-builder 43, all required changes are present in version 43.7.

Remember, patching is only needed for GNOME 43. Other versions of GNOME will have no problems with WebKitGTK 2.40.

There is no proper online documentation yet, but in the meantime you can view the markdown source for the migration guide to help you with porting your applications. Although the API is now stable and it is close to feature parity with the GTK 3 version, there are some problems to be aware of:

Big thanks to everyone who helped make this possible.

March 20, 2023

Writing elsewhere; the pain of moving platforms

I’ve been doing a lot of writing elsewhere of late. Some links:

  • I’ve written a fair amount in the past year for the Tidelift blog, most recently on the EU’s Cyber Resiliency Act and what it might mean for open source.
  • I wrote last week at opensource.com; the latest in a now multi-year series on board candidates in elections for the Open Source Initiative.
  • I have a newsletter on the intersection of open and machine learning at openml.fyi. It is fun!
  • I’ve moved to the fediverse for most of my social media—I’m social.coop/@luis_in_brief (and you can subscribe to this blog via the fediverse at @lu.is/@admin).

I don’t love (mostly) leaving Twitter; as I’ve said a few times, the exposure to different people there helped make me a better person. But one of my primary political concerns is the rise of fascism in the US, and that absolutely includes Elon and the people who enable him. I can’t quit cold-turkey; unfortunately, too many things I care about (or need to follow for work) haven’t left. But I can at least sleep well.

March 17, 2023

Mushroom Gardiane Recipe

Gardiane is a traditional recipe from Camargue, this vegan variant replaces streaked bull meat by mushrooms. I tried it with seitan too but it ends up a bit too soggy and mushy to my liking, especially after some time in the fridge. This vegan variant also has the advantage of requiring virtually no set up — in the original recipe you have to marinate the meat for 24–48 hours — and of taking less time to cook, as you only need 1–2 hours instead of 2–4 hours.

Regarding olives, the most traditional would be black Picholines, though I prefer to use black Lucques. Please please please do not use greek style olives as they contain way too much salt, they would spoil the gardiane. I strongly recommend using olives with stones as it helps keeping them in good shape while everything cooks, and removing the stones with your mouth is part of the experience, like cherry stones in clafoutis.

For 6–8 servings.

Ingredients

  • 2 kg of button mushrooms
  • 200 g of firm tofu
  • 200 g of smoked firm tofu
  • 500 g of black olives, preferably stoned Picholines or Lucques
  • 2 large yellow onions
  • 75 cl of red wine, preferably Côte du Rhône or Languedoc
  • 6 garlic cloves
  • ⅓ tsp of ground cloves
  • 2 tsp of thyme
  • 2 tsp of rosemary
  • 2 large bay leaves
  • a bit of ground black pepper
  • 1 tsp of salt
  • 3 tbsp of starch or flower
  • 500 g long grain white rice, preferably Camargue
  • some olive oil
  • some water

This day I couldn’t find Picholines or Lucques, and I forgot to invite the tofu to the familly photo

Instructions

  • Remove the stems from the mushrooms. Remove dirty bits from the stems.
  • Peel the mushroom caps, this will help the wine soak in.
  • Cut the caps into 3–4 cm wide pieces, in pratice that means you don’t cut the small ones, you cut the medium-sized ones in two, and you cut the larger ones in three or four.
  • Wash the caps and stems in a salad spinner.

The mushrooms, peeled, cut and washed

  • Cut the tofu in 2 cm long and 5 mm thick batonnets.

The cut smoked tofu

  • Peel the onions. Cut the onions in 8–10 vertically.
  • Peel the garlic cloves. Mince the garlic cloves.
  • If the olives come in water, remove the water.
  • Put a large pot on medium heat.
  • In the pot, put a bit of olive oil and lightly brown the onions with the garlic, the thyme and the rosemary. Reserve.

The browned onions

  • In the pot, put a bit of olive oil and lightly brown the tofu. Reserve.

The browned tofu

  • In the pot, put a bit of olive oil and lightly brown the mushrooms. Do it in several batches so they don’t cook too much in their own water, the goal is to get somw brown bits.

A batch of browned mushrooms, these cooked in their own water a bit too much, but it’s fine as they still have some brown bits

  • Remove the pot from the fire, and pour everything in it but the starch or flour, the black pepper and the rice. This means you add the wine, the mushrooms, the onions, the tofu, the olives, the bay leaves, the ground cloves, and the salt.
  • Add water in the pot until everything is covered.

The pot is filled with all the ingredients, the gardiane is ready to be cooked

  • Cover the pot, put it on medium-low heat and let it cook for 2 hours.
  • While the gardiane cooks, prepare and cook the rice.
  • Remove the pot from the stove.
  • Put some sauce from the gardiane in a glass with the starch or flour. Mix them with a spoon and put the liquid back in the pot.
  • Add the black pepper in the pot and stir, the starch or flour will very lightly thicken the sauce.

A served plate

  • Serve the gardiane either aside the rice or on top of it, as you prefer. Serve with another bottle of a similar wine.

This recipe is approved by Tobias Bernard

Libadwaita 1.3

Another cycle, another release. Let’s take a look at what’s new.

Banners

Screenshot of AdwBanner

AdwBanner is a brand new widget that replaces GtkInfoBar.

Jamie started implementing it before 1.2 was released, but we were already in the API freeze so it was delayed to this cycle instead.

While it looks more or less the same as GtkInfoBar, it’s not a direct replacement. AdwBanner has a title and optionally one button. That’s it. It does not have a close button, it cannot have multiple buttons or arbitrary children. In exchange, it’s easier to use, behaves consistently and has an adaptive layout:

Wide screenshot of AdwBanner. The button is on the right, the title is centered.

Medium-width screenshot of AdwBanner. The button is on the right, the title is left-aligned as it wouldn't fit centered.

Narrow screenshot of AdwBanner. The title is center-aligned, the button is centered below it.

Like GtkInfoBar, AdwBanner has a built-in revealer and can be shown and hidden with an animation.

There are situations where it cannot be used, but in most of those cases GtkInfobar was already the wrong choice and they should be redesigned. For example, Epiphany was using them for the “save password” prompt, and is using a popover for that now.

Tab Overview

A work-in-progress grid-based tab overview A work-in-progress carousel-based tab overview

AdwTabOverview is a new tab overview widget for AdwTabView that finally makes it possible to use tabs on mobile devices without implementing a mobile switcher manually.

Back when I wrote HdyTabView, I mentioned a tab overview widget in the works, and even had demo screenshots. Of course, basically everything from that demo got rewritten since then, and the carousel for narrow mode got scrapped completely, but now we’re getting into The Ship of Theseus territory.

This required a pretty big rework of AdwTabView to allow tabs to have thumbnails when they are not visible, and in particular it does not use a GtkStack internally anymore.

By default the selected tab has a live thumbnail and other thumbnails are static, but apps can opt into using live thumbnails for specific pages. They can also control the thumbnail alignment in case the thumbnail gets clipped. Thumbnails themselves are currently not public, but it might be interesting to use them for e.g. tooltips at some point.

Overview is not currently used very widely – it’s available in Console, and there is a merge request adding it to Epiphany, but I didn’t have the energy to finish the latter this cycle.

Tab Button

Screenshot of AdwTabButton. It's showing two tab buttons. One displays 3 open tabs, the other one 15 and has an attention indicator

AdwTabButton is much less interesting, it’s just a button that shows the number of open tabs in a given AdwTabView. It’s intended to be used as the button that opens the tab overview on mobile devices.

Unlike tab overview, this widget is more or less a direct port of what Epiphany has been using since 3.34. It does have one new feature though – it can display an indicator if a tab needs attention.

Accessibility

You might have noticed that widgets like AdwViewStack, AdwTabView or AdwEntryRow were not accessible in libadwaita 1.2.x. The reason for that is that GTK didn’t provide public API to make that possible. While GtkAccessible existed, it wasn’t possible to implement it outside GTK itself.

This cycle, Lukáš Tyrychtr has implemented the missing pieces, as well as fixed a few other issues. And so libadwaita widgets are now properly accessible.

Animation Additions

One of AdwAnimation features is that it automatically follows the system setting for disabling animations. While this is the expected behavior in most cases, there are a few where it gets in the way instead.

One of these cases is in apps where the animations are the app’s primary content, such as Elastic, and so I added a property that allows a specific animation to ignore the setting.

The animation in the demo is now using it as well, for the same reason as Elastic, and in the future it will allow us to have working spinners while animations are disabled system-wide as well.

A screenshot of an animation graph from Elastic

Elastic also draws an animation graph before running the actual animation, and while for timed animations it’s relatively easy since easing functions are available through the public API, it wasn’t possible for spring animations. It is now, via calculate_value() and calculate_velocity().

All of this uncovered a few bugs with spring animations – for example, the velocity was completely wrong for overdamped springs, and Manuel Genovés fixed them.

Unrelated to the above, a common complaint about AdwPropertyAnimationTarget was that it prints a critical if the object containing the property it’s animating is finalized before the target. While it was easy to avoid in C, it was nearly impossible from bindings. And so we don’t print that critical anymore.

Other Changes

  • Christopher Davis added a way to make AdwActionRow subtitle selectable.
  • Matt Jakeman added title-lines and subtitle-lines properties to AdwExpanderRow, matching AdwActionRow.
  • AdwEntryRow now has a grab_focus_without_selecting() method, matching GtkEntry.
  • The API to set an icon on AdwActionRow and AdwExpanderRow are now deprecated, since they were mostly unused. Apps that need icons can add a GtkImage as a prefix widget instead.
  • AdwMessageDialog now has the async choose() method, matching the new GTK dialogs like GtkAlertDialog. The response signal is still there and is not deprecated, but in some cases the new method may be more convenient, particularly from bindings:

    [GtkCallback]
    private async void clicked_cb () {
        var dialog = new Adw.MessageDialog (
            this,
            "Replace File?",
            "A file named “example.png” already exists. Do you want to replace it?"
        );
    
        dialog.add_response ("cancel", "_Cancel");
        dialog.add_response ("replace", "_Replace");
    
        dialog.set_response_appearance (
            "replace",
            DESTRUCTIVE
        );
    
        var response = yield dialog.choose (null);
    
        if (response == "replace") {
            // handle replacing
        }
    }
    
  • Corey Berla added missing drag-n-drop related API to AdwTabBar to make it work properly in Nautilus.
  • Since GTK now allows to change texture filtering, AdwAvatar properly scales custom images, so they don’t appear pixelated when downscaled or blurry when upscaled. This only works if the custom image is a GdkTexture – if your app is using a different GdkPaintable, you will need to do the equivalent change yourself.
  • Jason Francis implemented dark style and high contrast support when running on Windows.
  • Selected items in lists and grids are now using accent color instead of grey, same as Nautilus in 43. Sidebars and menus still look the same as before.

New Dependencies

The accessibility additions, scaled texture render nodes in AdwAvatar, as well as mask render nodes (that I didn’t mention because it’s an internal change) and deprecation fixes mean that libadwaita 1.3 requires GTK 4.10 instead of 4.6.


As always, thanks to all the contributors, and thanks to my employer, Purism, for letting me work on libadwaita and GTK to make this release happen.

Status update 17/03/2023

Hello from my parents place, sitting on the border of Wales & England, listening to this excellent Victor Rice album, thinking of this time last year when I actually got to watch him play at Freedom Sounds Festival, which was one of my first adventures of the post-lockdown 2020s.

I have many distractions at the moment, many being work/life admin but here are some of the more interesting ones:

  • Playing in a new band in Santiago – Killo Karallo – recording some initial music which is to come out next week
  • Preparing new Vladimir Chicken music, also cooked and ready for release in April
  • Figuring out how we can grow the GNOME OpenQA tests while keeping them fun to work with. Here’s an experimental commandline tool which might help with that.
  • Learning about marketing, analytics, and search engine optimization.
  • Trying out the new LLaMA language model and generally trying to keep up with the ongoing revolution in content generation technology.

Also I got to see real snow for the first time in a few years! Thanks Buxton!



March 15, 2023

Portfolio 0.9.15

After a long hiatus, a new release of Portfolio is out 📱🤓. This new release comes with important bug fixes, small-detail additions and a few visual improvements.

In terms of visuals, by popular demand, the most notable change is the use of regular icons for the files browser view. It should be easier now to quickly catch what each file is about. Thanks to @AngelTomkins for the initial implementation, @Exalm for helping with the reviews, and to the GNOME design team for such lovely new icons.

Another addition is support for system-wide style management. This is specially useful now that desktops like GNOME provide quick toggle buttons to switch between dark and light modes. Thanks to @pabloyoyoista for the initial implementation.

One small-detail change to the properties view is the addition of file permissions.  Plus, the properties view was broken down into three different sections to reduce the visual load, and labels can now be selected which is useful for copying locations or ellipsized values.

Moving on to bug fixes, two important changes landed. The first one solves an issue which prevented opening files with special characters 🤦. Thanks to @jwaataja for detecting and fixing this issue. The second one solves an issue with Portfolio not properly detecting mount points under some specific conditions. Thanks to mo2mo for reaching out and sharing his system details, so I could figure this out.

Last but never least, many thanks to @carlosgonz0, @Vistaus, @rffontenelle, @AsciiWolf, and @eson57 for keeping translations up to date, and  to @rene-coty for the new French translation.