Steven Deobald

@steven

2025-07-12 Foundation Update

Gah. Every week I’m like “I’ll do a short one this week” and then I… do not.

 

## New Treasurers

We recently announced our new treasurer, Deepa Venkatraman. We will also have a new vice-treasurer joining us in October.

This is really exciting. It’s important that Deepa and I can see with absolute clarity what is happening with the Foundation’s finances, and in turn present our understanding to the Board so they share that clarity. She and I also need to start drafting the annual budget soon, which itself must be built on clear financial reporting. Few people I know ask the kind of incisive questions Deepa asks and I’m really looking forward to tackling the following three issues with her:

  • solve our financial reporting problems:
    • cash flow as a “burndown chart” that most hackers will identify with
    • clearer accrual reporting so it’s obvious whether we’re growing or crashing
  • passing a budget on time that the Board really understands
  • help the Board pass safer policies

 

## postmarketOS

We are excited to announce that postmarketOS has joined the GNOME Advisory Board! This is particularly fun, because it breaks GNOME out of its safe shell. GNOME has had a complete desktop product for 15 years. Phones and tablets are the most common computers in the world today and the obvious next step for GNOME app developers. It’s a long hard road to win the mobile market, but we will. šŸ™‚

(I’m just going to keep saying that because I know some people think it’s extraordinarily silly… but I do mean it.)

 

## Sustain? Funding? Jobs?

We’ve started work this week on the other side of the coin for donate.gnome.org. We’re not entirely sure which subdomain it will live at yet, but the process of funding contributors needs its own home. This page will celebrate the existing grant and contract work going on in GNOME right now (such as Digital Wellbeing) but it will also act as the gateway where contributors can apply for travel grants, contracts, fellowships, and other jobs.

 

## PayPal

Thanks to Bart, donate.gnome.org now supports PayPal recurring donations, for folks who do not have credit cards.

We hear you: EUR presentment currency is a highly-requested feature and so are yearly donations. We’re still working away at this. šŸ™‚

 

## Hardware Pals

We’re making some steady progress toward relationships with Framework Computer and Slimbook where GNOME developers can help them ensure their hardware always works perfectly, out of the box. Great folks at both companies and I’m excited to see all the bugs get squashed. šŸ™‚

 

## Stuff I’m Dropping

Oh, friends. I should really be working on the Annual Report… but other junk keeps coming up! Same goes for my GUADEC talk. And the copy for jobs.gnome.org … argh. Sorry Sam! haha

Thanks to everyone who’s contributed your thoughts and ideas to the Successes for Annual Report 2025 issue. GNOME development is a firehose and you’re helping me drink it. More thoughts and ideas still welcome!

 

## It’s Not 1998

Emmanuele and I had a call this week. There was plenty of nuance and history behind that conversation that would be too difficult to repeat here. However, he and I have similar concerns surrounding communication, tone, tools, media, and moderation: we both want GNOME to be as welcoming a community as it is a computing and development platform. We also agreed the values which bind us as a community are those values directly linked to GNOME’s mission.

This is a significant challenge. Earth is a big place, with plenty of opinions, cultures, languages, and ideas. We are all trying out best to resolve the forces in tension. Carefully, thoughtfully.

We both had a laugh at the truism, “it’s not 1998.” There’s a lot that was fun and exciting and uplifting about the earlier internet… but there was also plenty of space for nastiness. Those of us old enough to remember it (read: me) occasionally make the mistake of speaking in the snarky, biting tones that were acceptable back then. As Official Old People, Emmanuele and I agreed we had to work even harder to set an example for the kind of dialogue we hope to see in the community.

Part of that effort is boosting other peoples’ work. You don’t have to go full shouty Twitter venture capitalist about it or anything… just remember how good it felt the first time someone congratulated you on some good work you did, and pass that along. A quick DM or email can go a long way to making someone’s week.

Thanks Emmanuele, Brage, Bart, Sid, Sri, Alice, Michael, and all the other mods for keeping our spaces safe and inviting. It’s thankless work most of the time but we’re always grateful.

 

## Office Hours

We tried out “office hours” today: one hour for Foundation Members to come and chat. Bring a tea or coffee, tell me about your favourite GUADEC, tell me what a bad job I’m doing, explain where the Foundation needs to spend money to make GNOME better, ask a question… anything. The URL is only published on private channels for, uh, obvious reasons. See you next week!

 

Donate to GNOME

Digital Wellbeing Contract

This month I have been accepted as a contractor to work on the Parental Controls frontent and integration as part of the Digital Wellbeing project. I'm very happy to take part in this cool endeavour, and very grateful to the GNOME Foundation for giving me this opportunity - special thanks to Steven Deobald and Allan Day for interviewing me and helping me connect with the team, despite our timezone compatibility :)

The idea is to redesign the Parental Controls app UI to bring it on par with modern GNOME apps, and integrate the parental controls in the GNOME Shell lock screen with collaboration with gnome-control-center. There also new features to be added, such as Screen Time monitoring and setting limits, Bedtime Schedule and Web Filtering support. The project has been going for quite some time, and there has been a lot of great work put into both designs by Sam Hewitt and the backend by Philip Withnall, who's been really helpful teaching me about the project code practices and reviewing my MR. See the designs in the app-mockups ticket and the os-mockups ticket.

We started implementing the design mockup MVP for Parental Controls, which you can find in the app-mockup ticket. We're trying to meet the GNOME 49 release deadlines, but as always it's a goal rather than certain milestone. So far we have finished the redesign of the current Parent Controls app without adding any new features, which is to refresh the UI for unlock page, rework the user selector to be a list rather than a carousel, and changed navigation to use pages. This will be followed by adding pages for Screen Time and Web Filtering.

Refreshed unlock page

Reworked user selector

Navigation using pages, Screen Time and Web Filtering to be added

I want to thank the team for helping me get on board and being generally just awesome to work with :) Until next update!

Jussi Pakkanen

@jpakkane

AI slop is the new smoking

 


This Week in GNOME

@thisweek

#208 Converting Colors

Update on what happened across the GNOME project in the week from July 04 to July 11.

GNOME Core Apps and Libraries

Calendar ↗

A simple calendar application.

FineFindus reports

GNOME Calendar now allows exporting events as .ics files, allowing them to be easily shared.

GNOME Development Tools

GNOME Builder ↗

IDE for writing GNOME-based software.

Nokse reports

This week GNOME Builder received some new features!

  • Inline git blame to see who last modified each line of code
  • Changes and diagnostics overview displayed directly in the scrollbar
  • Enhanced LSP markdown rendering with syntax highlighting

GNOME Circle Apps and Libraries

DĆ©jĆ  Dup Backups ↗

A simple backup tool.

Michael Terry says

TWIG-Bot DƩjƠ Dup Backups 49.alpha is out for testing!

It features a UI refresh and file-manager-based restores (for Restic only).

Read the announcement for install instructions and more info.

Any feedback is appreciated!

Third Party Projects

Dev Toolbox ↗

Dev tools at your fingertips

Alessandro Iepure reports

When I first started Dev Toolbox, it was just a simple tool I built for myself, a weekend project born out of curiosity and the need for something useful. I never imagined anyone else would care about it, let alone use it regularly. I figured maybe a few developers here and there would find it helpful. But then people started using it. Reporting bugs. Translating it. Opening pull requests. Writing reviews. Sharing it with friends. And suddenly, it wasn’t just my toolbox anymore. Fast forward to today: over 50k downloads on Flathub and 300 stars on GitHub. I still can’t quite believe it.

To every contributor, translator, tester, reviewer, or curious user who gave it a shot: thank you. You turned a small idea into something real, something useful, and something I’m proud to keep building.

Enough feelings. Let’s talk about what’s new in v1.3.0!

  • New tool: Color Converter Convert between HEX, RGB, HSL, and other formats. (Thanks @Flachz)
  • JWT tool improvements You can now encode payloads and verify signatures. (Thanks @Flachz)
  • Chmod tool upgrade Added support for setuid, setgid, and the sticky bit. (Thanks @Flachz)
  • Improved search, inside and out
    • The app now includes extra keywords and metadata, making it easier to discover in app stores and desktops
    • In-app search now matches tool keywords, not just their names. (Thanks @freeducks-debug)
  • Now a GNOME search provider You can search and launch Dev Toolbox tools straight from the Overview
  • Updated translations Many new translatable strings were added this release. Thank you to all translators who chipped in.

GNOME Websites

Victoria šŸ³ļøā€āš§ļøšŸ³ļøā€šŸŒˆ she/her reports

On welcome.gnome.org, of all of the listed teams, only Translation and Documentation Teams linked to their wikis instead of their respective Welcome pages. But now this changes for Translation Team! After several months of working on this we finally have our own Welcome page. Now is the best time to make GNOME speak your language!

Miscellaneous

Arjan reports

PyGObject has support for async functions since 3.50. Now the async functions and methods are also discoverable from the GNOME Python API documentation.

GNOME Foundation

steven announces

The 2025-07-05 Foundation Update is out:

  • Grants and Fellowships Plan
  • Friends of GNOME, social media partners, shell notification
  • Annual Report… I haven’t done it yet
  • Fiscal Controls and Operational Resilience… yay?
  • Digital Wellbeing Frontend Kickoff
  • Office Hours
  • A Hacker in Need

https://blogs.gnome.org/steven/2025/07/05/2025-07-05-foundation-update/

Digital Wellbeing Project ↗

Ignacy Kuchciński (ignapk) reports

As part of the Digital Wellbeing project, sponsored by the GNOME Foundation, there is an initiative to redesign the Parental Controls to bring it on par with modern GNOME apps and implement new features such as Screen Time monitoring, Bedtime Schedule and Web Filtering. Recently the UI for the unlock page was refreshed, the user selector was reworked to be a list rather than a carousel, and navigation was changed to use pages. There’s more to come, see https://blogs.gnome.org/ignapk/2025/07/11/digital-wellbeing-contract/ for more information.

That’s all for this week!

See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects!

Kubernetes is not just for Black Friday

I self-host services mostly for myself. My threat model is particular: the highest threats I face are my own incompetence and hardware failures. To mitigate the weight of my incompetence, I relied on podman containers to minimize the amount of things I could misconfigure. I also wrote ansible playbooks to deploy the containers on my VPS, thus making it easy to redeploy them elsewhere if my VPS failed.

I've always ruled out Kubernetes as too complex machinery designed for large organizations who face significant surges in traffic during specific events like Black Friday sales. I thought Kubernetes had too many moving parts and would work against my objectives.

I was wrong. Kubernetes is not just for large organizations with scalability needs I will never have. Kubernetes makes perfect sense for a homelabber who cares about having a simple, sturdy setup. It has less moving parts than my podman and ansible setup, more standard development and deployment practices, and it allows me to rely on the cumulative expertise of thousands of experts.

I don't want to do things manually or alone

Self-hosting services is much more difficult than just putting them online. This is a hobby for me, something I do on my free time, so I need to spend as little time doing maintenance as possible. I also know I don't have peers to review my deployments. If I have to choose between using standardized methods that have been reviewed by others or doing things my own way, I will use the standardized method.

My main threats are:

I can and will make mistakes. I am an engineer, but my current job is not to maintain services online. In my homelab, I am also a team of one. This means I don't have colleagues to spot the mistakes I make.

[!info] This means I need to use battle-tested and standardized software and deployment methods.

I have limited time for it. I am not on call 24/7. I want to enjoy time off with my family. I have work to do. I can't afford to spend my life in front of a computer to figure out what's wrong.

[!info] This means I need to have a reliable deployment. I need to be notified when something goes in the wrong direction, and when something went completely wrong.

My hardware can fail, or be stolen. Having working backups is critical. But if my hardware failed, I would still need to restore backups somewhere.

[!info] This means I need to be able to rebuild my infrastructure quickly and reliably, and restore backups on it.

I was doing things too manually and alone

Since I wanted to get standardized software, containers seemed like a good idea. podman was particularly interesting to me because it can generate systemd services that will keep the containers up and running across restarts.

I could have deployed the containers manually on my VPS and generated the systemd services by invoking the CLI. But I would then risk making small tweaks on the spot, resulting in a deployment that is difficult to replicate elsewhere.

Instead, I wrote an ansible-playbook based on the containers.podman collection and other ansible modules. This way, ansible deploys the right containers on my VPS, copies or updates the config files for my services, and I can easily replicate this elsewhere.

It has served me well and worked decently for years now, but I'm starting to see the limits of this approach. Indeed, on their introduction, the ansible maintainers state

Ansible uses simple, human-readable scripts called playbooks to automate your tasks. You declare the desired state of a local or remote system in your playbook. Ansible ensures that the system remains in that state.

This is mostly true for ansible, but this is not really the case for the podman collection. In practice I still have to do manual steps in a specific order, like creating a pod first, then adding containers to the pod, then generating a systemd service for the pod, etc.

To give you a very concrete example, this is what the tasks/main.yaml of my Synapse (Matrix) server deployment role looks like.

- name: Create synapse pod
  containers.podman.podman_pod:
    name: pod-synapse
    publish:
      - "10.8.0.2:9000:9000"
    state: created

- name: Stop synapse pod
  containers.podman.podman_pod:
    name: pod-synapse
    publish:
      - "10.8.0.2:9000:9000"
    state: stopped

- name: Create synapse's postgresql
  containers.podman.podman_container:
    name: synapse-postgres
    image: docker.io/library/postgres:{{ synapse_container_pg_tag }}
    pod: pod-synapse
    volume:
      - synapse_pg_pdata:/var/lib/postgresql/data
      - synapse_backup:/tmp/backup
    env:
      {
        "POSTGRES_USER": "{{ synapse_pg_username }}",
        "POSTGRES_PASSWORD": "{{ synapse_pg_password }}",
        "POSTGRES_INITDB_ARGS": "--encoding=UTF-8 --lc-collate=C --lc-ctype=C",
      }

- name: Copy Postgres config
  ansible.builtin.copy:
    src: postgresql.conf
    dest: /var/lib/containers/storage/volumes/synapse_pg_pdata/_data/postgresql.conf
    mode: "600"

- name: Create synapse container and service
  containers.podman.podman_container:
    name: synapse
    image: docker.io/matrixdotorg/synapse:{{ synapse_container_tag }}
    pod: pod-synapse
    volume:
      - synapse_data:/data
      - synapse_backup:/tmp/backup
    labels:
      {
        "traefik.enable": "true",
        "traefik.http.routers.synapse.entrypoints": "websecure",
        "traefik.http.routers.synapse.rule": "Host(`matrix.{{ base_domain }}`)",
        "traefik.http.services.synapse.loadbalancer.server.port": "8008",
        "traefik.http.routers.synapse.tls": "true",
        "traefik.http.routers.synapse.tls.certresolver": "letls",
      }

- name: Copy Synapse's homeserver configuration file
  ansible.builtin.template:
    src: homeserver.yaml.j2
    dest: /var/lib/containers/storage/volumes/synapse_data/_data/homeserver.yaml
    mode: "600"

- name: Copy Synapse's logging configuration file
  ansible.builtin.template:
    src: log.config.j2
    dest: /var/lib/containers/storage/volumes/synapse_data/_data/{{ matrix_server_name}}.log.config
    mode: "600"

- name: Copy Synapse's signing key
  ansible.builtin.template:
    src: signing.key.j2
    dest: /var/lib/containers/storage/volumes/synapse_data/_data/{{ matrix_server_name }}.signing.key
    mode: "600"

- name: Generate the systemd unit for Synapse
  containers.podman.podman_pod:
    name: pod-synapse
    publish:
      - "10.8.0.2:9000:9000"
    generate_systemd:
      path: /etc/systemd/system
      restart_policy: always

- name: Enable synapse unit
  ansible.builtin.systemd:
    name: pod-pod-synapse.service
    enabled: true
    daemon_reload: true

- name: Make sure synapse is running
  ansible.builtin.systemd:
    name: pod-pod-synapse.service
    state: started
    daemon_reload: true

- name: Allow traffic in monitoring firewalld zone for synapse metrics
  ansible.posix.firewalld:
    zone: internal
    port: "9000/tcp"
    permanent: true
    state: enabled
  notify: firewalld reload

I'm certain I'm doing some things wrong and this file can be shortened and improved, but this is also my point: I'm writing a file specifically for my needs, that is not peer reviewed.

Upgrades are also not necessarily trivial. While in theory it's as simple as updating the image tag in my playbook variables, in practice things get more complex when some containers depend on others.

[!info] With Ansible, I must describe precisely the steps my server has to go through to deploy the new containers, how to check their health, and how to roll back if needed.

Finally, discoverability of services is not great. I used traefik as a reverse proxy, and gave it access to the docker socket so it could read the labels of my other containers (like in the labels section of the yaml file above, containing the domain to use for Synapse), figure out what domain names I used, and route traffic to the correct containers. I wish a similar mechanism existed for e.g. prometheus to find new resources and scrape their metrics automatically, but didn't find any. Configuring Prometheus to scrape my pods was brittle and required a lot of manual work.

Working with Kubernetes and its community

What I need is a tool that lets me write "I want to deploy the version X of Keycloak." I need it to be able to figure by itself what version is currently running, what needs to be done to deploy the version X, whether the new deployment goes well, and how to roll back automatically if it can't deploy the new version.

The good news is that this tool exists. It's called Kubernetes, and contrary to popular belief it's not just for large organizations that run services for millions of people and see surges in traffic for Black Friday sales. Kubernetes is software that runs on one or several servers, forming a cluster, and uses their resources to run the containers you asked it to.

Kubernetes gives me more standardized deployments

To deploy services on Kubernetes you have to describe what containers to use, how many of them will be deployed, how they're related to one another, etc. To describe these, you use yaml manifests that you apply to your cluster. Kubernetes takes care of the low-level implementation so you can describe what you want to run, how much resources you want to allocate to it, and how to expose it.

The Kubernetes docs give the following manifest for an example Deployment that will spin up 3 nginx containers (without exposing them outside of the cluster)

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

From my laptop, I can apply this file to my cluster by using kubectl, the CLI to control kubernetes clusters.

$ kubectl apply -f nginx-deployment.yaml

[!info] With Kubernetes, I can describe my infrastructure as yaml files. Kubernetes will read them, deploy the right containers, and monitor their health.

If there is already a well established community around the project, chances are that some people already maintain Helm charts for it. Helm charts describe how the containers, volumes and all the Kubernetes object are related to one another. When a project is popular enough, people will write charts for it and publish them on https://artifacthub.io/.

Those charts are open source, and can be peer reviewed. They already describe all the containers, services, and other Kubernetes objects that need to deploy to make a service run. To use the, I only have to define configuration variables, called Helm values, and I can get a service running in minutes.

To deploy a fully fledged Keycloak instance on my cluster, I need to override default parameters in a values.yaml file, like for example

ingress:
	enabled: true
	hostname: keycloak.ergaster.org
	tls: true

This is a short example. In practice, I need to override more values to fine tune my deployment and make it production ready. Once it's done I can deploy a configured Keycloak on my cluster by typing these commands on my laptop

$ helm repo add bitnami https://charts.bitnami.com/bitnami
$ helm install my-keycloak -f values.yaml bitnami/keycloak --version 24.7.4

In practice I don't use helm on my laptop but I write yaml files in a git repository to describe what Helm charts I want to use on my cluster and how to configure them. Then, a software suite running on my Kubernetes cluster called Flux detects changes on the git repository and applies them to my cluster. It makes changes even easier to track and roll back. More on that in a further blog post.

Of course, it would be reckless to deploy charts you don't understand, because you wouldn't be able to chase problems down as they arose. But having community-maintained, peer-reviewed charts gives you a solid base to configure and deploy services. This minimizes the room for error.

[!info] Helm charts let me benefit from the expertise of thousands of experts. I need to understand what they did, but I have a solid foundation to build on

But while a Kubernetes cluster is easy to use, it can be difficult to set up. Or is it?

Kubernetes doesn't have to be complex

A fully fledged Kubernetes cluster is a complex beast. Kubernetes, often abbreviated k8s, was initially designed to run on several machines. When setting up a cluster, you have to choose between components you know nothing about. What do I want for the network of my cluster? I don't know, I don't even know how the cluster network is supposed to work, and I don't want to know! I want a Kubernetes cluster, not a Lego toolset!

[!info] A fully fledged cluster solves problems for large companies with public facing services, not problems for a home lab.

Quite a few cloud providers offer managed cluster options so you don't have to worry about it, but they are expensive for an individual. In particular, they charge fees depending on the amount of outgoing traffic (egress fees). Those are difficult to predict.

Fortunately, a team of brilliant people has created an opinionated bundle of software to deploy a Kubernetes cluster on a single server (though it can form cluster with several nodes too). They cheekily call it it k3s to advertise it as a small k8s.

[!info] For a self-hosting enthusiast who want to run Kubernetes on a single server, k3s works like k8s, but installing and maintaining the cluster itself is much simpler.

Since I don't have High Availability needs and can afford to get my services offline occasionally, k3s on a single node is more than enough to let me play with Kubernetes without the extra complexity.

Installing k3s can be as simple as running the one-liner they advertise on their website

$ curl -sfL https://get.k3s.io | sh -

I've found their k3s-ansible playbook more useful since it does a few checks on the cluster host and copies the kubectl configuration files on your laptop automatically.

Discoverability on Kubernetes is fantastic

In my podman setup, I loved how traefik could read the labels of a container, figure out what domains I used, and where to route traffic for this domain.

Not only is the same thing true for Kubernetes, it goes further. cert-manager, the standard way to retrieve certs in a Kubernetes cluster, will read Ingress properties to figure out what domains I use and retrieve certificates for them. If you're not familiar with Kubernetes, an Ingress is a Kubernetes object telling your cluster "I want to expose those containers to the web, and this is how they can be reached."

Kubernetes has an Operator pattern. When I install a service on my Kubernetes cluster, it often comes with a specific Kubernetes object that Prometheus Operator can read to know how to scrape the service.

All of this happens by adding an annotation or two in yaml files. I don't have to fiddle with networks. I don't have to configure things manually. Kubernetes handles all that complexity for me.

Conclusion

It was a surprise for me to realize that Kubernetes is not the complex beast I thought it was. Kubernetes internals can be difficult to grasp, and there is a steeper learning curve than with docker-compose. But it's well worth the effort.

I got into Kubernetes by deploying k3s on my Raspberry Pi 4. I then moved to a beefier mini pc, not because Kubernetes added too much overhead, but because the CPU of the Raspberry Pi 4 is too weak to handle my encrypted backups.

With Kubernetes and Helm, I have more standardized deployments. The open source services I deploy have been crafted and reviewed by a community of enthusiasts and professionals. Kubernetes handles a lot of the complexity for me, so I don't have to. My k3s cluster runs on a single node. My volumes live on my disk (via Rancher's local path provisioner). And I still don't do Black Friday sales!

Andy Wingo

@wingo

guile lab notebook: on the move!

Hey, a quick update, then a little story. The big news is that I got Guile wired to a moving garbage collector!

Specifically, this is the mostly-moving collector with conservative stack scanning. Most collections will be marked in place. When the collector wants to compact, it will scan ambiguous roots in the beginning of the collection cycle, marking objects referenced by such roots in place. Then the collector will select some blocks for evacuation, and when visiting an object in those blocks, it will try to copy the object to one of the evacuation target blocks that are held in reserve. If the collector runs out of space in the evacuation reserve, it falls back to marking in place.

Given that the collector has to cope with failed evacuations, it is easy to give the it the ability to pin any object in place. This proved useful when making the needed modifications to Guile: for example, when we copy a stack slice containing ambiguous references to a heap-allocated continuation, we eagerly traverse that stack to pin the referents of those ambiguous edges. Also, whenever the address of an object is taken and exposed to Scheme, we pin that object. This happens frequently for identity hashes (hashq).

Anyway, the bulk of the work here was a pile of refactors to Guile to allow a centralized scm_trace_object function to be written, exposing some object representation details to the internal object-tracing function definition while not exposing them to the user in the form of API or ABI.

bugs

I found quite a few bugs. Not many of them were in Whippet, but some were, and a few are still there; Guile exercises a GC more than my test workbench is able to. Today I’d like to write about a funny one that I haven’t fixed yet.

So, small objects in this garbage collector are managed by a Nofl space. During a collection, each pointer-containing reachable object is traced by a global user-supplied tracing procedure. That tracing procedure should call a collector-supplied inline function on each of the object’s fields. Obviously the procedure needs a way to distinguish between different kinds of objects, to trace them appropriately; in Guile, we use an the low bits of the initial word of heap objects for this purpose.

Object marks are stored in a side table in associated 4-MB aligned slabs, with one mark byte per granule (16 bytes). 4 MB is 0x400000, so for an object at address A, its slab base is at A & ~0x3fffff, and the mark byte is offset by (A & 0x3fffff) >> 4. When the tracer sees an edge into a block scheduled for evacuation, it first checks the mark byte to see if it’s already marked in place; in that case there’s nothing to do. Otherwise it will try to evacuate the object, which proceeds as follows...

But before you read, consider that there are a number of threads which all try to make progress on the worklist of outstanding objects needing tracing (the grey objects). The mutator threads are paused; though we will probably add concurrent tracing at some point, we are unlikely to implement concurrent evacuation. But it could be that two GC threads try to process two different edges to the same evacuatable object at the same time, and we need to do so correctly!

With that caveat out of the way, the implementation is here. The user has to supply an annoyingly-large state machine to manage the storage for the forwarding word; Guile’s is here. Basically, a thread will try to claim the object by swapping in a busy value (-1) for the initial word. If that worked, it will allocate space for the object. If that failed, it first marks the object in place, then restores the first word. Otherwise it installs a forwarding pointer in the first word of the object’s old location, which has a specific tag in its low 3 bits allowing forwarded objects to be distinguished from other kinds of object.

I don’t know how to prove this kind of operation correct, and probably I should learn how to do so. I think it’s right, though, in the sense that either the object gets marked in place or evacuated, all edges get updated to the tospace locations, and the thread that shades the object grey (and no other thread) will enqueue the object for further tracing (via its new location if it was evacuated).

But there is an invisible bug, and one that is the reason for me writing these words :) Whichever thread manages to shade the object from white to grey will enqueue it on its grey worklist. Let’s say the object is on an block to be evacuated, but evacuation fails, and the object gets marked in place. But concurrently, another thread goes to do the same; it turns out there is a timeline in which the thread A has marked the object, published it to a worklist for tracing, but thread B has briefly swapped out the object’s the first word with the busy value before realizing the object was marked. The object might then be traced with its initial word stompled, which is totally invalid.

What’s the fix? I do not know. Probably I need to manage the state machine within the side array of mark bytes, and not split between the two places (mark byte and in-object). Anyway, I thought that readers of this web log might enjoy a look in the window of this clown car.

next?

The obvious question is, how does it perform? Basically I don’t know yet; I haven’t done enough testing, and some of the heuristics need tweaking. As it is, it appears to be a net improvement over the non-moving configuration and a marginal improvement over BDW, but which currently has more variance. I am deliberately imprecise here because I have been more focused on correctness than performance; measuring properly takes time, and as you can see from the story above, there are still a couple correctness issues. I will be sure to let folks know when I have something. Until then, happy hacking!

Nancy Nyambura

@nwnyambura

Outreachy Update: Two Weeks of Configs, Word Lists, and GResource Scripting

ItĀ hasĀ beenĀ aĀ busyĀ twoĀ weeksĀ of learning as I continuedĀ toĀ developĀ the GNOME Crosswords project.Ā IĀ haveĀ beenĀ mainlyĀ engagedĀ inĀ improving how word lists are managed andĀ includedĀ using configuration files.

I started byĀ writingĀ documentationĀ forĀ how to add a new word list to the projectĀ byĀ using .conf files.Ā TheĀ configuration files define properties like display name, language, andĀ originĀ of the word listĀ soĀ thatĀ contributorsĀ canĀ simplyĀ add new vocabulary datasets. Each word list can optionally pullĀ inĀ definitions from WiktionaryĀ andĀ parseĀ them,Ā convertingĀ themĀ into resource files for useĀ byĀ the game.

As an addition to this I alsoĀ scriptedĀ aĀ programĀ thatĀ takesĀ configĀ fileĀ contentsĀ andĀ turnsĀ themĀ intoĀ GResourceĀ XMLĀ files.Ā This isn’t theĀ projectĀ bulk,Ā butĀ a useful tool that automates part of the setupĀ and ensures consistencyĀ betweenĀ different word list entries.Ā ItĀ takesĀ inĀ a .conf fileĀ and outputs aĀ correspondingĀ .gresource.xml.in file, mapping the necessary resourcesĀ toĀ suitableĀ aliases. This was a good chance for me to learn more about Python’s argparse and configparser modules.

Beyond scripting, I’ve been in regular communication with my mentor, seeking feedback and guidance to improve both my technical and collaborative skills. One key takeaway has been the importance of sharing smaller, incremental commits rather than submitting a large block of work all at once, a practice that not only helps with clarity but also encourages consistent progress tracking. I was also advised to avoid relying on AI-generated code and instead focus on writing clear, simple, and understandable solutions, which I’ve consciously applied to both my code and documentation.

Next, I’ll be looking into how definitions are extracted and howĀ importer modules work.Ā LotsĀ more toĀ discover, especiallyĀ aboutĀ theĀ innardsĀ of the Wiktionary extractor tool.

Looking forward to sharing more updates as IĀ getĀ deeper into the project

Copyleft-next Relaunched!

I am excited that Richard Fontana and I have announced the relaunch of copyleft-next.

The copyleft-next project seeks to create a copyleft license for the next generation that is designed in public, by the community, using standard processes for FOSS development.

If this interests you, please join the mailing list and follow the project on the fediverse (on its Mastodon instance).

I also wanted to note that as part of this launch, I moved my personal fediverse presence from floss.social to bkuhn@copyleft.org.

Jussi Pakkanen

@jpakkane

Deoptimizing a red-black tree

An ordered map is typically slower than a hash map, but it is needed every now and then. Thus I implemented one in Pystd. This implementation does not use individually allocated nodes, but instead stores all data in a single contiguous array.

Implementing the basics was not particularly difficult. Debugging it to actually work took ages of staring at the debugger, drawing trees by hand on paper, printfing things out in Graphviz format and copypasting the output to a visualiser. But eventually I got it working. Performance measurements showed that my implementation is faster than std::map but slower than std::unordered_map.

So far so good.

The test application creates a tree with a million random integers. This means that the nodes are most likely in a random order in the backing store and searching through them causes a lot of cache misses. Having all nodes in an array means we can rearrange them for better memory access patterns.

I wrote some code to reorganize the nodes so that the root is at the first spot and the remaining nodes are stored layer by layer. In this way the query always processes memory "left to right" making things easier for the branch predictor.

Or that's the theory anyway. In practice this made things slower. And not even a bit slower, the "optimized version" was more than ten percent slower. Why? I don't know. Back to the drawing board.

Maybe interleaving both the left and right child node next to each other is the problem? That places two mutually exclusive pieces of data on the same cache line. An alternative would be to place the entire left subtree in one memory area and the right one in a separate one. Thinking about this for a while, this can be accomplished by storing the nodes in tree traversal order, i.e. in numerical order.

I did that. It was also slower than a random layout. Why? Again: no idea.

Time to focus on something else, I guess.

Steven Deobald

@steven

2025-07-05 Foundation Update

## The Cat’s Out Of The Bag

Since some of you are bound to see this Reddit comment, and my reply, it’s probably useful for me to address it in a more public forum, even if it violates my “No Promises” rule.

No, this wasn’t a shoot-from-the-hip reply. This has been the plan since I proposed a fundraising strategy to the Board. It is my intention to direct more of the Foundation’s resources toward GNOME development, once the Foundation’s basic expenses are taken care of. (Currently they are not.) The GNOME Foundation won’t stop running infrastructure, planning GUADEC, providing travel grants, or any of the other good things we do. But rather than the Foundation contributing to GNOME’s development exclusively through inbound/restricted grants, we will start to produce grants and fellowships ourselves.

This will take time and it will demand more of the GNOME project. The project needs clear governance and management or we won’t know where to spend money, even if we have it. The Foundation won’t become a kingmaker, nor will we run lotteries — it’s up to the project to make recommendations and help us guide the deployment of capital toward our mission.

 

## Friends of GNOME

So far, we have a cute little start to our fundraising campaign: I count 172 public Friends of GNOME over on https://donate.gnome.org/ … to everyone who contributes to GNOME and to everyone who donates to GNOME: thank you. Every contribution makes a huge difference and it’s been really heartwarming to see all this early support.

We’ve taken the first step out of our cozy f/oss spaces: Reddit. One user even set up a “show me your donation!” thread. It’s really cute. šŸ™‚ It’s hard to express just how important it is that we go out and meet our users for this exercise. We need them to know what an exciting time it is for GNOME: Windows 10 is dying, MacOS gets worse with every release, and they’re going to run GNOME on a phone soon. We also need them to know that GNOME needs their help.

Big thanks to Sri for pushing this and to him and Brage for moderating /r/gnome. It matters a lot to find a shared space with users and if, as a contributor, you’ve been feeling like you need a little boost lately, I encourage you to head over to those Reddit threads. People love what you build, and it shows.

 

## Friends of GNOME: Partners

The next big thing we need to do is to find partners who are willing to help us push a big message out across a lot of channels. We don’t even know who our users are, so it’s pretty hard to reach them. The more people see that GNOME needs their help, the more help we’ll get.

Everyone I know who runs GNOME (but doesn’t pay much attention to the project) said the same thing when I asked what they wanted in return for a donation: “Nothing really… I just need you to ask me. I didn’t know GNOME needed donations!”

If you know of someone with a large following or an organization with a lot of reach (or, heck, even a little reach), please email me and introduce me. I’m happy to get them involved to boost us.

 

## Friends of GNOME: Shell Notification

KDE, Thunderbird, and Blender have had runaway success with their small donation notification. I’m not sure we can do this for GNOME 49 or not, but I’d love to try. I’ve opened an issue here:

https://gitlab.gnome.org/Teams/Design/os-mockups/-/issues/274

We may not know who our users are. But our software knows who our users are. šŸ˜‰

 

## Annual Report

I should really get on this but it’s been a busy week with other things. Thanks everyone who’s contributed their thoughts to the “Successes for 2025” issue so far. If you don’t see your name and you still want to contribute something, please go ahead!

 

## Fiscal Controls

One of the aforementioned “other things” is Fiscal Controls.

This concept goes by many names. “Fiscal Controls”, “Internal Controls”, “Internal Policies and Procedures”, etc. But they all refer to the same thing: how to manage financial risk. We’re taking a three-pronged approach to start with:

  1. Reduce spend and tighten up policies. We have put the travel policy on pause (barring GUADEC, which was already approved) and we intend to tighten up all our policies.
  2. Clarity on capital shortages. We need to know exactly what our P&L looks like in any given month, and what our 3-month, 6-month, and annual projections look like based on yesterday’s weather. Our bookkeepers, Ops team, and new treasurers are helping with this.
  3. Clarity in reporting. A 501(c)(3) is … kind of a weird shape. Not everyone in the Board is familiar with running a business and most certainly aren’t familiar with running a non-profit. So we need to make it painfully straightforward for everyone on the Board to understand the details of our financial position, without getting into the weeds: How much money are we responsible for, as a fiscal host? How much money is restricted? How much core money do we have? Accounting is more art than science and the nuances of reporting accurately (but without forcing everyone to read a balance sheet) is a large part of why that’s the case. Again, we have a lot of help from our bookkeepers, Ops team, and new treasurers.

There’s a lot of work to do here and we’ll keep iterating, but these feel like strong starts.

 

## Organizational Resilience

The other aforementioned “other thing” is resilience. We have a few things happening here.

First, we need broader ownership, control, and access to bank accounts. This is, of course, the related to, but different from, fiscal controls — our controls ensure no one person can sign themselves a cheque for $50,000. Multiple signatories ensures that such responsibility doesn’t rest with a single individual. Everyone at the GNOME Foundation has impeccable moral standing but people do die, and we need to add resilience to that inevitability. More realistically (and immediately), we will be audited soon and the auditors will not care how trustworthy we believe one another to be.

Second, we have our baseline processes: filing 990s, renewing our registration, renewing insurance, etc. All of these processes should be accessible to (and, preferably, executable by) multiple people.

Third, we’re finally starting to make good use of Vaultwarden. Thanks again, Bart, for setting this up for us.

Fourth, we need to ensure we have at least 3 administrators on each of our online accounts. Or, at worst, 2 administrators. Online accounts with an account owner should lean on an organizational account owner (not an individual) which multiple people control together. Thanks Rosanna for helping sort this out.

Last, we need at least 2 folks with root level access to all our self-hosted services. This of course true in the most literal sense, but we also need our SREs to have accounts with each service.

 

## Digital Wellbeing Kickoff

I’m pleased to announce that the Digital Wellbeing contract has kicked off! The developer who was awarded the contract is Ignacy Kuchciński and he has begun working with Philip and Sam as of Tuesday.

 

## Office Hours

I had a couple pleasant conversations with hackers this week: Jordan Petridis and Sophie Harold. I asked Sophie what she thought about the idea of “office hours” as I feel like I’ve gotten increasingly disconnected from the community after my first few weeks. Her response was something to the effect of “you can only try.” šŸ™‚

So let’s do that. I’ll invite maintainers and if you’d like to join, please reach out to a maintainer to find out the BigBlueButton URL for next Friday.

 

## A Hacker In Need Of Help

We have a hacker in the southwest United States who is currently in an unsafe living situation. This person has given me permission to ask for help on their behalf. If you or someone you know could provide a safe temporary living situation within the continental United States, please get in touch with me. They just want to hack in peace.

 

Hans de Goede

@hansdg

Recovering a FP2 which gives "flash write failure" errors

This blog post describes my successful os re-install on a fairphone 2 which was giving "flash write failure" errors when flashing it with fastboot, with the flash_FP2_factory.sh script. I'm writing down my recovery steps for this in case they are useful for anyone else.

I believe that this is caused by the bootloader code which implements fastboot not having the ability to retry recoverable eMMC errors. It is still possible to write the eMMC from Linux which can retry these errors.

So we can recover by directly fastboot-ing a recovery.img and then flashing things over adb.

See step by step instructions... )

comment count unavailable comments

This Week in GNOME

@thisweek

#207 Replacing Shortcuts

Update on what happened across the GNOME project in the week from June 27 to July 04.

GNOME Core Apps and Libraries

Sophie šŸ³ļøā€šŸŒˆ šŸ³ļøā€āš§ļø (she/her) reports

The Release Team is happy to announce, that Papers will be the default Document Viewer starting with GNOME 49. This comes after a Herculean effort of the Papers maintainers and contributors that started about four years ago. The inclusion into GNOME Core was lately only blocked by missing screen-reader support, which is now ready to be merged. Papers is a fork of Evince motivated by a faster pace of development.

Papers is not just a GTK 4 port but also brings new features like a better document annotations and support for mobile form factors. It is currently maintained by Pablo Correa Gomez, Qiu Wenbo, Markus Gƶllnitz, and lbaudin.

Emmanuele Bassi reports

While GdkPixbuf, the elderly statesperson of image loading libraries in GNOME, is being phased out in favour or better alternatives, like Glycin, we are still hard at work to ensure it’s working well enough while applications and libraries are ported. Two weeks ago, GdkPixbuf acquired a safe, sandboxed image loader using Glycin; this week, this loader has been updated to be the default on Linux. The Glycin loader has also been updated to read SVG, and save image data including metadata. Additionally, GdkPixbuf has a new Android-native loader, using platform API; this allows loading icon assets when building GTK for Android. For more information, see the release notes for GdkPixbuf 2.43.3, the latest development snapshot.

Sophie šŸ³ļøā€šŸŒˆ šŸ³ļøā€āš§ļø (she/her) announces

The nightly GNOME Flatpak runtime and SDK org.gnome.Sdk//master are now based on the Freedesktop runtime and SDK 25.08beta. If you are using the nightly runtime in you Flatpak development manifest, you might have to adjust a few things:

  • If you are using the LLVM extension, the required sdk-extensions is now org.freedesktop.Sdk.Extension.llvm20. Don’t forget to also adjust the append-path. On your development system you will probably also have to run flatpak install org.freedesktop.Sdk.Extension.llvm20//25.08beta.
  • If you are using other SDK extensions, they might also require a newer version. They can be installed with commands like flatpak install org.freedesktop.Sdk.Extension.rust-stable//25.08beta.

Libadwaita ↗

Building blocks for modern GNOME apps using GTK4.

Alice (she/her) šŸ³ļøā€āš§ļøšŸ³ļøā€šŸŒˆ says

libadwaita finally has a replacement for the deprecated GtkShortcutsWindow - AdwShortcutsDialog. AdwShortcutLabel is available as a separate widget as well, replacing GtkShortcutLabel

Calendar ↗

A simple calendar application.

Hari Rana | TheEvilSkeleton (any/all) šŸ‡®šŸ‡³ šŸ³ļøā€āš§ļø announces

Happy Disability Pride Month everybody :)

During the past few weeks, there’s been an overwhelming amount of progress with accessibility on GNOME Calendar:

  • Event widgets/popovers will convey to screen readers that they are toggle buttons. They will also convey of their states (whether they’re pressed or not) and that they have a popover. (See !587)
  • Calendar rows will convey to screen readers that they are check boxes, along with their states (whether they’re checked or not). Additionally, they will no longer require a second press of a tab to get to the next row; one tab will be sufficient. (See !588)
  • Month and year spin buttons are now capable of being interacted with using arrow up/down buttons. They will also convey to screen readers that they are spin buttons, along with their properties (current, minimum, and maximum values). The month spin button will also wrap, where going back a month from January will jump to December, and going to the next month from December will jump to January. (See !603)
  • Events in the agenda view will convey to screen readers of their respective titles and descriptions. (See !606)

All these improvements will be available in GNOME 49.

Accessibility on Calendar has progressed to the point where I believe it’s safe to say that, as of GNOME 49, Calendar will be usable exclusively with a keyboard, without significant usability friction!

There’s still a lot of work to be done in regards to screen readers, for example conveying time appropriately and event descriptions. But really, just 6 months ago, we went from having absolutely no idea where to even begin with accessibility in Calendar — which has been an ongoing issue for literally a decade — to having something workable exclusively with a keyboard and screen reader! :3

Huge thanks to Jeff Fortin for coordinating the accessibility initiative, especially with keeping the accessibility meta issue updated; Georges Stavracas for single-handedly maintaining GNOME Calendar and reviewing all my merge requests; and LukÔŔ Tyrychtr for sharing feedback in regards to usability.

All my work so far has been unpaid and voluntary; hundreds of hours were put into developing and testing all the accessibility-related merge requests. I would really appreciate if you could spare a little bit of money to support my work, thank you! 🩷

Glycin ↗

Sandboxed and extendable image loading and editing.

Sophie šŸ³ļøā€šŸŒˆ šŸ³ļøā€āš§ļø (she/her) reports

We recently switched our legacy image loading library GdkPixbuf over to using glycin internally, which is our new image loading library. Glycin is safer, faster, and supports more features. Something that we missed is how much software depends on the image saving capabilities of GdkPixbuf for different formats. But that’s why we are making such changes early in the cycle to find these issues.

Glycin now supports saving images for the AVIF, BMP, DDS, Farbfeld, GIF, HEIC, ICO, JPEG, OpenEXR, PNG, QOI, TGA, TIFF, and WebP image formats. JXL will hopefully follow. This means GdkPixbuf can also save the formats that it could save before. The changes are available as glycin 2.0.alpha.6 and gdk-pixbuf 2.43.3.

Third Party Projects

Alexander Vanhee says

Gradia has been updated with the ability to upload edited images to an online provider of choice. I made sure users are both well informed about these services and can freely choose without being forced to use any particular one. The data related to this feature can also be updated dynamically without requiring a new release, enabling us to quickly address any data quality issues and update the list of providers as needed, without relying on additional package maintainer intervention.

You can find the app on Flathub.

Bilal Elmoussaoui reports

I have released a MCP (Model Context Protocol) server implementation that allows LLMs to access and interact with your favourite desktop environment. The implementation is available at https://github.com/bilelmoussaoui/gnome-mcp-server and you can read a bit more about it in my recent blog post https://belmoussaoui.com/blog/21-mcp-server

Phosh ↗

A pure wayland shell for mobile devices.

Guido reports

Phosh 0.48.0 is out:

There’s a new lock screen plugin that show all currently running media players (that support the MPRIS interface). You can thus switch between Podcasts, Shortwave and Gapless without having to unlock the phone.

We also updated phosh’s compositor phoc to wlroots 0.19.0 bringing all the goodies from this releases. Phoc now also remembers the output scale in case the automatic scaling doesn’t match your expectations.

There’s more, see the full details at here

That’s all for this week!

See you next week, and be sure to stop by #thisweek:gnome.org with updates on your own projects!

Richard Littauer

@rlittauer

A handful of EDs

I had the great privilege of going to UN Open Source Week at the UN, in New York City, last month. At one point, standing on the upper deck and looking out over the East River, I realized that there were more than a few former and current GNOME executive directors. So, we got a photo.

Six people in front of a river on a building

Stormy, Karen, Jeff, me, Steven, and Michael – not an ED, but the host of the event and the former board treasurer – all lined up.

Fun.

Edit: Apparently Jeff was not an ED, but a previous director. I wonder if there is a legacy note of all previous appointments…

Carlos Garnacho

@garnacho

Developing an application with TinySPARQL in 2025

Back a couple of months ago, I was given the opportunity to talk at LAS about search in GNOME, and the ideas floating around to improve it. Part of the talk was dedicated to touting the benefits of TinySPARQL as the base for filesystem search, and how in solving the crazy LocalSearch usecases we ended up with a very versatile tool for managing application data, either application-private or shared with other peers.

It was no one else than our (then) future ED in a trench coat (I figure!) who forced my hand in the question round into teasing an application I had been playing with, to showcase how TinySPARQL should be used in modern applications. Now, after finally having spent some more time on it, I feel it’s up to a decent enough level of polish to introduce it more formally.

Behold Rissole

Picture of Rissole UI.

Rissole is a simple RSS feed reader, intended let you read articles in a distraction free way, and to keep them all for posterity. It also sports a extremely responsive full-text search over all those articles, even on huge data sets. It is built as a flatpak, you can ATM download it from CI to try it, meanwhile it reaches flathub and GNOME Circle (?). Your contributions are welcome!

So, let’s break down how it works, and what does TinySPARQL bring to the table.

Structuring the data

The first thing a database needs is a definition about how the data is structured. TinySPARQL is strongly based on RDF principles, and depends on RDF Schema for these data definitions. You have the internet at your fingertips to read more about these, but the gist is that it allows the declaration of data in a object-oriented manner, with classes and inheritance:

mfo:FeedMessage a rdfs:Class ;
    rdfs:subClassOf mfo:FeedElement .

mfo:Enclosure a rdfs:Class ;
    rdfs:subClassOf mfo:FeedElement .

One can declare properties on these classes:

mfo:downloadedTime a rdf:Property ;
    nrl:maxCardinality 1 ;
    rdfs:domain mfo:FeedMessage ;
    rdfs:range xsd:dateTime .

And make some of these properties point to other entities of specific (sub)types, this is the key that makes TinySPARQL a graph database:

mfo:enclosureList a rdf:Property ;
    rdfs:domain mfo:FeedMessage ;
    rdfs:range mfo:Enclosure .

In practical terms, a database needs some guidance on what data access patterns are most expected. Being a RSS reader, sorting things by date will be prominent, and we want full-text search on content. So we declare it on these properties:

nie:plainTextContent a rdf:Property ;
    nrl:maxCardinality 1 ;
    rdfs:domain nie:InformationElement ;
    rdfs:range xsd:string ;
    nrl:fulltextIndexed true .

nie:contentLastModified a rdf:Property ;
    nrl:maxCardinality 1 ;
    nrl:indexed true ;
    rdfs:subPropertyOf nie:informationElementDate ;
    rdfs:domain nie:InformationElement ;
    rdfs:range xsd:dateTime .

The full set of definitions will declare what is permitted for the database to contain, the class hierarchy and their properties, how do resources of a specific class interrelate with other classes… In essence, how the information graph is allowed to grow. This is its ontology (semi-literally, its view of the world, whoooah duude). You can read more in detail how these declarations work at the TinySPARQL documentation.

This information is kept in files separated from code, built in as a GResource in the application binary, and used during initialization to create a database at a location in control of the application:

    let mut store_path = glib::user_data_dir();
    store_path.push("rissole");
    store_path.push("db");

    obj.imp()
        .connection
        .set(tsparql::SparqlConnection::new(
            tsparql::SparqlConnectionFlags::NONE,
            Some(&gio::File::for_path(store_path)),
            Some(&gio::File::for_uri(
                "resource:///com/github/garnacho/Rissole/ontology",
            )),
            gio::Cancellable::NONE,
        )?)
        .unwrap();

So there’s a first advantage right here, compared to other libraries and approaches: The application only has to declare this ontology without much (or any) further code to support supporting code, compare to going through the design/normalization steps for your database design, and. having to CREATE TABLE your way to it with SQLite.

Handling structure updates

If you are developing an application that needs to store a non-trivial amount of data. It often comes as a second thought how to deal with new data being necessary, stored data being no longer necessary, and other post-deployment data/schema migrations. Rarely things come up exactly right at the first try.

With few documented exceptions, TinySPARQL is able to handle these changes to the database structure by itself, applying the necessary changes to convert a pre-existing database into the new format declared by the application. This also happens at initialization time, from the application-provided ontology.

But of course, besides the data structure, there might also be data content that might some kind of conversion or migration, this is where an application might still need some supporting code. Even then, SPARQL offers the necessary syntax to convert data, from small to big, from minor to radical changes. With the CONSTRUCT query form, you can generate any RDF graph from any other RDF graph.

For Rissole, I’ve gone with a subset of the Nepomuk ontology, which does contain much embedded knowledge about the best ways to lay data in a graph database. As such I don’t expect major changes or gotchas in the data, but this remains a possibility for the future, e.g. if we were to move to another emerging ontology, or any other less radical data migrations that might crop up.

So here’s the second advantage, compare to having to ALTER TABLE your way to new database schemas, or handle data migration for each individual table, and ensuring you will not paint yourself into a corner in the future.

Querying data

Now we have a database! We can write the queries that will feed the application UI. Of course, the language to write these in is SPARQL, there are plenty of resources over it on the internet, and TinySPARQL has its own tutorial in the documentation.

One feature that sets TinySPARQL apart from other SPARQL engines in terms of developer experience is the support for parameterized values in SPARQL queries, through a little bit of non-standard syntax and the TrackerSparqlStatement API, you can compile SPARQL queries into reusable statements, which can be executed with different arguments, and will compile to an intermediate representation, resulting in faster execution when reused. Statements are also the way to go in terms of security, in order to avoid query injection situations. This is e.g. Rissole (simplified) search query:

SELECT
    ?urn
    ?title
{
    ?urn a mfo:FeedMessage ;
        nie:title ?title ;
        fts:match ~match .
}

Which allows me to funnel a GtkEntry content right away in the ~match without caring about character escaping or other validation. These queries may also be stored in GResource, and live as separate files in the project tree, and be loaded/compiled early during application startup once, so they are reusable during the rest of the application lifetime:

fn load_statement(&self, query: &str) -> tsparql::SparqlStatement {
    let base_path = "/com/github/garnacho/Rissole/queries/";

    let stmt = self
        .imp()
        .connection
        .get()
        .unwrap()
        .load_statement_from_gresource(&(base_path.to_owned() + query), gio::Cancellable::NONE)
        .unwrap()
        .expect(&format!("Failed to load {}", query));

    stmt
}

...

// Pre-loading an statement
obj.imp()
    .search_entries
    .set(obj.load_statement("search_entries.rq"))
    .unwrap();

...

// Running a search
pub fn search(&self, search_terms: &str) -> tsparql::SparqlCursor {
    let stmt = self.imp().search_entries.get().unwrap();

    stmt.bind_string("match", search_terms);
    stmt.execute(gio::Cancellable::NONE).unwrap()
}

This data is of course all introspectable with the gresource CLI tool, and I can run these queries from a file using the tinysparql query CLI command, either on the application database itself, or on a separate in-memory testing database created through e.g.tinysparql endpoint --ontology-path ./src/ontology --dbus-service=a.b.c.

Here’s the third advantage for application development. Queries are 100% separate from code, introspectable, and able to be run standalone for testing, while the code remains highly semantic.

Inserting and updating data

When inserting data, we have two major pieces of API to help with the task, each with their own strengths:

  • TrackerSparqlStatement also works for SPARQL update queries.
  • TrackerResource offers more of a builder API to generate RDF data.
  • These can be either executed standalone, or combined/accumulated in a TrackerBatch for a transactional behavior. Batches do improve performance by clustering writes to the database, and database stability by making these changes either succeed or fail atomically (TinySPARQL is fully ACID).

    This interaction is the most application dependent (concretely, retrieving the data to insert to the database), but here is some links to Rissole code for reference, using TrackerResource to store RSS feed data, and using TrackerSparqlStatement to delete RSS feeds.

    And here is the fourth advantage for your application, an async friendly mechanism to efficiently manage large amounts of data, ready for use.

    Full-text search

    For some reason, there tends to be some magical thinking revolving databases and how these make things fast. And the most damned pattern of all can be typically at the heart of search UIs: substring matching. What feels wonderful during initial development in small datasets soon slows to a crawl in larger ones. See, an index is little more than a tree, you can look up exact items on relatively low big O, lookup by prefix with slightly higher one, and for anything else (substring, suffix) there will be nothing to do but a linear search. Sure, the database engine will comply, however painstakingly.

    What makes full-text search fundamentally different? This is a specialized index that performs an effort to pre-tokenize the text, so that each parsed word and term is represented individually, and can be looked up independently (either prefix or exact matches). At the expense of a slightly higher insertion cost (i.e. the usually scarce operation), this provides response times measured in milliseconds when searching for terms (i.e. the usual operation) regardless of their position in the text, even on really large data sets. Of course this is a gross simplification (SQLite has extensive documentation about the details), but I hopefully shined enough light into why full-text search can make things fast in a way a traditional index can not.

    I am largely parroting a SQLite feature here, and yes, this might also be available for you if using SQLite, but the fact that I’ve already taught you in this post how to use it in TinySPARQL (declaring nrl:fulltextIndexed on the searchable properties, using fts:match in queries to match on them) does again have quite some contrast with rolling your own database creation code. So here’s another advantage.

    Backups and other bulk operations

    After you got your data stored, is it enshrined? Are there forward plans to get the data back again out of there? Is the backup strategy cp?

    TinySPARQL (and the SPARQL/RDF combo at its core) boldly says no. Data is fully introspectable, and the query language is powerful enough to extract even full data dumps at a single query, if you wished so. This is for example available through the command line with tinysparql export and tinysparql import, the full database content can be serialized into any of the supported RDF formats, and can be either post-processed or snapshot into other SPARQL databases from there.

    A “small” detail I have not mentioned so far is the (optional) major network transparency of TinySPARQL, since for the most part it is irrelevant on usecases like Rissole. Coming from web standards, of course network awareness is a big component of SPARQL. In TinySPARQL, creating an endpoint to publicly access a database is an explicit choice of made through API, and so it is possible to access other endpoints either from a dedicated connection or by extending your local queries. Why do I bring this up here? I talked at Guadec 2023 about Emergence, a local-first oriented data synchronization mechanism between devices owned by the same user. Network transparency sits at the heart of this mechanism, which could make Rissole able to synchronize data between devices, or any other application that made use of it.

    And this is the last advantage I’ll bring up today, a solid standards-based forward plan to the stored data.

    Closing note

    If you develop an application that does need to store data, future you might appreciate some forward thinking on how to handle a lifetime’s worth of it. More artisan solutions like SQLite or file-based storage might set you up quickly for other funnier development and thus be a temptation, but will likely decrease rapidly in performance unless you know very well what you are doing, and will certainly increase your project’s technical debt over time.

    TinySPARQL wraps all major advantages of SQLite with a versatile data model and query language strongly based on open standards. The degree of separation between the data model and the code makes both neater and more easily testable. And it’s got forward plans in terms of future data changes, backups, and migrations.

    As everything is always subject to improvement, there’s some things that could do for a better developer experience:

    • Query and schema definition files could be linted/validated as a preprocess step when embedding in a GResource, just as we validate GtkBuilder files
    • TinySPARQL’s builtin web IDE started during last year’s GSoC should move forward, so we have an alternative to the CLI
    • There could be graphical ways to visualize and edit these schemas
    • Similar thing, but to visually browse a database content
    • I would not dislike if some of these were implemented in/tied to GNOME Builder
    • It would be nice to have a more direct way to funnel the results of a SPARQL query into UI. Sadly, the GListModel interface API mandates random access and does not play nice with cursor-alike APIs as it is common with databases. This at least excludes making TrackerSparqlCursor just implement GListModel.
    • A more streamlined website to teach and showcase these benefits, currently tinysparql.org points to the developer documentation (extensive otoh, but does not make a great landing page).

    Even though the developer experience would be more buttered up, there’s a solid core that is already a leap compared to other more artisan solutions, in a few areas. I would also like to point out that Rissole is not the first instance here, there are also Polari and Health using TinySPARQL databases this way, and mostly up-to-date in these best practices. Rissole is just my shiny new excuse to talk about this in detail, other application developers might appreciate the resource, and I’d wish it became one of many, so Emergence finally has a worthwhile purpose.

    Last but not least, I would like to thank Kristi, Anisa, and all organizers at LAS for a great conference.

    Alley Chaggar

    @AlleyChaggar

    Demystifying The Codegen Phase Part 2

    Intro

    Hello again, I’m here to update my findings and knowledge about Vala. Last blog, I talked about the codegen phase, as intricate as it is, I’m finding some very helpful information that I want to share.

    Looking at The Outputted C Code

    While doing the JSON module, I’m constantly looking at C code. Back and forth, back and forth, having more than 1 monitor is very helpful in times like these.

    At the beginning of GSoC I didn’t know much of C, and that has definitely changed. I’m still not fluent in it, but I can finally read the code and understand it without too much brain power. For the JsonModule I’m creating, I first looked at how users can currently (de)serialize JSON. I went scouting json-glib examples since then, and for now, I will be using json-glib. In the future, however, I’ll look at other ways in which we can have JSON more streamlined in Vala, whether that means growing away from json-glib or not.

    Using the command ā€˜valac -C yourfilename.vala’, you’ll be able to see the C code that Valac generates. If you were to look into it, you’d see a bunch of temporary variables and C functions. It can be a little overwhelming to see all this if you don’t know C.

    When writing JSON normally with minimal customization and without the JsonModule’s support. You would be writing it like this:

    Json.Node node = Json.gobject_serialize (person);
    Json.Generator gen = new Json.Generator ();
    gen.set_root(node);
    string result = gen.to_data (null);
    print ("%s\n", result); 
    

    This code is showing one way to serialize a GObject class using json-glib.
    The code below is a snippet of C code that Valac outputs for this example. Again, to be able to see this, you have to use the -C command when running your Vala code.

    static void
    _vala_main (void)
    {
    		Person* person = NULL;
    		Person* _tmp0_;
    		JsonNode* node = NULL;
    		JsonNode* _tmp1_;
    		JsonGenerator* gen = NULL;
    		JsonGenerator* _tmp2_;
    		gchar* _result_ = NULL;
    		gchar* _tmp3_;
    		_tmp0_ = person_new ();
    		person = _tmp0_;
    		person_set_name (person, "Alley");
    		person_set_age (person, 2);
    		_tmp1_ = json_gobject_serialize ((GObject*) person);
    		node = _tmp1_;
    		_tmp2_ = json_generator_new ();
    		gen = _tmp2_;
    		json_generator_set_root (gen, node);
    		_tmp3_ = json_generator_to_data (gen, NULL);
    		_result_ = _tmp3_;
    		g_print ("%s\n", _result_);
    		_g_free0 (_result_);
    		_g_object_unref0 (gen);
    		__vala_JsonNode_free0 (node);
    		_g_object_unref0 (person);
    }
    

    You can see many tempary variables denoted by the names __tmp*_, but you can also see JsonNode being called, you can see Json’s generator being called and setting root, and you can even see json gobject serialize. All of this was in our Vala code, and now it’s all in the C code, having temporary variables containing them to be successfully compiled to C code.

    The jsonmodule

    If you may recall the Codegen is the clash of Vala code, but also writing to C code. The steps I’m taking for the JsonModule are looking at the examples to (de)serialize then looking at how the example compiled to C. Since the whole purpose of my work is to write how the C should look like. I’m mainly going off of C’s _vala_main function when determining which C code I should put into my module, but I’m also going off of what the Vala code the user put.

    // serializing gobject classes
    	void generate_gclass_to_json (Class cl) {
    		cfile.add_include ("json-glib/json-glib.h");
    
    		var to_json_class = new CCodeFunction ("_json_%s_serialize_myclass".printf (get_ccode_lower_case_name (cl, null)), "void");
    		to_json_class.add_parameter (new CCodeParameter ("gobject", "GObject *"));
    		to_json_class.add_parameter (new CCodeParameter ("value", " GValue *"));
    		to_json_class.add_parameter (new CCodeParameter ("pspec", "GParamSpec *"));
    		
    		//...
    
    		var Json_gobject_serialize = new CCodeFunctionCall (new CCodeIdentifier ("json_gobject_serialize"));
    		Json_gobject_serialize.add_argument (new CCodeIdentifier ("gobject"));
    
    		// Json.Node node = Json.gobject_serialize (person); - vala code
    		Json_gobject_serialize.add_argument (new CCodeIdentifier ("gobject"));
    		var node_decl_right = new CCodeVariableDeclarator ("node", Json_gobject_serialize);
    		var node_decl_left = new CCodeDeclaration ("JsonNode *");
    		node_decl_left.add_declarator (node_decl_right);
    
    		// Json.Generator gen = new Json.Generator (); - vala code
    		var gen_decl_right = new CCodeVariableDeclarator ("generator", json_gen_new);
    		var gen_decl_left = new CCodeDeclaration ("JsonGenerator *");
    		gen_decl_left.add_declarator (gen_decl_right);
    
    		// gen.set_root(node); - vala code
    		var json_gen_set_root = new CCodeFunctionCall (new CCodeIdentifier ("json_generator_set_root"));
    		json_gen_set_root.add_argument (new CCodeIdentifier ("generator"));
    		json_gen_set_root.add_argument (new CCodeIdentifier ("node"));
    		//...
    

    The code snippet above is a work in progress method in the JsonModule that I created called ā€˜generate_gclass_to_json’ to generate serialization for GObject classes. I’m creating a C code function and passing parameters through it. I’m also filling the body with how the example code did the serializing in the first code snippet. Instead of the function calls being created in _vala_main (by the user), they’ll have their own function that will instantly get created by the module instead.

    static void _json_%s_serialize_myclass (GObject *gobject, GValue *value, GParamSpec *pspec)
    {
    	JsonNode *node = Json_gobject_serialize (gobject);
    	JsonGenerator *generator = json_generator_new ();
    	json_generator_set_root (generator, node);
    	//...
    }
    

    Comparing the differences with the original Vala code and the compiled code (C code), it takes the Vala code shape, but it’s written in C.

    Dev Log June 2025

    May and June in one convenient location.

    libopenraw

    Released 0.4.0.alpha10

    After that, added Nikon Z5 II and P1100, Sony 6400A and RX100M7A, Panasonic S1II and S1IIE, DJI Mavic 3 Pro Cinema (support for Nikon and Sony mostly incomplete, so is Panasonic decompression), Fujifilm X-E5 and OM Systems OM-5 II.

    gnome-raw-thumbnailer

    Updated to the latest libopenraw.

    Released 48.0

    flathub-cli

    This is a project I started a while ago but put on the back burner due to scheduling conflict. It's a command line tool to integrate all the tasks of maintaining flatpak packages for flathub. Some stuff isn't flathub specific though. I already have a bunch of scripts I use, and this is meant to be next level. It also merges into it my previous tool, flatpak-manifest-generator, an interactive tool to generate flatpak manifests.

    One thing I had left in progress and did finish implementing at least the basics is the cleanup command to purge downloads. The rationale is that when you update a package manifest, you change the sources. But the old ones that have been downloaded are still kept. The cleanup downloads command will find these unused sources and delete them for you. I really needed this.

    flathub-cli is written in Rust.

    AbiWord

    Fixing some annoying bugs (regressions) in master, some memory leakage in both stable and master, a lot of in the Gtk UI code. I also fixed a crash when editing lists in 3.0.6 that was due to some code not touched since 2004, and even then that part is probably even older. The short story is that updating a value in the StringMap<> updated the key whose pointer ended up being help somewhere else. Yep, dangling pointer. The fix was to not update the key if it is the same.

    On master only, I also started fixing the antiquated C++ syntax. For some reason in C++ there was a lot of typedef enum and typedef struct, probably an artifact of the late 90's code origin. At the same time moved to #pragma once for header includes. Let the compiler handle it. Also fixed a crash with saving a document with revisions.

    The New Troll Diet

    I have been thinking a lot about online harassment in software communities lately.

    Harassment is nothing new in our spaces, and I even have a bunch of fun stories from trolls, past and new. However, all these stories have one thing in common: they are irrelevant to modern harassment and trolling. So I would like to humbly propose a new framing of this whole issue.

    Harassment In The Troll Feeding Days

    Perhaps the most jarring change in online culture has been in how harassment happens on the internet. Spending our formative years in forums, IRC, and mailing lists, we got used to the occasional troll that after a few annoying interactions would get blocked by an admin.

    Back then, a troll was limited to baiting for replies, and that power was easy to take away. Remember, removing a troll was as simple as blocking an email address or banning an IP on IRC.

    In short: Don't feed the troll and it will either get bored and go away, or be blocked by an admin. Right?

    Online Harassment Is a Different Game Now

    The days of starving trolls are over. Trolls now have metaphorical DoorDash, UberEats, and are even decent cooks themselves.

    It is now impossible to defend an online community by simply "blocking the bad apples". A determined troll now has access to its own audience, peers to amplify their message, and even attack tools that used to be exclusive to nation states.

    A DDoS attack can be implemented with a few dozen dollars and cost thousands to defend. Social media accounts can be bought by the hundreds. Doxxing is easy for motivated individuals. Harassment campaigns can be orchestrated in real-time to flood comment sections, media outlets, employer inboxes, and even deplatform creators.

    Deterrence used to work because the trolls would lose access to attention and relevance if banned. This is no longer the case. In fact, trolls now have a lot to gain by building an audience around being ostracized by their targets, portraying themselves as brave truth tellers that are censored by evil-doers.

    A strange game indeed, and not playing it doesn't work anymore.

    Rules Are No Longer Enough

    All of the above means that online communities can no longer point to the "No Trolls Allowed" sign and consider the job done, this "rules-based" framework is no longer viable deterrence. A different approach is needed, one that is not naive to the ruses and concern trolling of contemporary harassment.

    A relevant example comes to mind. The popular "Nazi Bar" story as told by Michael Tager:

    "(...) Tager recounted visiting a "shitty crustpunk bar" where he saw a patron abruptly expelled: the bartender explained that the man was wearing "iron crosses and stuff", and that he feared such patrons would become regulars and start bringing friends if not promptly kicked out, which would lead him to realize "oh shit, this is a Nazi bar now" only after the unwanted patrons became too "entrenched" to kick out without trouble."

    (...) "(Internet slang) A space in which bigots or extremists have come to dominate due to a lack of moderation or by moderators wishing to remain neutral or avoid conflict." From Wiktionary

    The story is not about the necessity of having a better rulebook. No, the point is that, in some circumstances, moderation can not afford to be naive and has to see through the ruse of bad actors appealing to tolerance or optics. Some times you have to loudly tell someone to fuck off, and kick them out.

    This might seem counter intuitive if you grew up in the "don't feed the troll" era. But trolls no longer need the attention of their victims to thrive. In fact, some times silence and retreat from conflict are even bigger rewards.

    The Trap Card of Behavioral Outbursts

    Because the rules-based framework considers any engagement a failure, it leads groups to avoid conflict at all cost, not realizing that they are already in conflict with their harassers. Taken to an extreme, any push-back against harassment is seen as bad as the harassment itself. This flawed reasoning might even lead to throwing others under the bus, or walking back statements of support, all done in the name of keeping the harassers seemingly silent.

    Unfortunately, conceding to trolls after receiving push-back is one of Behavioral Psychology "trap cards". The concept is formally known as "Behavioral Outburst" and describes how a subject will intensify an unwanted behavior after receiving push-back. The classic example is a kid having a tantrum:

    A kid is at the store with their parent. The kid starts crying, asking for a new toy. The parent says no and warns the kid that they will go back home if they keep crying.

    The kid keeps crying and the parent decides to fulfill the warning to go back home.

    As a response to this consequence, the kid then has an outburst of the unwanted behavior: louder crying, screaming, throwing themselves to the floor.

    The parent gets overwhelmed and ends up buying a new toy for the kid.

    The above example is commonly used to demonstrate two concepts:

    1. When an unwanted behavior is met with resistance, it frequently leads to an outburst of that behavior to "defeat" such resistance
    2. If the outburst succeeds, then the outburst becomes the new baseline for responding to any resistance

    We should understand that applying consequences to a harasser (bans, warnings, condemnation) is likely to cause an outburst of the unwanted behavior. This is unavoidable. However, it is a fatal mistake to cede to a behavioral outburst. If consequences are taken back, then the outburst becomes the new default level of harassment.

    Even worse, an illusion of control is introduced: we harass, they fight back; we intensify the harassment a little bit, they concede.

    Why Speaking Up Is Important

    Communities are not corporations and morale is not set by a rule-book or by mandate of leadership. Communities, specially the ones giving away tens of thousands of dollars in value to each other, are held together by mutual trust.

    One element of this mutual trust, maybe the most important one, is knowing that your colleagues have your back and will defend you from anyone unfairly coming after you. Just like a soccer team will swarm a rival to defend a teammate.

    Knowing that your team will loudly tell those coming after you to fuck off is not only good for morale, but also a necessary outlet and catharsis for a community. Silence only leads to festering of the most rancid vibes, it erodes trust and creates feelings of isolation in the targeted individuals.

    If solidarity and empathy are not demonstrated, is that any different from there being none?

    A New Framework: Never Cede To The Troll

    We need a new framework for how to defend against "trolls". The feeding metaphor ran its course many years ago. It is done and will not be coming back.

    New online risks demand that we adapt and become proactive in protecting our spaces. We have to loudly and proudly set the terms of what is permissible. Those holding social or institutional power in communities should be willing to drop a few loud fuck offs to anyone trying to work their way in by weaponizing optics, concern trolling, or the well known "tolerance paradox". Conceding through silence, or self-censorship, only emboldens those who benefit from attacking a community.

    It is time that we adopt a bolder framework where defending our spaces and standing our ground to protect each other is the bare minimum expected.

    Victor Ma

    @victorma

    Bugs, bugs, and more bugs!

    In the past two weeks, I worked on two things:

    • Squashing a rebus bug.
    • Combine the two suggested words lists into one.

    The rebus bug

    A rebus cell is a cell that contains more than one letter in it. These aren’t too common in crossword puzzles, but they do appear occasionally—and especially so in harder puzzles.

    A rebus cell

    Our word suggestions lists were not working for slots with rebus cells. More specifically, if the cursor was on a cell that’s within letters in rebus - 1 cells to the right of a rebus cell, then an assertion would fail, and the word suggestions list would be empty.

    The cause of this bug is that our intersection code (which is what generates the suggested words) was not accounting for rebuses at all! The fix was to modify the intersection code to correctly count the additional letters that a rebus cell contains.

    Combine the suggested words lists

    The Crosswords editor shows a the words list for both Across and Down, at the same time. This is different from what most other crossword editors do, which is to have a single suggested words list that switches between Across and Down, based on the cursor’s direction.

    I think having a single list is better, because it’s visually cleaner, and you don’t have to take a second to find right list. It also so happens that we have a problem with our sidebar jumping, in large part because of the two suggested words lists.

    So, we decided that I should combine the two lists into one. To do this, I removed the second list widget and list model, and then I added some code to change the contents of the list model whenever the cursor direction changes.

    Suggested words list

    More bugs!

    I only started working on the rebus bug because I was working on the word suggestions bug. And I only started working on that bug because I discovered it while using the Editor. And it’s a similar story with the words lists unification task. I only started working on it because I noticed the sidebar jumping bug.

    Now, the plan was that after I fixed those two bugs, I would turn my attention to a bigger task: adding a step of lookahead to our fill algorithm. But alas, as I was fixing the two bugs, I noticed a few more bugs. But they shouldn’t take too long, and they ought to be fixed. So I’m going to do that first, and then transition to working on the fill lookahead task.

    Tobias Bernard

    @tbernard

    Aardvark: Summer 2025 Update

    It’s been a while, so here’s an update about Aardvark, our initiative to bring local-first collaboration to GNOME apps!

    A quick recap of what happened since my last update:

    • Since December, we had three more Aardvark-focused events in Berlin
    • We discussed peer-to-peer threat models and put together designs addressing some of the concerns that came out of those discussions
    • We switched from using Automerge to Loro as a CRDT library in the app, mainly because of better documentation and native support for undo/redo
    • As part of a p2panda NLnet grant, Julian Sparber has been building the Aardvark prototype out into a more fully-fledged app
    • We submitted and got approved for a new Prototypefund grant to further build on this work, which started a few weeks ago!
    • With the initiative becoming more concrete we retired the ā€œAardvarkā€ codename, and gave the app a real GNOME-style app name: ā€œReflectionā€

    The Current State

    As of this week, the Reflection (formerly Aardvark) app already works for simple Hedgedoc-style use cases. It’s definitely still alpha-quality, but we already use it internally for our team meetings. If you’re feeling adventurous you can clone the repo and run it from Builder, it should mostly work :)

    Our current focus is on reliability for basic collaboration use cases, i.e. making sure we’re not losing people’s data, handling various networking edge cases smoothly, and so on. After that there are a few more missing UI features we want to add to make it comfortable to use as a Hedgedoc replacement (e.g. displaying other people’s cursors and undo/redo).

    At the same time, the p2panda team (Andreas, Sam, and glyph) are working on new features in p2panda to enable functionality we want to integrate later on, particularly end-to-end encryption and an authentication/permission system.

    Prototype Fund Roadmap

    We have two primary goals for the Prototype Fund project: We want to build an app that’s polished enough to use as a daily driver for meeting notes in the near-term future, but with an explicit focus on full-stack testing of p2panda in a real-world native desktop app. This is because our second goal is kickstarting a larger ecosystem of local-first GNOME apps. To help with this, the idea is for Reflection to also serve as an example of a GTK app with local-first collaboration that others can copy code and UI patterns from. We’re not sure yet how much these two goals (peer-to-peer example vs. daily driver notes app) will be in conflict, but we hope it won’t be too bad in practice. If in doubt we’ll probably be biased towards the former, because we see this app primarily as a step towards a larger ecosystem of local-first apps.

    To that end it’s very important to us to involve the wider community of GNOME app developers. We’re planning to write more regular blog posts about various aspects of our work, and of course we’re always available for questions if anyone wants to start playing with this in their own apps. We’re also planning to create GObject bindings so people can easily use p2panda from C, Python, Javascript, Vala, etc. rather than only from Rust.

    Designs for various states of the connection popover

    We aim to release a first basic version of the app to Flathub around August, and then we’ll spend the rest of the Prototype Fund period (until end of November) adding more advanced features, such as end-to-end encryption and permission management. Depending on how smoothly this goes, we’d also like to get into some fancier UI features (such as comments and suggested edits), but it’s hard to say at this point.

    If we’re approved for Prototype Fund’s Second Stage (will be announced in October), we’ll get to spend a few more months doing mostly non-technical tasks for the project, such as writing more developer documentation, and organizing a GTK+Local-First conference next spring.

    Meet us at GUADEC

    Most of the Reflection team (Julian Sparber, Andreas Dzialocha, and myself) are going to be at GUADEC in July, and we’ll have a dedicated Local-First BoF (ideally on Monday July 28th, but not confirmed yet). This will be a great opportunity for discussions towards a potential system sync service, to give feedback on APIs if you’ve already tried playing with them, or to tell us what you’d need to make your app collaborative!

    In the mean time, if you have questions or want to get involved, you can check out the code or find us on Matrix.

    Happy Hacking!

    Bilal Elmoussaoui

    @belmoussaoui

    Grant the AI octopus access to a portion of your desktop

    The usage of Large Language Models (LLMs) has become quite popular, especially with publicly and "freely" accessible tools like ChatGPT, Google Gemini, and other models. They're now even accessible from the CLI, which makes them a bit more interesting for the nerdier among us.

    One game-changer for LLMs is the development of the Model Context Protocol (MCP), which allows an external process to feed information (resources) to the model in real time. This could be your IDE, your browser, or even your desktop environment. It also enables the LLM to trigger predefined actions (tools) exposed by the MCP server. The protocol is basically JSON-RPC over socket communication, which makes it easy to implement in languages like Rust.

    So, what could possibly go wrong if you gave portions of your desktop to this ever-growing AI octopus?

    The implementation details

    Over the weekend, I decided not only to explore building an MCP server that integrates with the GNOME desktop environment, but also to use Anthropic’s Claude Code to help implement most of it.

    The joyful moments

    The first step was to figure out what would be simple yet meaningful to give the LLM access to, to see:

    • if it could recognize that an MCP server was feeding it live context, and
    • how well it could write code around that, lol.

    I started by exposing the list of installed applications on the system, along with the ability to launch them. That way, I could say something like: "Start my work environment", and it would automatically open my favorite text editor, terminal emulator, and web browser.

    Overall, the produced code was pretty okay; with some minor comments here and there, the model managed to fix its mistakes without any issues.

    Once most of the basic tools and resources were in place, the LLM also did some nice code cleanups by writing a small macro to simplify the process of creating new tools/resources without code duplication.

    The less joyful ones

    You know that exposing the list of installed applications on the system is not really the important piece of information the LLM would need to do anything meaningful. What about the list of your upcoming calendar events? Or tasks in your Todo list?

    If you’re not familiar with GNOME, the way to achieve this is by using Evolution Data Server’s DBus APIs, which allow access to information like calendar events, tasks, and contacts. For this task, the LLM kept hallucinating DBus interfaces, inventing methods, and insisted on implementing them despite me repeatedly telling it to stop — so I had to take over and do the implementation myself.

    My takeaway from this is that LLMs will always require human supervision to ensure what they do is actually what they were asked to do.

    Final product

    The experience allowed us (me and the LLM pet) to build a simple yet powerful tool that can give your LLM access to the following resources:

    • Applications list
    • Audio and media status (MPRIS)
    • Calendar events
    • System information
    • Todo list

    And we built the following tools:

    • Application launcher
    • Audio and media control (MPRIS)
    • Notifications, allowing sending a new notification
    • Opening a file
    • Quick settings, allowing the LLM to turn on/off things like dark style, Wi-Fi, or so
    • Screenshot, useful for things like text recognition, for example, or even asking the LLM to judge your design skills
    • Wallpaper, allows the LLM to set you a new wallpaper, because why not!
    • Window management, allows listing, moving, and resizing windows using the unsafe GNOME Shell Eval API for now, until there is a better way to do it.

    One could add more tools, for example, creating new events or new tasks, but I left the exercise to new contributors.

    The tool is available on GitHub at https://github.com/bilelmoussaoui/gnome-mcp-server and is licensed under the MIT License.

    Caution

    Giving an external LLM access to real-time information about your computer has privacy and potentially security implications, so use with caution. The built tool allows disabling specific tools/resources via a configuration file; see https://github.com/bilelmoussaoui/gnome-mcp-server?tab=readme-ov-file#configuration

    Conclusion

    The experimentation was quite enriching as I learned how MCP can be integrated into an application/ecosystem and how well LLMs ingest those resources and make use of the exposed actions. Until further improvements are made, enjoy the little toy tool!

    dnf uninstall

    I am a long time user of the Fedora operating system. It’s very good quality, with a lot of funding from Red Hat (who use it to crowd-source testing for their commercial product Red Hat Enterprise Linux).

    On Fedora you use a command named dnf to install and remove packages. The absolute worst design decision of Fedora is this:

    • To install a package: dnf install
    • To uninstall a package: dnf remove

    If I had a dollar for every time I typed dnf uninstall foo and got an error then I’d be able to stage a lavish wedding in Venice by now.

    As a Nushell user, I finally spent 5 minutes to fix this forever by adding the following to my ~/.config/nushell/config.nu file:

    def "dnf uninstall" […packages: string] {
        dnf remove …$packages
    }
    

    (I also read online about a dnf alias command that might solve this, but it isn’t available for me for whatever reason).

    That’s all for today!

    Ahmed Fatthi

    @ausername1040

    GSoC 2025: June Progress Report

    June has been a month of deep technical work and architectural progress on my GSoC project with GNOME Papers. Here’s a summary of the key milestones, challenges, and decisions from the month.


    ļæ½ļø Architecture Overview

    To better illustrate the changes, here are diagrams of the current (unsandboxed) and the new (sandboxed) architectures for GNOME Papers:

    Current Architecture (Unsandboxed):

    Current unsandboxed architecture

    Target Architecture (Sandboxed):

    Target sandboxed architecture


    ļø Early June: Prototyping, Research & First Meeting

    Note: D-Bus is a system that lets different programs on your computer talk to each other, even if they are running in separate processes.

    Friday links 27 June 2025

    Some links for technical articles on various topics I read.

    std::mem is... interesting - Explore some of the functionalities in the std::mem module of the Rust standard library.

    How much code does that proc macro generate? - Nicholas Nethercote tells us how you can answer this question with new tooling in the Rust toolchain.

    PNG is back - A quick overview of the PNG spec 3. Spoiler: Animated PNG, HDR support and Exif support.

    JavaScript broke the web (and called it progress) - How the JavaScript ecosystem is over complicated for no reason with only making the user facing stuff worse.

    QtWayland 6.6 Brings Robustness Through Compositor Handoffs - Improvements in Wayland support in Qt 6.6, brought to kwin (KDE compositor), fixing some stuff that Wayland should better at than X11, but ended up being worse.

    jemalloc Postmortem - jemalloc from its inception to the end.

    Arun Raghavan

    @arunsr

    The Unbearable Anger of Broken Audio

    It should be surprising to absolutely nobody that the Linux audio stack is often the subject of varying levels of negative feedback, ranging from drive-by meme snark to apoplectic rage[1].

    A lot of what computers are used for today involves audiovisual media in some form or the other, and having that not work can throw a wrench in just going about our day. So it is completely understandable for a person to get frustrated when audio on their device doesn’t work (or maybe worse, stops working for no perceivable reason).

    It is also then completely understandable for this person to turn up on Matrix/IRC/Gitlab and make their displeasure known to us in the PipeWire (and previously PulseAudio) community. After all, we’re the maintainers of the part of the audio stack most visible to you.

    To add to this, we have two and a half decades’ worth of history in building the modern Linux desktop audio stack, which means there are historical artifacts in the stack (OSS -> ALSA -> ESD/aRTs -> PulseAudio/JACK -> PipeWire). And a lot of historical animus that apparently still needs venting.

    In large centralised organisations, there is a support function whose (thankless) job it is to absorb some of that impact before passing it on to the people who are responsible for fixing the problem. In the F/OSS community, sometimes we’re lucky to have folks who step up to help users and triage issues. Usually though, it’s just maintainers managing this.

    This has a number of … interesting … impacts for those of us who work in the space. For me this includes:

    1. Developing thick skin
    2. Trying to maintain equanimity while being screamed at
    3. Knowing to step away from the keyboard when that doesn’t work
    4. Repeated reminders that things do work for millions of users every day

    So while the causes for the animosity are often sympathetic, this is not a recipe for a healthy community. I try to be judicious while invoking the fd.o Code of Conduct, but thick skin or not, abusive behaviour only results in a toxic community, so there are limits to that.

    While I paint a picture of doom and gloom, most recent user feedback and issue reporting in the PipeWire community has been refreshingly positive. Even the trigger for this post is an issue from an extremely belligerent user (who I do sympathise with), who was quickly supplanted by someone else who has been extremely courteous in the face of what is definitely a frustrating experience.

    So if I had to ask something of you, dear reader – the next time you’re angry with the maintainers of some free software you depend on, please get some of the venting out of your system in private (tell your friends how terrible we are, or go for a walk maybe), so we can have a reasonable conversation and make things better.

    Thank you for reading!


    1. I’m not linking to examples, because that’s not the point of this post. ā†©ļøŽ

    Michael Meeks

    @michael

    2025-06-25 Wednesday

    • Catch up with H. with some great degree news, poke at M's data-sets briefly, sync with Dave, Pedro & Asja. Lunch.
    • Published the next strip around the excitement of setting up your own non-profit structure:
      The Open Road to Freedom - strip#23 - A solid foundation
    • Partner sales call.

    Michael Meeks

    @michael

    2025-06-24 Tuesday

    • Tech planning call, sync with Laser, Stephan, catch up with Andras, partner call in the evening. Out for a walk with J. on the race-course in the sun. Catch up with M. now returned home.

    Why is my Raspberry Pi 4 too slow as a server?

    I self-host services on a beefy server in a datacenter. Every night, Kopia performs a backup of my volumes and sends the result to a s3 bucket in Scaleway's Parisian datacenter.

    The VPS is expensive, and I want to move my services to a Raspberry Pi at home. Before actually moving the services I wanted to see how the Raspberry Pi would handle them with real life data. To do so, I downloaded kopia on the Raspberry Pi, connected it to the my s3 bucket in Scaleway's datacenter, and attempted to restore the data from a snapshot of a 2.8GB volume.

    thib@tinykube:~ $ kopia restore k1669883ce6d009e53352fddeb004a73a
    Restoring to local filesystem (/tmp/snapshot-mount/k1669883ce6d009e53352fddeb004a73a) with parallelism=8...
    Processed 395567 (3.6 KB) of 401786 (284.4 MB) 13.2 B/s (0.0%) remaining 6000h36m1s.
    

    A restore time in Bytes pers second? It would take 6000h, that is 250 days, to transfer 2.8GB from a s3 bucket to the Raspberry Pi in my living room? Put differently, it means I can't restore backups to my Raspberry Pi, making it unfit for production as a homelab server in its current state.

    Let's try to understand what happens, and if I can do anything about it.

    The set-up

    Let's list all the ingredients we have:

    • A beefy VPS (16 vCPU, 48 GB of RAM, 1 TB SSD) in a German datacenter
    • A Raspberry Pi 4 (8 GB of RAM) in my living room, booting from an encrypted drive to avoid data leaks in case of burglary. That NVMe disk is connected to the Raspberry Pi via a USB 3 enclosure.
    • A s3 bucket that the VPS pushes to, and that the Rasperry Pi pulls from
    • A fiber Internet connection for the Raspberry Pi to download data

    Where the problem can come from

    Two computers and a cloud s3 bucket look like it's fairly simple, but plenty of things can fail or be slow already! Let's list them and check if the problem could come from there.

    Network could be slow

    I have a fiber plan, but maybe my ISP lied to me, or maybe I'm using a poor quality ethernet cable to connect my Raspberry Pi to my router. Let's do a simple test by installing Ookla's speedtest CLI on the Pi.

    I can list the nearest servers

    thib@tinykube:~ $ speedtest -L
    Closest servers:
    
        ID  Name                           Location             Country
    ==============================================================================
     67843  Syxpi                          Les Mureaux          France
     67628  LaNetCie                       Paris                France
     63829  EUTELSAT COMMUNICATIONS SA     Paris                France
     62493  ORANGE FRANCE                  Paris                France
     61933  Scaleway                       Paris                France
     27961  KEYYO                          Paris                France
     24130  Sewan                          Paris                France
     28308  Axione                         Paris                France
     52534  Virtual Technologies and Solutions Paris                France
     62035  moji                           Paris                France
     41840  Telerys Communication          Paris                France
    

    Happy surprise, Scaleway, my s3 bucket provider, is among the test servers! Let's give it a go

    thib@tinykube:~ $ speedtest -s 61933
    [...]
       Speedtest by Ookla
    
          Server: Scaleway - Paris (id: 61933)
             ISP: Free SAS
    Idle Latency:    12.51 ms   (jitter: 0.47ms, low: 12.09ms, high: 12.82ms)
        Download:   932.47 Mbps (data used: 947.9 MB)                                                   
                     34.24 ms   (jitter: 4.57ms, low: 12.09ms, high: 286.97ms)
          Upload:   907.77 Mbps (data used: 869.0 MB)                                                   
                     25.42 ms   (jitter: 1.85ms, low: 12.33ms, high: 40.68ms)
     Packet Loss:     0.0%
    

    With a download speed of 900 Mb/s ā‰ˆ 112 MB/s between Scaleway and my Raspberry Pi, it looks like the network is not the core issue.

    The s3 provider could have an incident

    I could test that the network itself is not to blame, but I don't know exactly what is being downloaded and from what server. Maybe Scaleway's s3 platform itself has an issue and is slow?

    Let's use aws-cli to just pull the data from the bucket without performing any kind of operation on it. Scaleway provides detailed instructions about how to use aws-cli with their services. After following it, I can download a copy of my s3 bucket on the encrypted disk attached to my Raspberry Pi with

    thib@tinykube:~ $ aws s3 sync s3://ergaster-backup/ /tmp/s3 \
        --endpoint-url https://s3.fr-par.scw.cloud 
    

    It downloads at a speed of 1 to 2 MB/s. Very far from what I would expect. It could be tempting to stop here and think Scaleway is unjustly throttling my specific bucket. But more things could actually be happening.

    Like most providers, Scaleway has egress fees. In other words, they bill customers who pull data out of their s3 buckets. It means that if I'm going to do extensive testing, I will end up with a significant bill. I've let the sync command finish overnight so I could have a local copy of my bucket on my Raspberry Pi's encrypted disk.

    After it's done, I can disconnect kopia from my s3 bucket with

    thib@tinykube:~ $ kopia repository disconnect
    

    And I can connect it to the local copy of my bucket with

    thib@tinykube:~ $ kopia repository connect filesystem \
        --path=/tmp/s3
    

    Attempting to restoring a snapshot gives me the same terrible speed as earlier. Something is up with the restore operation specifically. Let's try to understand what happens.

    Kopia could be slow to extract data

    Kopia performs incremental, encrypted, compressed backups to a repository. There's a lot information packed in this single sentence, so let's break it down.

    How kopia does backups

    When performing a first snapshot of a directory, Kopia doesn't just upload files as it finds them. Instead if splits the files into small chunks, all of the same size on average. It computes a hash for each of them, that will serve as an unique identifier. It writes in a index table which block (identified by a hash) belongs to which file in which snapshot. And finally, it compresses, encrypts, and uploads them to the repository.

    When performing a second snapshot, instead of just uploading all the files again, kopia performs the same file splitting operation. It hashes each block again, looks up in the index table if the hash is already present. If that's the case, it means the corresponding chunk has already been backed up and doesn't need to be re-uploaded. If not, it writes the hash to the table, compresses and encrypts the new chunk, and sends it to the repository.

    Splitting the files and computing a hash for the chunks allows kopia to only send the data that has changed, even in large files, instead of uploading whole directories.

    The algorithm to split the files in small chunks is called a splitter. The algorithm to compute a hash for each chunk is called... a hash.

    Kopia supports several splitters, several hash algorithms, several encryption algorithms, and several compression algorithms. Different processors have different optimizations and will perform more or less well, which is why kopia offers to pick between several splitters, hash, and compression algorithms.

    The splitter, hash and encryption algorithms are defined per repository, when the repository is created. These algorithms cannot be changed after the repository has been created. After connecting a repository, the splitter and hash can be determined with

    thib@tinykube:~ $ kopia repository status
    Config file:         /home/thib/.config/kopia/repository.config
    
    Description:         Repository in Filesystem: /tmp/kopia
    Hostname:            tinykube
    Username:            thib
    Read-only:           false
    Format blob cache:   15m0s
    
    Storage type:        filesystem
    Storage capacity:    1 TB
    Storage available:   687.5 GB
    Storage config:      {
                           "path": "/tmp/kopia",
                           "fileMode": 384,
                           "dirMode": 448,
                           "dirShards": null
                         }
    
    Unique ID:           e1cf6b0c746b932a0d9b7398744968a14456073c857e7c2f2ca12b3ea036d33e
    Hash:                BLAKE2B-256-128
    Encryption:          AES256-GCM-HMAC-SHA256
    Splitter:            DYNAMIC-4M-BUZHASH
    Format version:      2
    Content compression: true
    Password changes:    true
    Max pack length:     21 MB
    Index Format:        v2
    
    Epoch Manager:       enabled
    Current Epoch: 465
    
    Epoch refresh frequency: 20m0s
    Epoch advance on:        20 blobs or 10.5 MB, minimum 24h0m0s
    Epoch cleanup margin:    4h0m0s
    Epoch checkpoint every:  7 epochs
    

    The compression algorithm is defined by a kopia policy. By default kopia doesn't apply any compression.

    How kopia restores data

    When kopia is instructed to restore data from a snapshot, it looks up the index table to figure out what chunks it must retrieve. It decrypts them, then decompresses them if they were compressed, and appends the relevant chunks together to reconstruct the files.

    Kopia doesn't rely on the splitter and hash algorithms when performing a restore, but it relies on the encryption and compression ones.

    Figuring out the theoretical speed

    Kopia has built in benchmarks to let you figure out what are the best hash and encryption algorithms to use for your machine. I'm trying to understand why the restore operation is slow, so I only need to know about what I can expect from the encryption algorithms.

    thib@tinykube:~ $ kopia benchmark encryption
    Benchmarking encryption 'AES256-GCM-HMAC-SHA256'... (1000 x 1048576 bytes, parallelism 1)
    Benchmarking encryption 'CHACHA20-POLY1305-HMAC-SHA256'... (1000 x 1048576 bytes, parallelism 1)
         Encryption                     Throughput
    -----------------------------------------------------------------
      0. CHACHA20-POLY1305-HMAC-SHA256  173.3 MB / second
      1. AES256-GCM-HMAC-SHA256         27.6 MB / second
    -----------------------------------------------------------------
    Fastest option for this machine is: --encryption=CHACHA20-POLY1305-HMAC-SHA256
    

    The Raspberry Pi is notorious for not being excellent with encryption algorithms. The kopia repository was created from my VPS, a machine with much better results with AES. Running the same benchmark on my VPS gives much different results.

    [thib@ergaster ~]$ kopia benchmark encryption
    Benchmarking encryption 'AES256-GCM-HMAC-SHA256'... (1000 x 1048576 bytes, parallelism 1)
    Benchmarking encryption 'CHACHA20-POLY1305-HMAC-SHA256'... (1000 x 1048576 bytes, parallelism 1)
         Encryption                     Throughput
    -----------------------------------------------------------------
      0. AES256-GCM-HMAC-SHA256         2.1 GB / second
      1. CHACHA20-POLY1305-HMAC-SHA256  699.1 MB / second
    -----------------------------------------------------------------
    Fastest option for this machine is: --encryption=AES256-GCM-HMAC-SHA256
    

    Given that the repository I try to perform a restore from does not use compression and that it uses the AES256 encryption algorithm, I should expect a restore speed of 27.6 MB/s on the Raspberry Pi. So why is the restore so slow? Let's keep chasing the performance bottleneck.

    The disk could be slow

    The hardware

    The Raspberry Pi is a brave little machine, but it was obviously not designed as a home lab server. The sd cards it usually boots from are notorious for being fragile and not supporting I/O intensive operations.

    A common solution is to make the Raspberry Pi boot from a SSD drive. But to connect this kind of disk to the Raspberry Pi 4, you need an USB enclosure. I bought a Kingston SNV3S/1000G NVMe drive. It supposedly can read and write at 6 GB/s and 5 GB/s respectively. I put that drive an ICY BOX IB-1817M-C31 enclosure, with a maximum theoretical speed of 1000 MB/s.

    According to this thread on the Raspberry Pi forums, the USB controller of the Pi has a bandwidth of 4Gb/s ā‰ˆ 512 MB/s (and not 4 GB/s as I initially wrote. Thanks baobun on hackernews for pointing out my mistake!) shared across all 4 ports. Since I only plug my disk there, it should get all the bandwidth.

    So the limiting factor is the enclosure, that should still give me a generous 1000 MB/s.

    So the limiting factor is the USB controller of the Raspberry Pi, that should still give me about 512 MB/s, although baobun on hackernews also pointed out that the USB controller on the Pi might share a bus with the network card.

    Let's see how close to the reality that is.

    Disk sequential read speed

    First, let's try with a gentle sequential read test to see how well it performs in ideal conditions.

    thib@tinykube:~ $ fio --name TEST --eta-newline=5s --filename=temp.file --rw=read --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
    TEST: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    fio-3.33
    Starting 1 process
    Jobs: 1 (f=1): [R(1)][11.5%][r=144MiB/s][r=144 IOPS][eta 00m:54s]
    Jobs: 1 (f=1): [R(1)][19.7%][r=127MiB/s][r=126 IOPS][eta 00m:49s] 
    Jobs: 1 (f=1): [R(1)][27.9%][r=151MiB/s][r=151 IOPS][eta 00m:44s] 
    Jobs: 1 (f=1): [R(1)][36.1%][r=100MiB/s][r=100 IOPS][eta 00m:39s] 
    Jobs: 1 (f=1): [R(1)][44.3%][r=111MiB/s][r=111 IOPS][eta 00m:34s] 
    Jobs: 1 (f=1): [R(1)][53.3%][r=106MiB/s][r=105 IOPS][eta 00m:28s] 
    Jobs: 1 (f=1): [R(1)][61.7%][r=87.1MiB/s][r=87 IOPS][eta 00m:23s] 
    Jobs: 1 (f=1): [R(1)][70.0%][r=99.9MiB/s][r=99 IOPS][eta 00m:18s] 
    Jobs: 1 (f=1): [R(1)][78.3%][r=121MiB/s][r=121 IOPS][eta 00m:13s] 
    Jobs: 1 (f=1): [R(1)][86.7%][r=96.0MiB/s][r=96 IOPS][eta 00m:08s] 
    Jobs: 1 (f=1): [R(1)][95.0%][r=67.1MiB/s][r=67 IOPS][eta 00m:03s] 
    Jobs: 1 (f=1): [R(1)][65.6%][r=60.8MiB/s][r=60 IOPS][eta 00m:32s] 
    TEST: (groupid=0, jobs=1): err= 0: pid=3666160: Thu Jun 12 20:14:33 2025
      read: IOPS=111, BW=112MiB/s (117MB/s)(6739MiB/60411msec)
        slat (usec): min=133, max=41797, avg=3396.01, stdev=3580.27
        clat (msec): min=12, max=1061, avg=281.85, stdev=140.49
         lat (msec): min=14, max=1065, avg=285.25, stdev=140.68
        clat percentiles (msec):
         |  1.00th=[   41],  5.00th=[   86], 10.00th=[  130], 20.00th=[  171],
         | 30.00th=[  218], 40.00th=[  245], 50.00th=[  271], 60.00th=[  296],
         | 70.00th=[  317], 80.00th=[  355], 90.00th=[  435], 95.00th=[  550],
         | 99.00th=[  793], 99.50th=[  835], 99.90th=[  969], 99.95th=[ 1020],
         | 99.99th=[ 1062]
       bw (  KiB/s): min=44521, max=253445, per=99.92%, avg=114140.83, stdev=31674.32, samples=120
       iops        : min=   43, max=  247, avg=111.18, stdev=30.90, samples=120
      lat (msec)   : 20=0.07%, 50=1.69%, 100=4.94%, 250=35.79%, 500=50.73%
      lat (msec)   : 750=5.24%, 1000=1.45%, 2000=0.07%
      cpu          : usr=0.66%, sys=21.39%, ctx=7650, majf=0, minf=8218
      IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.5%, 16=0.9%, 32=98.2%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=99.9%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
         issued rwts: total=6739,0,0,0 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=32
    
    Run status group 0 (all jobs):
       READ: bw=112MiB/s (117MB/s), 112MiB/s-112MiB/s (117MB/s-117MB/s), io=6739MiB (7066MB), run=60411-60411msec
    
    Disk stats (read/write):
        dm-0: ios=53805/810, merge=0/0, ticks=14465332/135892, in_queue=14601224, util=100.00%, aggrios=13485/943, aggrmerge=40434/93, aggrticks=84110/2140, aggrin_queue=86349, aggrutil=36.32%
      sda: ios=13485/943, merge=40434/93, ticks=84110/2140, in_queue=86349, util=36.32%
    

    So I can read from my disk at 117 MB/s. We're far from the theoretical 1000 MB/s. One thing is interesting here. The read performance seems to decrease over time? Running the same test again with htop to monitor what happens, I can see even more surprising. Not only the speed remains slower, but all four CPUs are pegging.

    So when performing a disk read test, the CPU is going to maximum capacity, with a wait metric of about 0%. So the CPU is not waiting for the disk. Why would my CPU go crazy when just reading from disk? Oh. Oh no. The Raspberry Pi performs poorly with encryption. I am trying to read from an encrypted drive. This is why even with this simple reading test my CPU is a bottleneck.

    Disk random read/write speed

    Let's run the test that this wiki describes as "will show the absolute worst I/O performance you can expect."

    thib@tinykube:~ $ fio --name TEST --eta-newline=5s --filename=temp.file --rw=randrw --size=2g --io_size=10g --blocksize=4k --ioengine=libaio --fsync=1 --iodepth=1 --direct=1 --numjobs=32 --runtime=60 --group_reporting 
    [...]
    Run status group 0 (all jobs):
       READ: bw=6167KiB/s (6315kB/s), 6167KiB/s-6167KiB/s (6315kB/s-6315kB/s), io=361MiB (379MB), run=60010-60010msec
      WRITE: bw=6167KiB/s (6315kB/s), 6167KiB/s-6167KiB/s (6315kB/s-6315kB/s), io=361MiB (379MB), run=60010-60010msec
    
    Disk stats (read/write):
        dm-0: ios=92343/185391, merge=0/0, ticks=90656/620960, in_queue=711616, util=95.25%, aggrios=92527/182570, aggrmerge=0/3625, aggrticks=65580/207873, aggrin_queue=319891, aggrutil=55.65%
      sda: ios=92527/182570, merge=0/3625, ticks=65580/207873, in_queue=319891, util=55.65%
    

    In the worst conditions, I can expect a read and write speed of 6 MB/s each.

    The situation must be even worse when trying to restore my backups with kopia: I read an encrypted repository from an encrypted disk and try to write data on the same encrypted disk. Let's open htop and perform a kopia restore to confirm that the CPU is blocking, and that I'm not waiting for my disk.

    htop seems to confirm that intuition: it looks like the bottleneck when trying to restore a kopia backup on my Raspberry Pi is its CPU.

    Let's test with an unencrypted disk to see if that hypothesis holds. I should expect higher restore speeds because the CPU will not be busy decrypting/encrypting data to disk, but it will still be busy decrypting data from the kopia repository.

    Testing it all

    I've flashed a clean Rasbperry Pi OS Lite image onto a sdcard, and booted from it. Using fdisk and mkfs.ext4 I can format the encrypted drive the Raspberry Pi was previously booting from into a clean, unencrypted drive.

    I then create a mount point for the disk, mount it, and change the ownership to my user thib.

    thib@tinykube:~ $ sudo mkdir /mnt/icy
    thib@tinykube:~ $ sudo mount /dev/sda1 /mnt/icy
    thib@tinykube:~ $ sudo chown -R thib:thib /mnt/icy
    

    I can now perform my tests, not forgetting to change the --filename parameter to /mnt/icy/temp.file so the benchmarks is performed on the disk and not on the sd card.

    Unencrypted disk performance

    Sequential read speed

    I can then run the sequential read test from the mounted disk

    thib@tinykube:~ $ fio --name TEST --eta-newline=5s --filename=/mnt/icy/temp.file --rw=read --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
    TEST: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    TEST: (g=0): rw=read, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    fio-3.33
    Starting 1 process
    TEST: Laying out IO file (1 file / 2048MiB)
    Jobs: 1 (f=1): [R(1)][19.4%][r=333MiB/s][r=333 IOPS][eta 00m:29s]
    Jobs: 1 (f=1): [R(1)][36.4%][r=333MiB/s][r=332 IOPS][eta 00m:21s] 
    Jobs: 1 (f=1): [R(1)][53.1%][r=333MiB/s][r=332 IOPS][eta 00m:15s] 
    Jobs: 1 (f=1): [R(1)][68.8%][r=333MiB/s][r=332 IOPS][eta 00m:10s] 
    Jobs: 1 (f=1): [R(1)][87.1%][r=332MiB/s][r=332 IOPS][eta 00m:04s] 
    Jobs: 1 (f=1): [R(1)][100.0%][r=334MiB/s][r=333 IOPS][eta 00m:00s]
    TEST: (groupid=0, jobs=1): err= 0: pid=14807: Sun Jun 15 11:58:14 2025
      read: IOPS=333, BW=333MiB/s (349MB/s)(10.0GiB/30733msec)
        slat (usec): min=83, max=56105, avg=2967.97, stdev=10294.97
        clat (msec): min=28, max=144, avg=92.78, stdev=16.27
         lat (msec): min=30, max=180, avg=95.75, stdev=18.44
        clat percentiles (msec):
         |  1.00th=[   71],  5.00th=[   78], 10.00th=[   80], 20.00th=[   83],
         | 30.00th=[   86], 40.00th=[   88], 50.00th=[   88], 60.00th=[   90],
         | 70.00th=[   93], 80.00th=[   97], 90.00th=[  126], 95.00th=[  131],
         | 99.00th=[  140], 99.50th=[  142], 99.90th=[  144], 99.95th=[  144],
         | 99.99th=[  144]
       bw (  KiB/s): min=321536, max=363816, per=99.96%, avg=341063.31, stdev=14666.91, samples=61
       iops        : min=  314, max=  355, avg=333.02, stdev=14.31, samples=61
      lat (msec)   : 50=0.61%, 100=83.42%, 250=15.98%
      cpu          : usr=0.31%, sys=18.80%, ctx=1173, majf=0, minf=8218
      IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=0.8%, 32=98.5%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
         issued rwts: total=10240,0,0,0 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=32
    
    Run status group 0 (all jobs):
       READ: bw=333MiB/s (349MB/s), 333MiB/s-333MiB/s (349MB/s-349MB/s), io=10.0GiB (10.7GB), run=30733-30733msec
    
    Disk stats (read/write):
      sda: ios=20359/2, merge=0/1, ticks=1622783/170, in_queue=1622998, util=82.13%
    

    I can read from that disk at a speed of about 350 MB/s. Looking at htop while the reading test is being performed paints a much different picture as compared to when the drive was encrypted

    I can see that the CPU is not very busy, and the wait time is well beyond 10%. Unsurprisingly this time, when testing what is the max read capacity for the risk the bottleneck is the disk.

    Sequential write speed

    thib@tinykube:~ $ fio --name TEST --eta-newline=5s --filename=/mnt/icy/temp.file --rw=write --size=2g --io_size=10g --blocksize=1024k --ioengine=libaio --fsync=10000 --iodepth=32 --direct=1 --numjobs=1 --runtime=60 --group_reporting
    TEST: (g=0): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=32
    fio-3.33
    Starting 1 process
    TEST: Laying out IO file (1 file / 2048MiB)
    Jobs: 1 (f=1): [W(1)][12.5%][w=319MiB/s][w=318 IOPS][eta 00m:49s]
    Jobs: 1 (f=1): [W(1)][28.6%][w=319MiB/s][w=318 IOPS][eta 00m:30s] 
    Jobs: 1 (f=1): [W(1)][44.7%][w=319MiB/s][w=318 IOPS][eta 00m:21s] 
    Jobs: 1 (f=1): [W(1)][59.5%][w=319MiB/s][w=318 IOPS][eta 00m:15s] 
    Jobs: 1 (f=1): [W(1)][75.0%][w=318MiB/s][w=318 IOPS][eta 00m:09s] 
    Jobs: 1 (f=1): [W(1)][91.4%][w=320MiB/s][w=319 IOPS][eta 00m:03s] 
    Jobs: 1 (f=1): [W(1)][100.0%][w=312MiB/s][w=311 IOPS][eta 00m:00s]
    TEST: (groupid=0, jobs=1): err= 0: pid=15551: Sun Jun 15 12:19:37 2025
      write: IOPS=300, BW=300MiB/s (315MB/s)(10.0GiB/34116msec); 0 zone resets
        slat (usec): min=156, max=1970.0k, avg=3244.94, stdev=19525.85
        clat (msec): min=18, max=2063, avg=102.64, stdev=103.41
         lat (msec): min=19, max=2066, avg=105.89, stdev=105.10
        clat percentiles (msec):
         |  1.00th=[   36],  5.00th=[   96], 10.00th=[   97], 20.00th=[   97],
         | 30.00th=[   97], 40.00th=[   97], 50.00th=[   97], 60.00th=[   97],
         | 70.00th=[   97], 80.00th=[   97], 90.00th=[  101], 95.00th=[  101],
         | 99.00th=[  169], 99.50th=[  182], 99.90th=[ 2039], 99.95th=[ 2056],
         | 99.99th=[ 2056]
       bw (  KiB/s): min= 6144, max=329728, per=100.00%, avg=321631.80, stdev=39791.66, samples=65
       iops        : min=    6, max=  322, avg=314.08, stdev=38.86, samples=65
      lat (msec)   : 20=0.05%, 50=1.89%, 100=88.33%, 250=9.44%, 2000=0.04%
      lat (msec)   : >=2000=0.24%
      fsync/fdatasync/sync_file_range:
        sync (nsec): min=189719k, max=189719k, avg=189718833.00, stdev= 0.00
        sync percentiles (msec):
         |  1.00th=[  190],  5.00th=[  190], 10.00th=[  190], 20.00th=[  190],
         | 30.00th=[  190], 40.00th=[  190], 50.00th=[  190], 60.00th=[  190],
         | 70.00th=[  190], 80.00th=[  190], 90.00th=[  190], 95.00th=[  190],
         | 99.00th=[  190], 99.50th=[  190], 99.90th=[  190], 99.95th=[  190],
         | 99.99th=[  190]
      cpu          : usr=7.25%, sys=11.37%, ctx=22027, majf=0, minf=26
      IO depths    : 1=0.1%, 2=0.1%, 4=0.2%, 8=0.4%, 16=0.8%, 32=98.5%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.1%, 64=0.0%, >=64=0.0%
         issued rwts: total=0,10240,0,1 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=32
    
    Run status group 0 (all jobs):
      WRITE: bw=300MiB/s (315MB/s), 300MiB/s-300MiB/s (315MB/s-315MB/s), io=10.0GiB (10.7GB), run=34116-34116msec
    
    Disk stats (read/write):
      sda: ios=0/20481, merge=0/47, ticks=0/1934829, in_queue=1935035, util=88.80%
    

    I now know I can write at about 300 MB/s on that unencrypted disk. Looking at htop while the test was running, I also know that the disk is the bottleneck and not the CPU.

    Random read/write speed

    Let's run the "worst performance test" again from the unencrypted disk.

    thib@tinykube:~ $ fio --name TEST --eta-newline=5s --filename=/mnt/icy/temp.file --rw=randrw --size=2g --io_size=10g --blocksize=4k --ioengine=libaio --fsync=1 --iodepth=1 --direct=1 --numjobs=32 --runtime=60 --group_reporting 
    TEST: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
    ...
    fio-3.33
    Starting 32 processes
    TEST: Laying out IO file (1 file / 2048MiB)
    Jobs: 32 (f=32): [m(32)][13.1%][r=10.5MiB/s,w=10.8MiB/s][r=2677,w=2773 IOPS][eta 00m:53s]
    Jobs: 32 (f=32): [m(32)][23.0%][r=11.0MiB/s,w=11.0MiB/s][r=2826,w=2819 IOPS][eta 00m:47s] 
    Jobs: 32 (f=32): [m(32)][32.8%][r=10.9MiB/s,w=11.5MiB/s][r=2780,w=2937 IOPS][eta 00m:41s] 
    Jobs: 32 (f=32): [m(32)][42.6%][r=10.8MiB/s,w=11.0MiB/s][r=2775,w=2826 IOPS][eta 00m:35s] 
    Jobs: 32 (f=32): [m(32)][52.5%][r=10.9MiB/s,w=11.3MiB/s][r=2787,w=2886 IOPS][eta 00m:29s] 
    Jobs: 32 (f=32): [m(32)][62.3%][r=11.3MiB/s,w=11.6MiB/s][r=2901,w=2967 IOPS][eta 00m:23s] 
    Jobs: 32 (f=32): [m(32)][72.1%][r=11.4MiB/s,w=11.5MiB/s][r=2908,w=2942 IOPS][eta 00m:17s] 
    Jobs: 32 (f=32): [m(32)][82.0%][r=11.6MiB/s,w=11.7MiB/s][r=2960,w=3004 IOPS][eta 00m:11s] 
    Jobs: 32 (f=32): [m(32)][91.8%][r=11.0MiB/s,w=11.2MiB/s][r=2815,w=2861 IOPS][eta 00m:05s] 
    Jobs: 32 (f=32): [m(32)][100.0%][r=11.0MiB/s,w=10.5MiB/s][r=2809,w=2700 IOPS][eta 00m:00s]
    TEST: (groupid=0, jobs=32): err= 0: pid=14830: Sun Jun 15 12:05:54 2025
      read: IOPS=2797, BW=10.9MiB/s (11.5MB/s)(656MiB/60004msec)
        slat (usec): min=14, max=1824, avg=88.06, stdev=104.92
        clat (usec): min=2, max=7373, avg=939.12, stdev=375.40
         lat (usec): min=130, max=7479, avg=1027.18, stdev=360.39
        clat percentiles (usec):
         |  1.00th=[    6],  5.00th=[  180], 10.00th=[  285], 20.00th=[  644],
         | 30.00th=[  889], 40.00th=[  971], 50.00th=[ 1037], 60.00th=[ 1090],
         | 70.00th=[ 1156], 80.00th=[ 1221], 90.00th=[ 1319], 95.00th=[ 1385],
         | 99.00th=[ 1532], 99.50th=[ 1614], 99.90th=[ 1811], 99.95th=[ 1926],
         | 99.99th=[ 6587]
       bw (  KiB/s): min= 8062, max=14560, per=100.00%, avg=11198.39, stdev=39.34, samples=3808
       iops        : min= 2009, max= 3640, avg=2793.55, stdev= 9.87, samples=3808
      write: IOPS=2806, BW=11.0MiB/s (11.5MB/s)(658MiB/60004msec); 0 zone resets
        slat (usec): min=15, max=2183, avg=92.95, stdev=108.34
        clat (usec): min=2, max=7118, avg=850.19, stdev=310.22
         lat (usec): min=110, max=8127, avg=943.13, stdev=312.58
        clat percentiles (usec):
         |  1.00th=[    6],  5.00th=[  174], 10.00th=[  302], 20.00th=[  668],
         | 30.00th=[  832], 40.00th=[  889], 50.00th=[  938], 60.00th=[  988],
         | 70.00th=[ 1020], 80.00th=[ 1057], 90.00th=[ 1123], 95.00th=[ 1172],
         | 99.00th=[ 1401], 99.50th=[ 1532], 99.90th=[ 1745], 99.95th=[ 1844],
         | 99.99th=[ 2147]
       bw (  KiB/s): min= 8052, max=14548, per=100.00%, avg=11234.02, stdev=40.18, samples=3808
       iops        : min= 2004, max= 3634, avg=2802.45, stdev=10.08, samples=3808
      lat (usec)   : 4=0.26%, 10=1.50%, 20=0.08%, 50=0.14%, 100=0.42%
      lat (usec)   : 250=5.89%, 500=7.98%, 750=6.66%, 1000=31.11%
      lat (msec)   : 2=45.93%, 4=0.01%, 10=0.01%
      fsync/fdatasync/sync_file_range:
        sync (usec): min=1323, max=17158, avg=5610.76, stdev=1148.23
        sync percentiles (usec):
         |  1.00th=[ 3195],  5.00th=[ 4228], 10.00th=[ 4490], 20.00th=[ 4686],
         | 30.00th=[ 4883], 40.00th=[ 5080], 50.00th=[ 5342], 60.00th=[ 5604],
         | 70.00th=[ 6128], 80.00th=[ 6718], 90.00th=[ 7177], 95.00th=[ 7570],
         | 99.00th=[ 8717], 99.50th=[ 9241], 99.90th=[ 9896], 99.95th=[10552],
         | 99.99th=[15401]
      cpu          : usr=0.51%, sys=2.25%, ctx=1006148, majf=0, minf=977
      IO depths    : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
         submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
         issued rwts: total=167837,168384,0,336200 short=0,0,0,0 dropped=0,0,0,0
         latency   : target=0, window=0, percentile=100.00%, depth=1
    
    Run status group 0 (all jobs):
       READ: bw=10.9MiB/s (11.5MB/s), 10.9MiB/s-10.9MiB/s (11.5MB/s-11.5MB/s), io=656MiB (687MB), run=60004-60004msec
      WRITE: bw=11.0MiB/s (11.5MB/s), 11.0MiB/s-11.0MiB/s (11.5MB/s-11.5MB/s), io=658MiB (690MB), run=60004-60004msec
    
    Disk stats (read/write):
      sda: ios=167422/311772, merge=0/14760, ticks=153900/409024, in_queue=615762, util=81.63%
    

    The read and write performance is much worse than I expected, only a few MB/s above the same test on the encrypted drive. But here again, htop tells us that the disk is the bottleneck, and not the CPU.

    Copying the bucket

    I now know that my disk can read or write at a maximum speed of about 300 MB/s. Let's sync the repository again from Scaleway s3.

    thib@tinykube:~ $ aws s3 sync s3://ergaster-backup/ /mnt/icy/s3 \
        --endpoint-url https://s3.fr-par.scw.cloud 
    

    The aws CLI reports download speeds between 45 and 65 MB/s, much higher than the initial tests! Having a look at htop while the sync happens, I can see that the CPUs are not at full capacity, and that the i/o wait time is at 0%.

    The metric that is has gone up though is si, that stands for softirqs. This paper and this StackOverflow answer explain what softirqs are. I understand the si metric from (h)top as "time the CPU spends to make the system's devices work." In this case, I believe this is time the CPU spends helping the network chip. If I'm wrong and you have a better explanation, please reach out at comments@ergaster.org!

    Testing kopia's performance

    Now for the final tests, let's first try to perform a restore from the AES-encrypted repository directly from the s3 bucket. Then, let's change the encryption algorithm of the repository and perform a restore.

    Restoring from the s3 bucket

    After connecting kopia to the repository my s3 bucket, I perform a tentative restore and...

    thib@tinykube:/mnt/icy $ kopia restore k5a270ab7f4acf72d4c3830a58edd7106
    Restoring to local filesystem (/mnt/icy/k5a270ab7f4acf72d4c3830a58edd7106) with parallelism=8...
    Processed 94929 (102.9 GB) of 118004 (233.3 GB) 19.8 MB/s (44.1%) remaining 1h49m45s.        
    

    I'm reaching much higher speeds, closer to the theoretical 27.6 MB/s I got in my encryption benchmark! Looking at htop, I can see that the CPU remains the bottleneck when restoring. Those are decent speeds for a small and power efficient device like the Raspberry Pi, but this is not enough for me to use it in production.

    The CPU is the limiting factor, and the Pi is busy exclusively doing a restore. If it was serving services in addition to that, the performance of the restore and of the services would degrade. We should be able to achieve better results by changing the encryption algorithm of the repository.

    Re-encrypting the repository

    Since the encryption algorithm can only be set when the repository is created, I need to create a new repository with the Chacha algorithm and ask kopia to decrypt the current repository encrypted with AES and re-encrypt its using Chacha.

    The Pi performs so poorly with AES that it would take days to do so. I can do this operation on my beefy VPS and then transfer the repository data onto my Pi.

    So on my VPS, I then connect to the s3 repo, passing it an option to dump the config in a special place

    [thib@ergaster ~]$ 
    Enter password to open repository: 
    
    Connected to repository.
    
    NOTICE: Kopia will check for updates on GitHub every 7 days, starting 24 hours after first use.
    To disable this behavior, set environment variable KOPIA_CHECK_FOR_UPDATES=false
    Alternatively you can remove the file "/home/thib/old.config.update-info.json".
    

    I then create a filesystem repo on my VPS, with the new encryption algorithm that is faster on the Pi

    [thib@ergaster ~]$ kopia repo create filesystem \
        --block-hash=BLAKE2B-256-128 \
        --encryption=CHACHA20-POLY1305-HMAC-SHA256 \
        --path=/home/thib/kopia_chacha
    Enter password to create new repository: 
    Re-enter password for verification: 
    Initializing repository with:
      block hash:          BLAKE2B-256-128
      encryption:          CHACHA20-POLY1305-HMAC-SHA256
      key derivation:      scrypt-65536-8-1
      splitter:            DYNAMIC-4M-BUZHASH
    Connected to repository.
    

    And I can finally launch the migration to retrieve data from the s3 provider and migrate it locally.

    [thib@ergaster ~]$ kopia snapshot migrate \
        --all \
        --source-config=/home/thib/old.config \
        --parallel 16
    

    I check that the repository is using the right encryption with

    [thib@ergaster ~]$ kopia repo status
    Config file:         /home/thib/.config/kopia/repository.config
    
    Description:         Repository in Filesystem: /home/thib/kopia_chacha
    Hostname:            ergaster
    Username:            thib
    Read-only:           false
    Format blob cache:   15m0s
    
    Storage type:        filesystem
    Storage capacity:    1.3 TB
    Storage available:   646.3 GB
    Storage config:      {
                           "path": "/home/thib/kopia_chacha",
                           "fileMode": 384,
                           "dirMode": 448,
                           "dirShards": null
                         }
    
    Unique ID:           eaa6041f654c5e926aa65442b5e80f6e8cf35c1db93573b596babf7cff8641d5
    Hash:                BLAKE2B-256-128
    Encryption:          AES256-GCM-HMAC-SHA256
    Splitter:            DYNAMIC-4M-BUZHASH
    Format version:      3
    Content compression: true
    Password changes:    true
    Max pack length:     21 MB
    Index Format:        v2
    
    Epoch Manager:       enabled
    Current Epoch: 0
    
    Epoch refresh frequency: 20m0s
    Epoch advance on:        20 blobs or 10.5 MB, minimum 24h0m0s
    Epoch cleanup margin:    4h0m0s
    Epoch checkpoint every:  7 epochs
    

    I could scp that repository to my Raspberry Pi, but I want to evaluate the restore performance the same conditions as before, so I create a new s3 bucket and sync the Chacha-encrypted repository to it. My repository weights about 200 GB. Pushing it to a new bucket and pulling it from the Pi will only cost me a handful of euros.

    [thib@ergaster ~]$ kopia repository sync-to s3 \
        --bucket=chacha \
        --access-key=REDACTED \
        --secret-access-key=REDACTED \
        --endpoint="s3.fr-par.scw.cloud" \
        --parallel 16
    

    After it's done, I can connect to that new bucket from the Raspberry Pi, disconnect kopia from the former AES-encrypted repo and connect to the new Chacha-encrypted repo

    thib@tinykube:~ $ kopia repo disconnect
    thib@tinykube:~ $ kopia repository connect s3 \
        --bucket=chacha \
        --access-key=REDACTED \
        --secret-access-key=REDACTED \
        --endpoint="s3.fr-par.scw.cloud"
    Enter password to open repository: 
    
    Connected to repository.
    

    I can then check that the repo indeed uses the Chacha encryption algorithm

    thib@tinykube:~ $ kopia repo status
    Config file:         /home/thib/.config/kopia/repository.config
    
    Description:         Repository in S3: s3.fr-par.scw.cloud chacha
    Hostname:            tinykube
    Username:            thib
    Read-only:           false
    Format blob cache:   15m0s
    
    Storage type:        s3
    Storage capacity:    unbounded
    Storage config:      {
                           "bucket": "chacha",
                           "endpoint": "s3.fr-par.scw.cloud",
                           "accessKeyID": "SCWW3H0VJTP98ZJXJJ8V",
                           "secretAccessKey": "************************************",
                           "sessionToken": "",
                           "roleARN": "",
                           "sessionName": "",
                           "duration": "0s",
                           "roleEndpoint": "",
                           "roleRegion": ""
                         }
    
    Unique ID:           632d3c3999fa2ca3b1e7e79b9ebb5b498ef25438b732762589537020977dc35c
    Hash:                BLAKE2B-256-128
    Encryption:          CHACHA20-POLY1305-HMAC-SHA256
    Splitter:            DYNAMIC-4M-BUZHASH
    Format version:      3
    Content compression: true
    Password changes:    true
    Max pack length:     21 MB
    Index Format:        v2
    
    Epoch Manager:       enabled
    Current Epoch: 0
    
    Epoch refresh frequency: 20m0s
    Epoch advance on:        20 blobs or 10.5 MB, minimum 24h0m0s
    Epoch cleanup margin:    4h0m0s
    Epoch checkpoint every:  7 epochs
    

    I can now do a test restore

    thib@tinykube:~ $ kopia restore k6303a292f182dcabab119b4d0e13b7d1 /mnt/icy/nextcloud-chacha
    Restoring to local filesystem (/mnt/icy/nextcloud-chacha) with parallelism=8...
    Processed 82254 (67 GB) of 118011 (233.4 GB) 23.1 MB/s (28.7%) remaining 1h59m50s
    

    After a minute or two of restoring at 1.5 MB/s with CPUs mostly idle, the Pi starts restoring increasingly faster. The restore speed displayed by kopia very slowly rises up to 23.1 MB/s. I expected it to reach 70 or 80 MB/s at least.

    The CPU doesn't look like it's going at full capacity. While the wait time remained regularly below 10%, I could see bumps of where the wa metric was going above 80% for some of the CPUs, and sometimes all at the same time.

    With the Chacha encryption algorithm, it looks like the bottleneck is not the CPU anymore but the disk. Unfortunately, I can only attach a NVMe drive via an usb enclosure on my Raspberry Pi 4, so I won't be able to remove that bottleneck.

    Conclusion

    It was a fun journey figuring out why my Raspberry Pi 4 was too slow to restore data backed up from my VPS. I now know the value of htop when chasing bottlenecks. I also understand better how Kopia works and the importance of using encryption and hash algorithms that work well on the machine that will perform the backups and restore.

    When doing a restore, the Raspberry Pi had to pull the repository data from Scaleway, decrypt the chunks from the repository, and encrypt data to write it on disk. The CPU of the Raspberry Pi is not optimized for encryption and favors power efficiency over computing power. It was completely saturated by the decryption and encryption to do.

    My only regret here is that I couldn't test a Chacha-encrypted kopia repository on an encrypted disk since my Raspberry Pi refused to boot from the encrypted drive shortly after testing the random read / write speed. I could get from a restore speed in Bytes per second to a restore speed in dozens of MegaByes per second. But even without the disk encryption overhead, the Pi is too slow at restoring backups for me to use it in production.

    Since I intend to run quite a few services on my server (k3s, Flux, Prometheus, kube-state-metrics, Grafana, velero, and a flurry of actual user-facing services) I need a much beefier machine. I purchased a Minisforum UM880 Plus to host it all, and now I know the importance of configuring velero and how it uses kopia for maximum efficiency on my machine.

    A massive thank you to my friends and colleagues, Olivier Reivilibre, Ben Banfield-Zanin, and Guillaume Villemont for their suggestions when chasing the bottleneck.

    Why is there no consistent single signon API flow?

    Single signon is a pretty vital part of modern enterprise security. You have users who need access to a bewildering array of services, and you want to be able to avoid the fallout of one of those services being compromised and your users having to change their passwords everywhere (because they're clearly going to be using the same password everywhere), or you want to be able to enforce some reasonable MFA policy without needing to configure it in 300 different places, or you want to be able to disable all user access in one place when someone leaves the company, or, well, all of the above. There's any number of providers for this, ranging from it being integrated with a more general app service platform (eg, Microsoft or Google) or a third party vendor (Okta, Ping, any number of bizarre companies). And, in general, they'll offer a straightforward mechanism to either issue OIDC tokens or manage SAML login flows, requiring users present whatever set of authentication mechanisms you've configured.

    This is largely optimised for web authentication, which doesn't seem like a huge deal - if I'm logging into Workday then being bounced to another site for auth seems entirely reasonable. The problem is when you're trying to gate access to a non-web app, at which point consistency in login flow is usually achieved by spawning a browser and somehow managing submitting the result back to the remote server. And this makes some degree of sense - browsers are where webauthn token support tends to live, and it also ensures the user always has the same experience.

    But it works poorly for CLI-based setups. There's basically two options - you can use the device code authorisation flow, where you perform authentication on what is nominally a separate machine to the one requesting it (but in this case is actually the same) and as a result end up with a straightforward mechanism to have your users socially engineered into giving Johnny Badman a valid auth token despite webauthn nominally being unphisable (as described years ago), or you reduce that risk somewhat by spawning a local server and POSTing the token back to it - which works locally but doesn't work well if you're dealing with trying to auth on a remote device. The user experience for both scenarios sucks, and it reduces a bunch of the worthwhile security properties that modern MFA supposedly gives us.

    There's a third approach, which is in some ways the obviously good approach and in other ways is obviously a screaming nightmare. All the browser is doing is sending a bunch of requests to a remote service and handling the response locally. Why don't we just do the same? Okta, for instance, has an API for auth. We just need to submit the username and password to that and see what answer comes back. This is great until you enable any kind of MFA, at which point the additional authz step is something that's only supported via the browser. And basically everyone else is the same.

    Of course, when we say "That's only supported via the browser", the browser is still just running some code of some form and we can figure out what it's doing and do the same. Which is how you end up scraping constants out of Javascript embedded in the API response in order to submit that data back in the appropriate way. This is all possible but it's incredibly annoying and fragile - the contract with the identity provider is that a browser is pointed at a URL, not that any of the internal implementation remains consistent.

    I've done this. I've implemented code to scrape an identity provider's auth responses to extract the webauthn challenges and feed those to a local security token without using a browser. I've also written support for forwarding those challenges over the SSH agent protocol to make this work with remote systems that aren't running a GUI. This week I'm working on doing the same again, because every identity provider does all of this differently.

    There's no fundamental reason all of this needs to be custom. It could be a straightforward "POST username and password, receive list of UUIDs describing MFA mechanisms, define how those MFA mechanisms work". That even gives space for custom auth factors (I'm looking at you, Okta Fastpass). But instead I'm left scraping JSON blobs out of Javascript and hoping nobody renames a field, even though I only care about extremely standard MFA mechanisms that shouldn't differ across different identity providers.

    Someone, please, write a spec for this. Please don't make it be me.

    comment count unavailable comments

    Nirbheek Chauhan

    @nirbheek

    A strange game. The only winning move is not to play.

    That's a reference to the 1983 film ā€œWarGamesā€. A film that has had incredible influence on not just the social milieu, but also cyber security and defence. It has a lot of lessons that need re-learning every couple of generations, and I think the time for that has come again.

    Human beings are very interesting creatures. Tribalism and warfare are wired in our minds in such a visceral way, that we lose the ability to think more than one or two steps forward when we're trying to defend our tribe in anger.

    Most people get that this is what makes warfare conducted with nuclear weapons particularly dangerous, but I think not enough words have been written about how this same tendency also makes warfare conducted with Social Media dangerous.

    You cannot win a war on Social Media. You can only mire yourself in it more and more deeply, harming yourself, the people around you, and the potential of what you could've been doing instead of fighting that war. The more you throw yourself in it, the more catharsis you will feel, followed by more attacks, more retaliation, and more catharsis.

    A Just War is addictive, and a Just War without loss of life is the most addictive of all.

    The only winning move is not to play.

    The Internet in general and Social Media in particular are very good at bringing close to you all kinds of strange and messed-up people. For a project like GNOME, it is almost unavoidable that the project and hence the people in it will encounter such people. Many of these people live for hate, and wish to see the GNOME project fail.

    Engaging them and hence spending your energy on them is the easiest way to help them achieve their goals. You cannot bully them off the internet. Your angry-posting and epic blogs slamming them into the ground aren't going to make them stop. The best outcome is that they get bored and go annoy someone else.

    The only winning move is not to play.

    When dealing with abusive ex-partners or ex-family members, a critical piece of advice is given to victims: all they want is a reaction. Everything they're doing is in pursuit of control, and once you leave them, the only control they have left is over your emotional state.

    When they throw a stone at you, don't lob it back at them. Catch it and drop it on the ground, as if it doesn't matter. In the beginning, they will intensify their attack, saying increasingly mean and cutting things in an attempt to evoke a response. You have to not care. Eventually they will get bored and leave you alone.

    This is really REALLY hard to do, because the other person knows all your trigger points. They know you inside out. But it's the only way out.

    The only winning move is not to play.

    Wars that cannot be won, should not be fought. Simply because war has costs, and for the people working on GNOME, the cost is time and energy that they could've spent on creating the future that they want to see.

    In my 20s and early 30s I made this same youthful mistake, and what got me out of it was my drive to always distil decisions through two questions: What is my core purpose? Is this helping me achieve my purpose?

    This is such a powerful guiding and enabling force, that I would urge all readers to imbue it. It will change your life.

    Jiri Eischmann

    @jeischma

    Linux Desktop Migration Tool 1.5

    After almost a year I made another release of the Linux Desktop Migration Tool. In this release I focused on the network settings migration, specifically NetworkManager because it’s what virtually all desktop distributions use.

    The result isn’t a lot of added code, but it certainly took some time to experiment with how NetworkManager behaves. It doesn’t officially support network settings migration, but it’s possible with small limitations. I’ve tested it with all kinds of network connections (wired, Wi-Fi, VPNs…) and it worked for me very well, but I’m pretty sure there are scenarios that may not work with the way I implemented the migration. I’m interested in learning about them. What is currently not fully handled are scenarios where the network connection requires a certificate. It’s either located in ~/.pki and thus already handled by the migration tool, or you have to migrate it manually.

    The Linux Desktop Migration Tool now covers everything I originally planned to cover and the number of choices has grown quite a lot. So I’ll focus on dialogs and generally UX instead of adding new features. I’ll also look at optimizations. E.g. migrating files using rsync takes a lot of time if you have a lot of small files in your home. It can certainly be speeded up.

    Hans de Goede

    @hansdg

    Is Copilot useful for kernel patch review?

    Patch review is an important and useful part of the kernel development process, but it also a time-consuming part. To see if I could save some human reviewer time I've been pushing kernel patch-series to a branch on github, creating a pull-request for the branch and then assigning it to Copilot for review. The idea being that In would fix any issues Co-pilot catches before posting the series upstream saving a human reviewer from having to catch the issues.

    I've done this for 5 patch-series: one, two, three, four, five, totalling 53 patches in total. click the number to see the pull-request and Copilot's reviews.

    Unfortunately the results are not great on 53 patches Co-pilot had 4 low-confidence comments which were not useful and 3 normal comments. 2 of the no comments were on the power-supply fwnode series one was about spelling degrees Celcius as degrees Celsius instead which is the single valid remark. The other remark was about re-assigning a variable without freeing it first, but Copilot missed that the re-assignment was to another variable since this happened in a different scope. The third normal comment (here) was about as useless as they can come.

    To be fair these were all patch-series written by me and then already self-reviewed and deemed ready for upstream posting before I asked Copilot to review them.

    As another experiment I did one final pull-request with a couple of WIP patches to add USBIO support from Intel. Copilot generated 3 normal comments here all 3 of which are valid and one of them catches a real bug. Still given the WIP state of this case and the fact that my own review has found a whole lot more then just this, including the need for a bunch if refactoring, the results of this Copilot review are also disappointing IMHO.

    Co-pilot also automatically generates summaries of the changes in the pull-requests, at a first look these look useful for e.g. a cover-letter for a patch-set but they are often full with half-truths so at a minimum these need some very careful editing / correcting before they can be used.

    My personal conclusion is that running patch-sets through Copilot before posting them on the list is not worth the effort.

    comment count unavailable comments

    Casilda 0.9.0 Development Release!

    Native rendering Release!

    I am pleased to announce a new development release of Casilda, aĀ simple Wayland compositor widget for Gtk 4 which can be used to embed other processes windows in your Gtk 4 application.

    The main feature of this release is dmabuf support which allow clients to use hardware accelerated libraries for their rendering brought to you by Val Packet!

    You can see all her cool work here.

    This allowed me to stop relaying on wlroots scene compositor and render client windows directly in the widget snapshot method which not only is faster but also integrates better with Gtk since now the background is not handled by wlroots anymore and can be set with CSS like with any other widget. This is why I decided to deprecate bg-color property.

    Other improvements include transient window support and better initial window placement.

    Release Notes

      • Fix rendering glitch on resize
      • Do not use wlr scene layout
      • Render windows and popups directly in snapshot()
      • Position windows on center of widget
      • Position transient windows on center of parent
      • Fix unmaximize
      • Add dmabuf support (Val Packett)
      • Added vapi generation (PaladinDev)
      • Add library soname (Benson Muite)

    Fixed Issues

      • “Resource leak causing crash with dmabuf”
      • ” Unmaximize not working properly”
      • “Add dmabuff support” (Val Packett)
      • “Bad performance”
      • “Add a soname to shared library” (Benson Muite)

    Where to get it?

    Source code lives on GNOME gitlab here

    git clone https://gitlab.gnome.org/jpu/casilda.git

    Matrix channel

    Have any question? come chat with us at #cambalache:gnome.org

    Mastodon

    Follow me in Mastodon @xjuan to get news related to Casilda and Cambalache development.

    Happy coding!

    My a11y journey

    23 years ago I was in a bad place. I'd quit my first attempt at a PhD for various reasons that were, with hindsight, bad, and I was suddenly entirely aimless. I lucked into picking up a sysadmin role back at TCM where I'd spent a summer a year before, but that's not really what I wanted in my life. And then Hanna mentioned that her PhD supervisor was looking for someone familiar with Linux to work on making Dasher, one of the group's research projects, more usable on Linux. I jumped.

    The timing was fortuitous. Sun were pumping money and developer effort into accessibility support, and the Inference Group had just received a grant from the Gatsy Foundation that involved working with the ACE Centre to provide additional accessibility support. And I was suddenly hacking on code that was largely ignored by most developers, supporting use cases that were irrelevant to most developers. Being in a relatively green field space sounds refreshing, until you realise that you're catering to actual humans who are potentially going to rely on your software to be able to communicate. That's somewhat focusing.

    This was, uh, something of an on the job learning experience. I had to catch up with a lot of new technologies very quickly, but that wasn't the hard bit - what was difficult was realising I had to cater to people who were dealing with use cases that I had no experience of whatsoever. Dasher was extended to allow text entry into applications without needing to cut and paste. We added support for introspection of the current applications UI so menus could be exposed via the Dasher interface, allowing people to fly through menu hierarchies and pop open file dialogs. Text-to-speech was incorporated so people could rapidly enter sentences and have them spoke out loud.

    But what sticks with me isn't the tech, or even the opportunities it gave me to meet other people working on the Linux desktop and forge friendships that still exist. It was the cases where I had the opportunity to work with people who could use Dasher as a tool to increase their ability to communicate with the outside world, whose lives were transformed for the better because of what we'd produced. Watching someone use your code and realising that you could write a three line patch that had a significant impact on the speed they could talk to other people is an incomparable experience. It's been decades and in many ways that was the most impact I've ever had as a developer.

    I left after a year to work on fruitflies and get my PhD, and my career since then hasn't involved a lot of accessibility work. But it's stuck with me - every improvement in that space is something that has a direct impact on the quality of life of more people than you expect, but is also something that goes almost unrecognised. The people working on accessibility are heroes. They're making all the technology everyone else produces available to people who would otherwise be blocked from it. They deserve recognition, and they deserve a lot more support than they have.

    But when we deal with technology, we deal with transitions. A lot of the Linux accessibility support depended on X11 behaviour that is now widely regarded as a set of misfeatures. It's not actually good to be able to inject arbitrary input into an arbitrary window, and it's not good to be able to arbitrarily scrape out its contents. X11 never had a model to permit this for accessibility tooling while blocking it for other code. Wayland does, but suffers from the surrounding infrastructure not being well developed yet. We're seeing that happen now, though - Gnome has been performing a great deal of work in this respect, and KDE is picking that up as well. There isn't a full correspondence between X11-based Linux accessibility support and Wayland, but for many users the Wayland accessibility infrastructure is already better than with X11.

    That's going to continue improving, and it'll improve faster with broader support. We've somehow ended up with the bizarre politicisation of Wayland as being some sort of woke thing while X11 represents the Roman Empire or some such bullshit, but the reality is that there is no story for improving accessibility support under X11 and sticking to X11 is going to end up reducing the accessibility of a platform.

    When you read anything about Linux accessibility, ask yourself whether you're reading something written by either a user of the accessibility features, or a developer of them. If they're neither, ask yourself why they actually care and what they're doing to make the future better.

    comment count unavailable comments

    libinput and tablet tool eraser buttons

    This is, to some degree, a followup to this 2014 post. The TLDR of that is that, many a moon ago, the corporate overlords at Microsoft that decide all PC hardware behaviour decreed that the best way to handle an eraser emulation on a stylus is by having a button that is hardcoded in the firmware to, upon press, send a proximity out event for the pen followed by a proximity in event for the eraser tool. Upon release, they dogma'd, said eraser button shall virtually move the eraser out of proximity followed by the pen coming back into proximity. Or, in other words, the pen simulates being inverted to use the eraser, at the push of a button. Truly the future, back in the happy times of the mid 20-teens.

    In a world where you don't want to update your software for a new hardware feature, this of course makes perfect sense. In a world where you write software to handle such hardware features, significantly less so.

    Anyway, it is now 11 years later, the happy 2010s are over, and Benjamin and I have fixed this very issue in a few udev-hid-bpf programs but I wanted something that's a) more generic and b) configurable by the user. Somehow I am still convinced that disabling the eraser button at the udev-hid-bpf level will make users that use said button angry and, dear $deity, we can't have angry users, can we? So many angry people out there anyway, let's not add to that.

    To get there, libinput's guts had to be changed. Previously libinput would read the kernel events, update the tablet state struct and then generate events based on various state changes. This of course works great when you e.g. get a button toggle, it doesn't work quite as great when your state change was one or two event frames ago (because prox-out of one tool, prox-in of another tool are at least 2 events). Extracing that older state change was like swapping the type of meatballs from an ikea meal after it's been served - doable in theory, but very messy.

    Long story short, libinput now has a internal plugin system that can modify the evdev event stream as it comes in. It works like a pipeline, the events are passed from the kernel to the first plugin, modified, passed to the next plugin, etc. Eventually the last plugin is our actual tablet backend which will update tablet state, generate libinput events, and generally be grateful about having fewer quirks to worry about. With this architecture we can hold back the proximity events and filter them (if the eraser comes into proximity) or replay them (if the eraser does not come into proximity). The tablet backend is none the wiser, it either sees proximity events when those are valid or it sees a button event (depending on configuration).

    This architecture approach is so successful that I have now switched a bunch of other internal features over to use that internal infrastructure (proximity timers, button debouncing, etc.). And of course it laid the ground work for the (presumably highly) anticipated Lua plugin support. Either way, happy times. For a bit. Because for those not needing the eraser feature, we've just increased your available tool button count by 100%[2] - now there's a headline for tech journalists that just blindly copy claims from blog posts.

    [1] Since this is a bit wordy, the libinput API call is just libinput_tablet_tool_config_eraser_button_set_button()
    [2] A very small number of styli have two buttons and an eraser button so those only get what, 50% increase? Anyway, that would make for a less clickbaity headline so let's handwave those away.

    Marcus Lundblad

    @mlundblad

    Midsommer Maps

     As tradition has it, it's about time for the (Northern Hemisphere) summer update on the happenings around Maps!

    About dialog for GNOME Maps 49.alpha development 


    Bug Fixes 

     Since the GNOME 48 release in March, there's been some bug fixes, such as correctly handling daylight savings time in public transit itineraries retrieved from Transitous. Also James Westman fixed a regression where the search result popover wasn't showing on small screen devices (phones) because of sizing issues.

     

    More Clickable Stuff

    More symbols can now be directly selected in the map view by clicking/tapping on there symbols, like roads and house numbers (and then also, like any other POI can be marked as favorites).
     
    Showing place information for the AVUS motorway in Berlin

     And related to traffic and driving, exit numbers are now shown for highway junctions (exits) when available.
     
    Showing information for a highway exit in a driving-on-the-right locallity

    Showing information for a highway exit in a driving-on-the-left locallity

     Note how the direction the arrow is pointing depends on the side of the road vehicle traffic drives in the country/territoy of the place…
    Also the icon for the ā€œDirectionsā€ button shows a ā€œturn off leftā€ mirrored icon now for places in drives-on-the-left countries as an additional attention-to-detail.
     

    Furigana Names in Japanese

    Since some time (around when we re-designed the place information ā€œbubblesā€) we show the native name for place under the name translated in the user's locale (when they are different).
    As there exists an established OpenStreetMap tag for phonetic names in Japanese (using Hiragana), name:ja-Hira akin to Furigana (https://en.wikipedia.org/wiki/Furigana) used to aid with pronounciation of place names. I had been thinking that it might be a good idea to show this when available as the dimmed supplimental text in the cases where the displayed name and native names are identical, and the Hiragana name is available. E.g. when the user's locale is Japanese and looking at Japanese names.  For other locales in these cases the displayed name would typically be the Romaji name with the Japanese full (Kanji) name displayed under it as the native name.
    So, I took the opportunity to discuss this with my college Daniel Markstedt, who speaks fluent Japanese and has lived many years in Japan. As he like the idea, and demo of it, I decided to go ahead with this!
     
    Showing a place in Japanese with supplemental Hiragana name

     

    Configurable Measurement Systems

    Since like the start of time, Maps has  shown distances in feet and miles when using a United States locale (or more precisely when measurements use such a locale, LC_MEASUREMENT when speaking about the environment variables). For other locales using standard metric measurements.
    Despite this we have several times recieved bug reports about Maps not  using the correct units. The issue here is that many users tend to prefer to have their computers speaking American English.
    So, I finally caved in and added an option to override the system default.
     
    Hamburger menu

     
    Hamburger menu showing measurement unit selection

    Station Symbols

    One feature I had been wanted to implement since we moved to vector tiles and integrated the customized highway shields from OpenStreeMap Americana is showing localized symbols for e.g. metro stations. Such as the classic ā€œroundelā€ symbol used in London, and the ā€Tā€œ in Stockholm.
     
    After adding the network:wikidata tag to the pre-generated vector tiles this has been possible to implement. We choose to rely on the Wikidata tag instead of the network name/abbreviations as this is more stable and names could risk getting collitions with unrelated networks having the same (short-) name.
     
    U-Bahn station in Hamburg

    Metro stations in Copenhagen

    Subway stations in Boston

    S-Bahn station in Berlin  

     
     This requires the stations being tagged consitently to work out. I did some mass tagging of metro stations in Stockholm, Oslo, and Copenhagen. Other than that I mainly choose places where's at least partial coverage already.
     
    If you'd like to contribute and update a network with the network Wikidata tag, I prepared to quick steps to do such an edit with the JOSM OpenStreetMap desktop editor.
     
    Download a set of objects to update using an Overpass query, as an example, selecting the stations of Washing DC metro
     
    [out:xml][timeout:90][bbox:{{bbox}}];

    (

         nwr["network"="Washington Metro"]["railway"="station"];

         );

        (._;>;);

        out meta;

     

    JOSM Overpass download query editor  

     Select the region to download from

    Select region in JOSM

     

    Select to only show the datalayer (not showing the background map) to make it easier to see the raw data.

    Toggle data layers in JOSM

     Select the nodes.

    Show raw datapoints in JSOM

     

    Edit the field in the tag edit panel to update the value for all selected objects

    Showing tags for selected objects

    Note that this sample assumed the relevant station node where already tagged with network names (the network tag). Other queries to limit selection might be needed.

    Also it could also be a good idea to reach out to local OSM communities before making bulk edits like this (e.g. if there is no such tagging at all in specific region) to make sure it would be aliged with expectations and such.

    Then it will also potentially take a while before it gets include in out monthly vector tile  update.

    When this has been done, given a suitable icon is available as e.g. public domain or commons in WikimediaCommons, it could be bundled in data/icons/stations and a definition added in the data mapping in src/mapStyle/stations.js.

     

    And More…

    One feature that has been long-wanted is the ability to dowload maps for offline usage. Lately precisely this is something James Westman has been working on.

    It's still an early draft, so we'll see when it is ready, but it already look pretty promising.

     

    Showing the new Preferences option  

      



    Preference dialog with dowloads

    Selecting region to download

     
    Entering a name for a downloaded region

      

    Dialog showing dowloaded areas

        

     

    And that's it for now! 

     
     

    Alley Chaggar

    @AlleyChaggar

    Demystifying The Codegen Phase Part 1

    Intro

    I want to start off and say I’m really glad that my last blog was helpful to many wanting to understand Vala’s compiler. I hope this blog will also be just as informative and helpful. I want to talk a little about the basics of the compiler again, but this time, catering to the codegen phase. The phase that I’m actually working on, but has the least information in the Vala Docs.

    Last blog, I briefly mentioned the directories codegen and ccode being part of the codegen phase. This blog will be going more into depth about it. The codegen phase takes the AST and outputs the C code tree (ccode* objects), so that it can be generated to C code more easily, usually by GCC or another C compiler you installed. When dealing with this phase, it’s really beneficial to know and understand at least a little bit of C.

    ccode Directory

    • Many of the files in the ccode directory are derived from the class CCodeNode, valaccodenode.vala.
    • The files in this directory represent C Constructs. For example, the valaccodefunction.vala file represents a C code function. Regular C functions have function names, parameters, return types, and bodies that add logic. Essentially, what this class specifically does, is provide the building blocks for building a function in C.

         //...
        	writer.write_string (return_type);
            if (is_declaration) {
                writer.write_string (" ");
            } else {
                writer.write_newline ();
            }
            writer.write_string (name);
            writer.write_string (" (");
            int param_pos_begin = (is_declaration ? return_type.char_count () + 1 : 0 ) + name.char_count () + 2;
      
            bool has_args = (CCodeModifiers.PRINTF in modifiers || CCodeModifiers.SCANF in modifiers);
       //...
      

    This code snippet is part of the ccodefunction file, and what it’s doing is overriding the ā€˜write’ function that is originally from ccodenode. It’s actually writing out the C function.

    codegen Directory

    • The files in this directory are higher-level components responsible for taking the compiler’s internal representation, such as the AST and transforming it into the C code model ccode objects.
    • Going back to the example of the ccodefunction, codegen will take a function node from the abstract syntax tree (AST), and will create a new ccodefunction object. It then fills this object with information like the return type, function name, parameters, and body, which are all derived from the AST. Then the CCodeFunction.write() (the code above) will generate and write out the C function.

      //...
      private void add_get_property_function (Class cl) {
      		var get_prop = new CCodeFunction ("_vala_%s_get_property".printf (get_ccode_lower_case_name (cl, null)), "void");
      		get_prop.modifiers = CCodeModifiers.STATIC;
      		get_prop.add_parameter (new CCodeParameter ("object", "GObject *"));
      		get_prop.add_parameter (new CCodeParameter ("property_id", "guint"));
      		get_prop.add_parameter (new CCodeParameter ("value", "GValue *"));
      		get_prop.add_parameter (new CCodeParameter ("pspec", "GParamSpec *"));
        
      		push_function (get_prop);
      //...
      

    This code snippet is from valagobjectmodule.vala and it’s calling CCodeFunction (again from the valaccodefunction.vala) and adding the parameters, which is calling valaccodeparameter.vala. What this would output is something that looks like this in C:

        void _vala_get_property (GObject *object, guint property_id, GValue *value, GParamSpec *pspec) {
           //... 
        }
    

    Why do all this?

    Now you might ask why? Why separate codegen and ccode?

    • We split things into codegen and ccode to keep the compiler organized, readable, and maintainable. It prevents us from having to constantly write C code representations from scratch all the time.
    • It also reinforces the idea of polymorphism and the ability that objects can behave differently depending on their subclass.
    • And it lets us do hidden generation by adding new helper functions, temporary variables, or inlined optimizations after the AST and before the C code output.

    Jsonmodule

    I’m happy to say that I am making a lot of progress with the JSON module I mentioned last blog. The JSON module follows very closely other modules in the codegen, specifically like the gtk module and the gobject module. It will be calling ccode functions to make ccode objects and creating helper methods so that the user doesn’t need to manually override certain JSON methods.

    Jamie Gravendeel

    @monster

    UI-First Search With List Models

    You can find the repository with the code here.

    When managing large amounts of data, manual widget creation finds its limits. Not only because managing both data and UI separately is tedious, but also because performance will be a real concern.

    Luckily, there’s two solutions for this in GTK:

    1. Gtk.ListView using a factory: more performant since it reuses widgets when the list gets long
    2. Gtk.ListBox‘s bind_model(): less performant, but can use boxed list styling

    This blog post provides an example of a Gtk.ListView containing my pets, which is sorted, can be searched, and is primarily made in Blueprint.

    The app starts with a plain window:

    from gi.repository import Adw, Gtk
    
    
    @Gtk.Template.from_resource("/app/example/Pets/window.ui")
    class Window(Adw.ApplicationWindow):
        """The main window."""
    
        __gtype_name__ = "Window"
    
    using Gtk 4.0;
    using Adw 1;
    
    template $Window: Adw.ApplicationWindow {
      title: _("Pets");
      default-width: 450;
      default-height: 450;
    
      content: Adw.ToolbarView {
        [top]
        Adw.HeaderBar {}
      }
    }
    

    Data Object

    The Gtk.ListView needs a data object to work with, which in this example is a pet with a name and species.

    This requires a GObject.Object called Pet with those properties, and a GObject.GEnum called Species:

    from gi.repository import Adw, GObject, Gtk
    
    
    class Species(GObject.GEnum):
        """The species of an animal."""
    
        NONE = 0
        CAT = 1
        DOG = 2
    
    […]
    
    class Pet(GObject.Object):
        """Data for a pet."""
    
        __gtype_name__ = "Pet"
    
        name = GObject.Property(type=str)
        species = GObject.Property(type=Species, default=Species.NONE)
    

    List View

    Now that there’s a data object to work with, the app needs a Gtk.ListView with a factory and model.

    To start with, there’s a Gtk.ListView wrapped in a Gtk.ScrolledWindow to make it scrollable, using the .navigation-sidebar style class for padding:

    content: Adw.ToolbarView {
      […]
    
      content: ScrolledWindow {
        child: ListView {
          styles [
            "navigation-sidebar",
          ]
        };
      };
    };
    

    Factory

    The factory builds a Gtk.ListItem for each object in the model, and utilizes bindingsĀ to show the data in the Gtk.ListItem:

    content: ListView {
      […]
    
      factory: BuilderListItemFactory {
        template ListItem {
          child: Label {
            halign: start;
            label: bind template.item as <$Pet>.name;
          };
        }
      };
    };

    Model

    Models can be modified through nesting. The data itself can be in any Gio.ListModel, in this case a Gio.ListStore works well.

    The Gtk.ListView expects a Gtk.SelectionModel because that’s how it manages its selection, so the Gio.ListStore is wrapped in a Gtk.NoSelection:

    using Gtk 4.0;
    using Adw 1;
    using Gio 2.0;
    
    […]
    
    content: ListView {
      […]
    
      model: NoSelection {
        model: Gio.ListStore {
          item-type: typeof<$Pet>;
    
          $Pet {
            name: "Herman";
            species: cat;
          }
    
          $Pet {
            name: "Saartje";
            species: dog;
          }
    
          $Pet {
            name: "Sofie";
            species: dog;
          }
    
          $Pet {
            name: "Rex";
            species: dog;
          }
    
          $Pet {
            name: "Lady";
            species: dog;
          }
    
          $Pet {
            name: "Lieke";
            species: dog;
          }
    
          $Pet {
            name: "Grumpy";
            species: cat;
          }
        };
      };
    };
    

    Sorting

    To easily parse the list, the pets should be sorted by both name and species.

    To implement this, the Gio.ListStore has to be wrapped in a Gtk.SortListModel which has a Gtk.MultiSorter with two sorters, a Gtk.NumericSorter and a Gtk.StringSorter.

    Both of these need an expression: the property that needs to be compared.

    The Gtk.NumericSorter expects an integer, not a Species, so the app needs a helper method to convert it:

    class Window(Adw.ApplicationWindow):
        […]
    
        @Gtk.Template.Callback()
        def _species_to_int(self, _obj: Any, species: Species) -> int:
            return int(species)
    
    model: NoSelection {
      model: SortListModel {
        sorter: MultiSorter {
          NumericSorter {
            expression: expr $_species_to_int(item as <$Pet>.species) as <int>;
          }
    
          StringSorter {
            expression: expr item as <$Pet>.name;
          }
        };
    
        model: Gio.ListStore { […] };
      };
    };
    

    To learn more about closures, such as the one used in the Gtk.NumericSorter, consider reading my previous blog post.

    Search

    To look up pets even faster, the user should be able to search for them by both their name and species.

    Filtering

    First, the Gtk.ListView‘s model needs the logic to filter the list by name or species.

    This can be done with a Gtk.FilterListModel which has a Gtk.AnyFilter with two Gtk.StringFilters.

    One of the Gtk.StringFilters expects a string, not a Species, so the app needs another helper method to convert it:

    class Window(Adw.ApplicationWindow):
        […]
    
        @Gtk.Template.Callback()
        def _species_to_string(self, _obj: Any, species: Species) -> str:
            return species.value_nick
    
    model: NoSelection {
      model: FilterListModel {
        filter: AnyFilter {
          StringFilter {
            expression: expr item as <$Pet>.name;
          }
    
          StringFilter {
            expression: expr $_species_to_string(item as <$Pet>.species) as <string>;
          }
        };
    
        model: SortListModel { […] };
      };
    };
    

    Entry

    To actually search with the filters, the app needs a Gtk.SearchBar with a Gtk.SearchEntry.

    The Gtk.SearchEntry‘s text property needs to be bound to the Gtk.StringFilters’ search properties to filter the list on demand.

    To be able to start searching by typing from anywhere in the window, the Gtk.SearchEntry‘s key-capture-widget has to be set to the window, in this case the template itself:

    content: Adw.ToolbarView {
      […]
    
      [top]
      SearchBar {
        key-capture-widget: template;
    
        child: SearchEntry search_entry {
          hexpand: true;
          placeholder-text: _("Search pets");
        };
      }
    
      content: ScrolledWindow {
        child: ListView {
          […]
    
          model: NoSelection {
            model: FilterListModel {
              filter: AnyFilter {
                StringFilter {
                  search: bind search_entry.text;
                  […]
                }
    
                StringFilter {
                  search: bind search_entry.text;
                  […]
                }
              };
    
              model: SortListModel { […] };
            };
          };
        };
      };
    };
    

    Toggle Button

    The Gtk.SearchBar should also be toggleable with a Gtk.ToggleButton.

    To do so, the Gtk.SearchEntry‘s search-mode-enabled property should be bidirectionally bound to the Gtk.ToggleButton‘s active property:

    content: Adw.ToolbarView {
      [top]
      Adw.HeaderBar {
        [start]
        ToggleButton search_button {
          icon-name: "edit-find-symbolic";
          tooltip-text: _("Search");
        }
      }
    
      [top]
      SearchBar {
        search-mode-enabled: bind search_button.active bidirectional;
        […]
      }
    
      […]
    };
    

    The search_button should also be toggleable with a shortcut, which can be added with a Gtk.ShortcutController:

    [start]
    ToggleButton search_button {
      […]
    
      ShortcutController {
        scope: managed;
    
        Shortcut {
          trigger: "<Control>f";
          action: "activate";
        }
      }
    }
    

    Empty State

    Last but not least, the view should fall back to an Adw.StatusPage if there are no search results.

    This can be done with a closure for the visible-child-name property in an Adw.ViewStack or Gtk.Stack. I generally prefer an Adw.ViewStack due to its animation curve.

    The closure takes the amount of items in the Gtk.NoSelection as input, and returns the correct Adw.ViewStackPage name:

    class Window(Adw.ApplicationWindow):
        […]
    
        @Gtk.Template.Callback()
        def _get_visible_child_name(self, _obj: Any, items: int) -> str:
            return "content" if items else "empty"
    
    content: Adw.ToolbarView {
      […]
    
      content: Adw.ViewStack {
        visible-child-name: bind $_get_visible_child_name(selection_model.n-items) as <string>;
        enable-transitions: true;
    
        Adw.ViewStackPage {
          name: "content";
    
          child: ScrolledWindow {
            child: ListView {
              […]
    
              model: NoSelection selection_model { […] };
            };
          };
        }
    
        Adw.ViewStackPage {
          name: "empty";
    
          child: Adw.StatusPage {
            icon-name: "edit-find-symbolic";
            title: _("No Results Found");
            description: _("Try a different search");
          };
        }
      };
    };
    

    End Result

    from typing import Any
    
    from gi.repository import Adw, GObject, Gtk
    
    
    class Species(GObject.GEnum):
        """The species of an animal."""
    
        NONE = 0
        CAT = 1
        DOG = 2
    
    
    @Gtk.Template.from_resource("/org/example/Pets/window.ui")
    class Window(Adw.ApplicationWindow):
        """The main window."""
    
        __gtype_name__ = "Window"
    
        @Gtk.Template.Callback()
        def _get_visible_child_name(self, _obj: Any, items: int) -> str:
            return "content" if items else "empty"
    
        @Gtk.Template.Callback()
        def _species_to_string(self, _obj: Any, species: Species) -> str:
            return species.value_nick
    
        @Gtk.Template.Callback()
        def _species_to_int(self, _obj: Any, species: Species) -> int:
            return int(species)
    
    
    class Pet(GObject.Object):
        """Data about a pet."""
    
        __gtype_name__ = "Pet"
    
        name = GObject.Property(type=str)
        species = GObject.Property(type=Species, default=Species.NONE)
    
    using Gtk 4.0;
    using Adw 1;
    using Gio 2.0;
    
    template $Window: Adw.ApplicationWindow {
      title: _("Pets");
      default-width: 450;
      default-height: 450;
    
      content: Adw.ToolbarView {
        [top]
        Adw.HeaderBar {
          [start]
          ToggleButton search_button {
            icon-name: "edit-find-symbolic";
            tooltip-text: _("Search");
    
            ShortcutController {
              scope: managed;
    
              Shortcut {
                trigger: "<Control>f";
                action: "activate";
              }
            }
          }
        }
    
        [top]
        SearchBar {
          key-capture-widget: template;
          search-mode-enabled: bind search_button.active bidirectional;
    
          child: SearchEntry search_entry {
            hexpand: true;
            placeholder-text: _("Search pets");
          };
        }
    
        content: Adw.ViewStack {
          visible-child-name: bind $_get_visible_child_name(selection_model.n-items) as <string>;
          enable-transitions: true;
    
          Adw.ViewStackPage {
            name: "content";
    
            child: ScrolledWindow {
              child: ListView {
                styles [
                  "navigation-sidebar",
                ]
    
                factory: BuilderListItemFactory {
                  template ListItem {
                    child: Label {
                      halign: start;
                      label: bind template.item as <$Pet>.name;
                    };
                  }
                };
    
                model: NoSelection selection_model {
                  model: FilterListModel {
                    filter: AnyFilter {
                      StringFilter {
                        expression: expr item as <$Pet>.name;
                        search: bind search_entry.text;
                      }
    
                      StringFilter {
                        expression: expr $_species_to_string(item as <$Pet>.species) as <string>;
                        search: bind search_entry.text;
                      }
                    };
    
                    model: SortListModel {
                      sorter: MultiSorter {
                        NumericSorter {
                          expression: expr $_species_to_int(item as <$Pet>.species) as <int>;
                        }
    
                        StringSorter {
                          expression: expr item as <$Pet>.name;
                        }
                      };
    
                      model: Gio.ListStore {
                        item-type: typeof<$Pet>;
    
                        $Pet {
                          name: "Herman";
                          species: cat;
                        }
    
                        $Pet {
                          name: "Saartje";
                          species: dog;
                        }
    
                        $Pet {
                          name: "Sofie";
                          species: dog;
                        }
    
                        $Pet {
                          name: "Rex";
                          species: dog;
                        }
    
                        $Pet {
                          name: "Lady";
                          species: dog;
                        }
    
                        $Pet {
                          name: "Lieke";
                          species: dog;
                        }
    
                        $Pet {
                          name: "Grumpy";
                          species: cat;
                        }
                      };
                    };
                  };
                };
              };
            };
          }
    
          Adw.ViewStackPage {
            name: "empty";
    
            child: Adw.StatusPage {
              icon-name: "edit-find-symbolic";
              title: _("No Results Found");
              description: _("Try a different search");
            };
          }
        };
      };
    }
    

    List models are pretty complicated, but I hope that this example provides a good idea of what’s possible from Blueprint, and is a good stepping stone to learn more.

    Thanks for reading!

    PS: a shout out to Markus for guessing what I’d write about next ;)

    Hari Rana

    @theevilskeleton

    It’s True, ā€œWeā€ Don’t Care About Accessibility on Linux

    Introduction

    What do concern trolls and privileged people without visible or invisible disabilities who share or make content about accessibility on Linux being trash without contributing anything to projects have in common? They don’t actually really care about the group they’re defending; they just exploit these victims’ unfortunate situation to fuel hate against groups and projects actually trying to make the world a better place.

    I never thought I’d be this upset to a point I’d be writing an article about something this sensitive with a clickbait-y title. It’s simultaneously demotivating, unproductive, and infuriating. I’m here writing this post fully knowing that I could have been working on accessibility in GNOME, but really, I’m so tired of having my mood ruined because of privileged people spending at most 5 minutes to write erroneous posts and then pretending to be oblivious when confronted while it takes us 5 months of unpaid work to get a quarter of recognition, let alone acknowledgment, without accounting for the time ā€œwastedā€ addressing these accusations. This is far from the first time, and it will certainly not be the last.

    I’m Not Angry

    I’m not mad. I’m absolutely furious and disappointed in the Linux Desktop community for being quiet in regards to any kind of celebration to advancing accessibility, while proceeding to share content and cheer for random privileged people from big-name websites or social media who have literally put a negative amount of effort into advancing accessibility on Linux. I’m explicitly stating a negative amount because they actually make it significantly more stressful for us.

    None of this is fair. If you’re the kind of person who stays quiet when we celebrate huge accessibility milestones, yet shares (or even makes) content that trash talks the people directly or indirectly contributing to the fucking software you use for free, you are the reason why accessibility on Linux is shit.

    No one in their right mind wants to volunteer in a toxic environment where their efforts are hardly recognized by the public and they are blamed for ā€œnot doing enoughā€, especially when they are expected to take in all kinds of harassment, nonconstructive criticism, and slander for a salary of 0$.

    There’s only one thing I am shamefully confident about: I am not okay in the head. I shouldn’t be working on accessibility anymore. The recognition-to-smearing ratio is unbearably low and arguably unhealthy, but leaving people in unfortunate situations behind is also not in accordance with my values.

    I’ve been putting so much effort, quite literally hundreds of hours, into:

    1. thinking of ways to come up with inclusive designs and experiences;
    2. imagining how I’d use something if I had a certain disability or condition;
    3. asking for advice and feedback from people with disabilities;
    4. not getting paid from any company or organization; and
    5. making sure that all the accessibility-related work is in the public, and stays in the public.

    Number 5 is especially important to me. I personally go as far as to refuse to contribute to projects under a permissive license, and/or that utilize a contributor license agreement, and/or that utilize anything riskily similar to these two, because I am of the opinion that no amount of code for accessibility should either be put under a paywall or be obscured and proprietary.

    Permissive licenses make it painlessly easy for abusers to fork, build an ecosystem on top of it which may include accessibility-related improvements, slap a price tag alongside it, all without publishing any of these additions/changes. Corporations have been doing that for decades, and they’ll keep doing it until there’s heavy push back. The only time I would contribute to a project under a permissive license is when the tool is the accessibility infrastructure itself. Contributor license agreements are significantly worse in that regard, so I prefer to avoid them completely.

    The Truth Nobody Is Telling You

    KDE hired a legally blind contractor to work on accessibility throughout the KDE ecosystem, including complying with the EU Directive to allow selling hardware with Plasma.

    GNOME’s new executive director, Steven Deobald, is partially blind.

    The GNOME Foundation has been investing a lot of money to improve accessibility on Linux, for example funding Newton, a Wayland accessibility project and AccessKit integration into GNOME technologies. Around 250,000€ (1/4) of the STF budget was spent solely on accessibility. And get this: literally everybody managing these contracts and communication with funders are volunteers; they’re ensuring people with disabilities earn a living, but aren’t receiving anything in return. These are the real heroes who deserve endless praise.

    The Culprits

    Do you want to know who we should be blaming? Profiteers who are profiting from the community’s effort while investing very little to nothing into accessibility.

    This includes a significant portion of the companies sponsoring GNOME and even companies that employ developers to work on GNOME. These companies are the ones making hundreds of millions, if not billions, in net profit indirectly from GNOME (and other free and open-source projects), and investing little to nothing into them. However, the worst offenders are the companies actively using GNOME without ever donating anything to fund the projects.

    Some companies actually do put an effort, like Red Hat and Igalia. Red Hat employs people with disabilities to work on accessibility in GNOME, one of which I actually rely on when making accessibility-related contributions in GNOME. Igalia funds Orca, the screen reader as part of GNOME, which is something the Linux community should be thankful of. However, companies have historically invested what’s necessary to comply with governments’ accessibility requirements, and then never invest in it again.

    The privileged people who keep sharing and making content around accessibility on Linux being bad without contributing anything to it are, in my opinion, significantly worse than the companies profiting off of GNOME. Companies are and stay quiet, but these privileged people add an additional burden to contributors by either trash talking or sharing trash talkers. Once again, no volunteer deserves to be in the position of being shamed and ridiculed for ā€œnot doing enoughā€, since no one is entitled to their free time, but themselves.

    My Work Is Free but the Worth Is Not

    Earlier in this article, I mentioned, and I quote: ā€œI’ve been putting so much effort, quite literally hundreds of hours […]ā€. Let’s put an emphasis on ā€œhundredsā€. Here’s a list of most accessibility-related merge requests that have been incorporated into GNOME:

    GNOME Calendar’s !559 addresses an issue where event widgets were unable to be focused and activated by the keyboard. That was present since the very beginning of GNOME Calendar’s existence, to be specific: for more than a decade. This alone was was a two-week effort. Despite it being less than 100 lines of code, nobody truly knew what to do to have them working properly before. This was followed up by !576, which made the event buttons usable in the month view with a keyboard, and then !587, which properly conveys the states of the widgets. Both combined are another two-week effort.

    Then, at the time of writing this article, !564 adds 640 lines of code, which is something I’ve been volunteering on for more than a month, excluding the time before I opened the merge request.

    Let’s do a little bit of math together with ā€˜only’ !559, !576, and !587. Just as a reminder: these three merge requests are a four-week effort in total, which I volunteered full-time—8 hours a day, or 160 hours a month. I compiled a small table that illustrates its worth:

    Country Average Wage for Professionals Working on Digital AccessibilityWebAIM Total in Local Currency
    (160 hours)
    Exchange Rate Total (CAD)
    Canada 58.71$ CAD/hour 9,393.60$ CAD N/A 9,393.60$
    United Kingdom 48.20Ā£ GBP/hour 7,712Ā£ GBP 1.8502 14,268.74$
    United States of America 73.08$ USD/hour 11,692.80$ USD 1.3603 15,905.72$

    To summarize the table: those three merge requests that I worked on for free were worth 9,393.60$ CAD (6,921.36$ USD) in total at a minimum.

    Just a reminder:

    • these merge requests exclude the time spent to review the submitted code;
    • these merge requests exclude the time I spent testing the code;
    • these merge requests exclude the time we spent coordinating these milestones;
    • these calculations exclude the 30+ merge requests submitted to GNOME; and
    • these calculations exclude the merge requests I submitted to third-party GNOME-adjacent apps.

    Now just imagine how I feel when I’m told I’m ā€œnot doing enoughā€, either directly or indirectly, by privileged people who don’t rely on any of these accessibility features. Whenever anybody says we’re ā€œnot doing enoughā€, I feel very much included, and I will absolutely take it personally.

    It All Trickles Down to ā€œGNOME Badā€

    I fully expect everything I say in this article to be dismissed or be taken out of context on the basis of ad hominem, simply by the mere fact I’m a GNOME Foundation member / regular GNOME contributor. Either that, or be subject to whataboutism because another GNOME contributor made a comment that had nothing to do with mine but ā€˜is somewhat related to this topic and therefore should be pointed out just because it was maybe-probably-possibly-perhaps ableist’. I can’t speak for other regular contributors, but I presume that they don’t feel comfortable talking about this because they dared be a GNOME contributor. At least, that’s how I felt for the longest time.

    Any content related to accessibility that doesn’t dunk on GNOME doesn’t foresee as many engagement, activity, and reaction as content that actively attacks GNOME, regardless of whether the criticism is fair. Many of these people don’t even use these accessibility features; they’re just looking for every opportunity to say ā€œGNOME badā€ and will šŸŖ„ magically šŸŖ„ start caring about accessibility.

    Regular GNOME contributors like myself don’t always feel comfortable defending ourselves because dismissing GNOME developers just for being GNOME developers is apparently a trend…

    Final Word

    Dear people with disabilities,

    I won’t insist that we’re either your allies or your enemies—I have no right to claim that whatsoever.

    I wasn’t looking for recognition. I wasn’t looking for acknowledgment since the very beginning either. I thought I would be perfectly capable of quietly improving accessibility in GNOME, but because of the overall community’s persistence to smear developers’ efforts without actually tackling the underlying issues within the stack, I think I’ve justified myself to at least demand for acknowledgment from the wider community.

    I highly doubt it will happen anyway, because the Linux community feeds off of drama and trash talking instead of being productive, without realizing that it negatively demotivates active contributors while pushing away potential contributors. And worst of all: people with disabilities are the ones affected the most because they are misled into thinking that we don’t care.

    It’s so unfair and infuriating that all the work I do and share online gain very little activity compared to random posts and articles from privileged people without disabilities that rant about the Linux desktop’s accessibility being trash. It doesn’t help that I become severely anxious sharing accessibility-related work to avoid signs of virtue signalling. The last thing I want is to (unintentionally) give any sign and impression of pretending to care about accessibility.

    I beg you, please keep writing banger posts like fireborn’s I Want to Love Linux. It Doesn’t Love Me Back series and their interluding post. We need more people with disabilities to keep reminding developers that you exist and your conditions and disabilities are a spectrum, and not absolute.

    We simultaneously need more interest from people with disabilities to contribute to free and open-source software, and the wider community to be significantly more intolerant of bullies who profit from smearing and demotivating people who are actively trying.

    We should take inspiration from ā€œAccessibility on Linux sucks, but GNOME and KDE are making progressā€ by OSNews. They acknowledge that accessibility on Linux is suboptimal while recognizing the efforts of GNOME and KDE. As a community, we should promote progress more often.

    Jamie Gravendeel

    @monster

    Data Driven UI With Closures

    It’s highly recommended to read my previous blog post first to understand some of the topics discussed here.

    UI can be hard to keep track of when changed imperatively, preferably it just follows the code’s state. Closures provide an intuitive way to do so by having data as input, and the desired value as output. They couple data with UI, but decouple the specific piece of UI that’s changed, making closures very modular. The example in this post uses Python and Blueprint.

    Technicalities

    First, it’s good to be familiar with the technical details behind closures. To quote from Blueprint’s documentation:

    Expressions are only reevaluated when their inputs change. Because Blueprint doesn’t manage a closure’s application code, it can’t tell what changes might affect the result. Therefore, closures must be pure, or deterministic. They may only calculate the result based on their immediate inputs, not properties of their inputs or outside variables.

    To elaborate, expressions know when their inputs have changed due to the inputs being GObject properties, which emit the “notify” signal when modified.

    Another thing to note is where casting is necessary. To again quote Blueprint’s documentation:

    Blueprint doesn’t know the closure’s return type, so closure expressions must be cast to the correct return type using a cast expression.

    Just like Blueprint doesn’t know about the return type, it also doesn’t know the type of ambiguous properties. To provide an example:

    Button simple_button {
      label: _("Click");
    }
    
    Button complex_button {
      child: Adw.ButtonContent {
        label: _("Click");
      };
    }

    Getting the label of simple_button in a lookup does not require a cast, since label is a known property of Gtk.Button with a known type:

    simple_button.label

    While getting the label of complex_button does require a cast, since child is of type Gtk.Widget, which does not have the label property:

    complex_button.child as <Adw.ButtonContent>.label

    Example

    To set the stage, there’s a window with a Gtk.Stack which has two Gtk.StackPages, one for the content and one for the loading view:

    from gi.repository import Adw, Gtk
    
    
    @Gtk.Template.from_resource("/org/example/App/window.ui")
    class Window(Adw.ApplicationWindow):
        """The main window."""
    
        __gtype_name__ = "Window"
    using Gtk 4.0;
    using Adw 1;
    
    template $Window: Adw.ApplicationWindow {
      title: _("Demo");
    
      content: Adw.ToolbarView {
        [top]
        Adw.HeaderBar {}
    
        content: Stack {
          StackPage {
            name: "content";
    
            child: Label {
              label: _("Meow World!");
            };
          }
    
          StackPage {
            name: "loading";
    
            child: Adw.Spinner {};
          }
        };
      };
    }

    Switching Views Conventionally

    One way to manage the views would be to rely on signals to communicate when another view should be shown:

    from typing import Any
    
    from gi.repository import Adw, GObject, Gtk
    
    
    @Gtk.Template.from_resource("/org/example/App/window.ui")
    class Window(Adw.ApplicationWindow):
        """The main window."""
    
        __gtype_name__ = "Window"
    
        stack: Gtk.Stack = Gtk.Template.Child()
    
        loading_finished = GObject.Signal()
    
        @Gtk.Template.Callback()
        def _show_content(self, *_args: Any) -> None:
            self.stack.set_visible_child_name("content")

    A reference to the stack has been added, as well as a signal to communicate when loading has finished, and a callback to run when that signal is emitted.

    using Gtk 4.0;
    using Adw 1;
    
    template $Window: Adw.ApplicationWindow {
      title: _("Demo");
      loading-finished => $_show_content();
    
      content: Adw.ToolbarView {
        [top]
        Adw.HeaderBar {}
    
        content: Stack stack {
          StackPage {
            name: "content";
    
            child: Label {
              label: _("Meow World!");
            };
          }
    
          StackPage {
            name: "loading";
    
            child: Adw.Spinner {};
          }
        };
      };
    }

    A signal handler has been added, as well as a name for the Gtk.Stack.

    Only a couple of changes had to be made to switch the view when loading has finished, but all of them are sub-optimal:

    1. A reference in the code to the stack would be nice to avoid
    2. Imperatively changing the view makes following state harder
    3. This approach doesn’t scale well when the data can be reloaded, it would require another signal to be added

    Switching Views With a Closure

    To use a closure, the class needs data as input and a method to return the desired value:

    from typing import Any
    
    from gi.repository import Adw, GObject, Gtk
    
    
    @Gtk.Template.from_resource("/org/example/App/window.ui")
    class Window(Adw.ApplicationWindow):
        """The main window."""
    
        __gtype_name__ = "Window"
    
        loading = GObject.Property(type=bool, default=True)
    
        @Gtk.Template.Callback()
        def _get_visible_child_name(self, _obj: Any, loading: bool) -> str:
            return "loading" if loading else "content"

    The signal has been replaced with the loading property, and the template callback has been replaced by a method that returns a view name depending on the value of that property. _obj here is the template class, which is unused.

    using Gtk 4.0;
    using Adw 1;
    
    template $Window: Adw.ApplicationWindow {
      title: _("Demo");
    
      content: Adw.ToolbarView {
        [top]
        Adw.HeaderBar {}
    
        content: Stack {
          visible-child-name: bind $_get_visible_child_name(template.loading) as <string>;
    
          StackPage {
            name: "content";
    
            child: Label {
              label: _("Meow World!");
            };
          }
    
          StackPage {
            name: "loading";
    
            child: Adw.Spinner {};
          }
        };
      };
    }

    In Blueprint, the signal handler has been removed, as well as the unnecessary name for the Gtk.Stack. The visible-child-name property is now bound to a closure, which takes in the loading property referenced with template.loading.

    This fixed the issues mentioned before:

    1. No reference in code is required
    2. State is bound to a single property
    3. If the data reloads, the view will also adapt

    Closing Thoughts

    Views are just one UI element that can be managed with closures, but there’s plenty of other elements that should adapt to data, think of icons, tooltips, visibility, etc. Whenever you’re writing a widget with moving parts and data, think about how the two can be linked, your future self will thank you!

    Victor Ma

    @victorma

    A strange bug

    In the last two weeks, I’ve been trying to fix a strange bug that causes the word suggestions list to have the wrong order sometimes.

    For example, suppose you have an empty 3x3 grid. Now suppose that you move your cursor to each of the cells of the 1-Across slot (labelled α, β, and γ).

    +---+---+---+
    | α | β | γ |
    +---+---+---+
    | | | |
    +---+---+---+
    | | | |
    +---+---+---+
    

    You should expect the word suggestions list for 1-Across to stay the same, regardless of which cell your cursor is on. After all, all three cells have the same information: that the 1-Across slot is empty, and the intersecting vertical slot of whatever cell we’re on (1-Down, 2-Down, or 3-Down) is also empty.

    There are no restrictions whatsoever, so all three cells should show the same word suggestion list: one that includes every three-letter word.

    But that’s not what actually happens. In reality, the word suggestions list changes quite dramatically. The order of the list definitely changes. And it looks like there may even be words in one list that doesn’t appear in another. What’s going on here?

    Understanding the code

    My first step was to understand how the code for the word suggestions list works. I took notes along the way, in order to solidify my understanding. I especially found it useful to create diagrams for the word list resource (a pre-compiled resource that the code uses):

    Word list resource diagram

    By the end of the first week, I had a good idea of how the word-suggestions-list code works. The next step was to figure out the cause of the bug and how to fix it.

    Investigating the bug

    After doing some testing, I realized that the seemingly random orderings of the lists are not so random after all! The lists are actually all in alphabetical order—but based on the letter that corresponds to the cell, not necessarily the first letter.

    What I mean is this:

    • The word suggestions list for cell α is sorted alphabetically by the first letter of the words. (This is normal alphabetical order.) For example:
      ALE, AXE, BAY, BOA, CAB
      
    • The word suggestions list for cell β is sorted alphabetically by the second letter of the words. For example:
      CAB, BAY, ALE, BOA, AXE
      
    • The word suggestions list for cell γ is sorted alphabetically by the third letter of the words. For example:
      BOA, CAB, ALE, AXE, BAY
      

    Fixing the bug

    The cause of the bug is quite simple: The function that generates the word suggestions list does not sort the list before it returns it. So the order of the list is whatever order the function added the words in. And because of how our implementation works, that order happens to be alphabetical, based on the letter that corresponds to the cell.

    The fix for the bug is also quite simple—at least theoretically. All we need to do is sort the list before we return it. But in reality, this fix runs into some other problems that need to be addressed. Those problems are what I’m going to work on this week.

    Status update, 15/06/2025

    This month I created a personal data map where I tried to list all my important digital identities.

    (It’s actually now a spreadsheet, which I’ll show you later. I didn’t want to start the blog post with something as dry as a screenshot of a spreadsheet.)

    Anyway, I made my personal data map for several reasons.

    The first reason was to stay safe from cybercrime. In a world of increasing global unfairness and inequality, of course crime and scams are increasing too. Schools don’t teach how digital tech actually works, so it’s a great time to be a cyber criminal. Imagine being a house burglar in a town where nobody knows how doors work.

    Lucky for me, I’m a professional door guy. So I don’t worry too much beyond having a really really good email password (it has numbers and letters). But its useful to double check if I have my credit card details on a site where the password is still “sam2003”.

    The second reason is to help me migrate to services based in Europe. Democracy over here is what it is, there are good days and bad days, but unlike the USA we have at least more options than a repressive death cult and a fundraising business. (Shout to @angusm@mastodon.social for that one). You can’t completely own your digital identity and your data, but you can at least try to keep it close to home.

    The third reason was to see who has the power to influence my online behaviour.

    This was an insight from reading the book Technofeudalism. I’ve always been uneasy about websites tracking everything I do. Most of us are, to the point that we have made myths like “your phone microphone is always listening so Instagram can target adverts”. (As McSweeney’s Internet Tendency confirms, it’s not! It’s just tracking everything you type, every app you use, every website you visit, and everywhere you go in the physical world).

    I used to struggle to explain why all that tracking feels bad. Technofeudalism frames a concept of cloud capital, saying this is now more powerful than other kinds of capital because cloud capitalists can do something Henry Ford, Walt Disney and The Monopoly Guy can only dream of: mine their data stockpile to produce precisely targeted recommendations, search bubbles and adverts which can influence your behaviour before you’ve even noticed.

    This might sound paranoid when you first hear it, but consider how social media platforms reward you for expressing anger and outrage. Remember the first time you saw a post on Twitter from a stranger that you disagreed with? And your witty takedown attracted likes and praise? This stuff can be habit-forming.

    In the 20th century, ad agencies changed people’s buying patterns and political views using billboards, TV channel and newspapers. But all that is like a primitive blunderbuss compared to recommendation algorithms, feedback loops and targeted ads on social media and video apps.

    I lived through the days when web search for “Who won the last election” would just return you 10 pages that included the word “election”. (If you’re nostalgic for those days… you’ll be happy to know that GNOME’s desktop search engine still works like that today! : -) I can spot when apps trying to ‘nudge’ me with dark patterns. But kids aren’t born with that skill, and they aren’t necessarily going to understand the nature of Tech Billionaire power unless we help them to see it. We need a framework to think critically and discuss the power that Meta, Amazon and Microsoft have over everyone’s lives. Schools don’t teach how digital tech actually works, but maybe a “personal data map” can be a useful teaching tool?

    By the way, here’s what my cobbled-together “Personal data map” looks like, taking into account security, what data is stored and who controls it. (With some fake data… I don’t want this blog post to be a “How to steal my identity” guide.)

    NameRisksSensitivity ratingEthical ratingLocationControllerFirst factorSecond factorCredentials cached?Data stored
    Bank accountFinancial loss102EuropeBank FingerprintNoneOn phoneMoney, transactions
    InstagramIdentity theft5-10USAMetaPasswordEmailOn phonePosts, likes, replies, friends, views, time spent, locations, searches.
    Google Mail (sam@gmail.com)Reset passwords9-5USAGooglePasswordNoneYes – cookiesConversations, secrets
    GithubImpersonation33USAMicrosoftPasswordOTPYes – cookiesCredit card, projects, searches.

    How is it going migrating off USA based cloud services?

    “The internet was always a project of US power”, says Paris Marx, a keynote at PublicSpaces conference, which I never heard of before.

    Closing my Amazon account took an unnecessary amount of steps, and it was sad to say goodbye to the list of 12 different address I called home at various times since 2006, but I don’t miss it; I’ve been avoiding Amazon for years anyway. When I need English-language books, I get them from an Irish online bookstore named Kenny’s. (Ireland, cleverly, did not leave the EU so they can still ship books to Spain without incurring import taxes).

    Dropbox took a while because I had years of important stuff in there. I actually don’t think they’re too bad of a company, and it was certainly quick to delete my account. (And my data… right? You guys did delete all my data?).

    I was using Dropbox to sync notes with the Joplin notes app, and switched to the paid Joplin Cloud option, which seems a nice way to support a useful open source project.

    I still needed a way to store sensitive data, and realized I have access to Protondrive. I can’t recommend that as a service because the parent company Proton AG don’t seem so serious about Linux support, but I got it to work thanks to some heroes who added a protondrive backend to rclone.

    Instead of using Google cloud services to share photos, and to avoid anything so primitive as an actual cable, I learned that KDE Connect can transfer files from my Android phone over my laptop really neatly. KDE Connect is really good. On the desktop I use GSConnect which integrates with GNOME Shell really well. I think I’ve not been so impressed by a volunteer-driven open source project in years. Thanks to everyone who worked on these great apps!

    I also migrated my VPS from a US-based host Tornado VPS to one in Europe. Tornado VPS (formally prgmr.com) are a great company, but storing data in the USA doesn’t seem like the way forwards.

    That’s about it so far. Feels a bit better.

    What’s next?

    I’m not sure whats next!

    I can’t leave Github and Gitlab.com, but my days of “Write some interesting new code and push it straight to Github” are long gone. I didn’t sign up to train somebody else’s LLM for free, and neither should you. (I’m still interested in sharing interesting code with nice people, of course, but let’s not make it so easy for Corporate America to take our stuff without credit or compensation. Bring back the “sneakernet“!)

    Leaving Meta platforms and dropping YouTube doesn’t feel directly useful. It’s like individually renouncing debit cards, or air travel: a lot of inconvenience for you, but the business owners don’t even notice. The important thing is to use the alternatives more. Hence why I still write a blog in 2025 and mostly read RSS feeds and the Fediverse. Gigs where I live are mostly only promoted on Instagram, but I’m sure that’s temporary.

    In the first quarter of 2025, rich people put more money into AI startups than everything else put together (see: Pivot to AI). Investors love a good bubble, but there’s also an element of power here.

    If programmers only know how to write code using Copilot, then whoever controls Microsoft has the power to decide what code we can and can’t write. (This currently this seems limited to not using the word ‘gender’. But I can imagine a future where it catches you reverse-engineering proprietary software, or jailbreaking locked-down devices, or trying write a new Bittorrent client).

    If everyone gets their facts from ChatGPT, then whoever controls OpenAI has the power to tweak everyone’s facts, an ability that is currently limited only to presidents of major world superpowers. If we let ourselves avoid critical thinking and rely on ChatGPT to generate answers to hard questions instead, which teachers say is very much exactly what’s happening in schools now… then what?

    Toluwaleke Ogundipe

    @toluwalekeog

    Hello GNOME and GSoC!

    I am delighted to announce that I am contributing to GNOME Crosswords as part of the Google Summer of Code 2025 program. My project primarily aims to add printing support to Crosswords, with some additional stretch goals. I am being mentored by Jonathan Blandford, Federico Mena Quintero, and Tanmay Patil.

    The Days Ahead

    During my internship, I will be refactoring the puzzle rendering code to support existing and printable use cases, adding clues to rendered puzzles, and integrating a print dialog into the game and editor with crossword-specific options. Additionally, I should implement an ipuz2pdf utility to render puzzles in the IPUZ format to PDF documents.

    Beyond the internship, I am glad to be a member of the GNOME community and look forward to so much more. In the coming weeks, I will be sharing updates about my GSoC project and other contributions to GNOME. If you are interested in my journey with GNOME and/or how I got into GSoC, I implore you to watch out for a much longer post coming soon.

    Appreciation

    Many thanks to Hans Petter Jansson, Federico Mena Quintero and Jonathan Blandford, who have all played major roles in my journey with GNOME and GSoC. šŸ™ā¤

    Taking out the trash, or just sweeping it under the rug? A story of leftovers after removing files

     There are many things that we take for granted in this world, and one of them is undoubtedly the ability to clean up your files - imagine a world where you can't just throw all those disk space hungry things that you no longer find useful. Though that might sound impossible, turns out some people have encountered a particularly interesting bug, that resulted in silent sweeping the Trash under the rug instead of emptying it in Nautilus. Since I was blessed to run into that issue myself, I decided to fix it and shed some light on the fun.

    Trash after emptying in Nautilus, are the files really gone?


    It all started with a 2009 Ubuntu launchpad ticket, reported against Nautilus. The user found 70 GB worth of files using disk analyzer in the ~/.local/share/Trash/expunged directory, even though they had emptied it with graphical interface. They did realize the offending files belonged to another user, however, they couldn't reproduce it easily at first. After all, when you try to move to trash a file or a directory not belonging to you, you would usually be correctly informed that you don't have necessary permissions, and perhaps even offer to permanently delete them instead. So what was so special about this case?

    First let's get a better view of when we can and when we can't permanently delete files, something that is done at the end of a successful trash emptying operation. We'll focus only on the owners of relevant files, since other factors, such as file read/write/execute permissions, can be adjusted freely by their owners, and that's what trash implementations will do for you. Here are cases where you CAN delete files:

    - when a file is in a directory owned by you, you can always delete it
    - when a directory is in a directory owned by you and it's owned by you, you can obviously delete it
    - when a directory is in a directory owned by you but you don't own it, and it's empty, you can surprisingly delete it as well

    So to summarize, no matter who the owner of the file or a directory is, if it's in a directory owned by you, you can get rid of it. There is one exception to this - the directory must be empty, otherwise, you will not be able to remove neither it, nor its including files. Which takes us to an analogous list for cases where you CANNOT delete files:

    - when a directory is in a directory owned by you but you don't own it, and it's not empty, you can't delete it.
    - when a file is in a directory NOT owned by you, you can't delete it
    - when a directory is in a directory NOT owned by you, you can't delete it either

    In contrast with removing files in a directory you own, when you are not the owner of the parent directory, you cannot delete any of the child files and directories, without exceptions. This is actually the reason for the one case where you can't remove something from a directory you own - to remove a non-empty directory, first you need to recursively delete all of its including files and directories, and you can't do that if the directory is not owned by you.

    Now let's look inside the trash can, or rather how it functions - the reason for separating permanently deleting and trashing operations, is obvious - users are expected to change their mind and be able to get their files back on a whim, so there's a need for a middle step. That's where the Trash specification comes, providing a common way in which all "Trash can" implementation should store, list, and restore trashed files, even across different filesystems - Nautilus Trash feature is one of the possible implementations. The way the trashing works is actually moving files to the $XDG_DATA_HOME/Trash/files directory and setting up some metadata to track their original location, to be able to restore them if needed. Only when the user empties the trash, are they actually deleted. If it's all about moving files, specifically outside their previous parent directory (i.e. to Trash), let's look at cases where you CAN move files:

    - when a file is in a directory owned by you, you can move it
    - when a directory is in a directory owned by you and you own it, you can obviously move it

    We can see that the only exception when moving files in a directory you own, is when the directory you're moving doesn't belong to you, in which case you will be correctly informed you don't have permissions. In the remaining cases, users are able to move files and therefore trash them. Now what about the cases where you CANNOT move files?

    - when a directory is in a directory owned by you but you don't own it, you can't move it
    - when a file is in a directory NOT owned by you, you can't move it either
    - when a directory is in a directory NOT owned by you, you still can't move it

    In those cases Nautilus will either not expose the ability to trash files, or will tell user about the error, and the system is working well - even if moving them was possible, permanently deleting files in a directory not owned by you is not supported anyway.

    So, where's the catch? What are we missing? We've got two different operations that can succeed or fail given different circumstances, moving (trashing) and deleting. We need to find a situation, where moving a file is possible, and such overlap exists, by chaining the following two rules:

    - when a directory A is in a directory owned by you and it's owned by you, you can obviously move it
    - when a directory B is in a directory A owned by you but you don't own it, and it's not empty, you can't delete it.

    So a simple way to reproduce was found, precisely:

    mkdir -p test/root
    touch test/root/file
    sudo chown root:root test/root

    Afterwards trashing and emptying in Nautilus or gio trash command will result in the files not being deleted, and left in the ~/.local/share/Trash/expunged, which is used by the gvfsd-trash as an intermediary during emptying operation. The situations where that can happen are very rare, but they do exist - personally I have encountered this when manually cleaning container files created by podman in ~/.local/share/containers, which I arguably I shouldn't be doing in the first place, and rather leave it up to the podman itself. Nevertheless, it's still possible from the user perspective, and should be handled and prevented correctly. That's exactly what was done, a ticket was submitted and moved to appropriate place, which turned out to be glib itself, and I have submitted a MR that was merged - now both Nautilus and gio trash will recursively check for this case, and prevent you from doing this. You can expect it in the next glib release 2.85.1.

    On the ending notes I want to thank the glib maintainer Philip Withnall who has walked me through on the required changes and reviewed them, and ask you one thing: is your ~/.local/share/Trash/expunged really empty? :)

    Lennart Poettering

    @mezcalero

    ASG! 2025 CfP Closes Tomorrow!

    The All Systems Go! 2025 Call for Participation Closes Tomorrow!

    The Call for Participation (CFP) for All Systems Go! 2025 will close tomorrow, on 13th of June! We’d like to invite you to submit your proposals for consideration to the CFP submission site quickly!

    Andy Wingo

    @wingo

    whippet in guile hacklog: evacuation

    Good evening, hackfolk. A quick note this evening to record a waypoint in my efforts to improve Guile’s memory manager.

    So, I got Guile running on top of the Whippet API. This API can be implemented by a number of concrete garbage collector implementations. The implementation backed by the Boehm collector is fine, as expected. The implementation that uses the bump-pointer-allocation-into-holes strategy is less good. The minor reason is heap sizing heuristics; I still get it wrong about when to grow the heap and when not to do so. But the major reason is that non-moving Immix collectors appear to have pathological fragmentation characteristics.

    Fragmentation, for our purposes, is memory under the control of the GC which was free after the previous collection, but which the current cycle failed to use for allocation. I have the feeling that for the non-moving Immix-family collector implementations, fragmentation is much higher than for size-segregated freelist-based mark-sweep collectors. For an allocation of, say, 1024 bytes, the collector might have to scan over many smaller holes until you find a hole that is big enough. This wastes free memory. Fragmentation memory is not gone—it is still available for allocation!—but it won’t be allocatable until after the current cycle when we visit all holes again. In Immix, fragmentation wastes allocatable memory during a cycle, hastening collection and causing more frequent whole-heap traversals.

    The value proposition of Immix is that if there is too much fragmentation, you can just go into evacuating mode, and probably improve things. I still buy it. However I don’t think that non-moving Immix is a winner. I still need to do more science to know for sure. I need to fix Guile to support the stack-conservative, heap-precise version of the Immix-family collector which will allow for evacuation.

    So that’s where I’m at: a load of gnarly Guile refactors to allow for precise tracing of the heap. I probably have another couple weeks left until I can run some tests. Fingers crossed; we’ll see!

    Alireza Shabani

    @Revisto

    Why GNOME’s Translation Platform Is Called ā€œDamned Liesā€

    Damned Lies is the name of GNOME’s web application for managing localization (l10n) across its projects. But why is it named like this?

    Damned Lies about GNOME

    Screenshot of Gnome Damned Lies from Google search with the title: Damned Lies about GNOME

    On the About page of GNOME’s localization site, the only explanation given for the name Damned Lies is a link to a Wikipedia article called “Lies, damned lies, and statisticsā€.

    “Damned Lies” comes from the saying “Lies, damned lies, and statistics” which is a 19th-century phrase used to describe the persuasive power of statistics to bolster weak arguments, as described on Wikipedia. One of its earliest known uses appeared in a 1891 letter to the National Observer, which categorised lies into three types:

    “Sir, —It has been wittily remarked that there are three kinds of falsehood: the first is a ‘fib,’ the second is a downright lie, and the third and most aggravated is statistics. It is on statistics and on the absence of statistics that the advocate of national pensions reliesĀ …”

    To find out more, I asked in GNOME’s i18n Matrix room, and Alexandre Franke helped a lot, he said:

    Stats are indeed lies, in many ways.
    Like if GNOME 48 gets 100% translated in your language on Damned Lies, it doesn’t mean the version of GNOME 48 you have installed on your system is 100% translated, because the former is a real time stat for the branch and the latter is a snapshot (tarball) at a specific time.
    So 48.1 gets released while the translation is at 99%, and then the translators complete the work, but you won’t get the missing translations until 48.2 gets released.
    Works the other way around: the translation is at 100% at the time of the release, but then there’s a freeze exception and the stats go 99% while the released version is at 100%.
    Or you are looking at an old version of GNOME for which there won’t be any new release, which wasn’t fully translated by the time of the latest release, but then a translator decided that they wanted to see 100% because the incomplete translation was not looking as nice as they’d like, and you end up with Damned Lies telling you that version of GNOME was fully translated when it never was and never will be.
    All that to say that translators need to learn to work smart, at the right time, on the right modules, and not focus on the stats.

    So there you have it: Damned Lies is a name that reminds us that numbers and statistics can be misleading even on GNOME’s I10n Web application.

    Varun R Mallya

    @varunrmallya

    The Design of Sysprof-eBPF

    Sysprof

    This is a tool that is used to profile applications on Linux. It tracks function calls and other events in the system to provide a detailed view of what is happening in the system. It is a powerful tool that can help developers optimize their applications and understand performance issues. Visit Sysprof for more information.

    sysprof-ebpf

    This is a project I am working on as part of GSoC 2025 mentored by Christian Hergert. The goal is to create a new backend for Sysprof that uses eBPF to collect profiling data. This will mostly serve as groundwork for the coming eBPF capabilities that will be added to Sysprof. This will hopefully also serve as the design documentation for anyone reading the code for Sysprof-eBPF in the future.

    Testing

    If you want to test out the current state of the code, you can do so by following these steps:

    1. Clone the repo and fetch my branch.
    2. Run the following script in the root of the project:
      #!/bin/bash
      set -euo pipefail
      GREEN="\033[0;32m"
      BLUE="\033[0;34m"
      RESET="\033[0m"
      
      prefix() {
       local tag="$1"
       while IFS= read -r line; do
       printf "%b[%s]%b %s\n" "$BLUE" "$tag" "$RESET" "$line"
       done
      }
      
      trap 'sudo pkill -f sysprofd; sudo pkill -f sysprof; exit 0' SIGINT SIGTERM
      
      meson setup build --reconfigure || true
      ninja -C build || exit 1
      sudo ninja -C build install || exit 1
      sudo systemctl restart polkit || exit 1
      
      # Run sysprofd and sysprof as root
      echo -e "${GREEN}Launching sysprofd and sysprof in parallel as root...${RESET}"
      
      sudo stdbuf -oL ./build/src/sysprofd/sysprofd 2>&1 | prefix "sysprofd" &
      sudo stdbuf -oL sysprof 2>&1 | prefix "sysprof" &
      
      wait
      

    Capabilities of Sysprof-eBPF

    alt text sysprof-ebpf will be a subprocess that will be created by sysprofd when the user selects the eBPF backend on the UI. I will be adding an options menu on the UI to choose which tracers to activate after I am done with the initial implementation. You can find my current dirty code here. As of writing this blog, this MR has the following capabilities:

    • A tiny toggle on the UI: Contains a tiny toggle on the UI to turn the activation of the eBPF backend on and off. This is a simple toggle that will start or stop the sysprof-ebpf subprocess.
    • Full eBPF compilation pipeline: This is the core of the sysprof-ebpf project. It compiles eBPF programs from C code to BPF bytecode, loads them into the kernel, and attaches them to the appropriate tracepoints. This is done using the libbpf library, which provides a high-level API for working with eBPF programs. All this is done at compile time which means that the user does not need to have a compiler to run the eBPF backend. This will soon be made modular to be able to add more eBPF programs in the future.

    alt text

    • cpu-stats tracer: Can track CPU usage of the full system by reading the exit state of a struct after a function that runs on requesting /proc/stat executes inside the kernel. I am working on finding methods to make this process not random and instead triggering this manually using bpf-timers. In the current state, this just prints this info to the console, but I will be soon adding capabilities to store this directly into the syscap file.
    • sysprofd: My little program can now talk to sysprofd now and get the file descriptor to write the data to. I also accept an event-fd in this program that allows the the UI to stop this subprocess from running. I currently face a limitation on this where I have no option of choosing which tracers to activate. I am working on getting the tracer selection working by adding an options field to SysprofProxiedInstrument.

    Follow up stuff

    • Adding a way to write to the syscap file: This will include adding a way to write the data collected by the tracers to the syscap file. I have already figured out how to do it, but it’ll require a bit of refactoring which I will be doing soon.
    • Adding more tracers: I will be adding more tracers to the sysprof-ebpf project. This will include tracers for memory usage, disk usage, and network usage. I will also be adding support for custom eBPF programs that can be written by the user if possible.
    • Adding UI: This will include adding options to choose which tracers to activate, and displaying the data collected by the tracers in a more readable format.

    Structure of sysprof-ebpf

    I planned on making this a single threaded process initially, but it dawned on me that not all ring-buffers will update at the same time and this will certainly block IO during polling, so I figured I’ll just put each tracer in it’s own DexFuture to do this capture in an async way. This has not been implemented as of writing this blog though.

    alt text

    The eBPF programs will follow the this block diagram in general. I haven’t made the config hashmap part of this yet, but I think I’ll make it only if it’s required in the future. All the currently planned features do not require this config map, but it certainly will be useful when I would need to make the program cross-platform or cross-kernel. This will be one of the last things I will be implementing in the project. alt text

    Conclusion

    I hope to make this a valuable addition to Sysprof. I will be writing more blogs as I make progress on the project. If you have any questions or suggestions, feel free to reach out to me on GitLab or Twitter. Also, I’d absolutely LOVE suggestions on how to improve the design of this project. I am still learning and I am open to any suggestions that can make this project better.

    Adrian Vovk

    @adrianvovk

    Introducing stronger dependencies on systemd

    Doesn’t GNOME already depend on systemd?

    Kinda… GNOME doesn’t have a formal and well defined policy in place about systemd. The rule of thumb is that GNOME doesn’t strictly depend on systemd for critical desktop functionality, but individual features may break without it.

    GNOME does strongly depend on logind, systemd’s session and seat management service. GNOME first introduced support for logind in 2011, then in 2015Ā ConsoleKit support was removed and logind became a requirement. However, logind can exist in isolation from systemd: the modern elogind service does just that, and even back in 2015 there were alternatives available. Some distributors chose to patch ConsoleKit support back into GNOME. This way, GNOME can run in environments without systemd, including the BSDs.

    While GNOME can run with other init systems, most upstream GNOME developers are not testing GNOME in these situations. Our automated testing infrastructure (i.e. GNOME OS) doesn’t test any non-systemd codepaths. And many modules that have non-systemd codepaths do so with the expectation that someone else will maintain them and fix them when they break.

    What’s changing?

    GNOME is about to gain a few strong dependencies on systemd, and this will make running GNOME harder in environments that don’t have systemd available.

    Let’s start with the easier of the changes. GDM is gaining a dependency on systemd’s userdb infrastructure. GNOME and systemd do not support running more than one graphical session under the same user account, but GDM supports multi-seat configurations and Remote Login with RDP. This means that GDM may try to display multiple login screens at once, and thus multiple graphical sessions at once. At the moment, GDM relies on legacy behaviors and straight-up hacks to get this working, but this solution is incompatible with the modern dbus-broker and so we’re looking to clean this up. To that end, GDM now leverages systemd-userdb to dynamically allocate user accounts, and then runs each login screen as a unique user.

    In the future, we plan to further depend on userdb by dropping the AccountsService daemon, which was designed to be a stop-gap measure for the lack of a rich user database. 15 years later, this “temporary” solution is still in use. Now that systemd’s userdb enables rich user records, we can start work on replacing AccountsService.

    Next, the bigger change. Since GNOME 3.34, gnome-session uses the systemd user instance to start and manage the various GNOME session services. When systemd is unavailable, gnome-session falls back to a builtin service manager. This builtin service manager uses .desktop files to start up the various GNOME session services, and then monitors them for failure. This code was initially implemented for GNOME 2.24, and is starting to show its age. It has received very minimal attention in the 17 years since it was first written. Really, there’s no reason to keep maintaining a bespoke and somewhat primitive service manager when we have systemd at our disposal. The only reason this code hasn’t completely bit rotted is the fact that GDM’s aforementioned hacks break systemd and so we rely on the builtin service manager to launch the login screen.

    Well, that has now changed. The hacks in GDM are gone, and the login screen’s session is managed by systemd. This means that the builtin service manager will now be completely unused and untested. Moreover: we’d like to implement a session save/restore feature, but the builtin service manager interferes with that. For this reason, the code is being removed.

    So what should distros without systemd do?

    First, consider using GNOME with systemd. You’d be running in a configuration supported, endorsed, and understood by upstream. Failing that, though, you’ll need to implement replacements for more systemd components, similarly to what you have done with elogind and eudev.

    To help you out, I’ve put a temporary alternate code path into GDM that makes it possible to run GDM without an implementation of userdb. When compiled against elogind, instead of trying to allocate dynamic users GDM will look-up and use the gdm-greeter user for the first login screen it spawns, gdm-greeter-2 for the second, and gdm-greeter-N for the Nth. GDM will have similar behavior with the gnome-initial-setup[-N] users. You can statically allocate as many of these users as necessary, and GDM will work with them for now. It’s quite likely that this will be necessary for GNOME 49.

    Next: you’ll need to deal with the removal of gnome-session’s builtin service manager. If you don’t have a service manager running in the user session, you’ll need to get one. Just like system services, GNOME session services now install systemd unit files, and you’ll have to replace these unit files with your own service manager’s definitions. Next, you’ll need to replace the “session leader” process: this is the main gnome-session binary that’s launched by GDM to kick off session startup. The upstream session leader just talks to systemd over D-Bus to upload its environment variables and then start a unit, so you’ll need to replace that with something that communicates with your service manager instead. Finally, you’ll probably need to replace “gnome-session-ctl”, which is a tiny helper binary that’s used to coordinate between the session leader, the main D-Bus service, and systemd. It is also quite likely that this will be needed for GNOME 49

    Finally: You should implement the necessary infrastructure for the userdb Varlink API to function. Once AccountsService is dropped and GNOME starts to depend more on userdb, the alternate code path will be removed from GDM. This will happen in some future GNOME release (50 or later). By then, you’ll need at the very least:

    • An implementation of systemd-userdbd’s io.systemd.Multiplexer
    • If you have NSS, a bridge that exposes NSS-defined users through the userdb API.
    • A bridge that exposes userdb-defined users through your libc’s native user lookup APIs (such as getpwent).

    Apologies for the short timeline, but this blog post could only be published after I knew how exactly I’m splitting up gnome-session into separate launcher and main D-Bus service processes. Keep in mind that GNOME 48 will continue to receive security and bug fixes until GNOME 50 is released. Thus, if you cannot address these changes in time, you have the option of holding back the GNOME version. If you can’t do that, you might be able to get GNOME 49 running with gnome-session 48, though this is a configuration that won’t be tested or supported upstream so your mileage will vary (much like running GNOME on other init systems). Still, patching that scenario to work may buy you more time to upgrade to gnome-session 49.

    And that should be all for now!

    Log Detective: Google Summer of Code 2025

    I'm glad to say that I'll participate again in the GSoC, as mentor. This year we will try to improve the RPM packaging workflow using AI, as part of the openSUSE project.

    So this summer I'll be mentoring an intern that will research how to integrate Log Detective with openSUSE tooling to improve the packager workflow to maintain rpm packages.

    Log Detective

    Log Detective is an initiative created by the Fedora project, with the goal of

    "Train an AI model to understand RPM build logs and explain the failure in simple words, with recommendations how to fix it. You won't need to open the logs at all."

    As a project that was promoted by Fedora, it's highly integrated with the build tools around this distribution and RPM packages. But RPM packages are used in a lot of different distributions, so this "expert" LLM will be helpful for everyone doing RPM, and everyone doing RPM, should contribute to it.

    This is open source, so if, at openSUSE, we want to have something similar to improve the OBS, we don't need to reimplement it, we can collaborate. And that's the idea of this GSoC project.

    We want to use Log Detective, but also collaborate with failures from openSUSE to improve the training and the AI, and this should benefit openSUSE but also will benefit Fedora and all other RPM based distributions.

    The intern

    The selected intern is Aazam Thakur. He studies at University of Mumbai, India. He has experience in using SUSE as he has previously worked on SLES 15.6 during his previous summer mentorship at OpenMainFrame Project for RPM packaging.

    I'm sure that he will be able to achieve great things during these three months. The project looks very promising and it's one of the things where AI and LLM will shine, because digging into logs is always something difficult and if we train a LLM with a lot of data it can be really useful to categorize failures and give a short description of what's happening.

    Tanmay Patil

    @txnmxy

    Acrostic Generator for GNOME Crossword Editor

    The experimental Acrostic Generator has finally landed inside the Crossword editor and is currently tagged as BETA.
    I’d classify this as one of the trickiest and most interesting projects I’ve worked on.
    Here’s how an acrostic puzzle loaded inside Crossword editor looksĀ like:

    In my previous blog post (published about a year ago), I explained one part of the generator. Since then, there have been many improvements.
    I won’t go into detail about what an acrostic puzzle is, as I’ve covered that in multiple previous posts already.
    If you’re unfamiliar, please check out my earlier post for a briefĀ idea.

    Coming to the Acrostic Generator, I’ll begin by showing an illustration that shows the input and the corresponding output generated by it. After that, I’ll walk through the implementation and challenges IĀ faced.

    Let’s take the quote: ā€œCATS ALWAYS TAKE NAPSā€ whose author is aĀ ā€œCATā€.

    Here’s what the Acrostic Generator essentially does

    It generates answers like ā€œCATSPAWā€, ā€œALASKANā€ and ā€œTYESā€ which, as you can probably guess from the color coding, are made up of letters from the originalĀ quote.

    Core Components

    Before explaining how the Acrostic generator works, I want to briefly explain some of the key components involved.
    1. Word list
    The word list is an important part of Crosswords. It provides APIs to efficiently search for words. Refer to the documentation to understand how it works.
    2. IpuzCharset
    The performance of the Acrostic Generator heavily depends on IpuzCharset, which is essentially a HashMap that stores characters and their frequencies.
    We perform numerous ipuz_charset_add_text and ipuz_charset_remove_text operations on the QUOTE charset. I'd especially like to highlight ipuz_charset_remove_text, which used to be computationally very slow. Last year, charset was rewritten in Rust by Federico. Compared to the earlier implementation in C using a GTree, the Rust version turned out to be quite faster.
    Here’s Federico’s blog post on rustifying libipuz’s charset.

    Why is ipuz_charset_remove_text latency so important? Let's consider the following example:

    QUOTE: "CARNEGIE VISITED PRINCETON AND TOLD WILSON WHAT HIS YOUNG MEN NEEDED WAS NOT A LAW SCHOOL BUT A LAKE TO ROW ON IN ADDITION TO BEING A SPORT THAT BUILT CHARACTER AND WOULD LET THE UNDERGRADUATES RELAX ROWING WOULD KEEP THEM FROM PLAYING FOOTBALL A ROUGHNECK SPORT CARNEGIE DETESTED"
    SOURCE: "DAVID HALBERSTAM THE AMATEURS"

    In this case, the total number of maximum ipuz_charset_remove_text operations required in the worst case wouldĀ be:

    73205424239083486088110552395002236620343529838736721637033364389888000000

    …which is aĀ lot.

    Terminology

    I’d also like you guys to take a note of a few things.
    1. Answers and Clues refer to the same thing, they are the solutions generated by the Acrostic Generator. I’ll be using them interchangeably throughout.
    2. We’ve set two constants in the engine: MIN_WORD_SIZE = 3 and MAX_WORD_SIZE = 20. These make sure the answers are not too short or too long and help stop the engine from running indefinitely.
    3. Leading characters here are all the characters of source. Each one is the first letter of corresponding answer.

    Setting upĀ things

    Before running the engine, we need to set up some data structures to store theĀ results.

    typedef struct {
    /* Representing a answer */
    gunichar leading_char;
    const gchar *letters;
    guint word_length;

    /* Searching the answer */
    gchar *filter;
    WordList *word_list;
    GArray *rand_offset;
    } ClueEntry;

    We use a ClueEntry structure to store the answer for each clue. It holds the leading character (from the source), the letters of the answer, the word length, and some additional word list information.
    Oh wait, why do we need the word length since we are already storing letters of the answer?
    Let’s backtrack. Initially, I wrote the following brute-force recursive algorithm:

    void
    acrostic_generator_helper (AcrosticGenerator *self,
    gchar nth_source_char)
    {
    // Iterate from min_word_size to max_word_size for every answer
    for (word_length = min_word_size; word_length <= max_word_size; word_length++)
    {
    // get list of words starting from `nth_source_char`
    // and with length equal to word_length
    word_list = get_word_list (starting_letter = nth_source_char, word_length);

    // Iterate throught the word list
    for (guint i = 0; i < word_list_get_n_items (word_list); i++)
    {
    word = word_list[i];

    // check if word is present in the quote charset
    if (ipuz_charset_remove_text (quote_charset, word))
    {
    // if present we forward to the next source char
    acrostic_generator_helper (self, nth_source_char + 1)
    }
    }
    }
    }

    The problem with this approach is that it is too slow. We were iterating from MIN_WORD_SIZE to MAX_WORD_SIZE and trying to find a solution for every possible size. Yes, this would work and eventually we’ll find a solution, but it would take a lot of time. Also, many of the answers for the initial source characters would end up having length equal to MIN_WORD_SIZEĀ .
    To quantify this, compared to the latest approach (which I’ll discuss shortly), we would be performing roughly 20 times the current number (7.3 Ɨ 10⁷³) of ipuz_charset_remove_text operations.

    To fix this, we added randomness by calculating and assigning random lengths to clue answers before running the engine.
    To generate these random lengths, we break a number equal to the length of the quote string into n parts (where n is the number of source characters), each part having a randomĀ value.

    static gboolean
    generate_random_lengths (GArray *clues,
    guint number,
    guint min_word_size,
    guint max_word_size)
    {
    if ((clues->len * max_word_size) < number)
    return FALSE;

    guint sum = 0;

    for (guint i = 0; i < clues->len; i++)
    {
    ClueEntry *clue_entry;
    guint len;
    guint max_len = MAX (min_word_size,
    MIN (max_word_size, number - sum));

    len = rand() % (max_len - min_word_size + 1) + min_word_size;
    sum += len;

    clue_entry = &(g_array_index (clues, ClueEntry, i));
    clue_entry->word_length = len;
    }

    return sum == number;
    }

    I have been continuously researching ways to generate random lengths that help the generator find answers as quickly as possible.
    What I concluded is that the Acrostic Generator performs best when the word lengths follow a right-skewed distribution.

    static void
    fill_clue_entries (GArray *clues,
    ClueScore *candidates,
    WordListResource *resource)
    {
    for (guint i = 0; i < clues->len; i++)
    {
    ClueEntry *clue_entry;

    clue_entry = &(g_array_index (clues, ClueEntry, i));

    // Generate filter in order to get words with starting letter nth char of source string
    // For eg. char = D, answer_len = 5
    // filter = "D????"
    clue_entry->filter = generate_individual_filter (clue_entry->leading_char,
    clue_entry->word_length);


    // Load all words with starting letter equal to nth char in source string
    clue_entry->word_list = word_list_new ();
    word_list_set_resource (clue_entry->word_list, resource);
    word_list_set_filter (clue_entry->word_list, clue_entry->filter, WORD_LIST_MATCH);

    candidates[i].index = i;
    candidates[i].score = clue_entry->word_length;

    // Randomise the word list which is sorted by default
    clue_entry->rand_offset = generate_random_lookup (word_list_get_n_items (clue_entry->word_list));
    }

    Now that we have random lengths, we fill up the ClueEntry data structure.
    Here, we generate individual filters for each clue, which are used to set the filter on each word list. For example, the filters for the example illustrated above are C??????, A??????, and T???Ā .
    We also maintain a separate word list for each clue entry. Note that we do not store the huge word list individually for every clue. Instead, each word list object refers to the same memory-mapped word list resource.
    Additionally, each clue entry contains a random offsets array, which stores a randomized order of indices. We use this to traverse the filtered word list in a random order. This randomness helps fix the problem where many answers for the initial source characters would otherwise end up with length equal to MIN_WORD_SIZE.
    The advantage of pre-calculating all of this before running the engine is that the main engine loop only performs the heavy operations: ipuz_charset_remove_text and ipuz_charset_add_text.

    static gboolean
    acrostic_generator_helper (AcrosticGenerator *self,
    GArray *clues,
    guint index,
    IpuzCharsetBuilder *remaining_letters,
    ClueScore *candidates)
    {
    ClueEntry *clue_entry;

    if (index == clues->len)
    return TRUE;

    clue_entry = &(g_array_index (clues, ClueEntry, candidates[index].index));

    for (guint i = 0; i < word_list_get_n_items (clue_entry->word_list); i++)
    {
    const gchar *word;

    g_atomic_int_inc (self->count);


    // traverse based on random indices
    word = word_list_get_word (clue_entry->word_list,
    g_array_index (clue_entry->rand_offset, gushort, i));

    clue_entry->letters = word;

    if (ipuz_charset_builder_remove_text (remaining_letters, word + 1))
    {
    if (!add_or_skip_word (self, word) &&
    acrostic_generator_helper (self, clues, index + 1, remaining_letters, candidates))
    return TRUE;

    clean_up_word (self, word);
    ipuz_charset_builder_add_text (remaining_letters, word + 1);
    clue_entry->letters = NULL;
    }

    }

    clue_entry->letters = NULL;

    return FALSE;
    }

    The approach is quite simple. As you can see in the code above, we perform ipuz_charset_remove_text many times, so it was crucial to make the ipuz_charset_remove_text operation efficient.
    When all the characters in the charset have been used/removed and the index becomes equal to number of clues, it means we have found a solution. At this point, we return, store the answers in an array, and continue our search for new answers until we receive a stop signal.
    We also maintain a skip list that is updated whenever we find an clue answer and is cleaned up during backtracking. This makes sure there are no duplicate answers in the answersĀ list.

    Performance Improvements

    I compared the performance of the acrostic generator using the current Rust charset implementation against the previous C GTree implementation. I have used the following quote and source strings with the same RNG seed for both implementations:

    QUOTE: "To be yourself in a world that is constantly trying to make you something else is the greatest accomplishment."
    SOURCE: "TBYIWTCTMYSEGA"
    Results:
    +-----------------+--------------------+
    | Implementation | Time taken(secs) |
    +-----------------+--------------------+
    | C GTree | 74.39 |
    | Rust HashMap | 17.85 |
    +-----------------+--------------------+

    The Rust HashMap implementation is nearly 4 times faster than the original C GTree version for the same random seed and traversal order.

    I have also been testing the generator to find small performance improvements. Here are some ofĀ them:

    1. When searching for answers, looking for answers for clues with longer word lengths first helps find solutions faster
    2. We switched to using nohash_hasher for the hashmap because we are essentially storing {char: frequency} pairs. Trace reports showed that significant time and resources were spent computing hash using Rust’s default SipHash implementation which was unnecessary. MR
    3. Inside ipuz_charset_remove_text, instead of cloning the original data, we use a rollback mechanism that tracks all modifications and rolls back in case of failure.Ā MR

    I also remember running the generator on some quote and source input back in the early days. It ran continuously for four hours and still couldn’t find a single solution. We even overflowed the gint counter which tracks number of words tried. Now, the same generator can return 10 solutions in under 10 seconds. We’ve come a long way!Ā šŸ˜€

    Crossword Editor

    Now that I’ve covered the engine, I’ll talk about the UI part.
    We started off by sketching potential designs on paper. @jrb came up with a good design and we decided to move forward with it, making a few tweaks toĀ it.

    First, we needed to display a list of the generated answers.

    For this, I implemented my own list model where each item stores a string for the answer and a boolean indicating whether the user wants to apply that answer.
    To allow the user to run, stop the generator and then apply answers, we reused the compact version of the original autofill component used in normal crosswords. The answer list gets updated whenever the slider isĀ moved.

    We have tried to reuse as much code as possible for acrostics, keeping most of the code common between acrostics and normal crosswords.
    Here’s a quick demo of the acrostic editor inĀ action:

    We also maintain a cute little histogram on the right side of the bottom panel to summarize clueĀ lengths.

    You can also try out the Acrostic Generator using our CLI app, which I originally wrote to quickly test the engine. To use the binary, you’ll need to build Crosswords Editor locally. ExampleĀ usage:

    $ ./_build/src/acrostic-generator -q "For most of history, Anonymous was a woman. I would venture to guess that Anon, who wrote so many poems without signing them, was often a woman. And it is for this reason that I would implore women to write all the more" -s "Virginia wolf"
    Starting acrostic generator. Press Ctrl+C to cancel.
    [ VASOTOMY ] [ IMFROMMISSOURI ] [ ROMANIANMONETARYUNIT ] [ GREATFEATSOFSTRENGTH ] [ ITHOUGHTWEHADADEAL ] [ NEWSSHOW ] [ INSTITUTION ] [ AWAYWITHWORDS ] [ WOOLSORTERSPNEUMONIA ] [ ONEWOMANSHOWS ] [ LOWMANONTHETOTEMPOLE ] [ FLOWOUT ]
    [ VALOROUSNESS ] [ IMMUNOSUPPRESSOR ] [ RIGHTEOUSINDIGNATION ] [ GATEWAYTOTHEWEST ] [ IWANTYOUTOWANTME ] [ NEWTONSLAWOFMOTION ] [ IMTOOOLDFORTHISSHIT ] [ ANYONEWHOHADAHEART ] [ WOWMOMENT ] [ OMERS ] [ LAWUNTOHIMSELF ] [ FORMATWAR ]

    Plans for theĀ future

    To begin with, we’d really like to improve the overall design of the Acrostic Editor and make it more user friendly. Let us know if you have any design ideas, we’d love to hear your suggestions!
    I’ve also been thinking about different algorithms for generating answers in the Acrostic Generator. One idea is to use a divide-and-conquer approach, where we recursively split the quote until we find a set of sub-quotes that satisfy all constraints ofĀ answers.

    To conclude, here’s an acrostic for you all to solve, created using the Acrostic Editor! You can load the file in Crosswords and startĀ playing.

    Thanks forĀ reading!