November 16, 2018

The performance impact of zeroing raw memory

When you create a new variable (in C, C++ and other languages) or allocate a block of memory the value is undefined. That is, whatever bit pattern happened to be in the raw memory location at the time. This is faster than initialising all memory (which languages such as Java do) but it is also unsafe and can lead to bugs, such as use-after-free issues.

There have been several attempts to change this behaviour and require that compilers would initialize all memory to a known value, usually zero. This is always rejected with a statement like "that would cause a performance degradation fo unknown size" and the issue is dropped. This is not very scientific so let's see if we could get at least some sort of a measurement for this.

The method

The overhead for uninitialized variables is actually fairly difficult to measure. Compilers don't provide a flag to initialize all variables to zero. Thus measuring this would require compiler hacking, which is a ton of work. An alternative would be to write a clang-tidy plugin and add a default initialization to zero for all variables that don't have a initialization clause already. This is also fairly involved, so let's not do this.

The impact of dynamic memory turns out to be fairly straightforward to measure. All we need to do is to build a shared library with custom overrides for malloc, free and memalign, and LD_PRELOAD it to any process we want to measure. The sample code can be found in this Github repo.

Measurements

We did two measurements. The first one was running Python's pystone benchmark. There was no noticeable difference between zero initialization and no initialization.

The second measurement consisted of compiling a simple C++ iostream helloworld application with optimizations enabled. The results for this experiment were a lot more interesting. Zeroing all memory on malloc made the program 2% slower. Zeroing the memory on both allocation and free (to catch use-after-free bugs) made the program 3.6% slower.

A memory zeroing implementation inside malloc would probably have a smaller overhead, because there are cases where you don't need to explicitly overwrite the memory, for example when the allocation is done behind the scenes via mmap/munmap.

November 15, 2018

The devil makes work for idle processes

TLDR: in Endless OS, we switched the IO scheduler from CFQ to BFQ, and set the IO priority of the threads doing Flatpak downloads, installs and upgrades to “idle”; this makes the interactive performance of the system while doing Flatpak operations indistinguishable from when the system is idle.

At Endless, we’ve been vaguely aware for a while that trying to use your computer while installing or updating apps is a bit painful, particularly on spinning-disk systems, because of the sheer volume of IO performed by the installation/update process. This was never particularly high priority, since app installations are user-initiated, and until recently, so were app updates.

But, we found that users often never updated their installed apps, so earlier this year, in Endless OS 3.3.10, we introduced automatic background app updates to help users take advantage of “new features and bug fixes” (in the generic sense you so often see in iOS/Android app release notes). This fixed the problem of users getting “stuck” on old app versions, but made the previous problem worse: now, your computer becomes essentially unusable at arbitrary times when app updates happen. It was particularly bad when users unboxed a system with an older version of Endless OS (and hence a hundred or so older apps) pre-installed, received an automatic OS update, then rebooted into a system that’s unusable until all those apps have been updated.

At first, I looked for logic errors in (our versions of) GNOME Software and Flatpak that might cause unneccessary IO during app updates, without success. We concluded that heavy IO load when updating a large app or runtime is largely unavoidable,1 so I switched to looking at whether we could mitigate this by tweaking the IO scheduler.

The BFQ IO scheduler is supposed to automatically prioritize interactive workloads over bulk workload, which is pretty much exactly what we’re trying to do. The specific example its developers give is watching a video, without hiccups, while copying a huge file in the background. I spent some time with the BFQ developers’ own suite of benchmarks on two test systems: a Lenovo Yoga 900 (with an Intel i5-6200U @ 2.30GHz and a consumer-grade M.2 SSD) and an Endless Mission One (an older system with a Celeron CPU and a laptop-class spinning disk). Neither JP nor I were able to reproduce any interesting results for the dropped-frames benchmark: with either BFQ or CFQ (the previous default IO scheduler), the Yoga essentially never dropped frames, whereas the IO workloads immediately rendered the Mission totally unusable. I had rather more success with a benchmark which measures the time to launch LibreOffice:

  • On the Yoga, when the system was idle, the mean launch time went from 2.838s under CFQ to 2.98s under BFQ (a slight regression), but with heavy background IO, the mean launch time went from 16s with CFQ (standard deviation 0.11) to 3s with BFQ (standard deviation 0.51).
  • On the Mission, with modest background IO, the mean launch time was 108 seconds under BFQ, which sounds awful; but under CFQ, I gave up waiting for LibreOffice to start after 8 minutes!

Emboldened by these results, I went on to look at how the same “time to launch LibreOffice” benchmark fared when the background IO load is “installing and uninstalling a Lollipop Flatpak bundle in a loop”. I also looked at using ionice -c3 to set the IO priority of the install/uninstall loop to idle, which does what its name suggests: BFQ essentially will never serve IO at the idle priority if there is IO pending at any higher priority. You can see some raw data or look at some extended discussion copied from our internal issue tracker to a Flatpak pull request, but I suggest just looking at this chart:

What does it all mean?

  • The coloured bars represent median launch time in seconds for LibreOffice, across 15/30 trials for Yoga/Mission respectively.
  • The black whiskers show the minimum and maximum launch times observed. I know this should have been a box-and-whiskers or violin plot, but I realised too late that multitime does not give enough information to draw those.
  • “unloaded” refers to the performance when the system is otherwise idle.
  • “shell-loop” refers to running while true; do flatpak install -y /home/wjt/Downloads/org.gnome.Lollypop.flatpak; flatpak uninstall -y org.gnome.Lollypop/x86_64/stable; done; “long-lived” refers to performing the same operations with the Flatpak API in a long-lived process. I tried this because I understood that BFQ gives new processes a slight performance boost, but on a real system the GNOME Software and Flatpak system helper processes are long-lived. As you can see, the behaviour under BFQ is actually the other way around in the worst case, and identical for CFQ and in the median case.
  • The “ionice-” prefix means the Flatpak operation was run under ionice -c3.
  • Switching from CFQ to BFQ makes the worst case a little worse at the default IO priority, but the median case much better.
  • Setting the IO priority of the Flatpak process(es) to idle erases that worst-case regression under BFQ, and dramatically improves the median case under CFQ.
  • In combination, the time to launch LibreOffice while performing Flatpak operations in the background on the Mission went from 24 seconds to 12 seconds by switching to BFQ & setting the IO priority to idle.

So, by switching to BFQ and setting IO priorities appropriately, the system’s interactive performance while performing background updates is now essentially indistinguishable from when the system is idle. To implement this in practice, Rob McQueen wrote some patches to set the IO priority of the Flatpak system helper and GNOME Software’s worker threads to idle (both changes are upstream) and changed Endless OS’s default IO scheduler to BFQ where available. As Matthias put it on #flatpak when shown this chart and that first link: “not bad for a 1-line change”.

Of course, this means apps take a bit longer to install, even on a mostly-idle system. No, I don’t have numbers on how big the impact is: this work happened months ago and it’s taken me this long to write it up because I couldn’t find the time to collect more data. But my colleague Umang is working on eliminating up to half of the disk IO performed during Flatpak installations so that should more than make up for it!

  1. modulo Umang’s work mentioned in the coda

On the track for 3.32

It happens sneakily, but there’s more things going on in the Tracker front than the occasional fallout. Yesterday 2.2.0-alpha1 was released, containing some notable changes.

On and off during the last year, I’ve been working on a massive rework of the SPARQL parser. The current parser was fairly solid, but hard to extend for some of the syntax in the SPARQL 1.1 spec. After multiple attempts and failures at implementing property paths, I convinced myself this was the way forward.

The main difference is that the previous parser was more of a serializer to SQL, just minimal state was preserved across the operation. The new parser does construct an expression tree so that nodes may be shuffled/reevaluated. This allows some sweet things:

  • Property paths are a nice resource to write more idiomatic SPARQL, most property path operators are within reach now. There’s currently support for sequence paths:

    # Get all files in my homedir
    SELECT ?elem {
      ?elem nfo:belongsToContainer/nie:url 'file:///home/carlos'
    }
    


    And inverse paths:

    # Get all files in my homedir by inverting
    # the child to container relation
    SELECT ?elem {
      ?homedir nie:url 'file:///home/carlos' ;
               ^nfo:belongsToContainer ?elem
    }
    

    There’s harder ones like + and * that will require recursive selects, and there’s the negation (!) operator which is not possible to implement yet.

  • We now have prepared statements! A TrackerSparqlStatement object was introduced, capable of holding a query with parameters which can be set/replaced prior to execution.

    conn = tracker_sparql_connection_get (NULL, NULL);
    stmt = tracker_sparql_connection_query_statement (conn,
                                                      "SELECT ?u { ?u fts:match ~term }",
                                                      NULL, NULL);
    
    tracker_sparql_statement_bind_string (stmt, "term", search_term);
    cursor = tracker_sparql_statement_execute (stmt, NULL, NULL);
    

    This is a long sought protection for injections. The object is cacheable and can service multiple cursors asynchronously, so it will also be an improvement for frequent queries.

  • More concise SQL is generated at places, which brings slight improvements on SQLite query planning.

This also got the ideas churning towards future plans, the trend being a generic triple store as much sparql1.1 capable as possible. There’s also some ideas about better data isolation for Flatpak and sandboxes in general (seeing the currently supported approach didn’t catch on). Those will eventually happen in this or following cycles, but I’ll reserve that for other blog post.

An eye was kept on memory usage too (mostly unrealized ideas from the performance hackfest earlier this year), tracker-store has been made to automatically shutdown when unneeded (ideally most of the time, since it just takes care of updates and the unruly apps that use the bus connection), and tracker-miner-fs took over the functionality of tracker-miner-apps. That’s 2 processes less in your default session.

In general, we’re on the way to an exciting release, and there’s more to come!

November 14, 2018

Counting Code in GNOME Settings

I've been spending a bit of time recently working on GNOME Settings. One part of this has been bringing some of the older panel code up to modern standards, one of which is making use of GtkBuilder templates.

I wondered if any of these changes would show in the stats, so I wrote a program to analyse each branch in the git repository and break down the code between C and GtkBuilder. The results were graphed in Google Sheets:



This is just the user accounts panel, which shows some of the reduction in C code and increase in GtkBuilder data:



Here's the breakdown of which panels make up the codebase:



I don't think this draws any major conclusions, but is still interesting to see. Of note:
  • Some of the changes make in 3.28 did reduce the total amount of code! But it was quickly gobbled up by the new Thunderbolt panel.
  • Network and Printers are the dominant panels - look at all that code!
  • I ignored empty lines in the files in case differing coding styles would make some panels look bigger or smaller. It didn't seem to make a significant difference.
  • You can see a reduction in C code looking at individual panels that have been updated, but overall it gets lost in the total amount of code.
I'll have another look in a few cycles when more changes have landed (I'm working on a new sound panel at the moment).

New plymouth theme for flickerfree boot

As discussed in my previous blog post one of my TODO list items for plymouth is creating a new plymouth theme.

Since the transition to plymouth is not entirely smooth plymouth by default will wait 5 seconds (counted from starting the kernel) before showing itself so that on systems which boot under 5 seconds it never shows. As can be seen in this video, this leads to a very non-smooth experience when the boot takes say 7 seconds as plymouth then only shows briefly, leading to a kinda "flash" effect while it briefly shows.

Another problem with the 5 second wait, is now that we do not show GRUB the user is looking at the firmware's bootsplash for not only the often long firmware initialization time, but also for the 5 seconds plymouth waits on top, making it look as if nothing is happening.

To fix this I've been working on a new plymouth theme which draws a spinner over the firmware boot splash, eliminating the ugly transition from the firmware boot splash to plymouth. This also allows removing the show-delay, so that we provide feedback that something is happening as soon as plymouth starts.

Firmware being firmware getting this done right was somewhat harder then I expected, but I've a first "draft" of a new theme doing this now. I've created some videos showing 2 different systems booting the new theme:Note the videos with diskcrypt where paused when I entered my passphrase. So there is a bit of a jump in them because of this.

I've built a test version of plymouth for Fedora 29, to give this a try download all rpm files from here except the .src.rpm and -devel files and then from a directory with all those files in it, run:

sudo rpm -Uvh plymouth*.rpm

Since plymouth is part of your initrd, you also need to regenerate your initrd:

sudo dracut -f /boot/initramfs-$(uname -r).img $(uname -r)

This regenerates the initrd for the kernel you are currently running, so if you've installed a kernel update and have not rebooted since then you may not get the new theme when rebooting. In this case rerun the dracut command after rebooting.

Note if you've previously followed my instructions to test flickerfree boot, then you need to remove "plymouth.splash_delay=20" from your kernel commandline, since we now no longer want to have a splash-delay.

Now reboot and you should get the new spinner on firmware-boot-splash theme, with Fedora branding.

If you give this a try and the new theme somehow does not look correct, please mail at hdegoede@redhat.com. If you mail me about the theme not displaying correctly please attach the /run/plymouth.log file which this test-build generates to the email and a video of how the theme misbehaves would be great too.

I still need to discuss the idea of using a new theme incorporating the firmware boot splash with the GNOME design team so this is all subject to change.

Degree final work about ISO/IEC 29110

Cover of «Creation of artifacts for adoption of ISO/IEC 29110 standards» blueprint

I want a lot to write more in this blog. There are matters I didn’t talk enough about SuperSEC or GUADEC conferences, some announce for 2019 and some some activities in Wikipedia (specially in the Wikiproyecto-Almería and my firsts step in the amazing world of SPARQL), less important but I really enjoy.

But now I want to keep record of significant advances in the university degree I’m finishing these months. I decided to finish a pending course with special interest in the required degree final work, to work in things I’ve been interested since 2003 but never had the oportunity to focus in deep enough to study, learn and write some useful, I hope, tools. And it’s being fun :-)

29110 Galore at http://29110.olea.org

So now I can say the project blueprint has been approved by the university. It’s named «Creation of artifacts for adoption of ISO/IEC 29110 standards» (document in Spanish, sorry) and the goals are to produce a set of opensource artifacts for the adoption of the 29110 family of standards focused on a light software engineering methodology suitable to be adopted by very small entities (VSEs). At the moment my main target is to work in the «Part 5-4: Agile software development guidelines», currently on development by WG24, using the EPF Composer tool.

As a working tool I’m making a (half backed and maybe temporal) website to keep record of related materials at http://29110.olea.org.

Hope to announce related news in the next weeks.

GSoC Mentors Summit 2018

Sorry for the long wait! In this post I’ll be talking about my visit to the GSoC Mentors Summit 2018.

I represented GNOME, sadly alone because the other selected mentor didn’t get the US visa in time. This was my first trip out of India and I couldn’t plan it properly1, so I went there for just the two conference days.

Meets

The [un]conference was a great place to learn about other open source organizations. I was able to meet a lot of people, and I remember talking to folks from HPX, scilab, WorldBrain2, Xwiki, coala, OSGeo3, Java PathFinder, PollyLabs4, KDE, fossasia, Oppia, MuseScore, CiviCRM, and meeting at least two other Abhinavs at the summit.

GSoC mentors from over 40 countries!

Sessions

A wide range of topics were covered in the sessions - funding, licensing, documentation, and in general how to improve open source software. Some sessions were more GSoC oriented like how to - retain mentors, select a good project, communicate better, and improve the program. I would like to highlight few sessions that piqued my interest:

  • Google Season of Docs

    GSoD is a new program (still being brainstormed) in which technical writers would be able to participate and work on documentation. I believe this would be a great initiative and can help newcomers ramp up much faster!

  • Open Source Personal Assistants

    Mario from fossasia held this session demonstrating the SUSI AI. It was really refreshing to see an open source personal assistant working so well. I wonder why linux distros do not ship with a working personal assistant while Windows and macOS are continuously deepening their AI integration to aid users.

    I was able to find a SUSI AI client for linux (written in Gtk). We can definitely look into making its integration easier into GNOME - say how about asking the user during the first run if they would like to enable a personal assistant?

  • GSoC Feedback

    This was my favourite session. The notes for this session can be found here. I could relate the most to the following points given by other mentors:

    • Mentors have difficulties seeing feedback from evaluations
    • Google should consider raising the lowest bar on the stipend

Lightning Talks

Lightning talks are the quickest way to gain curiosity according to me. I got to know about many projects that I wouldn’t have otherwise. It was amazing to see the influence of open source on health, robotics, humanities, and science!

Most presentations talked about their students’ summer work, but I especially loved the one by openSUSE which was about encouraging those students that had their proposals rejected. The slides for all these talks can be found here. Unfortunately, I did not present as I was too scared :(

Chocolates

The chocolate table was a delight! It was the first and definitely the last time in my life that I had a chocolate made of 100% cocoa. There was one chocolate that smelled like bubblegum, another one that had spices in it, and one that made popping sounds when eaten! Open source doesn’t only bring diversity in people but in chocolates as well :P

Thanks

GSoC Mentor Summit 2018 was truly an amazing experience! I am grateful to GNOME for selecting me and to Google for running the wonderful GSoC program. I would also like to give special thanks to Alexander Mikhaylenko for helping out in mentoring even though he didn’t sign up for it.


  1. read poor work-life balance due to my day job 

  2. Oliver - thanks for the seaweed 

  3. Jeff - I watched the movie you suggested 

  4. these folks do magic with LLVM 

November 13, 2018

Adding an optional install duration to LVFS firmware

We’ve just added an optional feature to fwupd and the LVFS that some people might find useful: The firmware update process can now tell the user how long in seconds the update is going to take.

This means that users can know that a dock update might take 5 minutes, and so they start the update process before they go to lunch. A UEFI update will require multiple reboots and will take 45 minutes to apply, and so the user will only apply the update at the end of the day rather than losing access to the their computer for nearly an hour.

If you want to use this feature there are currently three ways to assign the duration to the update:

  • Changing the value on the LVFS admin console — the component update panel now has an extra input field to enter the
    duration in
    seconds
  • Adding a new attribute to the element, for instance:
    <release version="3.0.2" date="2018-11-09" install_duration="120">
    
  • Adding a ‘quirk’ to fwupd, for instance:
    [DeviceInstanceId=USB\VID_1234&PID_5678]
    InstallDuration = 40
    
  • For updates requiring a reboot the install duration should include the time to POST the system both before and after the update has run, but it can be approximate. Only users running very new versions of fwupd and gnome-software will be shown the install duration, and older versions will be unchanged as the new property will just be ignored. It’s therefore safe to include in all versions of firmware without adding a the dependency on a specific fwupd version.

Jesień Linuksowa 2018

Last weekend I participated in the conference Jesień Linuksowa 2018 in Ustroń which is located in Poland near the Czech and Slovak borders. It was my first time in a country with so much tragic historical experiences.

On the other hand, I was impressed by the community members and the organization of the event. We celebrated another edition of Linux Autumn in the hotel Gwarek and my post-event wrap up will take into consideration seven basic points:

Organizers

This time I was accompanied by my friend Ana Garcia, who is a student at the University of Edinburgh and the members of the organization were supportive and kind all the time with us. We felt a warm environment since we arrive at midnight in the middle of the fog. They helped us with our talks and workshops we offer related to parallelization.

We meet new Linux friends: Dominik, Rafal, Filip, Mateusz Kita, and Matej from Red Hat.Topics

Interesting topics were presented that included Ansible, catching bugs, packaging to Fedora and the innovations of systemd. Here pictured Marcin, Matej, and Zbigniew talks:

My participation

I did a topic related to introduction to parallel computers and a workshop to use parallel directives in OpenMP such as static, dynamic and guided. Thanks to Ana for being my co-speaker this time. My talk was set to the morning around 10 am, and we did the workshop after lunch and we are glad that all the participants completed successfully the experience.

Fedora 29 party

We celebrated with a delicious Fedora 29 cake! What are the new features in Fedora 29 were explained, as well as how to create a FAS account and the system badges. I am glad that Fedora Ambassadors from Peru and Poland, as well as Matej from Red Hat Brno met.Food

The organization arranged comfort food more than three times a day, we had extra cookies and water between main menus, plus the cake of course! We tasted delicious Poland food!

Special Thanks

To Filip Kłębczyk not only for being the general organizer but also for giving me one of the best gifts ever! and to Rafał Lużyński for the invitation and the tour around the city.

The uniqueness 

Retro gaming area was a geeky-genius-fantastic-idea 🙂 There was a unique collection of games from the 80’s and we were able to have an entertainment with that treasure.

November 12, 2018

2018-11-12 Monday

  • Mail; admin; lunch, status report, sync. with Andras.

Why the Linux console has sixteen colors (SeaGL)

At the 2018 Seattle GNU/Linux Conference after-party, I gave a lightning talk about why the Linux console has only sixteen colors. Lightning talks are short, fun topics. I enjoyed giving the lightning talk, and the audience seemed into it, too. So I thought I'd share my lightning talk here. These are my slides in PNG format, with notes added:
Also, my entire presentation is under the CC-BY:
When you bring up a terminal window, or boot Linux into plain text mode, maybe you've wondered why the Linux console only has sixteen colors. No matter how awesome your graphics card, you only get these sixteen colors for text:
You can have eight background colors, and sixteen foreground colors. But why is that?
Remember that Linux is a PC operating system, so you have to go back to the early days of the IBM PC. Although the rules are the same for any Unix terminal.

The origins go back to CGA, the Color/Graphics Adapter from the earlier PC-compatible computers. This was a step up from the plain monochrome displays; as the name implies, monochrome could only display black or white. CGA could display a limited range of colors.

CGA supported mixing red (R), green (G) and blue (B) colors. In its simplest form, RGB is either "on" or "off." In this case, you can mix the RGB colors in 2×2×2=8 ways. So RGB=100 is Red, and RGB=010 is Green, and RGB=001 is Blue. And you can mix colors, like RGB=011 is cyan. This simple table shows the binary and decimal representations of RGB:
To double the number of colors, CGA added an extra bit called the "intensifier" bit. With the intensifier bit set, the red, green and blue colors would be set to their maximum values. Without the intensifier bit, each RGB value would be set to a "midrange" intensity. Let's represent that intensifier bit as an extra 1 or 0 in the binary color representation, as iRGB:
That means 0100 gives "red" and 1100 (with intensifier bit set) results in "bright red." Also, 0010 is "green" and 1010 is "bright green." And 0000 is "black," but 1000 is "bright black."

Oh wait, there's a problem. "Black" and "bright black" are the same color, because there's no RGB value to intensify.

But we can solve that! CGA actually implemented a modified iRGB definition, using two intermediate values, at about one-third and two-thirds intensity. Most "normal" mode (0–7) colors used values at the two-thirds intensity. Translating from "normal" mode to "bright" mode, convert zero values to the one-third intensity, and two-thirds values to full intensity.

With that, you can represent all the colors in the rainbow: red, orange, yellow, blue, indigo, and violet. You can sort of fake the blue, indigo, and violet with the different "blue" shades.

Oops, we don't have orange! But we can fix that by assigning 0110 yellow a one-third green value that turned the color into orange, although most people saw it as brown.

Here's another iteration of the color table, using 0x0 to 0xF for the color range, with 0x5 and 0xA as the one-third and two-thirds intensities, respectively:
And that's how the Linux console got sixteen text colors! That's also why you'll often see "brown" labeled "yellow" in some references, because it started out as plain "yellow" before the intensifier bit. Similarly, you may also see "gray" represented as "bright black," because "gray" is really "black" with the intensifier bit set.

So let's look at the bit patterns. You have four bits for the foreground color, 0000 black to 1111 bright white:
And you have three bits for the background color, from 000 black to 111 white:
But why not four bits for the background color? That's because the final bit is reserved for a special attribute. With this attribute set, your text could blink on and off. The "Blink" bit was encoded at the end of the foreground and background bit-pattern:
That's a full byte! And that's why the Linux console has only sixteen colors; the Linux console inherits text mode colors from CGA, which encodes colors a full byte at a time.

It turns out the rules are the same for other Unix terminals, which also used eight bits to represent colors. But on other terminals, 0110 really was yellow, not orange or brown.

Usability Testing in Open Source Software (SeaGL)

I recently attended the 2018 Seattle GNU/Linux Conference, where I gave a presentation about usability testing in open source software. I promised to share my presentation deck. Here are my slides in PNG format, with notes added:
Also, my entire presentation is under the CC-BY:
I've been involved in Free/open source software since 1993, but recently I developed an interest in usability testing in open source software. During a usability testing class in my Master's program in Scientific and Technical Communication (MS) I studied the usability of GNOME and Firefox. Later, I did a deeper examination of the usability of open source software, focusing on GNOME, as part of my Master's capstone. (“Usability Themes in Open Source Software,” 2014.)

Since then, I've joined the GNOME Design Team where I help with usability testing.

I also (sometimes) teach usability at the University of Minnesota. (CSCI 4609 Processes, Programming, and Languages: Usability of Open Source Software.)
I’ve worked with others on usability testing since then. I have mentored in Outreachy, formerly the Outreach Program for Women. Sanskriti, Gina, Renata, Ciarrai, Diana were all interns in usability testing. Allan and Jakub from the GNOME Design Team co-mentored as advisers.
What do we mean when we talk about “usability”? You can find some formal definitions of usability that talk about the Learnability, Efficiency, Memorability, Errors, and Satisfaction. But I find it helps to have a “walking around” definition of usability.

A great way to summarize usability is to remember that real people are busy people, and they just need to get their stuff done. So a program will have good usability if real people can do real tasks in a realistic amount of time.

User eXperience (UX) is technically not the same as usability. Where usability is about real people doing real tasks in a reasonable amount of time, UX is more about the emotional connection or emotional response the user has when using the software.

You can test usability in different ways. I find the formal usability test and prototype test work well. You can also indirectly examine usability, such as using an expert to do a heuristic evaluation, or using questionnaires. But really, nothing can replace watching a real person trying to use your software; you will learn a lot just by observing others.
People think it's hard to do usability testing, but it's actually easy to do a usability test on your own. You don’t need a fancy usability lab or any professional experience. You just need to want to make your program easier for other people to use.

If you’re starting from scratch, you really have three steps to do a formal usability test:

1. Consider who are your users. Write this down as a short paragraph for each kind of user for your software. Make it a realistic fiction. These are your Personas. With personas, you can make design decisions that always benefit the user. “If we change __ then that will make it easier for users like Jane.” “If we add __ then that will help people like Steve.”

2. For each persona, write a brief statement about why that user might use the software to do their tasks. There are different ways that a user might use the software, but just jot down one way. This is a Use Scenario. With scenarios, you can better understand the circumstances when people use the software.

3. Now take a step back and think about the personas and scenarios. Write down some realistic tasks that real people would do with the software. Make each one stand on its own. These are scenario tasks, and they make up your actual usability test. Where you should write personas and scenarios in third-person (“__ does this..”) you should write scenario tasks in second-person (“you do this..”) Each scenario task should set up a brief context, then ask the tester to do something specific. For example:

You don’t have your glasses with you, so it’s hard to see the text on the screen. Make the text bigger so you can read it more easily.

The challenge in scenario tasks is not to accidentally give hints for what the tester should do. Avoid using the same words and phrases from menus. Don’t be too exact about what the tester should do - instead, describe the goal, and let the tester find their own path. Remember that there may be more than one way to do something.

The key in doing a usability test is to make it iterative. Do a usability test, analyze your results, then make changes to the design based on what you learned in the test. Then do another test. But how many testers do you need?
You don’t need many testers to do a usability test if you do it iteratively. Doing a usability test with five testers is enough to learn about the usability problems and make tweaks to the interface. At five testers, you’ve uncovered more than 80% of usability problems, assuming most testers can uncover about 31% of issues (typical).

But you may need more testers for other kinds of usability tests. “Only five” works well for traditional/formal usability tests. For a prototype test, you might need more testers.

But five is enough for most tests.
If every tester can uncover about 31% of usability problems, then note what happens when you have one, five, and ten testers in a usability test. You can cover 31% with one tester. With more testers, you have overlap in some areas, but you cover more ground with each tester. At five testers, that’s pretty good coverage. At ten testers, you don’t have considerably better coverage, just more overlap.

I made this sample graphic to demonstrate. The single red square covers 31% of the grey square's area (in the same way a tester can usually uncover about 31% of the usability problems, if you've designed your test well). Compare five and ten testers. You don't get significantly more coverage at ten testers than at five testers. You get some extra coverage, and more overlap, but that's a lot of extra effort for not a lot of extra value. Five is really all you need.
Let me show you a usability test that I did. Actually, I did two of them. This was part of my work on my Master’s degree. My capstone was Usability Themes in Open Source Software. Hall, James. (2014). Usability Themes in Open Source Software. University of Minnesota.

I wrote up the results for each test as separate articles for Linux Journal: “The Usability of GNOME” (December, 2014) and “It’s about the user: Usability in open source software” (December, 2013).
I like to show results in a “heat map.” A heat map is just a convenient way to show test results. Scenario tasks are in rows and each tester is a separate column.

For each cell (a tester doing a task) I use a color to show how easy or how difficult that task was for the tester. I use this scale:

Green if the tester easily completed the task. For example, if the tester seemed to know exactly what to do, what menu item to activate or which icon to click, you would code the task in green for that tester.

Yellow if the tester experienced some (but not too much) difficulty in the task.

Orange if the tester had some trouble in the task. For example, if the tester had to poke around the menus for a while to find the right option, or had to hunt through toolbars and selection lists to locate the appropriate icon, you would code the task in orange for that tester.

Red if the tester experienced severe difficulty in completing the task.

Black if the tester was unable to figure out how to complete the task, and gave up.

There are some “hot” rows here, which show tasks that were difficult for testers: setting the font and colors in gedit, and setting a bookmark in Nautilus. Also searching for a file in Nautilus was a bit challenging, too. So my test recommended that the GNOME Design Team focus on these four to make them easier to do.
This next one is the heat map from my capstone project.

Note that I tried to do a lot here. You need to be realistic in your time. Try for about an hour (that’s what I did) but make sure your testers have enough time. The gray “o” in each cell is where we didn’t have enough time do that task.

You can see some “hot rows” here too: setting the font in gedit, and renaming a folder in Nautilus. And changing all instances of some words in gedit, and installing a program in Software, and maybe creating two notes in Notes.
Most of the interns did a traditional usability test. So that’s what Sanskriti did here:
Sanskriti did a usability test that was similar to mine, so we could measure changes. She had a slightly different color map here, using two tones for green. But you can see a few hot rows: changing the default colors in gedit, adding photos to an album in Photos, and setting a photo as a desktop wallpaper from Photos. Also some warm rows in creating notes in Notes, and creating a new album in Photos.
Gina was from my second cycle in Outreachy, and she did another traditional usability test:
You can see some hot rows in Gina's test: bookmarking a location in Nautilus, adding a special character (checkmark) using Characters and Evince, and saving the location (bookmark) in Evince. Also some warm rows: changing years in Calendar, and saving changes in Evince. Maybe searching for a file in Nautilus.

Gina did such great work that we co-authored an article in FOSS Force: "A Usability Study of GNOME" (March, 2016).
In the next cycle of Outreachy, we had three interns: Gina, Ciarrai and Diana. Renata did a traditional usability test:
In Renata’s heat map, you can see some hot rows: creating an album in Photos, adding a new calendar in Calendar, and connecting to an online account in Calendar. And maybe deleting a photo in Photos and setting a photo as a wallpaper image in Photos. Some issues in searching for a date in Calendar, and creating an event in Calendar.

See also our article in Linux Voice Magazine: "GNOME Usability Testing" (November, 2016, Issue 32).
Ciarrai did a prototype test for a future design change to the GNOME Settings application:
In the future Settings, the Design Team thought they’d have a list of categories down the side. Clicking on a category shows you the settings for that category. Here’s a mock-up for Wi-Fi in the new Settings. You can see the list of other categories down the left side:
Remember the “only five” slide from a while back? That’s only for traditional/formal usability tests. For a prototype test, we didn’t think five was enough, so Ciarrai did ten testers.

For Ciarrai’s heat map, we used slightly different colors because the tester wasn’t actually using the software. They were pointing to a paper printout. Here, green indicates the tester knew exactly what to point to, and red indicates they pointed to the wrong one. Or for some tasks that had sub-panels, orange indicates they got to the first panel, and failed to get to the second setting.

You can see some hot rows, indicating where people didn’t know what category would have the Settings option they were looking for: Monitor colors, and Screen lock time. Also Time zone, Default email client, and maybe Bluetooth and Mute notifications.
Other open source projects have adopted the same usability test methods to examine usability. Debian did a usability test of GNOME. Here’s their test: (*original)
They had more general “goals” for testers, called “missions.” Similar to scenario tasks, the missions had a more broad goal that provided some flexibility for the tester. But not very different from scenario tasks.

You can see some hot rows here: temporary files and change default video program in Settings, and installing/removing packages in Package Management. Also some issues in creating a bookmark in Nautilus, and adding/removing other clocks in Settings.
If you want more information, please visit my blog or email me.
I hope this helps you to do usability testing on your own programs. Usability is not hard! Anyone can do it!

More fun with libxmlb

A few days ago I cut the 0.1.4 release of libxmlb, which is significant because it includes the last three features I needed in gnome-software to achieve the same search results as appstream-glib.

The first is something most users of database libraries will be familiar with: Bound variables. The idea is you prepare a query which is parsed into opcodes, and then at a later time you assign one of the ? opcode values to an actual integer or string. This is much faster as you do not have to re-parse the predicate, and also means you avoid failing in incomprehensible ways if the user searches for nonsense like ]@attr. Borrowing from SQL, the syntax should be familiar:

g_autoptr(XbQuery) query = xb_query_new (silo, "components/component/id[text()=?]/..", &error);
xb_query_bind_str (query, 0, "gimp.desktop", &error);

The second feature makes the caller jump through some hoops, but hoops that make things faster: Indexed queries. As it might be apparent to some, libxmlb stores all the text in a big deduplicated string table after the tree structure is defined. That means if you do <component component="component">component</component> then we only store just one string! When we actually set up an object to check a specific node for a predicate (for instance, text()='fubar' we actually do strcmp("fubar", "component") internally, which in most cases is very fast…

Unless you do it 10 million times…

Using indexed strings tells the XbMachine processing the predicate to first check if fubar exists in the string table, and if it doesn’t, the predicate can’t possibly match and is skipped. If it does exist, we know the integer position in the string table, and so when we compare the strings we can just check two uint32_t’s which is quite a lot faster, especially on ARM for some reason. In the case of fwupd, it is searching for a specific GUID when returning hardware results. Using an indexed query takes the per-device query time from 3.17ms to about 0.33ms – which if you have a large number of connected updatable devices makes a big difference to the user experience. As using the indexed queries can have a negative impact and requires extra code it is probably only useful in a handful of cases. In case you do need this feature, this is the code you would use:

xb_silo_query_build_index (silo, "component/id", NULL, &error); // the cdata
xb_silo_query_build_index (silo, "component", "type", &error); // the @type attr
g_autoptr(XbNode) n = xb_silo_query_first (silo, "component/id[text()=$'test.firmware']", &error);

The indexing being denoted by $'' rather than the normal pair of single quotes. If there is something more standard to denote this kind of thing, please let me know and I’ll switch to that instead.

The third feature is: Stemming; which means you can search for “gaming mouse” and still get results that mention games, game and Gaming. This is also how you can search for words like Kongreßstraße which matches kongressstrasse. In an ideal world stemming would be computationally free, but if we are comparing millions of records each call to libstemmer sure adds up. Adding the stem() XPath operator took a few minutes, but making it usable took up a whole weekend.

The query we wanted to run would be of the form id[text()~=stem('?') but the stem() would be called millions of times on the very same string for each comparison. To fix this, and to make other XPath operators faster I implemented an opcode rewriting optimisation pass to the XbMachine parser. This means if you call lower-case(text())==lower-case('GIMP.DESKTOP') we only call the UTF-8 strlower function N+1 times, rather than 2N times. For lower-case() the performance increase is slight, but for stem it actually makes the feature usable in gnome-software. The opcode rewriting optimisation pass is kinda dumb in how it works (“lets try all combinations!”), but works with all of the registered methods, and makes all existing queries faster for almost free.

One common question I’ve had is if libxmlb is supposed to obsolete appstream-glib, and the answer is “it depends”. If you’re creating or building AppStream metadata, or performing any AppStream-specific validation then stick to the appstream-glib or appstream-builder libraries. If you just want to read AppStream metadata you can use either, but if you can stomach a binary blob of rewritten metadata stored somewhere, libxmlb is going to be a couple of orders of magnitude faster and use a ton less memory.

If you’re thinking of using libxmlb in your project send me an email and I’m happy to add more documentation where required. At the moment libxmlb does everything I need for fwupd and gnome-software and so apart from bugfixes I think it’s basically “done”, which should make my manager somewhat happier. Comments welcome.

The GNOME (and WebKitGTK+) Networking Stack

WebKit currently has four network backends:

  • CoreFoundation (used by macOS and iOS, and thus Safari)
  • CFNet (used by iTunes on Windows… I think only iTunes?)
  • cURL (used by most Windows applications, also PlayStation)
  • libsoup (used by WebKitGTK+ and WPE WebKit)

One guess which of those we’re going to be talking about in this post. Yeah, of course, libsoup! If you’re not familiar with libsoup, it’s the GNOME HTTP library. Why is it called libsoup? Because before it was an HTTP library, it was a SOAP library. And apparently somebody thought that when Mexican people say “soap,” it often sounds like “soup,” and also thought that this was somehow both funny and a good basis for naming a software library. You can’t make this stuff up.

Anyway, libsoup is built on top of GIO’s sockets APIs. Did you know that GIO has Object wrappers for BSD sockets? Well it does. If you fancy lower-level APIs, create a GSocket and have a field day with it. Want something a bit more convenient? Use GSocketClient to create a GSocketConnection connected to a GNetworkAddress. Pretty straightforward. Everything parallels normal BSD sockets, but the API is nice and modern and GObject, and that’s really all there is to know about it. So when you point WebKitGTK+ at an HTTP address, libsoup is using those APIs behind the scenes to handle connection establishment. (We’re glossing over details like “actually implementing HTTP” here. Trust me, libsoup does that too.)

Things get more fun when you want to load an HTTPS address, since we have to add TLS to the picture, and we can’t have TLS code in GIO or GLib due to this little thing called “copyright law.” See, there are basically three major libraries used to implement TLS on Linux, and they all have problems:

  • OpenSSL is by far the most popular, but it’s, hm, shall we say technically non-spectacular. There are forks, but the forks have problems too (ask me about BoringSSL!), so forget about them. The copyright problem here is that the OpenSSL license is incompatible with the GPL. (Boring details: Red Hat waves away this problem by declaring OpenSSL a system library qualifying for the GPL’s system library exception. Debian has declared the opposite, so Red Hat’s choice doesn’t gain you anything if you care about Debian users. The OpenSSL developers are trying to relicense to the Apache license to fix this, but this process is taking forever, and the Apache license is still incompatible with GPLv2, so this would make it impossible to use GPLv2+ software except under the terms of GPLv3+. Yada yada details.) So if you are writing a library that needs to be used by GPL applications, like say GLib or libsoup or WebKit, then it would behoove you to not use OpenSSL.
  • GnuTLS is my favorite from a technical standpoint. Its license is LGPLv2+, which is unproblematic everywhere, but some of its dependencies are licensed LGPLv3+, and that’s uncomfortable for many embedded systems vendors, since LGPLv3+ contains some provisions that make it difficult to deny you your freedom to modify the LGPLv3+ software. So if you rely on embedded systems vendors to fund the development of your library, like say libsoup or WebKit, then you’re really going to want to avoid GnuTLS.
  • NSS is used by Firefox. I don’t know as much about it, because it’s not as popular. I get the impression that it’s more designed for the needs of Firefox than as a Linux system library, but it’s available, and it works, and it has no license problems.

So naturally GLib uses NSS to avoid the license issues of OpenSSL and GnuTLS, right?

Haha no, it uses a dynamically-loadable extension point system to allow you to pick your choice of OpenSSL or GnuTLS! (Support for NSS was started but never finished.) This is OK because embedded systems vendors don’t use GPL applications and have no problems with OpenSSL, while desktop Linux users don’t produce tivoized embedded systems and have no problems with LGPLv3. So if you’re using desktop Linux and point WebKitGTK+ at an HTTPS address, then GLib is going to load a GIO extension point called glib-networking, which implements all of GIO’s TLS APIs — notably GTlsConnection and GTlsCertificate — using GnuTLS. But if you’re building an embedded system, you simply don’t build or install glib-networking, and instead build a different GIO extension point called glib-openssl, and libsoup will create GTlsConnection and GTlsCertificate objects based on OpenSSL instead. Nice! And if you’re Centricular and you’re building GStreamer for Windows, you can use yet another GIO extension point, glib-schannel, for your native Windows TLS goodness, all hidden behind GTlsConnection so that GStreamer (or whatever application you’re writing) doesn’t have to know about SChannel or OpenSSL or GnuTLS or any of that sad complexity.

Now you know why the TLS extension point system exists in GIO. Software licenses! And you should not be surprised to learn that direct use of any of these crypto libraries is banned in libsoup and WebKit: we have to cater to both embedded system developers and to GPL-licensed applications. All TLS library use is hidden behind the GTlsConnection API, which is really quite nice to use because it inherits from GIOStream. You ask for a TLS connection, have it handed to you, and then read and write to it without having to deal with any of the crypto details.

As a recap, the layering here is: WebKit -> libsoup -> GIO (GLib) -> glib-networking (or glib-openssl or glib-schannel).

So when Epiphany fails to load a webpage, and you’re looking at a TLS-related error, glib-networking is probably to blame. If it’s an HTTP-related error, the fault most likely lies in libsoup. Same for any other GNOME applications that are having connectivity troubles: they all use the same network stack. And there you have it!

P.S. The glib-openssl maintainers are helping merge glib-openssl into glib-networking, such that glib-networking will offer a choice of GnuTLS or OpenSSL and obsoleting glib-openssl. This is still a work in progress. glib-schannel will be next!

P.S.S. libcurl also gives you multiple choices of TLS backend, but makes you choose which at build time, whereas with GIO extension points it’s actually possible to choose at runtime from the selection of installed extension points. The libcurl approach is fine in theory, but creates some weird problems, e.g. different backends with different bugs are used on different distributions. On Fedora, it used to use NSS, but now uses OpenSSL, which is fine for Fedora, but would be a license problem elsewhere. Debian actually builds several different backends and gives you a choice, unlike everywhere else. I digress.

November 11, 2018

Compile any C++ program 10× faster with this one weird trick!

tl/dr: Is it unity builds? Yes.

I would like to know more!

At work I have to compile a large code base from scratch fairly often. One of the components it has is a 3D graphics library. It takes around 2 minutes 15 seconds to compile using an 8 core i7. After a while I got bored with this and converted the system to use a unity build. In all simplicity what that means is that if you have a target consisting of files foo.cpp, bar.cpp, baz.cpp etc you create a cpp file with the following contents:

#include<foo.cpp>
#include<bar.cpp>
#include<baz.cpp>

Then you would tell the build system to build that instead of the individual files. With this method the compile time dropped down to 1m 50s which does not seem like that much of a gain but the compilation used only one CPU core. The remaining 7 are free for other work. If the project had 8 targets of roughly the same size, building them incrementally would take 18 minutes. With unity builds they would take the exact same 1m 50s assuming perfect parallelisation, which happens fairly often in practice.

Wait, what? How is this even?

The main reason that C++ compiles slowly has to do with headers. Merely including a few headers in the standard library brings in tens or hundreds of thousands of lines of code that must be parsed, verified, converted to an AST and codegenerated in every translation unit. This is extremely wasteful especially given that most of that work is not used but is instead thrown away.

With an Unity build every #include is processed only once regardless of how many times it is used in the component source files.

Basically this amounts to a caching problem, which is one of the two really hard problems in computer science in addition to naming things and off by one errors.

Why is this not used by everybody then?

There are several downsides and problems. You can't take any old codebase and compile it as a unity build. The first blocker is that things inside source files leak into other ones since they are all textually included one after the other.. For example if you have two files and each of them declares a static function with the same name, it will lead to name clashes and a compilation failure. Similarly things like using namespace std declarations leak from one file to another causing havoc.

But perhaps the biggest problem is that every recompilation takes the same time. An incremental rebuild where one file has changed takes a few seconds or so whereas a unity builds takes the full 1m 50s every time. This is a major roadblock to iterative development and the main reason unity builds are not widely used.

A possible workflow with Meson

For simplicity let's assume that we have a project that builds and works with unity builds. Meson has an automatic unity build file generator that can be enabled by setting the value of the unity build option.

This solves the basic build problem but not the incremental one. However usually you'd develop only one target (be it a library, executable or module) and want to build only that one incrementally and everything else as a unity build. This can be done by editing the build definition of the target in question and adding an override option:

executable(..., override_options : ['unity=false'])

Once you are done you can remove the override from the build file to return everything back to normal.

How does this tie in with C++ modules?

Directly? Not in any way really. However one of the stated advantages of modules has always been faster build times. There are a few module implementations but there is very little public data on how they behave with real world codebases. During a CppCon presentation on modules Google's Chandler Carruth mentioned that in Google's code base modules resulted in 30% build time reduction.

It was not mentioned whether Google uses unity builds internally but they almost certainly don't (based on things such as this bug report on Bazel). If we assume that theirs is the fastest existing "classical" C++ build mechanism, which it probably is, the conclusion is that it is an order of magnitude slower than a unity build on the same source files. A similar performance gap would probably not be tolerated in any other part of the C++ ecosystem.

The shoemaker's children go barefoot.

2018-11-11 Sunday

  • All Saints, played bass; pizza lunch, dropped H. and M. off to the Rememberance parade & service with Scouts. Watched Man in the High Castle, and a large nativity silhouette with J. Worked on mending J's old Galaxy S3 for M. in the evening - struggled to find compatible PIT files that would not be rejected; grim. Rather unclear why Android's Download mode doesn't let you read the flash contents smoothly.

November 10, 2018

Purism Fractal sponsorship

I’m happy to announce that Purism agreed to sponsor my work on Fractal for the next couple of weeks. I will polish the room history and drastically improve the UX/UI around scrolling, loading messages etc. which will make Fractal feel much nicer. As part of this I will also clean up and refactor the current code. On my agenda is the following:

Smooth history loading

Loading old messages in the history is currently a bit jarring, because the scroll position isn’t preserved when new messages come in. I’d like to address this by loading messages outside the viewport, making it so that the user isn’t even aware that more messages are being loaded most of the time. This is a crucial part of why modern messaging apps feel so nice.

Faster message rendering

There are some inefficiencies in how messages are currently rendered, which make showing messages not as smooth as it could be. Fixing this could improve the experience of sending/receiving messages significantly.

 “New messages” behavior

  • Re-add a “New messages” divider, since it was lost as part of the big history refactor I recently completed
  • Scroll to last seen message when opening app instead of most recent message
  • Fix bugs in current behavior and make sure the divider always shows up

Add day label

Add a label with the day/date at the beginning of every new day, like other messaging apps do.

Developer Center Initiative – Meeting Summary 10th November 2018

The Developer Center Initiative is an attempt to reboot the developer center  based on a new modern platform. For more information, see my previous  blog posts.

It’s been two months since the last Developer Center meeting. In light of that I called in for a meeting last week to get a status of things. We discussed three items:

  • Changing the current state of the developer center.
  • The need for a physical meetup to reinforce spirits and get bulk work started.
  • Pending bugs and feature merges in hotdoc relevant for the developer center.

Developer Center State

Thibault currently holds a branch for gnome-devel-docs. The branch contains the old GNOME Developer docs ported to markdown. To ensure that no duplicate work happens between gnome-devel-docs master and the branch, the next step is to announce to relevant mailing lists that further contribution to the developer docs should happen in the gnome-devel-docs branch. Even more ideal would be to have the branch pushed to master.  The markdown port is not synchronized in any way with the mallard docs in master, so any changes to the mallard docs would require re-synchronization and that’s why currently editing ported markdown docs in the branch currently is a no-go for now.

Pushing the branch does imply that we initially loose translations though and most changes made to gnome-devel-docs seem to be translations these days with a few exceptions (mostly grammar corrections). Thibault and Mathieu expressed interest in supporting translated docs in the future, but it is a substantial amount of work and low on the todo list.

We agreed that I should try to get in touch by e-mail to the relevant mailing lists (including translations) and to individuals who contributed to gnome-devel-docs recently to hear their opinion before we proceed.

Hotdoc status

On the hotdoc side, Mathieu and Thibault explained that they have pending work from the GStreamer docs waiting to land and include in a new release.  There is also an ongoing feature request to support flexboxes through markdown syntax which would be nice to have if we want to align Thibault’s branch more closely to Allan’s mockup.

With help from Thibault and Mathieu I managed to get a local instance of hotdoc running. A few bugs were fixed along the way and I plan to blog post a getting started guide to get Thibaults branch running on your computer which hopefully in turn can be used to improve the existing documentation on hotdoc.

Hackfest plans

For the past two months activity in this initiative has been running low which signals to me that we need to meet together physically again. The meeting attendance this time was around 3-4 people. I am going to FOSDEM 2019 and if anyone else interested in the gnome developer center’s future are attending too, I’ll be happy to meet up during conference for a chat. If sufficiently many shows interest, we could also extend the conference a day, sit down and have a look at the state of things by then.

Otherwise, if someone out there could help providing venue/office space sometime in spring, I can try to gather the group of people who have shown interest so far and get a proper hackfest going.

November 09, 2018

Talking at PETCon2018 in Hamburg, Germany and OpenPGP Email Summit in Brussels, Belgium

Just like last year, I managed to be invited to the Privacy Enhancing Technologies Conference to talk about GNOME. First, Simone Fischer-Huebner from Karlstadt University talked about her projects which are on the edge of security, cryptography, and usability, which I find a fascinating area to be in. She presented outcomes of her Prismacloud project which also involves fancy youtube videos…

I got to talk about how I believe GNOME is in a good position make a safe and secure operating system. I presented some case studies and reported on the challenges that I see. For example, Simone mentioned in her talk that certain users don’t trust a software if it is too simple. Security stuff must be hard, right?! So how do measure the success of your security solution? Obviously you can test with users, but certain things are just very hard to get users for. For example, testing GNOME Keysign requires a user not only with a set up MUA but also with a configured GnuPG. This is not easy to come by. The discussions were fruitful and I got sent a few references that might be useful in determining a way forward.

OpenPGP Email Summit

I also attended the OpenPGP Email Summit in Brussels a few weeks ago. It’s been a tiny event graciously hosted by a local company. Others have written reports, too, which are highly interesting to read.

It’s been an intense weekend with lots of chatting, thinking, and discussing. The sessions were organised in a bar-camp style manner. That is, someone proposed what to discuss about and the interested parties then came together. My interest was in visual security indication, as triggered by this story. Unfortunately, I was lured away by another interesting session about keyserver and GDPR compliance which ran in parallel.

For the plenary session, Holger Krekel reported on the current state of Delta.Chat. If you haven’t tried it yet, give it a go. It’s trying to provide an instant messaging interface with an email transport. I’ve used this for a while now and my experience is mixed. I still get to occasional email I cannot decrypt and interop with my other MUA listening on the very same mailbox is hit and miss. Sometimes, the other MUA snatches the email before Delta.chat sees it, I think. Otherwise, I like the idea very much. Oh, and of course, it implements Autocrypt, so your clients automatically encrypt the messages.

Continuing the previous talk, Azul went on to talk about countermitm, an attempt to overcome Autocrypt 1.0‘s weaknesses. This is important work. Because without the vision of how to go from Autocrypt Level 1 to Level 2, you may very well question to usefulness. As of now, Emails are encrypted along their way (well. Assuming MTA-STS) and if you care about not storing plain text messages in your mailbox, you could encrypt them already now. Defending against active attackers is hard so having sort of a plan is great. Anyway, countermitm defines “verified groups” which involves a protocol to be run via Email. I think I’ve mentioned earlier that I still think that it’s a bit a sad that we don’t have the necessary interfaces to run protocols over Email. Outlook, I think, can do simple stuff like voting for of many options or retracting an email. I would want my key exchange to be automated further, i.e. when GNOME Keysign sends the encrypted signature, I would want the recipient to decrypt it and send it back.

Phil Zimmermann, the father of PGP, mentioned a few issues he sees with the spec, although he also said that it’s been a while that he was deeply into this matter. He wanted the spec to be more modern and more aggressively pushing for today’s cryptography rather than for the crypto of the past. And in fact, he wants the crypto of tomorrow. Now. He said that we know that big agencies are storing message today for later analyses. And we currently have no good way of having what people call “perfect forward secrecy” so a future key compromise makes the messages of today readable. He wants post quantum crypto to defeat the prying eyes. I wonder whether anybody has implemented pq-schemes for GnuPG, or any other OpenPGP implementation, yet.

My takeaways are: The keyserver network needs a replacement. Currently, it is used for initial key discovery, key updates, and revocations. I think we can solve some of these problems better if we separate them. For example, revocations are pretty much a fire and forget thing whereas other key updates are not necessarily interesting in twenty years from now. Many approaches for making initial key discovery work have been proposed. WKD, Autocrypt, DANE, Keybase, etc. Eventually one of these approaches wins the race. If not, we can still resort back to a (plain) list of Email addresses and their key ids. That’s as good or bad as the current situation. For updates, the situation is maybe not as bad. But we might still want to investigate how to prevent equivocation.

Another big thing was deprecating cruft in the spec to move a bit faster in terms of cryptography and to allow implementers to get a compliant program running (more) quickly. Smaller topics were the use of PQ safe algorithm and exploitation of backwards incompatible changes to the spec, i.e. v5 keys with full fingerprints. Interestingly enough, a trimmed down spec had already been developed here.

November 08, 2018

GNOME ED update – October

I’m currently writing this from sunny (but very cold!) Colorado Springs, and tomorrow I’m off to SeaGL. We’ll have a booth there, so come and say hi to me and Rosanna if you’re in Seattle! More on that in the next report, however :)

General

As per usual, our main focus has been on the hiring of new staff members for the Foundation. We’ve completed a few second interviews and a couple of first interviews. We’re aiming to start making offers around the end of November. If you have put in an application, and haven’t heard back in a while, please don’t worry! It’s simply due to a large number of people who’ve applied and the very manual way we’ve had to process these. Everyone should hear back.

We’ve also had some interesting times with our banking. The short version is, we’ve moved banks to another provider. This has taken quite a bit of work, but hopefully, this should be settling down now.

As mentioned in issue #43, we have an employee handbook. However, it’s not public and hasn’t been updated. We’ve now managed to find a service that will do some of this for us, so we don’t need to create a whole load of text.

Finally, some minor items: Three trademark agreements were granted/modified, one GDPR request is being considered (removal of email from list archives), the GNOME namespace on handshake.org has been requested (which involved a number of calls with our trademark lawyer), a Dun and Bradstreet number (D-U-N-S) has been requested so we can then request a free Apple code signing certificate and the EU events box should now have a laptop.

Conferences

SustainOSS Summit

This event was quite interesting and was held in London at the end of October. There was an estimated 150 people attend, and all to talk about the sustainability of open source software, and how this can be improved (sustainability here should be read in all forms; financial, newcomer experience; maintainer burnout etc).

Freenode#Live

Held in Bristol at the start of November Freenode#Live was once again an impressive event, and I presented my “Why Free Software on the desktop matters” talk. The most useful aspect of the event is to meet up with key people in the FOSS community, and Advisory Board members.

Finally, as we were leaving the venue, one of the venue staff members came up to say hello. He’s a professional graphic illustrator/designer and although having never heard about free software before, was impressed by the conference and our passion that he’s volunteering to help with design work.

November 07, 2018

Mesa Update Breaks WebKitGTK+ in Fedora 29

If you’re using Fedora and discovered that WebKitGTK+ is displaying blank pages, the cause is a bad mesa update, mesa-18.2.3-1.fc29. This in turn was caused by a GCC bug that resulted in miscompilation of mesa.

To avoid this bug, downgrade to mesa-18.2.2-1.fc29:

$ sudo dnf downgrade mesa*

You can also update to mesa-18.2.4-2.fc29, but this build has not yet reached updates-testing, let alone stable, so downgrading is easier for now. Another workaround is to run your application with accelerated compositing mode disabled, to avoid OpenGL usage:

$ WEBKIT_DISABLE_COMPOSITING_MODE=1 epiphany

On the bright side of things, from all the bug reports I’ve received over the past two days I’ve discovered that lots of people use Epiphany and notice when it’s broken. That’s nice!

Huge thanks to Dave Airlie for quickly preparing the fixed mesa update, and to Jakub Jelenik for handling the same for GCC.

November 06, 2018

Speed up your GitLab CI

GNOME GitLab has AWS runners, but they are used only when pushing code into a GNOME upstream repository, not when you push into your personal fork. For personal forks there is only one (AFAIK) shared runner and you could be waiting for hours before it picks your job.

But did you know you can register your own PC, or a spare laptop collecting dust in a drawer, to get instant continuous integration (CI) going? It’s really easy to setup!

1. Install docker

apt install docker.io

2. Install gitlab-runner

Follow the instructions here:
https://gitlab.com/gitlab-org/gitlab-runner/blob/master/docs/install/linux-repository.md#installing-the-runner

(Note: The Ubuntu 18.04 package doesn’t seem to work.)

3. Install & start the GitLab runner service

sudo gitlab-runner install
sudo gitlab-runner start

4. Find the registration token

Go to your gitlab project page, settings -> CI/CD -> expand “runners”

5. Register your runner

sudo gitlab-runner register --non-interactive --url https://gitlab.gnome.org --executor docker --docker-image fedora:27 --registration-token **

You can repeat step 5 with the registration token of all your personal forks in the same GitLab instance. To make this easier, here’s a snippet I wrote in my ~/.bashrc to register my “builder.local” machine on a new project. Use it as gitlab-register .

function gitlab-register {
  host=$1
  token=$2

  case "$host" in
    gnome)
      host=https://gitlab.gnome.org
      ;;
    fdo)
      host=https://gitlab.freedesktop.org
      ;;
    collabora)
      host=https://gitlab.collabora.com
      ;;
    *)
      host=https://gitlab.gnome.org
      token=$1
  esac

  cmd="sudo gitlab-runner register --non-interactive --url $host --executor docker --docker-image fedora:27 --registration-token $token"

  #$cmd

  ssh builder.local -t "$cmd"
}

Not only will you now get faster CI, but you’ll also reduce the queue on the shared runner for others!

Birds in flight

Flying birds

If you follow Planet GNOME, you’ll know about Jim Hall’s fantastic usability testing work. For years Jim has spearheaded usability testing on GNOME, both by running tests himself and mentoring usability testing internships offered through Outreachy.

This Autumn, Jim will once again be mentoring usability testing internships. However, this time round, we’re planning on running the internships a bit differently.

In previous rounds of usability testing, the tests have typically been performed on released software: that is, apps and features that are already in the hands of users. This is great and has flagged up issues that we’ve gone on to fix, but it has some drawbacks.

Most obviously, it means that users are exposed to the software prior to usability testing takes place. However, it also means that test results can take a long time to be corrected: active development of the software in question might have been paused by the time the tests are conducted, and it can sometimes take a while until a developer is able to correct any usability issues that have been identified.

Therefore, for this round of the Outreachy internships, we are only going to test UX changes that are actively being worked on. Instead of testing finished features, the tests will be on two things:

  1. Mockups or prototypes of changes that we hope to implement soon (this can include static mockups and paper or software prototypes)
  2. Features or UI changes that are being actively worked on, but haven’t been released to users yet

One goal is to increase the number of cycles of data-driven iteration that our UX work goes through. Ideally there should be multiple rounds of testing and design changes prior to coding even taking place! This will reduce the amount of UI changes that have to be made, and in turn reduce the amount of work for our developers.

Organising the tests in this way, we’re drawing on ideas from agile and lean. The plan is to have a predefined schedule of tests. When test day rolls around, we’ll figure out what we want to test. This will force a routine to our practice and ensure that we keep the exercise light and iterative.

There’s lots of work in progress UX work in GNOME right now, all of which would benefit from testing. This includes the new menu arrangements that are replacing app menus, new sound settings, new design patterns for lists, new application permission settings, the new lock screen design, and more.

One thing I’d actually love to see is design initiatives rejected outright because of testing feedback.

The region and language settings are being updated right now. We can test this!

This approach to testing is an experiment and we’ll have to see how well it works in practice. However, if it does go well, I’m hopeful that we can incorporate it into our design and development practice more generally.

Jim and the rest of the design team will be looking for help from the rest of the GNOME community as we approach the test days. If anyone wants to help make prototypes or make sure that development branches can be easily run by our interns, your help would be extremely welcome. Likewise, we’d love to hear from anyone who has development work that they would like to have tested.

Taking Out the Garbage

From the title, you might think this post is about household chores. Instead, I’m happy to announce that we may have a path to solving GJS’s “Tardy Sweep Problem”.

For more information about the problem, read The Infamous GNOME Shell Memory Leak by Georges Stavracas. This is going to be a more technical post than my previous post on the topic, which was more about the social effects of writing blog posts about memory leaks. So first I’ll recap what the problem is.

Garbage, garbage, everywhere

Buzz Lightyear meme with the caption "Garbage, Garbage Everywhere"At the root of the GNOME desktop is an object-oriented technology called GObject. GObjects are reference counted, but not garbage collected. As long as their reference count is nonzero, they are “alive”, and when their reference count drops to zero, they are deleted from memory.

GObject reference counting

Graphical user interfaces (such as a large part of GNOME Shell) typically involve lots of GObjects which all increase each other’s reference count. A diagram for a simple GUI window made with GTK might look like this:

A typical GUI would involve many more objects than this, but this is just for illustrating the problem.

Here, each box is an object in the C program.

Note that these references are all non-directional, meaning that they aren’t really implemented as arrows. In reality it looks more like a list of numbers; Window (1); Box (1); etc. Each object “knows” that it has one reference, but it knows nothing about which other objects own those references. This will become important later.

When the app closes, it drops its reference to the window. The window’s reference count becomes zero, so it is erased. As part of that, it drops all the references it owns, so the reference count of the upper box becomes zero as well, and so on down the tree. Everything is erased and all the memory is reclaimed. This all happens immediately. So far, so good.

Javascript objects

To write the same GUI in a Javascript program, we want each GObject in the underlying C code to have a corresponding Javascript object so that we can interact with the GUI from our Javascript code.

Javascript objects are garbage collected, and the garbage collector in the SpiderMonkey JS engine is a “tracing” garbage collector, meaning that on every garbage collection pass it starts out with objects in a “root set” that it knows are not garbage. It “traces” each of those objects, asking it which other objects it refers to, and keeps tracing each new object until it hits a dead end. Any objects that weren’t traced are considered garbage, and are deleted. (For more information, the Wikipedia article is informative: https://en.wikipedia.org/wiki/Tracing_garbage_collection)

We need to integrate the JS objects and their garbage collection scheme with the GObjects and their reference counting scheme. That looks like this:

The associations between the Javascript objects (“JS”) and the GObjects are bidirectional. That means, the JS object owns a reference to the GObject, meaning the reference count of every GObject in this diagram is 2. The GObject also “roots” the JS object (marks it as unable to be garbage collected) because the JS object may have some state set on it (for example, by writing button._alreadyClicked = false; in JS) that should not be lost while the object is still alive.

The JS objects can also refer to each other. For example, see the rightmost arrow from the window’s JS object to the button’s JS object. The JS code that created this GUI probably contained something like win._button = button;. These references are directional, because the JS engine needs to know which objects refer to which other objects, in order to implement the garbage collector.

Speaking of the garbage collector! The JS objects, unlike the GObjects, are cleaned up by garbage collection. So as long as a JS object is not “rooted” and no other JS object refers to it, the garbage collector will clean it up. None of the JS objects in the above graph can be garbage collected, because they are all rooted by the GObjects.

Toggle references and tardy sweeps

Two objects (G and JS) keeping each other alive equals a reference cycle, you might think. That’s right; as I described it above, neither object could ever get deleted, so that’s a memory leak right there. We prevent this with a feature called toggle references: when a GObject’s reference count drops to 1 we assume that the owner of the one remaining reference is the JS object, and so the GObject drops its reference to the JS object (“toggles down“). The JS object is then eligible for garbage collection if no other JS object refers to it.

(If this doesn’t make much sense, don’t worry. Toggle references are among the most difficult to comprehend code in the GJS codebase. It took me about two years after I became the maintainer of GJS to fully understand them. I hope that writing about them will demystify them for others a bit.)

When we close the window of this GUI, here is approximately what happens. The app drops its references to the GObjects and JS objects that comprise the window. The window’s reference count drops to 1, so it toggles down, dropping one direction of the association between GObject and JS object.

Unlike the GObject-only case where everything was destroyed immediately, that’s all that can happen for now! Everything remains in place until the next garbage collection, because at the top of the object tree is the window’s JS object. It is eligible to be collected because it’s not rooted and no other JS object refers to it.

Normally the JS garbage collector can collect a whole tree of objects at once. That’s why the JS engine needs to have all the information about the directionality of the references.

However, it won’t do that for this tree. The JS garbage collector doesn’t know about the GObjects. So unfortunately, it takes several passes of the garbage collector to get everything. After one garbage collection only the window is gone, and the situation looks like this:

Now, the outermost box’s JS object has nothing referring to it, so it will be collected on the next pass of the garbage collector:

And then it takes one more pass for the last objects to be collected:

The objects were not leaked, as such, but it took four garbage collection passes to get all of them. The problem we previously had, that Georges blogged about, was that the garbage collector didn’t realize that this was happening. In normal use of a Javascript engine, there are no GObjects that behave differently, so trees of objects don’t deconstruct layer by layer like this. So, there might be hours or days in between garbage collector passes, making it seem like that memory was leaked. (And often, other trees would build up in the intervening time between passes.)

Avoiding toggle references

To mitigate the problem Georges implemented two optimizations. First, the “avoid toggle references” patch, which was actually written by Giovanni Campagna several years ago but never finished, made it so that objects don’t start out using the toggle reference system. Instead, only the JS objects hold references to the GObjects. The JS object can get garbage collected whenever nothing else refers to it, and it will drop its reference to the GObject.

A problem then occurs when that wasn’t the last reference to the GObject, i.e. it’s being kept alive by some C code somewhere, and the GObject resurfaces again in JS, for example by being returned by a C function. In this case we recreate the JS object, assuming that it will be identical to the one that was already garbage collected. The only case where that assumption doesn’t hold, is when the JS code sets some state on one of the JS objects. For example, you execute something like myButton._tag = 'foo';. If myButton gets deleted and recreated, it won’t have a _tag property. So in the case where any custom state is set on a JS object, we switch it over to the toggle reference system once again.

In theory this should help, because toggle references cause the tardy sweep problem, so if fewer objects use toggle references, there should be fewer objects collected tardily. However, this didn’t solve the problem, because especially in GNOME Shell, most JS objects have some state on them. And, sadly, it made the toggle reference code even more complicated than it already was.

The Big Hammer

The second optimization Georges implemented was the affectionately nicknamed “Big Hammer”. It checks if any GObjects toggled down during a garbage collector pass, and if so, restart the garbage collector a few seconds after. This made CPU performance worse, but would at least make sure that all unused objects were deleted from memory within a reasonable time frame (under a minute, rather than a day.)

Combined with some other memory optimizations, this made GNOME 3.30 quite a lot less memory hungry than its predecessors.

An afternoon at Mozilla

Earlier this year, I had been talking on IRC to Ted Campbell and Steve Fink on the SpiderMonkey team at Mozilla for a while, about various ins and outs of being an external (i.e. not Firefox) user of SpiderMonkey’s JS engine API. Early September I found myself in Toronto, where Ted Campbell is based out of, and I paid a visit to the Mozilla office one afternoon.

I had lunch with Ted and Kannan Vijayan of the SpiderMonkey team where we discussed the current status of external SpiderMonkey API users. Afterwards, we made the plans which eventually became this GitHub repository of examples and best practices for using the SpiderMonkey JS engine outside of Firefox. We have both documentation and code examples there, and more on the way. This is still in progress, but it should be the beginning of a good resource for embedding the JS engine, and the end of all those out-of-date pages on MDN!

I also learned some good practices that I can put to use in GJS. For example, we should avoid using JS::PersistentRooted except as a last resort, because it roots objects by putting them in a giant linked list, which is then traced during garbage collection. It’s often possible to store the objects more efficiently than that, and trace them from some other object, or the context.

Ending the tardy sweeps

In the second half of the afternoon we talked about some of the problems that I had with SpiderMonkey that were specific to GJS. Of course, the tardy sweep problem was one of them.

For advice on that, Ted introduced me to Nika Layzell, an engineer on the Gecko team. We looked at the XPCOM cycle collector and I was surprised to learn that Gecko uses a scheme similar to toggle references for some things. However, rather than GJS sticking with toggle references, she suggested a solution that had the advantage of being much simpler.

In “Avoiding toggle references” above, I mentioned that the only thing standing in the way of removing toggle references, is custom state on the JS objects. If there is custom state, the objects can’t be destroyed and recreated as needed. In Gecko, custom state properties on DOM objects are called “expandos” or “expando properties” and are troublesome in a similar way that they are in GJS’s toggle references.

Nika’s solution is to separate the JS object from the expandos, putting the expandos on a separate JS object which has a different lifetime from the JS object that represents the GObject in the JS code. We can then make the outer JS objects into JS Proxies so that when you get or set an expando property on the JS object, it delegates transparently to the expando object.

Kind of like this:

In the “before” diagram, there is a reference cycle which we have to solve with toggle references, and in the “after” diagram, there is no reference cycle, so everything can simply be taken care of by the garbage collector.

In cases where an object doesn’t have any expando properties set on it, we don’t even need to have an expando object at all. It can be created on demand, just like the JS object. It’s also important to note that the expando objects can never be accessed directly from JS code; the GObject is the sole conduit by which they can be accessed.

Recasting our GUI from the beginning of the post with a tree of GUI elements where the top-level window has an expando property pointing to the bottom-level button, and where the window was just closed, gives us this:

Most of these GObjects don’t even need to have expando objects, or JS objects!

At first glance this might seem to be garbage-collectable all at once, but we have to remember that GObjects aren’t integrated with the garbage collector, because they can’t be traced, they can only have their reference counts decremented. And the JS engine doesn’t allow you to make new garbage in the middle of a garbage collector sweep. So a naive implementation would have to collect this in two passes, leaving the window’s expando object and the button for the second pass:

This would require an extra garbage collector pass for every expando property that referred to another GObject via its JS object. Still a lot better than the previous situation, but it would be nice if we could collect the whole thing at once.

We can’t walk the whole tree of GObjects in the garbage collector’s marking phase; remember, GObject references are nondirectional, so there’s no generic way to ask a GObject which other GObjects it references. What we can do is partially integrate with the marking phase so that when a GObject has only one reference left, we make it so that the JS object traces the expando object directly, instead of the GObject rooting the expando object. Think of it as a “toggle reference lite”. This would solve the above case, but there are still some more corner cases that would require more than one garbage collection pass. I’m still thinking about how best to solve this.

What’s next

All together, this should make the horrible toggle reference code in GJS a lot simpler, and improve performance as well.

I started writing the code for this last weekend. If you would like to help, please get in touch with me. You can help by writing code, figuring out the corner cases, or testing the code by running GNOME Shell with the branch of GJS where this is being implemented. Follow along at issue #217.

Additionally, since I am in Toronto again, I’ll be visiting the Mozilla office again this week, and hopefully more good things will come out of that!

Acknowledgements

Thanks to Ted Campbell, Nika Layzell, and Kannan Vijayan of Mozilla for making me feel welcome at Mozilla Toronto, and taking some time out of their workday to talk to me; and thanks to my employer Endless for letting me take some time out of my workday to go there.

Thank you to Ted Campbell and Georges Stavracas for reading and commenting on a draft version of this post.

The diagrams in this post were made with svgbob, a nifty tool; hat tip to Federico Mena Quintero.

November 03, 2018

GNOME Translation Editor 3.30.0

I'm pleased to announce the new GNOME Translation Editor Release. This is the new release of the well known Gtranslator. I talked about the Gtranslator Ressurection some time ago and this is the result:

This new release isn't yet in flathub, but I'm working on it so we'll have a flatpak version really soon. Meantime you can test using the gnome nightly flatpak repo.

New release 3.30.0

This release doesn't add new functionality. The main change is in the code, and in the interface.

We've removed the toolbar and move the main useful buttons to the headerbar. We've also removed the statusbar and replaced it with a new widget that shows the document translation progress.

The plugin system and the dockable window system has been removed to simplify the code and make the project more maintenable. The only plugin that is maintained for now is the translation memory, that's now integrated. I'm planning to migrate other useful plugins, but that's for the future.

Other minor changes that we've made are in the message table, we've removed some columns and now we only show two, the original message and the translated one and we use colors and text styles to show fuzzy status and untranslated.

The main work is a full code modernization, now we use meson to build, we've flatpak integration and this simplify the development because gtranslator know works by default in Gnome Builder without the need to install development dependencies.

There's others minor changes like the new look when you open the app without any file:

Or the new language selector that autofill all the profile fields using the language:

And for sure we've tried to fix most important bugs:

New name and new Icon

Following the modern GNOME app names, we've renamed the app from Gtranslator to GNOME Translation Editor. Internally we'll continue with the gtranslator name, so the app is gtranslator, but for the final user the name will be Translation Editor.

And following the App Icon Redesign Initiative we've a new icon that follows the new HIG.

Thanks

I'm not doing this alone. I became the gtranslator maintainer because Daniel Mustieles push to have a modern tool for GNOME translators, done with gnome technology and fully functional.

The GNOME Translation Editor is a project done by the GNOME community, there are other people helping with code, documentation, testing, design ideas and much more and any help is always welcome. If you're insterested, don't hesitate and come to the gnome gitlab and collaborate with this great project.

And maybe it's a bit late, but I've publish a project to the outreachy.org, so maybe someone can work on this as an intern for three months. I'll try to get more people involved here using following outreachy and maybe GSoC, so if you're a student, now is the right time to start contributing to be able to be selected for the next year internship programs.

November 02, 2018

Running EPF Composer in Fedora Linux, v3

Well, finally I succeed with native instalation of the EPF (Eclipse Process Framework) Composer in my Linux system thanks to Bruce MacIsaac and the development team help. I’m happy. This is not trivial since EPFC is a 32 bits application running in a modern 64 bits Linux system.

My working configuration:

In my system obviously I can install all rpm packages using DNF. For different distros look for the equivalent packages.

Maybe I’m missing some minor dependency, I didn’t checked in a clean instalation.

Download EPFC and xulrunner and extract each one in the path of your choice. I’m using xulrunner-10.0.2.en-US.linux-i686/ as directory name to be more meaninful.

The contents of epf.ini file:

-data
@user.home/EPF/workspace.152
-vmargs
-Xms64m
-Xmx512m
-Dorg.eclipse.swt.browser.XULRunnerPath=/PATHTOXULRUNNER/xulrunner-10.0.2.en-US.linux-i686/

I had to write the full system path for the -Dorg.eclipse.swt.browser.XULRunnerPath property to get Eclipse recognize it.

And to run EPF Composer:

cd $EPF_APP_DIR
$ epf -vm  /usr/lib/jvm/java-1.8.0-oracle-1.8.0.181/jre/bin/java  

If you want some non trivial work with Composer in Linux you’ll need xulrunner since it’s used extensively for editing contents.

Native Linux EPF Composer screenshot

I had success running the Windows EPF version using Wine and I can do some work with it, but at some point the program gets inestable and needs to reboot. Other very interesting advantage of running native is I can use the GTK+ filechooser which is really lot better than the simpler native Java one.

I plan to practice a lot modeling with EPF Composer in the coming weeks. Hopefully I’ll share some new artifacts authored by me.

November 01, 2018

GNOME Internships interns has been elected

Hello all,

GNOME Internships projects and interns has been elected!

We had have strong applicants and quite a big amount of applications. If you are not the elected don’t be discouraged, it wasn’t an easy choice.

This round we have to congratulate Ludovico de Nittis who will work in the “USB Protection” project with his mentor Tobias Mueller. Congrats Ludovico!

The project goal is to increase the robustness against attacks via malicious USB devices. Certainly a challenging goal! You can read extensive information in the project wiki linked above, it’s definitely quite interesting.

This round of internships is fund by the privacy and security campaign we did at GNOME a few years ago, and we are happy we can put them into good use to improve the security and privacy of GNOME users and donors.

Deeply thanks to all the donors! Withouth you this initiative wouldn’t have been possible. Also thanks to the mentors and applicants.

More on the admin side, it has been a little bumpy since its the first time doing these internships. And well, this last week I have been a bit worried by other stuff you already probably know about :)

If you have any question or feedback, don’t hesitate to contact me on IRC as csoriano or send an email to internships-admin@gnome.org

October 31, 2018

Update from the PipeWire hackfest

As the third and final day of the PipeWire hackfest draws to a close, I thought I’d summarise some of my thoughts on the goings-on and the future.

Thanks

Before I get into the details, I want to send out a big thank you to:

  • Christian Schaller for all the hard work of organising the event and Wim Taymans for the work on PipeWire so far (and in the future)
  • The GNOME Foundation, for sponsoring the event as a whole
  • Qualcomm, who are funding my presence at the event
  • Collabora, for sponsoring dinner on Monday
  • Everybody who attended and participate for their time and thoughtful comments

Background

For those of you who are not familiar with it, PipeWire (previously Pinos, previously PulseVideo) was Wim’s effort at providing secure, multi-program access to video devices (like webcams, or the desktop for screen capture). As he went down that rabbit hole, he wrote SPA, a lightweight general-purpose framework for representing a streaming graph, and this led to the idea of expanding the project to include support for low latency audio.

The Linux userspace audio story has, for the longest time, consisted of two top-level components: PulseAudio which handles consumer audio (power efficiency, wide range of arbitrary hardware), and JACK which deals with pro audio (low latency, high performance). Consolidating this into a good out-of-the-box experience for all use-cases has been a long-standing goal for myself and others in the community that I have spoken to.

An Opportunity

From a PulseAudio perspective, it has been hard to achieve the 1-to-few millisecond latency numbers that would be absolutely necessary for professional audio use-cases. A lot of work has gone into improving this situation, most recently with David Henningsson’s shared-ringbuffer channels that made client/server communication more efficient.

At the same time, as application sandboxing frameworks such as Flatpak have added security requirements of us that were not accounted for when PulseAudio was written. Examples including choosing which devices an application has access to (or can even know of) or which applications can act as control entities (set routing etc., enable/disable devices). Some work has gone into this — Ahmed Darwish did some key work to get memfd support in PulseAudio, and Wim has prototyped an access-control mechanism module to enable a Flatpak portal for sound.

All this said, there are still fundamental limitations in architectural decisions in PulseAudio that would require significant plumbing to address. With Wim’s work on PipeWire and his extensive background with GStreamer and PulseAudio itself, I think we have an opportunity to revisit some of those decisions with the benefit of a decade’s worth of learning deploying PulseAudio in various domains starting from desktops/laptops to phones, cars, robots, home audio, telephony systems and a lot more.

Key Ideas

There are some core ideas of PipeWire that I am quite excited about.

The first of these is the graph. Like JACK, the entities that participate in the data flow are represented by PipeWire as nodes in a graph, and routing between nodes is very flexible — you can route applications to playback devices and capture devices to applications, but you can also route applications to other applications, and this is notionally the same thing.

The second idea is a bit more radical — PipeWire itself only “runs” the graph. The actual connections between nodes are created and managed by a “session manager”. This allows us to completely separate the data flow from policy, which means we could write completely separate policy for desktop use cases vs. specific embedded use cases. I’m particularly excited to see this be scriptable in a higher-level language, which is something Bastien has already started work on!

A powerful idea in PulseAudio was rewinding — the ability to send out huge buffers to the device, but the flexibility to rewind that data when things changed (a new stream got added, or the stream moved, or the volume changed). While this is great for power saving, it is a significant amount of complexity in the code. In addition, with some filters in the data path, rewinding can break the algorithm by introducing non-linearity. PipeWire doesn’t support rewinds, and we will need to find a good way to manage latencies to account for low power use cases. One example is that we could have the session manager bump up the device latency when we know latency doesn’t matter (Android does this when the screen is off).

There are a bunch of other things that are in the process of being fleshed out, like being able to represent the hardware as a graph as well, to have a clearer idea of what is going on within a node. More updates as these things are more concrete.

The Way Forward

There is a good summary by Christian about our discussion about what is missing and how we can go about trying to make a smooth transition for PulseAudio users. There is, of course, a lot to do, and my ideal outcome is that we one day flip a switch and nobody knows that we have done so.

In practice, we’ll need to figure out how to make this transition seamless for most people, while folks with custom setup will need to be given a long runway and clear documentation to know what to do. It’s way to early to talk about this in more specifics, however.

Configuration

One key thing that PulseAudio does right (I know there are people who disagree!) is having a custom configuration that automagically works on a lot of Intel HDA-based systems. We’ve been wondering how to deal with this in PipeWire, and the path we think makes sense is to transition to ALSA UCM configuration. This is not as flexible as we need it to be, but I’d like to extend it for that purpose if possible. This would ideally also help consolidate the various methods of configuration being used by the various Linux userspaces.

To that end, I’ve started trying to get a UCM setup on my desktop that PulseAudio can use, and be functionally equivalent to what we do with our existing configuration. There are missing bits and bobs, and I’m currently focusing on the ones related to hardware volume control. I’ll write about this in the future as the effort expands out to other hardware.

Onwards and upwards

The transition to PipeWire is unlikely to be quick or completely-painless or free of contention. For those who are worried about the future, know that any switch is still a long way away. In the mean time, however, constructive feedback and comments are welcome.

Pipewire Hackfest 2018

Good morning from Edinburgh, where the breakfast contains haggis, and the charity shops have some interesting finds.

My main goal in attending this hackfest was to discuss Pipewire integration in the desktop, and how it will eventually replace PulseAudio as the audio daemon.

The main problem GNOME has had over the years with PulseAudio relate mostly to how PulseAudio was a black box when it came to its routing policy. What happens when you plug in an HDMI cable into your laptop? Or turn on your Bluetooth headset? I've heard the stories of folks with highly mobile workstations having to constantly visit the Sound settings panel.

PulseAudio has policy scattered in a number of places (do a "git grep routing" inside the sources to see that): some are in the device manager, then modules themselves can set priorities for their outputs and inputs. But there's nothing to take all the information in, and take a decision based on the hardware that's plugged in, and the applications currently in use.

For Pipewire, the policy decisions would be split off from the main daemon. Pipewire, as it gains PulseAudio compatibility layers, will grow a default/example policy engine that will try to replicate PulseAudio's behaviour. At the very least, that will mean that Pipewire won't regress compared to PulseAudio, and might even be able to take better decisions in the short term.

For GNOME, we still wanted to take control of that part of the experience, and make our own policy decisions. It's very possible that this engine will end up being featureful and generic enough that it will be used by more than just GNOME, or even become the default Pipewire one, but it's far too early to make that particular decision.

In the meanwhile, we wanted the GNOME policies to not be written in C, difficult to experiment with for power users, and for edge use cases. We could have started writing a configuration language, but it would have been too specific, and there are plenty of embeddable languages around. It was also a good opportunity for me to finally write the helper library I've been meaning to write for years, based on my favourite embedded language, Lua.

So I'm introducing Anatole. The goal of the project is to make it trivial to write chunks of programs in Lua, while the core of your project is written in C (we might even be able to embed it in Python or Javascript, once introspection support is added).

It's still in the very early days, and unusable for anything as of yet, but progress should be pretty swift. The code is mostly based on Victor Toso's incredible "Lua factory" plugin in Grilo. (I'm hoping that, once finished, I won't have to remember on which end of the stack I need to push stuff for Lua to do something with it ;)

October 30, 2018

First responder: fire alarm

Within Netherlands each company is by law required to have first responders. These handle various situations until the professionals arrive. It’s usually one of (possible) fire, medical or an evacuation. Normally I’d post this at Google+ but as that’s going away I’ll put it on this blog. I prefer writing it down so later on I still can see the details.

While having lunch I noticed a notification about the fire department going to my (huge) office building. As this was during lunch time and there might be way less first responders available I headed back. The message says “Handmelder” which is Dutch for those manually operated red boxes. As these are manual the fire department assumes someone verified that there’s a fire.

Screenshot of an P2000 app

According to Google Maps the fire department is a 7 minute drive away, so a maximum of 5 minutes for them. I saw them using a road which was partly closed for construction. A bit strange as they’ve been informed & there are signs. Anyway, I went inside. At this point I don’t know much. You hear people asking what is going on. I can guess but better not to assume too much. I do see that one of the fire doors closed itself. I notice the lack of an evacuation alarm though the pager indicated it’s either cellar, parking level or ground floor.

To help out I need a bright vest and a walkie talkie. The bright vest is very important a) for fire department to recognize whom to approach b) force people to listen to me. A walkie talkie I have at my desk, that’s useless as the elevators won’t work, I have no idea what is going on and it’ll take forever using the stairs. To start I take a bright jacket as well as ask and get for a spare walkie talkie from security. It takes a bit until I notice the walkie talkie battery is dead. I want to find another but there’s 2 unneeded people in the security room standing in the way so quickest is to ask again. Normally security is a hell during these things so I really don’t want to distract security. Fortunately there’s another working walkie talkie.

I announce myself and ask for instructions. I get told to check parking level 4 east side. Normally I easily know east vs west but at the moment not so much, I’m more thinking on how to approach safely but quickly. Last week security mentioned that the new fire detection cables have east and west mixed up. It seems easier and safer to check the entire parking level.

On parking level 4 I initially see nothing strange plus I’m the only one. This is strange as I was late to arrive to the incident. I missed all of the previous conversations. It’s a waste of time to ask about this so I skip it. I don’t see any fire at all, though there is a hell of a noise. I first check if it’s one of the cars (super easy). Nothing. There’s also various doors for building related things. Normally I’d have keys for that but alas, not now. I relay that first impression is no fire. I get told to check everything as per request fire department. I don’t get why they don’t come up but pointless to wonder. It takes me a few minutes to check the various doors. I check for fire indicator lights (fire behind a door) as well as a door check (heat, smoke). Nothing to be seen. Meanwhile a car enters the parking level. That should not be possible and usually cannot be done (parking gates close). Something to tell security. I communicate that nothing found except a really loud running airco.

As of a month ago the building has a fire detection cable on all parking levels. It is very sensitive to temperature changes. It’s also installed above the parking places close to two airco outlets. They thought ahead of the potential problem and they said they addressed it (made it less sensitive). My guess is that it’s still too sensitive.

One incident leads too loads of questions. Some answered during the incident, some just after, some take a while. There’s been enough learnings in this one.

PipeWire Hackfest

So we kicked off the PipeWire hackfest in Edinburgh yesterday. We have 15 people attending including Arun Raghavan, Tanu Kaskinen and Colin Guthrie from PulseAudio, PipeWire creator Wim Taymans, Bastien Nocera and Jan Grulich representing GNOME and KDE, Mark Brown from the ALSA kernel team, Olivier Crête,George Kiagiadakis and Nicolas Dufresne was there to represent embedded usecases for PipeWire and finally Thierry Bultel representing automotive.

The event kicked off with Wim Taymans presenting on current state of PipeWire and outlining the remaining issues and current thoughts on how to resolve them. Most of the first day was spent on a roadtable discussion about what are and should be the goals of PipeWire and what potential tradeoffs there would be going forward. PipeWire is probably a bit closer to Jack than PulseAudio in design, so quite a bit of the discussion went on how that would affect the PulseAudio usecases and what is planned to ensure PipeWire works very well for consumer audio usecases.

Personally I ended up spending quite some time just testing and running various Jack apps to see what works already and what doesn’t. In terms of handling outputing audio with Jack apps I was positively surprised how many Jack apps I was able to make work (aka output audio) using PipeWire instead of Jack, but of course we still have some gaps to cover before PipeWire is ready as a drop-in Jack replacement, for instance the Jack session management protocol needs to be implemented first.

The second day we outlined the areas that need work before we are ready to replace PulseAudio and came up with the following list:

  • Mixers – This is basically dealing with hardware mixers. Arun and Wim started looking at a design for this during the hackfest.
  • PulseAudio services – This is all the things in PulseAudio that is not very suitable for putting inside PipeWire. The idea is instead to put them in a separate daemon. This includes things like network streaming, ROAP, DBus apis and so on.
  • Policy/Session handling – We plan to move policy and session handling out of PulseAudio to make it easier for different usecases to set their own policies. PipeWire will still provide some default setup, but the idea here is to have a separate daemon(s) to provide this. Bastien Nocera started prototyping a setup where he could create policy and session handling using Lua scripting.
  • Filters
  • Bluetooth – Ensuring we have great bluetooth support with PipeWire. We would want to move Bluetooth handling to its own daemon, and not have it inside like in PulseAudio to allow for more flexibility with various embedded bluetooth stacks for instance. This could also mean looking at the Linux Bluetooth stack more widely as things are not ideal atm, especially from a security viewpoint.
  • Device reservation – We expect to replace Jack and PulseAudio in steps, starting with PulseAudio. So dealing well with hardware reservation is important to allow people to for instance keep running Jack alongside PipeWire until we are ready for full replacement.
  • Stream Monitoring – Important feature from Jack and PulseAudio that still needs implementing to allowing monitoring audio devices and streams.
  • Latency handling – Improving ways we can deal with hardware latency in for instance consumer devices such as TVs

It is still a bit hard to have a clear timeline for when we will be ready to drop in PipeWire support to replace PulseAudio and then Jack, but we feel the Wayland migration was a good example to follow where we held off doing the switch until we felt comfortable the move would be transparent to most users. There will of course always be corner cases and bugs, but we hope that in general people agree that the Wayland transition was done in a responsible manner and thus could be a good example to follow for us here.

We would like to offers big thanks to the GNOME Foundation for sponsoring travel for some of the community attendees and to Collabora for sponsoring dinner for all attendees the first night.

If you want to take a look at PipeWire, Wim updated the wiki page with PipeWire build intructions to be up-to-date. The hackfest attendees tested them out so we are sure they work, just be aware that you want the ‘Work’ branch and not the Master branch, as that is the one where all the audio work is happening. The Master branch is the video focused branch we use in Fedora for desktop remoting support in browsers and VNC under Wayland.

Speaking at FIfFKon 18 in Berlin, Germany

I was invited to be a panellist at this year’s FIfFKon in Berlin, Germany. While I said hi to the people at All Systems Go!, my main objective in Berlin was to attend the annual conference of the FIfF, the association for people in computing caring about peace and social responsibility.

The most interesting talk for me was held by Rainer Mühlhoff on the incapacitation if the user. The claim, very broadly speaking, is that providing a usable interface prevents your users from learning how to operate the machine properly. Or in other words: Making an interface for dumb people will attract dumb people and not make them smarter. Of course, he was more elaborate than that.

He presented Android P which nudges the user into a certain behaviour. In Android, you get to see for how long you have used an app and encourages you to stop. Likewise, Google nudges you into providing your phone number for account recovery. The design of that dialogue makes it hard to hit the button to proceed without providing the number. Those nudges do not prevent a choice to be made, they just make it more likely that the user makes one particular choice. The techniques are borrowed from public policy making and commercial settings. So the users are being an instrument themselves rather than a sovereign entity.

Half way through his talk he made a bit of a switch to “sealed interfaces” and presented the user interface of a vacuum cleaner. In the beginning, the nozzle had a “bristly” or “flat” setting, depending on whether you wanted to use it on a carpet or a flat surface. Nowadays, the pictogram does not show the nozzle any more, but rather the surface you want to operate on. Similarly, microwave ovens do not show the two levers for wattage and time any more, but rather full recipes like pizza, curry, or fish.
The user is prevented from understanding the device in its mechanical details and use it as an instrument based on what it does. Instead the interaction is centred on the end purpose rather than using the device as a tool to achieve this end. The commercialisation of products numbs people down in their thinking. We are going from “Don’t make me think” to “Can you do the thinking for me” as, he said, we can see with the newer Android interfaces which tries to know already what you intend to do.

Eventually, you adapt the technology to the human rather than adapting the human to the technology. And while this is correct, he says, and it has gotten us very far, it is wrong from a social theory point of view. Mainly because it suggests that it’s a one-way process whereas it really is an interdependency. Because the interaction with technology forms habits and coins how the user experiences the machine. Imagine, he said, to get a 2018 smartphone in 1995. Back in the day, you probably could not have made sense out of it. The industrial user experience design is a product of numbing users down.

A highly interesting talk that got me thinking a little whether we ought to teach the user the inner workings of software systems.

The panel I was invited for had the topic “More privacy for smart phones – will the GDPR get us a new break through?” and we were discussing with a corporate representative and other people working in data protection. I was there in my capacity as a Free Software representative and as someone who was working on privacy enhancing technologies. I used my opportunities to praise Free Software and claim that many problems we were discussion would not exist if we consequently used Free Software. The audience was quite engaged and asked a lot of questions. Including the ever popular point of *having* to use WhatsApp, Signal, or any of those proprietary products, because of the network effect and they demanded more regulation. I cautioned that call for various reasons and mentioned that the freedom to choose the software to run has not yet fully been exploited. Afterwards, some projects presented themselves. It was an interesting mix of academic and actual project work. The list is on the conference page.

October 26, 2018

Announcing the Fractal Hackfest in Seville

It’s been an exciting year for Fractal, the GNOME Matrix client. Since our last hackfest in May, we’ve decided to split the application, refactored large parts of the backend, implemented new features such as the media viewer, made the message history adaptive, and laid the groundwork for end-to-end encryption.

Now that we have most of the foundations in place that will enable our long-term goals (such as adaptive layout, E2E, and the app split), we’re getting together again to push these initiatives forward. This is why we’re having another hackfest on December 11-14 in Seville, Spain.

The main focus of the hackfest will be finalizing the backend refactor, and tying up various loose ends related to this, so we can start working on E2E and the app split.  The other area we want to focus on is improving Fractal as a tool for GNOME developers, and IRC replacement. In particular, we’re interested in providing a smooth, integrated GNOME Newcomer experience, because finding the right rooms to join is currently a big pain point for new contributors.

See you in Seville!

October 25, 2018

The History of GNOME

I’ve done a thing which may be of interest if you’re following the GNOME community.

As I said on Twitter, I have spare time, and I like boring people to death by talking about things that matter to me a lot; one of the things that matter to me is GNOME and its community—and especially its history.

Of course, I had to go and make it about liminal spaces and magic rituals, because that’s what makes it fun. This, though, is a magic ritual. I’m holding a seance, and I’m calling forth the past of the GNOME project for the people that live down its light-cone.

GNOME has the luxury of having a lot of people that stuck around—some even from the early days when there was no GNOME; there are also other people, though, some of them born after Miguel’s announcement, that are now starting to contribute to GNOME. I guess that means that it’s time to look back a bit, and give some more context to the history of the project.

I hope I won’t bore you that much with this; I hope that people will learn something new, or re-discover something that was forgotten. In general, I do hope people will have fun with it.

October 24, 2018

Thunderbolt ports & bolt update

TL;DR: Not every USB-C port is a Thunderbolt 3 port. Watch out for the logos!

One thing that I have learned in conversations about Thunderbolt is that it is important to distinguish between Thunderbolt the I/O technology and the physical connector that is used by it, because this seems to be a constant source of confusion. Over the course of its history, Thunderbolt used different types of connectors; Version 1 and 2 used Mini DisplayPort. USB as the older and more ubiquitous I/O tech also uses a myriad of different connectors ([{mini, micro}-]{A, B, AB}, SuperSpeed, ...).

USB Type-C connector on a Thunderbolt 3 cable

USB-C connector on a Thunderbolt 3 cable. NB: The flash logo

In order to simplify this and make things generally better, a new universal connector was designed: the USB Type-C connector. A universal connector for a universal bus, ha! But besides USB, this connector is also used by Thunderbolt 3. It has some nice properties (it is symmetrical!) and is quite versatile, i.e. it can be used to deliver power via USB-PD to peripherals.
On the technical side, this can be done because USB-C supports different Alternate Modes: e.g. DisplayPort, HDMI and Thunderbolt 3, which itself then can also carry DisplayPort. Now the important bit: all of the alternate modes (and USB-PD) are optional. It depends on what kind of controller the USB type C port is connected to. Ergo it is not clear what data can be transported by just looking at the port alone: maybe DisplayPort or it might even support Thunderbolt 3. The crucial bit of information is conveyed via the little logo that is printed next to the port. Since physically different pins are used for different modes, the logos are also important on the USB-C cables. Logo usage is regulated by the USB logo usage guidelines.

Example logo: Super Speed USB 10 Gbps USB Type-C™ Charging Trident Logo + DisplayPort Logo (p. 55)

What this all boils down to is the tl;dr from the top of the article: not every USB-C port is also a Thunderbolt 3 port. If you want to connect a TB3 device to a computer, make sure all involved ports (and cables) have the right logo:

Thunderbolt 3 logo

Some people who own a T480s learned this the hard way when they were wondering why Thunderbolt was not working for them when plugging things into the leftmost USB-C port. I had a few reports of bolt not working, followed by noticeable embarrassment, although it is arguably the design of the port itself which is at fault.

T480s: plain USB 3 port with USB-PD support (left in red) and Thunderbolt 3 port (right in green)

NB: Since there are also different types of Thunderbolt 3 cables (active, passive), you might get different maximum speeds (20Gbit/s vs 40Gbit/s) depending on the cable you use. For the few different cables that I have seen so far, they could not be distinguished based on their appearance alone.

bolt 0.5 "You've got the power"

In related news: bolt 0.5 is out (since about a month now) and will be shipped with Fedora 29. Have a look at the release notes for a complete list of changes, but the most important one I want to highlight here is the new force power D-Bus API. What is it and why do we need it? The Thunderbolt controller can be in two different modes: one in which it is constantly powered (native enumeration mode) and one in which it is controlled by the BIOS. In the latter mode, if nothing is plugged into the Thunderbolt port the controller is completely powered down and it looks as if there is no Thunderbolt hardware present at all. This is great because it saves battery, but there are two problems: 1) boltd wants to know what security level the Thunderbolt controller is in, and more importantly 2) the firmware update daemon (fwupd) wants to know the firmware version of the Thunderbolt controller, so that it can check if there are updates available (and if so, show them in GNOME Software). Luckily, newer kernel versions have (on supported platforms) a sysfs interface that can be used to "force-power" the Thunderbolt controller. Both boltd and fwupd have support for that, which is great, but also the root of a race: the force-power interface is not reference counted and also write only (you cannot ask for the current status). Now if boltd force-powers the controller, uevents will be generated which, in turn, will be processed by fwupd and it will try to read the firmware version. If, in the meantime, boltd is done with its thing and powers the controller down again but fwupd is not yet done reading the firmware, then that read will fail. Or the other way around: fwupd powers the controller, boltd gets started due to the uevents, but meanwhile fwupd is powering the controller down again, boltd might e.g. hang reading the boot-acl.

boltctl power --help for more information

The solution to the problem that Mika, Mario and I came up with was to have only boltd talk to the "force-power" interface and provide a D-Bus API for clients (fwupd) to use. Internally, boltd keeps track of open force-power requests and only when there is none left does the controller get powered down. If you have bolt 0.5 installed, you query the force power status via boltctl power -q or actually request force-powering with boltctl power.

What's next?

I am currently working on boot acl support (#77), which will be the main feature of bolt 0.6. It is still work in progress, but it is very close to being finished and could use some testing. The code is in the bootacl branch and there is a merge request !119 for potential feedback.

October 23, 2018

Fedora Toolbox ready for testing!

As many of you know we kicked of a ambitious goal to revamp the Linux desktop when we launched Fedora Workstation 4 years. We wanted to remove many of the barriers to adoption of Linux as a desktop and make it a better operating system for all, especially for developers.
To that effect we have been pushing a long range of initiatives over the last 4 years ago, ranging from providing a better input stack through libinput, a better display system through Wayland, a better audio and video subsystem through PipeWire, a better way of doing application packaging and dependency handling through Flatpak, a better application installation history through GNOME Software, actual firmware handling for Linux through Linux Vendor Firmware Service, better manageability through Fleet Commander, and Project Silverblue for reliable OS updates. We also had a lot of efforts done to improve general hardware handling, be that work on glvnd and friends for dealing with NVidia driver, the Bolt project for handling Thunderbolt devices better, HiDPI support in the desktop, better touch support in the desktop, improved laptop battery life, and ongoing work to improve state of fingerprint readers under Linux and to provide a flicker free boot experience.

One thing though that was clear to us was that as we where making all these changes to improve the ease of use and reliability of Linux as a desktop operating system we couldn’t make life worse for developers. Developers are the lifeblood of Fedora and Linux and thus we have had Debarshi Ray working on a project we call Fedora Toolbox. Fedora toolbox creates a seamless experience for developers when using an immutable OS like Silverblue, yet want to be able to install the wonderful world of software libraries and tools that makes Linux so powerful for developers. Fedora Toolbox is now ready for early adopters to start testing, so I recommend jumping over to Debarshi’s blog to read up on Fedora Toolbox.

October 22, 2018

Fedora Toolbox — Hacking on Fedora Silverblue

rpm-ostree-flatpak-silverblue

Fedora Silverblue is a modern and graphical operating system targeted at laptops, tablets and desktop computers. It is the next-generation Fedora Workstation that promises painless upgrades, clear separation between the OS and applications, and secure and cross-platform applications. The basic operating system is an immutable OSTree image, and all the applications are Flatpaks.

It’s great!

However, if you are a hacker and decide to set up a development environment, you immediately run into the immutable OS image and the absence of dnf. You can’t install your favourite tools, editors and SDKs the way you’d normally do on Fedora Workstation. You can either unlock your immutable OS image to install RPMs through rpm-ostree and give up the benefit of painless upgrades; or create a Docker container to get an RPM-based toolbox but be prepared to mess around with root permissions and having to figure out why your SSH agent or display server isn’t working.

Enter Fedora Toolbox.

It makes it trivial to get a mutable development environment on Silverblue:

[rishi@bollard ~]$ fedora-toolbox create
[rishi@bollard ~]$ fedora-toolbox enter
🔹[rishi@toolbox ~]$

It uses OCI containers underneath, but takes away the cognitive overhead of thinking about containers by providing a seamless integration with the host environment. It uses rootless podman and buildah, so there’s no root in the picture either.

fedora-toolbox

If you are going to try it out, make sure that you have the runc-1.0.0-56.dev.git78ef28e package in your Silverblue image. There’s also an ongoing review to get fedora-toolbox added to Fedora. If you don’t feel comfortable mucking around with rpm-ostree on the command-line, then fear not. Very soon all the necessary pieces will be part of the OS image, making it that much easier to start hacking on your Silverblue.

October 21, 2018

Glade Support for Builder

One of the things we’ve wanted in Builder for a while is a designer. We’ve had various prototypes in the past to see how things would have worked out, and mostly just punted on the idea because it seemed like Glade served users better than we would be able to directly.

Last week, Juan Pablo, Matthias Clasen and I met up in San Francisco to see what we could do in the short term. We discussed a couple of options that we have going forward.

  • Integrate glade 3 into Builder using libgladeui.
  • Integrate glade 3 using the external Glade application and use D-Bus to inter-operate.

Like all projects, we have some constraints.

  • Gtk 4 is in progress, and our hope is that most new application development moves towards that because the benefits are outstanding. That means the value of a Gtk 3 designer is depreciating.
  • Gtk 4 changes many fundamental designs behind the scenes. While much effort has been done to reduce the friction in porting applications, porting an UI designer is no trivial task as they necessarily reach into library internals. It is likely Gtk 4 will require creating a new designer from the ground up. Doing this as part of Gtk itself is probably worthwhile.
  • We want the designer to know about all of your .ui files so that it is easier to see widgets created using composition.
  • Allow generating signal callbacks into your existing code-base in a variety of languages.

With that in mind, I want to get the maximal benefit with the least amount of time to ship. I made a new plugin for Builder last week that gets us moving in that direction. It still needs more work to integrate with signal editing, templates, and other more advanced Glade features.

Hopefully that happens soon because I know we’ve all been waiting for it. Get it now with the Builder Nightly flatpak.

flatpak install --from https://git.gnome.org/browse/gnome-apps-nightly/plain/gnome-builder.flatpakref

Screenshot of Builder with Glade Integration

Vala state: October 2018

Maintainability

While I think maintainability could be improved, adding to history commits from contributions, apart from the ones coming from current Maintainer. Actually, there are some lot of commits not in history coming from authors outside current ones. Hope with new GitLab GNOME’s instance, this will reflect the correct situation.

Behind scenes, Vala has to improve its code base to adapt to new requirements like to develop a descent Vala Language Server and more IEDs supporting Vala. At least for me, even GEdit is productive enough to produce software in Vala, because the language itself; write a Class, an Interface and implement interfaces, is 10 times faster in Vala than in C.

Vala has received lot of improvements in last development cycles, like a new POSIX profile, ABI stability, C Warnings improvements and many other, to be reported in a different article.

Look at Vala’s repository history, you will see more “feature” commits than “bindings” ones, contrary to the situation reported by Emmanuel, while should be a good idea to produce a graphic on this, but resent improvements could tell by them self the situation has been improved in recent release cycles.

Lets look at repository’s chart. It reports 2000 commits in the last 3 months, 1.1 average per day, from 101 contributions as for October 19, 2018. Me at 10 commits from the last year, so I’m far to be a core contributor, but push ABI stability to be a reality. My main contributions are to communicate Vala advances and status.

Vala as a language

Vala has its roots as a GObject oriented language. Currently for C too, because it can produce C code without dependent on GLib/GObject.

Program a C code base using an Object Oriented Programming is easy on Vala, even if the C output are GObject classes or not. You can produce Compact Classes, as for  Vala’s Manual:

Compact classes, so called because they use less memory per instance, are the least featured of all class types. They are not registered with the GType system and do not support reference counting, virtual methods, or private fields. They do support unmanaged properties. Such classes are very fast to instantiate but not massively useful except when dealing with existing libraries. They are declared using the Compact attribute on the class

In C a compact class is a simple struct, its fields can have accessors, so your users just call a get/set method to access it and run additional code when is done. Following code:

[vala]
[Compact]
class Caller {
  public string _name;
  public string name {
    get {
      return _name;
    }
    set {
      _name = value;
    }
  }
  public void call (string number) {}
}
[/vala]

Is translated to:

#include &lt;glib.h&gt;
#include &lt;glib-object.h&gt;
#include &lt;stdlib.h&gt;
#include &lt;string.h&gt;

typedef struct _Caller Caller;
#define _g_free0(var) (var = (g_free (var), NULL))

struct _Caller {
	gchar* _name;
};



void caller_free (Caller * self);
static void caller_instance_init (Caller * self);
void caller_call (Caller* self,
                  const gchar* number);
Caller* caller_new (void);
const gchar* caller_get_name (Caller* self);
void caller_set_name (Caller* self,
                      const gchar* value);


void
caller_call (Caller* self,
             const gchar* number)
{
	g_return_if_fail (self != NULL);
	g_return_if_fail (number != NULL);
}


Caller*
caller_new (void)
{
	Caller* self;
	self = g_slice_new0 (Caller);
	caller_instance_init (self);
	return self;
}


const gchar*
caller_get_name (Caller* self)
{
	const gchar* result;
	const gchar* _tmp0_;
	g_return_val_if_fail (self != NULL, NULL);
	_tmp0_ = self-&gt;_name;
	result = _tmp0_;
	return result;
}


void
caller_set_name (Caller* self,
                 const gchar* value)
{
	gchar* _tmp0_;
	g_return_if_fail (self != NULL);
	_tmp0_ = g_strdup (value);
	_g_free0 (self-&gt;_name);
	self-&gt;_name = _tmp0_;
}


static void
caller_instance_init (Caller * self)
{
}


void
caller_free (Caller * self)
{
	_g_free0 (self-&gt;_name);
	g_slice_free (Caller, self);
}

Could you find some improvements in C code generation?

As you can see, Vala generates code to create your struct, access your fields while run costume code, to free your struct and initialize it. Your C users will have a clean and easy API to use, as you expect, your struct.

If you use Vala to access your C struct, also written in Vala, it will create and free correctly automatically.

So,

Less code, more features.

Vala Pushing up Complex projects

GObject/GInterface are a powerfull tool, but is really hard to use in C if you have to write and implement a complex hierarchy of classes and interfaces. W3C DOM4, is a clear example of it; sure is possible implement in C, but is hundred times easy to implement and maintain in Vala.

GXml and GSVG are an example on how complex a hierarchy can be, and how they have been implemented in a short time.

GXml’s GomElement hierarchy just shows all interfaces it should implement to be DOM4’s Element.

Have you tried to write lot of fragmented featured interfaces in C and implement all of them in a single GObject class?

W3C SVG 1.1 specification have lot of interfaces to write and implement to have a fully compliant object’s class.

Implement GSVG over Vala, makes SVG 1.1 possible in a short time; but don’t stop here, because maintain this complex hierarchy, like move an object from final to derivable one, or move methods from a final to parent class, including its interface implementation, is hundred times easy to do in Vala than in plain C/GObject.

Vala allows: Complex Interface hierarchy implementation easy

While GSVG implementation is not completed, Vala is the right tool to use on this kind of projects.

Vala libraries are Introspectable and usable in other languages

Yes, C GObject classes are introspectable, but lot of annotations are required to produce GIR/TYPELIB, usable in Python or JavaScript.

Vala code produce Introspection Annotations as you write. Produce GIR XML file format is a matter to use --gir switch. Errors and warnings in Vala are focused to make your classes introspectable consistent; for example, you can’t return NULL in Non NULL declared methods.

October 19, 2018

Vala Scripting?

Vala Language Server

I’m working with a library called GNOME Vala Language Server (GVls), as a proof of concept for a server that will serve autocompletion, syntax highlighting and that kind of stuff, but found something interesting by accident.

I’ve added an interface called Client, may is not it final name, but it allows to locale a symbol in a already parsed file, along with some goodness from other interfaces and implementations, I’ll talk about in another article.

The Accident

While trying to create a in-line parser to detect if the actual word is an object or any other kind of symbol to, in the near future, propose a list of properties and methods or a list of method’s parameters, by accident Client now can parse in-line Vala code, so you push text to this Client object and a Server (the object that parse and handle Symbol collection), parse  it and just add new Symbols to its collection, without any effort or parse errors for incomplete code!!!!

So now if you push the folowing text:

class Push {\n

Server now find a Push symbol representing a new Vala class. Inserting a medhod:

public void callme () {

You can find in the Server a new symbol for Push.callme method. Just no error, even if the code is not completed yet.

Vala Scripting?

Well not for now. But with this work the doors are opening. This include Genie, because it has more script suitable syntax than Vala. Any way, leave to the time and some others’ help of course.