August 21, 2019

low-memory-monitor: new project announcement

I'll soon be flying to Greece for GUADEC but wanted to mention one of the things I worked on the past couple of weeks: the low-memory-monitor project is off the ground, though not production-ready.

low-memory-monitor, as its name implies, monitors the amount of free physical memory on the system and will shoot off signals to interested user-space applications, usually session managers, or sandboxing helpers, when that memory runs low, making it possible for applications to shrink their memory footprints before it's too late either to recover a usable system, or avoid taking a performance hit.

It's similar to Android's lowmemorykiller daemon, Facebook's oomd, Endless' psi-monitor, amongst others

Finally a GLib helper and a Flatpak portal are planned to make it easier for applications to use, with an API similar to iOS' or Android's.

Combined with work in Fedora to use zswap and remove the use of disk-backed swap, this should make most workstation uses more responsive and enjoyable.

August 20, 2019

GSoC 2019 Final submission

Since my last blog post the main merge request of my GSoC project has landed and after that I followed up with subsequent bugfixes and also a couple of enhancements to the savestates manager.

I will use this blog post for the final submission as part of my participation in the program. As such this post will be a bit longer than the others but I’ll still keep it visual and populate it with gifs and screenshots 🙂

Overview of the savestates manager

My main merge request

Merge request (MR) !278 is the largest and most important one that I have submitted for this project. The changes it introduces can be summarized as follows:

  • Automatic migration from the old directory layout
  • Backend changes to use the new directory layout and savestates
  • An initial implementation of the UI to manipulate savestates. This UI was capable of saving, loading and deleting savestates.

The most important change was the refactor of the RetroRunner class which is a class that manages the emulator core used to run games (the core is represented as an instance of a Retro.Core object). An instance of a RetroRunner manages an instance of a Retro.Core for one game and now it also manages the savestates for that game.

Secondary MR 1) Add the savestates renaming feature

!301 introduces a new enhancement to the savestates manager, it allows the user to rename savestates.

The renaming popover in action

Secondary MR 2) Add shortcuts for using savestates

!305 along with !317 introduce quality of life shortcuts for the savestates manager 🙂

The former MR implements the shortcuts themselves, while the latter (not yet merged) MR lists them in the Shortcuts window.

The shortcuts window with the the new savestates shortcuts listed

Secondary MR 3) Flash effect when creating savestates

!310 introduces a pretty flash effect which is played every time a new savestate is created. This effect was inspired by the one used by GNOME Screenshot and we attempted to mimic it’s implementation. 🙂

I was very surprised to find out the flash is actually not implemented by GNOME Screenshot but rather by GNOME Shell. This feature was very interesting to work on as I had to check the source code of GNOME Shell to find out what interpolation function and duration it used for the animation. Then, using that information, I had to write code that animated the opacity property of the background widget in order to achieve the same effect. We concluded that the initial version of the flash was too bright and might distract the user from the game. Because of that the Games flash has a reduced opacity compared to the GNOME Shell one and as such appears a bit dimmer.

A demonstration of the flash animation inspired by GNOME Screenshot 🙂

Secondary MR 4) Use a nicer date format for savestates

As I’m writing this post, !319 still needs a bit of work, but it aims to show the savestates dates similarly to how Nautilus shows the Modified date for files. Note that in the screenshot below the dates are of course fabricated (since there were no savestates back in 2018) in order to simply illustrate the various available formats.

Nicer date formats similar to Nautilus’ Modified column

Along with these MRs, there were also several others with bugfixes and small tweaks:

  • !294 Here an important bug was fixed in which the automatic savestates were not pruned properly and we also changed the way in which new savestate names were generated such that it works similarly to how new file names are generated by the file manager (e.g. If we have savestates “New savestate 1”, “A”, “B” and we create a new savestate it would be called “New savestate 2” rather than “New savestate 4”)
  • !295 Replace the Cancel button with a Back button
  • !296 Remove left and right borders from the savestates menu
  • !297 Fix a UI bug in which the user could apply Load/Delete operations even when there was no savestate selected
  • !298 Fix a UI bug in which the top header bar would cover the savestates menu in fullscreen
  • !299 A quickfix which renamed a method
  • !300 Before this MR the game would unpause before the savestates menu finished it’s animation which would confuse the user
  • !303 Fix a small UI bug in which the new savestate row (the one denoted by the + sign) could be selected, which is to be avoided
  • !304 Fix a small issue in which savestates would get saved with a wrong thumbnail
  • !306 These changes are a fix for bugs discovered in !294
  • !307 The Nintendo DS core caused crashes because it wasn’t properly de-initialized
  • !311 When renaming a savestate we iterate through the others to check if there exists one with the same name. This fix ensures that we don’t take the new savestate row into consideration when doing that
  • !312 Some games don’t support savestates (e.g. Steam games) and we must ignore the savestates shortcuts with these games
  • !313 UI fix for the edge case in which a savestate name spans multiple lines
  • !314 If we click the Rename button and the selected row is not visible on the screen then we scroll to it
  • !315 Allow creating multiple savestates per second. Creating savestates rapidly caused runtime errors because the new directories would have the same name based on time.
  • !316 Avoid runtime errors when spamming the Delete button

Possible future improvements to the project

  • Gamepad navigation – Allow the player to use the savestates manager with a gamepad. The difficulty of this task comes from the fact that the number of available buttons that games don’t use is limited.
  • Undo loading – This is a feature common to many emulators and it helps the users in case they accidentally loaded a savestate and lost the progress of their current session.

Review | GSoC 2019

I've been working on GStreamer based project of Gnome Foundation. GStreamer is a pipeline-based multimedia framework that links together a wide variety of media processing systems to complete complex workflows. The framework is based on plugins that will provide various codec and other functionality. The plugins can be linked and arranged in a pipeline. And most of the plugins are written in C. Now the developers are in an attempt to convert them to Rust which is more robust and easily maintainable. My task is to be a part of this conversion and to help fix issues related to this.

Planned Tasks

  1. Implement Hyper-based HTTP source around async IO and make it feature-complete. (Explained in below section)
    GitLab issue: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/issues/31
  2. Tutorial for gst-plugin-tutorial crate. (Explained in below section)
    GitLab issue: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/issues/44
  3. Finish FLV demuxer. Conversion of FLV demuxer to Rust is not fully complete. My task is to complete the FLV demuxer with the feature equivalent to FLV demuxer in C.
    GitLab issue: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/issues/16

Completed Tasks

we changed the plans from the initial plan due to the tasks taking more time than expected. During the planning phase we didn’t get the whole idea about the task. On the run we figured that the tasks are bigger than expected.

1
Implement Hyper-based HTTP source around async IO and make it feature-complete.
Done partially
2
Tutorial for gst-plugin-tutorial crate which has 4 features.
(Rgb2gray, Sinesrc, identity, Progressbin)
Done partially 
(Completed Rgb2gray, Sinesrc)
3
Finish FLV demuxer
Did not do
4
Extra: Add build.rs
Done fully

Breakdown of the completed tasks

Task 1

Adding build.rs that provides a version number, release date and other meta data to the GStreamer plugin. The idea of this is to automatically infer useful values for the plugin during the build instead of hard coding them. I have used git2 crate for this purpose.Issue: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/issues/5
Merge request: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/merge_requests/91

Task 2

Moving the tutorials to markdown. Sebastian Dröge has written a series of blog posts on GStreamer plugin development in Rust. My task is to convert it to markdown and put it in the gst-plugin-tutorial crate on which the tutorial is based. Let me explain to you about gst-plugin-tutorial crate. This crate has a plugin with 4 features (Rgb2gray, Sinesrc, Identity and Progressbin). But I have completed below features only
  1. Rgb2gray
  2. Sinesrc
There are blog posts for rgb2gray and sinesrc. Also the code snippets there have to be updated along with some explanations. I had to update the code snippets to match the latest version of the gst-plugin-tutorial crate. And then I had to match the description to the updated code snippets. This task made me understand how a Gstreamer plugins work and how to develop them in Rust. Also I was able to familiarize with Rust which is a new language to me.
Issue: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/issues/44
Merge request: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/merge_requests/99

Task 3

Making Rust HTTP source plugin asynchronous. I have made this plugin requests and receiving asynchronous and cancellable. Reqwesthttpsrc is the Rust HTTP source is a plugin which reads data from a remote location specified by a URI and the supported protocols are HTTP and HTTPS. souphttpsrc is the C version of the HTTP source plugin.
Issue: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/issues/31
Merge request: https://gitlab.freedesktop.org/gstreamer/gst-plugins-rs/merge_requests/127

Task 4

Writing Test for Rust HTTP source Reqwesthttpsrc. Although the plugin code was partially built, the test was not written to any of the features. My mentor, Sebastian Dröge helped me for the first test and the needed class. Then we came up with few test cases and wrote the tests for them.


Test Case
Merge request/Commit
1
Behavior when HTTP server not found (404) error returns to the Plugin. 
2
Seeking to a position after the element reached READY state
3
Seeking after a buffer was received already

And more importantly, I found a few bugs in the plugin while writing the test and corrected them. They are


  1. HTTP error response were not handled correctly before. I was able to find that error and correct it. 

Task 5

Rust HTTP source was not feature complete. It had only few features of souphttpsrc which is the C version of HTTP source. One of my goals was to implement the missing properties in reqwesthttpsrc. Below are the properties I have implemented.



Name of the properties
Description
Merge request
1
is-live
This is a property which can have Boolean value. If it's set to true then the plugin act as a live source. Live sources are sources that when paused discard data, such as audio or video capture devices.
2
timeout
Setting timeout to the request. By default it has a timeout of 15 seconds.
3
user-agent
User-Agent is an HTTP request header field. This header is used to identify where the request is being originated from. The User-Agent HTTP request header is set to a custom string using this property. Default value for user-agent header is "GStreamer reqwesthttpsrc".
4
user-id
HTTP location URI user id and password for authentication. 
5
user-pw
6
automatic-redirect
Still not completed (2019.08.20)
When the status code is 3xx, follow the redirect link.

There are few more properties and features such as context sharing to be implemented in Rust HTTP source in order to feature complete the plugin. Missing properties are
  1. compress
  2. cookies
  3. extra-headers
  4. iradio-mode
  5. keep-alive
  6. method
  7. proxy
  8. proxy-id
  9. proxy-pw
  10. retries
  11. ssl-ca-file
  12. ssl-strict
  13. ssl-use-system-ca-file
Once these properties and context sharing feature is complete, reqwesthttpsrc (Rust version) can be used instead of souphttpsrc (C version).

August 15, 2019

Musings on the Microsoft Component Firmware Update (CFU) Protocol

CFU is a new specification from Microsoft designed to be the one true protocol for updating hardware. No vendor seems to be shipping hardware supporting CFU (yet?), although I’ve had two peripheral vendors ask my opinion which is why I’m posting here.

CFU has a bazaar pre-download phase before sending the firmware to the microcontroller so the uC can check if the firmware is required and compatible. CFU also requires devices to be able to transfer the entire new transfer mode in runtime mode. The pre-download “offer” allows the uC to check any sub-components attached (e.g. other devices attached to the SoC) and forces it to do dep resolution in case sub-components have to be updated in a specific order.

Pushing the dep resolution down to the uC means the uC has to do all the version comparisons and also know all the logic with regard to protocol incompatibilities. You could be in a position where the uC firmware needs to be updated so that it “knows” about the new protocol restrictions, which are needed to update the uC and the things attached in the right order in a subsequent update. If we always update the uC to the latest, the probably-factory-default running version doesn’t know about the new restrictions.

The other issue with this is that the peripheral is unaware of the other devices in the system, so for instance couldn’t only install a new firmware version for only new builds of Windows for example. Something that we support in fwupd is being able to restrict the peripheral device firmware to a specific SMBIOS CHID or a system firmware vendor, which lets vendors solve the “same hardware in different chassis, with custom firmware” problem. I don’t see how that could be possible using CFU unless I misunderstand the new .inf features. All the dependency resolution should be in the metadata layer (e.g. in the .inf file) rather than being pushed down to the hardware running the old firmware.

What is possibly the biggest failure I see is the doubling of flash storage required to do an runtime transfer, the extra power budget of being woken up to process the “offer” and enough bulk power to stay alive if “unplugged” during a A/B swap. Realistically it’s an extra few dollars for a ARM uC to act as a CFU “bridge” for legacy silicon and IP, which I can’t see as appealing to an ODM given they make other strange choices just to save a few cents on a BOM. I suppose the CFU “bridge” could also do firmware signing/encryption but then you still have a physical trace on the PCB with easy-to-read/write unsigned firmware. CFU could have defined a standardized way to encrypt and sign firmware, but they kinda handwave it away letting the vendors do what they think is best, and we all know how that plays out.

CFU downloads in the runtime mode, but from experience, most of the devices can transfer a few hundred Kb in less than ~200ms. Erasing flash is typically the slowest thing, typically less than 2s, writing next at ~1s both done in the bootloader phase. I’ve not seen a single device that can do a flash-addr-swap to be able to do the A/B solution they’ve optimized for, with the exception of enterprise UEFI firmware which CFU can’t update anyway.

By far the longest process in the whole update step is the USB re-enumeration (up to twice) which we have to allow 5s (!!!) for in fwupd due to slow hubs and other buggy hardware. So, CFU doubles the flash size requirement for millions of device to save ~5 seconds for a procedure which might be done once or twice in the devices lifetime. It’s also not the transfer that’s the limitation even over bluetooth as if the dep resolution is “higher up” you only need to send the firmware to the device when it needs an update, rather that every time you scan the device.

I’m similarly unimpressed with the no-user-interaction idea where firmware updates just happen in the background, as the user really needs to know when the device is going to disappear and re-appear for 5 seconds (even CFU has to re-enumerate…) — image it happening during a presentation or as the machine is about to have the lid shut to go into S3.

so, tl;dr: Not a fan, but could support in fwupd if required.

Another layer

Five years (and change) ago I was looking at the data types and API that were needed to write a 3D-capable scene graph API; I was also learning about SIMD instructions and compiler builtins on IA and ARM, as well as a bunch of math I didn’t really study in my brush offs with formal higher education. The result was a small library called Graphene.

Over the years I added more API, moved the build system from Autotools over to Meson, and wrote a whole separate library for its test suite.

In the meantime, GStreamer started using Graphene in its GL element; GTK 3.9x is very much using Graphene internally and exposing it as public API; Mutter developers are working on reimplementing the various math types in their copies of Cogl and Clutter using Graphene; and Alex wrote an entire 3D engine using it.

Not bad for a side project.

Of course, now I’ll have to start maintaining Graphene like a proper grownup, which also means reporting its changes, bug fixes, and features when I’m at the end of a development cycle.

While the 1.8 development cycle consisted mostly of bug fixes with no new API, there have been a few major internal changes during the development cycle towards 1.10:

  • I rewrote the Euler angles conversion to and from quaternions and matrices; the original implementation I cribbed from here and there was not really adequate, and broke pretty horribly when you tried to roundtrip from Euler angles to a transformation matrix and back. This also affected the conversion between Euler angles and quaternions. The new implementation is more correct, and as a side effect it now includes not just the Tait–Bryan angles, but also the classic Euler angles. All possible orders are available in both the intrinsic and extrinsic axes variants.
  • We’re dealing with floating point comparison and with infinities a bit better, now; this is usually necessary because the various vector implementations may have different behaviour, depending on the toolchain in use. A shout out goes to Intel, who bothered to add an instruction to check for infinities only in AVX 512, making it pointless for me, and causing a lot more grief than necessary.
  • The ARM NEON implementation of graphene_simd4f_t has been fixed and tested on actual ARM devices (an old Odroid I had lying around for ARMv7 and a Raspberry Pi3 for Aarch64); this means that the “this is experimental” compiler warning has been removed. I still need to run the CI on an ARM builder, but at least I can check if I’m doing something dumb, now.
  • As mentioned in the blog posts above, the whole test suite has been rewritten using µTest, which dropped a dependency on GLib; you still need GLib to get the integration with GObject, but if you’re not using that, Graphene should now be easier to build and test.

On the API side:

  • there are a bunch of new functions for graphene_rect_t, courtesy of Georges Basile Stavracas Neto and Marco Trevisan
  • thanks to Marco, the graphene_rect_round() function has been deprecated in favour of the more reliable graphene_rect_round_extents()
  • graphene_quaternion_t gained new operators, like add(), multiply() and scale()
  • thanks to Alex Larsson, graphene_plane_t can now be transformed using a matrix
  • I added equality and near-equality operators for graphene_matrix_t, and a getter function to retrieve the translation components of a transformation matrix
  • I added interpolation functions for the 2, 3, and 4-sized vectors
  • I’m working on exposing the matrix decomposition code for Gthree, but that requires some untangling of messy code so I’ll be in the next development snapshot.

On the documentation side:

  • I’ve reworked the contribution guide, and added a code of conduct to the project; doesn’t matter how many times you say “patches welcome” if you also aren’t clear on how those patches should be written, submitted, and reviewed, and if you aren’t clear on what constitutes acceptable behaviour when it comes to interactions between contributors and the maintainer
  • this landed at the tail end of 1.8, but I’ve hopefully clearly documented the conventions of the matrix/matrix and matrix/vector operations, to the point that people can use the Graphene API without necessarily having to read the code to understand how to use it

This concludes the changes that will appear with the next 1.10 stable release, which will be available by the time GNOME 3.34 is out. For the time being, you can check out the latest development snapshot available on Github.

I don’t have many plans for the future, to be quite honest; I’ll keep an eye out for what GTK and Gthree need, and I expect that once Mutter starts using Graphene I’ll start receiving bug reports.

One thing I did try was moving to a “static” API reference using Markdeep, just like I did for µTest, and drop yet another dependency; sadly, since we need to use gtk-doc annotations for the GObject introspection data generation, we’re going to depend on gtk-doc for a little while longer.

Of course, if you are using Graphene and you find some missing functionality, feel free to open an issue, or a merge request.

August 12, 2019

Flathub, brought to you by…

Over the past 2 years Flathub has evolved from a wild idea at a hackfest to a community of app developers and publishers making over 600 apps available to end-users on dozens of Linux-based OSes. We couldn’t have gotten anything off the ground without the support of the 20 or so generous souls who backed our initial fundraising, and to make the service a reality since then we’ve relied on on the contributions of dozens of individuals and organisations such as Codethink, Endless, GNOME, KDE and Red Hat. But for our day to day operations, we depend on the continuous support and generosity of a few companies who provide the services and resources that Flathub uses 24/7 to build and deliver all of these apps. This post is about saying thank you to those companies!

Running the infrastructure

Mythic Beasts Logo

Mythic Beasts is a UK-based “no-nonsense” hosting provider who provide managed and un-managed co-location, dedicated servers, VPS and shared hosting. They are also conveniently based in Cambridge where I live, and very nice people to have a coffee or beer with, particularly if you enjoy talking about IPv6 and how many web services you can run on a rack full of Raspberry Pis. The “heart” of Flathub is a physical machine donated by them which originally ran everything in separate VMs – buildbot, frontend, repo master – and they have subsequently increased their donation with several VMs hosted elsewhere within their network. We also benefit from huge amounts of free bandwidth, backup/storage, monitoring, management and their expertise and advice at scaling up the service.

Starting with everything running on one box in 2017 we quickly ran into scaling bottlenecks as traffic started to pick up. With Mythic’s advice and a healthy donation of 100s of GB / month more of bandwidth, we set up two caching frontend servers running in virtual machines in two different London data centres to cache the commonly-accessed objects, shift the load away from the master server, and take advantage of the physical redundancy offered by the Mythic network.

As load increased and we brought a CDN online to bring the content closer to the user, we also moved the Buildbot (and it’s associated Postgres database) to a VM hosted at Mythic in order to offload as much IO bandwidth from the repo server, to keep up sustained HTTP throughput during update operations. This helped significantly but we are in discussions with them about a yet larger box with a mixture of disks and SSDs to handle the concurrent read and write load that we need.

Even after all of these changes, we keep the repo master on one, big, physical machine with directly attached storage because repo update and delta computations are hugely IO intensive operations, and our OSTree repos contain over 9 million inodes which get accessed randomly during this process. We also have a physical HSM (a YubiKey) which stores the GPG repo signing key for Flathub, and it’s really hard to plug a USB key into a cloud instance, and know where it is and that it’s physically secure.

Building the apps

Our first build workers were under Alex’s desk, in Christian’s garage, and a VM donated by Scaleway for our first year. We still have several ARM workers donated by Codethink, but at the start of 2018 it became pretty clear within a few months that we were not going to keep up with the growing pace of builds without some more serious iron behind the Buildbot. We also wanted to be able to offer PR and test builds, beta builds, etc ­­— all of which multiplies the workload significantly.

Packet Logo

Thanks to an introduction by the most excellent Jorge Castro and the approval and support of the Linux Foundation’s CNCF Infrastructure Lab, we were able to get access to an “all expenses paid” account at Packet. Packet is a “bare metal” cloud provider — like AWS except you get entire boxes and dedicated switch ports etc to yourself – at a handful of main datacenters around the world with a full range of server, storage and networking equipment, and a larger number of edge facilities for distribution/processing closer to the users. They have an API and a magical provisioning system which means that at the click of a button or one method call you can bring up all manner of machines, configure networking and storage, etc. Packet is clearly a service built by engineers for engineers – they are smart, easy to get hold of on e-mail and chat, share their roadmap publicly and set priorities based on user feedback.

We currently have 4 Huge Boxes (2 Intel, 2 ARM) from Packet which do the majority of the heavy lifting when it comes to building everything that is uploaded, and also use a few other machines there for auxiliary tasks such as caching source downloads and receiving our streamed logs from the CDN. We also used their flexibility to temporarily set up a whole separate test infrastructure (a repo, buildbot, worker and frontend on one box) while we were prototyping recent changes to the Buildbot.

A special thanks to Ed Vielmetti at Packet who has patiently supported our requests for lots of 32-bit compatible ARM machines, and for his support of other Linux desktop projects such as GNOME and the Freedesktop SDK who also benefit hugely from Packet’s resources for build and CI.

Delivering the data

Even with two redundant / load-balancing front end servers and huge amounts of bandwidth, OSTree repos have so many files that if those servers are too far away from the end users, the latency and round trips cause a serious problem with throughput. In the end you can’t distribute something like Flathub from a single physical location – you need to get closer to the users. Fortunately the OSTree repo format is very efficient to distribute via a CDN, as almost all files in the repository are immutable.

Fastly Logo

After a very speedy response to a plea for help on Twitter, Fastly – one of the world’s leading CDNs – generously agreed to donate free use of their CDN service to support Flathub. All traffic to the dl.flathub.org domain is served through the CDN, and automatically gets cached at dozens of points of presence around the world. Their service is frankly really really cool – the configuration and stats are reallly powerful, unlike any other CDN service I’ve used. Our configuration allows us to collect custom logs which we use to generate our Flathub stats, and to define edge logic in Varnish’s VCL which we use to allow larger files to stream to the end user while they are still being downloaded by the edge node, improving throughput. We also use their API to purge the summary file from their caches worldwide each time the repository updates, so that it can stay cached for longer between updates.

To get some feelings for how well this works, here are some statistics: The Flathub main repo is 929 GB, of which 73 GB are static deltas and 1.9 GB of screenshots. It contains 7280 refs for 640 apps (plus runtimes and extensions) over 4 architectures. Fastly is serving the dl.flathub.org domain fully cached, with a cache hit rate of ~98.7%. Averaging 9.8 million hits and 464 Gb downloaded per hour, Flathub uses between 1-2 Gbps sustained bandwidth depending on the time of day. Here are some nice graphs produced by the Fastly management UI (the numbers are per-hour over the last month):

Graph showing the requests per hour over the past month, split by hits and misses.
Graph showing the data transferred per hour over the past month.

To buy the scale of services and support that Flathub receives from our commercial sponsors would cost tens if not hundreds of thousands of dollars a month. Flathub could not exist without Mythic Beasts, Packet and Fastly‘s support of the free and open source Linux desktop. Thank you!

August 11, 2019

West Coast Hackfest – Summary

Sorry this was supposed to have gone out some weeks ago and I lazed it up. Blame it on my general resistance to blogging. :-)

This year, I helped organize West Coast Hackfest with my stalwart partner and friend Teresa Hill in Portland – with assistance from Kristi Progi. Big thanks to them for helping to make this a success!

Primarily the engagement hackfest was focused on the website content. The website is showing its age and needs both a content update and a facelift. Given our general focus on engagement, we want to re-envision the website to drive that engagement as a medium for volunteer capture, identity, and fundraising.

The three days of engagement hackfest was spent going through each of the various pages and pointing out issues in the content and what should be fixed. Fixing them is a little bit problematic as the content is not generally available on WordPress but embedded in the theme of which few people have access to. Another focus will be opening up that content and finding alternatives to create content without having to touch the theme at all.

Our observations going through them are as follow:

  • Our website doesn’t actually identify what we are as a project and what we work on. (eg the word desktop doesn’t show up anywhere on our website)
  • There is no emotional connection for newcomers who want to know what GNOME is, what our values are
  • We have old photos from early 6-7 years ago that need to be updated.
  • The messaging that we have developed within the engagement team is not reflected on the website and should be updated accordingly
  • We have items on our technologies that are no longer maintained like Telepathy
  • We have new items on our technology page that need to be added
  • We have outdated links to social media (eg G+ should no longer exist)

Our tour of the website has shown how out of date our website has and it is clear that it is not part of the engagement process. One of the things we will talk about in GUADEC is managing content and visuals on the website as part of the engagement team activity. We have an opportunity to really find new ways to connect with our users, volunteers, and donors and reach out to potential new folks through the philanthropy and activism in Free Software that we do.

I would like to thank the GNOME Foundation for providing the resources and infrastructure to have us all here.

The plans for West Coast Hackfest is to continue to expand its participation in the U.S. As a U.S. based non-profit, we have a responsibility to expand our mission in the United States as part of our Foundation activities. While we have been quite modest this year, we hope to expand even larger for next year as another vehicle like GUADEC as a meeting place for users, maintainers, designers, documentators and everyone else.

If you are interested in hosting West Coast Hackfest – (we’ll call it something else – suggestions?) then please get in touch with Kristi Progi and myself. We will love to hear from you!

Towards a happy ending

Since my last blog post I have kept on working on my GSoC project while also making preparations for attending the GUADEC 2019 conference.

The main Merge Request of my project has been merged into the master branch of Games and then I followed up with a couple more bugfixes for bugs that my mentor has noticed after the merge.

A very mean bug 😦

A very important bug/regression that I’m glad we were able to fix is the NintendoDS crash that was caused by introducing the Savestates Manager. We couldn’t have included this feature in a release as long as this bug was present since it was a major regression as the user wasn’t even able to start a NintendoDS game at all.

The bug was caused by the fact that we used to re-instantiate the emulator core everytime we would start the game or load a new savestate. This didn’t cause any problems with other cores but it seems that the NintendoDS core didn’t like this.

The lucky fix \o/

Initially I had no idea how to approach this issue and I also doubted that re-instantiating is the cause of the problem because it was working perfectly fine with the other cores. Nevertheless I decided to try this out and was very surprised, happy and relieved when I saw that it actually fixed the problem completely and now we are able to play NintendoDS games with savestates. Now when loading a savestate the core is stopped, it’s state is changed and then the core is resumed.

A NintendoDS game being played with savestates 🙂

Last effort

There is still one week left of GSoC and during this time with one final effort I will attempt to implement Savestate renaming before submitting the final version of my project.

My next blog post will be after the GUADEC 2019 Conference at which I will have a lightning talk about my project and hopefully get to meet some of the people reading this post 🙂

On responsible vulnerability disclosure

Recently KDE had an unfortunate event. Someone found a vulnerability in the code that processes .desktop and .directory files, through which an attacker could create a malicious file that causes shell command execution (analysis). They went for immediate, full disclosure, where KDE didn't even get a chance of fixing the bug before it was published.

There are many protocols for disclosing vulnerabilities in a coordinated, responsible fashion, but the gist of them is this:

  1. Someone finds a vulnerability in some software through studying some code, or some other mechanism.

  2. They report the vulnerability to the software's author through some private channel. For free softare in particular, researchers can use Openwall's recommended process for researchers, which includes notifying the author/maintainer and distros and security groups. Free software projects can follow a well-established process.

  3. The author and reporter agree on a deadline for releasing a public report of the vulnerability, or in semi-automated systems like Google Zero, a deadline is automatically established.

  4. The author works on fixing the vulnerability.

  5. The deadline is reached; the patch has been publically released, the appropriate people have been notified, systems have been patched. If there is no patch, the author and reporter can agree on postponing the date, or the reporter can publish the vulnerability report, thus creating public pressure for a fix.

The steps above gloss over many practicalities and issues from the real world, but the idea is basically this: the author or maintainer of the software is given a chance to fix a security bug before information on the vulnerability is released to the hostile world. The idea is to keep harm from being done by not publishing unpatched vulnerabilities until there is a fix for them (... or until the deadline expires).

What happened instead

Around the beginning of July, the reporter posts about looking for bugs in KDE.

On July 30, he posts a video with the proof of concept.

On August 3, he makes a Twitter poll about what to do with the vulnerability.

On August 4, he publishes the vulnerability.

KDE is left with having to patch this in emergency mode. On August 7, KDE releases a security advisory in perfect form:

  • Description of exactly what causes the vulnerability.

  • Description of how it was solved.

  • Instructions on what to do for users of various versions of KDE libraries.

  • Links to easy-to-cherry-pick patches for distro vendors.

Now, distro vendors are, in turn, in emergency mode, as they must apply the patch, run it through QA, release their own advisories, etc.

What if this had been done with coordinated disclosure?

The bug would have been fixed, probably in the same way, but it would not be in emergency mode. KDE's advisory contains this:

Thanks to Dominik Penner for finding and documenting this issue (we wish however that he would have contacted us before making the issue public) and to David Faure for the fix.

This is an extremely gracious way of thanking the reporter.

I am not an infosec person...

... but some behaviors in the infosec sphere are deeply uncomfortable to me. I don't like it when security "research" is hard to tell from vandalism. "Excuse me, you left your car door unlocked" vs. "Hey everyone, this car is unlocked, have at it".

I don't know the details of the discourse in the infosec sphere around full disclosure against irresponsible vendors of proprietary software or services. However, KDE is free software! There is no need to be an asshole to them.

August 09, 2019

Networks Of Trust: Dismantling And Preventing Harassment

Purism’s David Seaward recently posted an article titled Curbing Harassment with User Empowerment. In it, they posit that “user empowerment” is the best way to handle harassment. Yet, many of their suggestions do nothing to prevent or stop harassment. Instead they only provide ways to allow a user to plug their ears as it occurs.

Trusting The Operator

David Seaward writes with the assumption that the operator is always untrustworthy. But, what if the operator was someone you knew? Someone you could reach out to if there were any issues, who could reach out to other operators? This is the case on the Fediverse, where Purism’s Librem Social operates. Within this system of federated networks, each node is run by a person or group of people. These people receive reports in various forms. In order to continue to be trusted, moderators of servers are expected to handle reports of spam, hate speech, or other instances of negative interactions from other services. Since the network is distributed, this tends to be sustainable.

In practice, this means that as a moderator my users can send me things they’re concerned by, and I can send messages to the moderators of other servers if something on their server concerns me or one of my users. If the operator of the other node breaches trust (e.g. not responding, expressing support for bad actors) then I can choose to defederate from them. If I as a user find that my admin does not take action, I can move to a node that will take action. The end result is that there are multiple layers of trust:

  • I can trust my admins to take action
  • My admins can trust other admins to take action

This creates a system where, without lock-in, admins are incentivized to respond to things in good faith and in the best interests of their users.

User Empowerment And Active Admins

The system of trust above does not conflict with Purism’s goal of user empowerment. In fact, these two systems need to work together. Providing users tools to avoid harassment works in the short term, but admins need to take action to prevent harassment in the long term. There’s a very popular saying: with great power comes great responsibility. When you are an admin, you have both the power and responsibility to prevent harassment.

To continue using the fediverse for this discussion, there are two ways harassment occurs in a federated system:

  1. A user on a remote instance harasses people
  2. A user on the local instance harasses people

When harassment occurs, it comes in various forms like harassing speech, avoiding blocks, or sealioning. In all cases and forms, the local admin is expected to listen to reports and handle them accoridngly. For local users, this can mean a stern warning or a ban. For remote users, the form of response could range from contacting the remote admin to blocking that instance. Some fediverse software also supports blocking individual remote accounts. Each action helps prevent the harasser from further harming people on your instance or other instances.

Crowdsourcing Does Not Solve Harassment

One solution David proposes in the article is crowdsourced tagging. Earlier in the article he mentions that operators can be untrustworthy, but trusting everyone to tag things does not solve this. In fact, this can contribute to dogpiling and censorship. Let’s use an example to illustrate the issue. A trans woman posts about her experience with transphobia, and how transphobic people have harmed her. Her harassers can see this post, and tag it with “#hatespeech”. They tell their friends to do it too, or use bots. This now means anyone who filters “#hatespeech” would have her post hidden – even people that would have supported her. Apply this for other things and crowdsourced tagging can easily become a powerful tool to censor the speech of marginalized people.

Overall, I’d say Purism needs to take a step back and review their stance to moderation and anti-harassment. It would do them well if they also took a minute to have conversations with the experts they cite.

C++ modules with a batch compiler feasibility study

In our previous blog post an approach for building C++ modules using a batch compiler model was proposed. The post raised a lot of discussion on Reddit. Most of it was of the quality one would expect, but some interesting questions were raised.

How much work for compiler developers?

A fairly raised issue was that the batch approach means adding new functionality to the compiler. This is true, but not particulary interesting as a standalone statement. The real question is how much more work is it, especially compared to the work needed to support other module implementation methods.

In the batch mode there are two main pieces of work. The first is starting compiler tasks, detecting when they freeze due to missing modules and resuming them once the modules they require have been built. The second one is calculating the minimal set of files to recompile when doing incremental builds.

The former is the trickier one because no-one has implemented it yet. The latter is a well known subject, build systems like Make and Ninja have done it for 40+ years. To test the former I wrote a simple feasibility study in Python. What it does is generate 100 source files containing modules that call each other and then compiles them all in the manner a batch compiler would. There is no need to scan the contents of files, the system will automatically detect the correct build order or error out if it can not be done.

Note that this is just a feasibility study experiment. There are a lot of caveats. Please go through the readme before commenting. The issue you want to raise may already be addressed there. Especially note that it only works with Visual Studio.

The code that is responsible for running the compile is roughly 60 lines of Python. It is conceptually very simple. A real implementation would need to deal with threading, locking and possibly IPC, which would take a fair bit of work.

The script does not support incremental builds. I'd estimate that getting a minimal version going would take something like 100-200 lines of Python.

I don't have any estimates on how much work this would mean on a real compiler code base.

The difference to scanning

A point raised in the Reddit discussion is that there is an alternative approach that uses richer communication between the build system and the compiler. If you go deeper you may find that the approaches are not that different after all. It's more of a question of how the pie is sliced. The scanning approach looks roughly like this:


In this case the build system and compiler need to talk to each other, somehow, via some sort of a protocol. This protocol must be implemented by all compilers and all build tools and they must all interoperate. It must also remain binary stable and all that. There is a proposal for this protocol. The specification is already fairly long and complicated especially considering that it supports versioning so future versions may be different in arbitrary ways. This has an ongoing maintenance cost of unknown size. This also complicates distributed builds because it is common for the build system and the compiler to reside on different machines or even data centres so setting up an IPC channel between them may take a fair bit of work.

The architectural diagram of the batch compiler model looks like this:


Here the pathway for communicating module information is a compiler implementation detail. Every compiler can choose to implement it in the way that fits their internal compiler implementation the best. They can change its implementation whenever they wish because it is never exposed outside the compiler. The only public facing API that needs to be kept stable is the compiler command line, which all compilers already provide. This also permits them to ship a module implementation faster, since there is no need to fully specify the protocol and do interoperability tests. The downside is that the compiler needs to determine which sources to recompile when doing incremental builds.

Who will eventually make the decision?

I don't actually know. But probably it will come down to what the toolchain providers (GCC, Clang, Visual Studio) are willing to commit to.

It is my estimate (which is purely a guess, because I don't have first hand knowledge) that the batch system would take less effort to implement and would present a smaller ongoing maintenance burden. But compiler developers are the people who would actually know.

Gaming with GThree

The last couple of week I’ve been on holiday and I spent some of that hacking on gthree. Gthree is a port of three.js, and a good way to get some testing of it is to port a three.js app. Benjamin pointed out HexGL, a WebGL racing game similar to F-Zero.

This game uses a bunch of cool features like shaders, effects, sprites, particles, etc, so it was a good target. I had to add a bunch of features to gthree and fix some bugs, but its now at a state where it looks pretty cool as a demo. However it needs more work to be playable as a game.

Check out this screenshot:

Or this (lower resolution) video:

If you’re interested in playing with it, the code is on github. It needs latest git versions of graphene and gthree to build.

I hope to have a playable version of this for GUADEC. See you there!

Ubuntu 18.04.3 LTS is out, including GNOME stable updates and Livepatch desktop integration

Ubuntu 18.04.3 LTS has just been released. As usual with LTS point releases, the main changes are a refreshed hardware enablement stack (newer versions of the kernel, xorg & drivers) and a number of bug and security fixes.

For the Desktop, newer stable versions of GNOME components have been included as well as a new feature: Livepatch desktop integration.

For those who aren’t familiar, Livepatch is a service which applies critical kernel patches without rebooting. The service is available as part of an Ubuntu Advantage subscriptions but also made available for free to Ubuntu users (up to 3 machines).  Fixes are downloaded and applied to your machine automatically to help reduce downtime and keep your Ubuntu LTS systems secure and compliant.  Livepatch is available for your servers and your desktops.

Andrea Azzarone worked on desktop integration for the service and his work finally landed in the 18.04 LTS.

To enabling Livepatch you just need an Ubuntu One account. The set up is part of the first login or can be done later from the corresponding software-properties tab.

Here is a simple walkthrough showing the steps and the result:

The wizard displayed during the first login includes a Livepatch step will help you get signed in to Ubuntu One and enable Livepatch:

Clicking the ‘Set Up’ button invites you to enter you Ubuntu One information (or to create an account) and that’s all that is needed.

The new desktop integration includes an indicator showing the current status and notifications telling when fixes have been applied.

You can also get more details on the corresponding CVEs from the Livepatch configuration UI

You can always hide the indicator using the toggle if you prefer to keep your top panel clean and simple.

Enjoy the increased security in between reboots!

 

 

 

August 08, 2019

Age rating data for Flathub apps

OARS (Open Age Ratings Service) defines a scheme to include content rating information in apps’ AppData/AppStream file. GNOME Software and similar tools use this metadata to show age ratings for applications. In Endless OS, we also support restricting which applications a given user can install based on this data – see this page, and the reports it links to, for a bit more information about this feature and its future.

Screenshot: “Age Rating: 7. The application was rated this way because it features: Characters in aggressive conflict easily distinguishable from reality”

Every new app on Flathub should include OARS metadata, but there are many existing apps which don’t have this data, so it’s not (yet) enforced at build time. Edit: Bartłomiej tells me that it has been enforced at build time for a little over a month. (See App Requirements and AppData Guidelines on the Flathub wiki for some more information on what’s required and recommended; this branch of appstream-glib is modified to enforce Flathub’s policies.) My colleague Andre Magalhaes crunched the data and opened a tracker task for Flathub apps without OARS metadata.1 This information is much more useful if it can be relied upon to be present.

If you’re familiar with an app on this list, generating the OARS data is a simple process: open this generator in your browser, answer some questions about the app, and receive some XML. The next step is to put that OARS data into the AppData.2 Take a look at the app’s Flathub repo and check whether it has an .appdata.xml file.

If it doesn’t, then the app’s AppData must be maintained upstream. Great! Find the upstream repository for the project, and send a merge request there. (Here’s one I sent earlier, for D-Feet.) You can either add the same patch to the Flathub packaging (as I did for D-Feet) or wait for a new upstream release and then update the Flathub packaging to that version (as I also did for D-Feet, a couple of days later).

If the appdata is maintained in the Flathub repo, make the relevant changes there directly. (Here’s a PR I opened for Tux, of Math Command while writing this post.) Ideally, the appdata would make its way upstream, but there are a fair few apps on Flathub which do not have active upstreams.

You might well find that the appdata requirements have become more strict about things other than OARS since the app was last updated, and these will have to be fixed too. In both the example cases above, I had to add release information, which has become mandatory for Flathub apps since these were last updated.

  1. We have a similar list for our in-house apps.
  2. If you’re the upstream maintainer for the app, you probably already know how to do all this, and can stop reading here!

libfprint 1.0 (and fprintd 0.9.0)

After more than a year of work libfprint 1.0 has just been released!

It contains a lot of bug fixes for a number of different drivers, which would make it better for any stable or unstable release of your OS.

There was a small ABI break between versions 0.8.1 and 0.8.2, which means that any dependency (really just fprintd) will need to be recompiled. And it's good seeing as we also have a new fprintd release which also fixes a number of bugs.

Benjamin Berg will take over maintenance and development of libfprint with the goal of having a version 2 in the coming months that supports more types of fingerprint readers that cannot be supported with the current API.

From my side, the next step will be some much needed modernisation for fprintd, both in terms of code as well as in the way it interacts with users.

August 07, 2019

Pango 1.44 wrap-up

In my last post discussing changes in Pango 1.44, I’ve asked for feedback. We’ve received some, thanks to everybody who reported issues!

We tried to address some of the fallout in several follow-up releases. I’ll do a 1.44.4 release with the last round of fixes before too long.

Here is a summary.

Bitmap fonts

As expected, not supporting Type 1 and BDF fonts anymore is an unwelcome change for people whose favorite fonts are in these formats.

Clearly, a robust conversion script would be a very good thing to have; people have had mixed success with fontforge-based scripts (see this issue). I hope that we can get some help from the font packager community with this.

One follow-up fix that we did here is to make sure that Pango’s font enumeration code does not return fonts in formats that we don’t support. This makes font fallback work to replace bitmap fonts, and helps to avoid ‘black box’ output.

Subpixel positioning

Font rendering is a sensitive topic; every change here is likely to upset some people (in particular those with carefully tuned font setups).

We did not help things by enabling subpixel positioning unconditionally in Pango, when it is only supported in cairo master.  When used with the released cairo, this leads to unpleasantly uneven glyph placement.  Even with cairo master, some compositors have not been updated to support subpixel positioning (e.g. win32, xcb).

To address this problem, subpixel positioning is now optional, and off by default. Use

pango_context_set_round_glyph_positions (context, FALSE)

to turn it on.

Even without subpixel positioning, there is are still small differences in glyph positioning between Pango 1.43 and 1.44. These are caused by differences in glyph extent calculations between cairo and harfbuzz; see this issue for the ongoing discussion.

API changes

I was a bit overzealous in my attempt to reduce our dependency on freetype when I changed the return type of pango_fc_font_lock_face() to gpointer. This is a harmless change for the C API, but it broke some users of Pango in C++. The next release will have the old return type back.

Line spacing

Another new feature that turned out to be better of being off by default is the new line spacing. In the initial 1.44 release, it was on by default, causing line spacing UIs (e.g. in the GIMP) to stop working, which is not acceptable. It is now off by default. Call

pango_layout_set_line_spacing (layout, factor)

to enable it.

Hyphenation

We’ve received one bug report pointing out that hyphens could be confusing in some contexts, for example when breaking filenames. As a consequence, there is  now a text attribute to suppress the insertion of hyphens.

Miscellaneous bugs

Naturally, some bugs crept in; there were some crash fixes, and some hyphens got inserted in the wrong place (such as: hyphens after hyphens, or hyphens after spaces). These were easy.

One bug that took me a while to track down was making lines grow higher when they are ellipsized, causing misrendering. It turned out to be a mixup with text attributes, that let us to pick the wrong font  for the ellipsis character. This will be fixed in the next release.

Building C++ modules, take N+1

Modules were voted in C++20 some time ago. They are meant to be a replacement for #include statements to increase build speeds and to also isolate translation units so, for example, macros defined in one file do not affect the contents of another file. There are three major different compilers and each of them has their own prototype implementation available (GCC documentation, Clang documentation, VS documentation).

As you would expect, all of these implementations are wildly different and, in the grand C++ tradition, byzantinely complicated. None of them also have a really good solution to the biggest problem of C++ modules, namely that of dependency tracking. A slightly simplified but mostly accurate description of the problem goes like this:

Instead of header files, all source code is written in one file. It contains export statements that describe what functions can be called from the outside. An analogy would be that functions declared as exported would be in a public header file and everything else would be internal and declared in an internal header file (or would be declared static or similar). The module source can not be included directly, instead when you compile the source code the compiler will output an object file and also a module interface file. The latter is just some sort of a binary data file describing the module's interface. An import statement works by finding this file and reading it in.

If you have file A that defines a module and file B that uses it, you need to first fully compile file A and only after the module interface file has been created can you compile file B. Traditionally C and C++ files can be compiled in parallel because everything needed to compile each file is already in the header files. With modules this is no longer the case. If you have ever compiled Fortran and this seems familiar, it's because it is basically the exact same architecture.

Herein lies the problem

The big, big problem is how do you determine what order you should build the sources in. Just looking at the files is not enough, you seemingly need to know something about their contents. At least the following approaches have toyed with:
  1. Writing the dependencies between files manually in Makefiles. Yes. Really. This has actually been but forth as a serious proposal.
  2. First scan the contents of every file, determine the interdependencies, write them out to a separate dependency file and then run the actual build based on that. This requires parsing the source files twice and it has to be done by the compiler rather than a regex because you can define modules via macros (at least in VS currently).
  3. When the compiler finds a module import it can not resolve, it asks the build system via IPC to generate one. Somehow.
  4. Build an IPC mechanism between the different compiler instances so they can talk to each other to determine where the modules are. This should also work between compilers that are in different data centers when doing distributed builds.
Some of these approaches are better than others but all of them fail completely when source code generators enter the picture, especially if you want to build the generator executable during the build (which is fairly common). Scanning all file contents at the start of the build is not possible in this case, because some of the source code does not yet exist. It only comes into existence as build steps are executed. This is hideously complicated to support in a build system.

Is there a better solution?

There may well be, though I'd like to emphasize that none of the following has actually been tested and that I'm not a compiler developer. The approach itself does require some non-trivial amount of work on the compiler, but it should be less than writing a full blown IPC mechanism and distributed dataflow among the different parts of the system.

At the core of the proposed approach is the realization that not all module dependencies between files are the same. They can be split into two different types. This is demonstrated in the following diagram that has two targets: a library and an executable that uses it.


As you can see the dependencies within each target can get fairly complicated. The dependencies between targets can be just as complicated, but they have been left out of the picture to keep it simple. Note that there are no dependency cycles anywhere in the graph (this is mandated by the module specification FWICT). This gives us two different kinds of module dependencies: between-targets module dependencies and within-targets module dependencies.

The first one of these is actually fairly simple to solve. If you complete all compilations (but not the linking step) of the dependency library before starting any compilations in the executable, then all library module files that the executable could possibly need are guaranteed to exist. This is easy to implement with e.g. a Ninja pseudotarget.

The second case is the difficult one and leads to all the scanning problems and such discussed above. The proposed solution is to slightly change the way the compiler is invoked. Rather than starting one process per input file, we do something like the following:

g++ <other args> --outdir=somedir [all source files of this target]

What this means conceptually is that the compiler needs to take all the input files and compile each of them. Thus file1.cpp should end up as somedir/file1.o and so on. In addition it must deal with this target's internal module interrelations transparently behind the scenes. When run again it must detect which output files are up to date and rebuild only the outdated ones.

One possible implementation is that the compiler may launch one thread per input file (but no more than there are CPUs available). Each compilation proceeds as usual but when it encounters a module import that it can not find, it halts and waits on, say, a condition variable. Whenever a compilation job finishes writing a module, it will signal all the other tasks that a new module is available. Eventually either all jobs finish or every remaining task is deadlocked because they need a module that can't be found anywhere.

This approach is similar to the IPC mechanism described on GCC's documentation but it is never exposed to any third party program. It is fully an internal implementation detail of the compiler and as such there are no security risks or stability requirements for the protocol.

With this approach we can handle both internal and external module dependencies reliably. There is no need to scan the sources twice or write complicated protocols between the compiler and the build system. This even works for generated sources without any extra work, which no other proposed approach seems to be able to do.

As an added bonus the resulting command line API is so simple it can be even be driven with plain Make.

Extra bits

This approach also permits one to do ZapCC style caching. Since compiler arguments for all sources within one target must be the same under this scheme (which is a good thing to do in general), imports and includes can be potentially shared between different compiler tasks. Even further, suppose you have a type that is used in most sources like std::vector<std::string>.  Normally the instantiated and compiled code would need to be written in every object file for the linker to eventually deduplicate. In this case, since we know that all outputs will go to the same target it is enough to write the code out in one object file. This can lead to major build footprint reductions. It should also reduce the linker's memory usage since there is a lot less debug info to manage. In most large projects linking, rather than compiling, is the slow and resource intensive step so making it faster is beneficial.

The module discussions have been going around in circles about either "not wanting to make the compiler a build system" or "not wanting to make the build system into a compiler". This approach does neither. The compiler is still a compiler, it just works with slightly bigger work chunks at a time. It does need to track staleness of output objects, though, which it did not need to do before.

There needs to be some sort of a load balancing system so that you don't accidentally spawn N compile jobs each of which spawns N internal work threads.

If you have a project that consists of only one executable with a gazillion files and you want to use a distributed build server then this approach is not the greatest. The solution to this is obvious: split your project into independent logical chunks. It's good design and you should do it regardless.

The biggest downside of this approach that I could come up with was that CCache will probably no longer work without a lot of work. But if modules make compilation 5-10× faster (which is a given estimate, there are no public independent measurements yet) then it could be worth it.

Making Rust HTTP source Feature equivalent - Part 1 | GSoC 2019

souphttpsrc is the C version of HTTP source plugin of GStreamer. Making reqwesthttpsrc feature equivalent to that of souphttpsrc is a very important part of the conversion. Although Rust HTTP source is functioning well, it is not fully in to use because it is not equivalent to C HTTP source.

For now there is only one property implemented from C HTTP source apart from the ones which come from base class. That is 'location'. We can set a URL to read using this property. For example

gst-launch-1.0 reqwesthttpsrc location=https://www.google.com ! fakesink dump=true

I introduced two more properties to Rust HTTP source. Let's all give a warm welcome to 'is-live' and 'user-agent'. I had to go through the C code of the plugin to understand about the those properties and see where they've been used in the plugin. 

'is-live' was a easy point to start with because the implementation was straight forward. This is a property which can have Boolean value. If it's set to true then the plugin act as a live source. What is a live source? Yes, I was wondering about that for sometime. Live sources are sources that when paused discard data, such as audio or video capture devices. A typical live source also produces data at a fixed rate and thus provides a clock to publish this rate. A live source does not produce data in the PAUSED state.

Example usage using gst-launch

> gst-launch-1.0 reqwesthttpsrc location=https://www.google.com is-live=true ! fakesink dump=true

'user-agent' is the next property that I added to reqwesthttpsrc. User-Agent is a HTTP request header field. This header is used to identify where the request is being originated from. The User-Agent HTTP request header is set to a custom string using this property. Default value for user-agent header is "GStreamer reqwesthttpsrc".

Example usage

> gst-launch-1.0 reqwesthttpsrc location=https://www.google.com user-agent="Hi, I am new here" ! fakesink dump=true

We can capture the request and check the user-agent header which is going to be "Hi, I am new here". There a few more properties missing in the Rust HTTP source. Below table shows them.


Properties
Description
1
automatic-redirect 
When the status code is 3xx, follow the redirect link. Go to this link for implementing this property using reqwest.
2
compress
Allow compressed content encodings. Go to this link for implementing this property using reqwest.
3
cookies
HTTP request cookies.
4
extra-headers
Extra headers to append to the HTTP request.
5
iradio-mode
Enable internet radio mode (ask server to send shoutcast/ icecast metadata interleaved with the actual stream data)
6
keep-alive
Use HTTP persistent connections.
7
method
Use GET, HEAD requests.
8
proxy
HTTP proxy server related properties.
9
proxy-id
10
proxy-pw
11
retries
Number of retries before giving up.
12
ssl-ca-file
SSL related properties.
13
ssl-strict
14
ssl-use-system-ca-file
15
timeout
Seconds to timeout a blocking I/O. Go to this link for implementing this property using reqwest.
16
user-id
HTTP location URI user id for authentication. Go to this link for implementing this property using reqwest.
17
user-pw
HTTP location URI user password for authentication.

Other than these properties, there are some more features which should be implemented in Rust HTTP source. For example HTTP context sharing feature. Keep in touch for more posts :)

August 06, 2019

Please welcome: NetworkManager 1.20

Another three months have passed since NetworkManager’s 1.18, and 1.20 is now available. What follows is a quick overview of what’s new.

We’re dropping some old cruft

Yes, the line diff compared to the previous major release, NetworkManager 1.18, is negative!

The libnm-glib library, deprecated in favor of libnm since NetworkManager 1.0 release almost five years ago, was dropped. At this point it’s almost certain to have no users.

If you’re developing a program that has anything to do with network configuration, libnm is the way to go. You can also use it from other languages than C via GObject instrospection — just check out our examples.

Also gone is the settings plugin for use with iBFT. For those who don’t know: iBFT is the way for the boot firmware to pass the network configuration it has used to the operating system. It really was rather unlike what other settings plugins are — its role was to create a single virtual connection profile that would be there so that NetworkManager won’t tear down the network configuration applied by the early boot firmware. This doesn’t mean that we don’t support network booted installations. Quite the opposite. Since the last release we support configuring the network on early boot with NetworkManager and we preserve configuration done outside NetworkManager without the need of a placeholder connection profile.

Vendor-supplied profiles and scripts

The distributions can now ship connection profiles and dispatcher scripts in the /usr/lib/NetworkManager. This makes things convenient for stateless systems, that ship a read-only system image or are reset to a “factory” state by wiping /etc clean. Note that you the users can still modify and delete the vendor-supplied connection profiles via D-Bus API. In such case, NetworkManager will overlay the read-only files with profiles in /etc or /run.

Refer to file-hierarchy(7) manual page to learn more about the directory layout of a typical Linux installation.

Improved D-Bus API

The Settings.AddConnection2() call was added, as a future-proof version of AddConnection() and AddConnectionUnsaved() calls that turned out to lack extensibility. In addition to the original functionality, it allows suppressing the autoconnect feature on adding a connection. This is like the already existing Settings.Connection.Update2() that supports a similar flag. Settings.Connection.Update2() also got a new “no-reapply” flag to suppress taking changes to “connection.zone” and “connection.metered” immediately.

Using the older API? Worry not, we’re not removing it. We’re committed to maintain a stable API and not ever break things deliberately.

More small things

There have been over seven hundred commits since the last stable release. That’s means this article can not possibly be comprehensive. But here’s are some of the most interesting small bits:

The daemon restart got more robust. In particular, in-memory connections are saved in /run, making it possible for them to survive restarts.

Wi-Fi Mesh networks are supported, if you are lucky enough to have hardware that supports it.

The internal default DHCP client is used by default. It’s smaller and faster than dhclient, which was used previously. NetworkManager supports multiple different DHCP clients and the users and distributors have an option to modify the default.

For those who wish to learn more, there’s always a more exhaustive NEWS file.

Thanks

We’d like to thank everyone who made NetworkManager 1.20 possible. In particular Andy Kling, who contributed the Wi-Fi Mesh support and Tom Gundersen who improved the internal DHCP client. Thomas Haller reviewed and improved the article.

August 05, 2019

App Grid in GNOME Shell

GNOME Shell is the cornerstone of the GNOME experience. It is the part of the system where the vast majority of user interactions takes place. Windows are managed by it. Launching and closing applications as well. Workspaces, running commands, seeing the status of your system — GNOME Shell covers pretty much everything.

One interesting aspect of GNOME Shell is how it deals with launching applications. Currently, it organizes application launchers in three different places:

  • The Dash, which contains both favorite and running applications
  • The “Frequent” tab, which shows a small selection of the most used applications
  • The “All” tab, which shows all applications (that may be inside folders)

Let’s focus on the “All” tab for now.

The “All” tab

The “All” tab contains a list of all installed applications on your system. These applications may be inside folders. GNOME has a default set of folders and applications that aims to minimally organize it, and avoid displaying applications that may be too specialized for regular usage.

What many do not know is that the folders are customizable. Unfortunately, GNOME Shell does not provide any means within it to manage these folders. So far, the only (officially supported) way to do that is through GNOME Software!

Furthermore, it is a pity that only the smallest of the application launchers — the Dash — is customizable. There is only so much that can be put in there!

New Ideas

During the London UX Hackfest, in 2017, GNOME designers and developers had many interesting ideas about different ways to organize GNOME Shell’s UI elements. Letting designers create freely, without having to consider toolkit limitations or time constraints, can produce wonderful results!

It is interesting to notice that many of these ideas floated around the concept of an user-customizable application grid.

In fact, such kind of application grid exists in Endless OS (which by itself is loosely inspired by how smartphones do that) and our user research has shown that it improves discoverability. New users that are presented to Endless OS can easily and quickly navigate through the OS.

Managing icons in GNOME Shell

For the past few weeks, I’ve been working on a way to manage application icons and folders in GNOME Shell itself.

The implementation is different from what we have at Endless, and fits GNOME’s current UI paradigm better. Let’s check it out!

Naturally, it’s possible to create folders from GNOME Shell:

Notice how the folder name was automatically picked from the shared categories between the applications.

Folders are automatically deleted when emptied:

In addition to that, custom positions are also supported:

And of course drag between pages:

Keep in mind that none of this is merged yet. It’s still under design and code review, and testing!

OMG what have you done i hate it lolol

Fear not! This first iteration of the feature is implemented in such a way that if you don’t customize the icon grid, it will continue to work exactly as it used to. That way, we can still deliver a consistent experience and reduce friction as we persue further changes in GNOME Shell.

The Roadmap

We are trying to land this before the feature freeze in 3.34, in which case you’ll get it in the next GNOME release. In case that doesn’t happen, it will appear early in the 3.36 development cycle. Either way, discussion and refinement will continue, so between 3.34 and 3.36 what you see in the videos above might change significantly.

The design team is already considering how we might fully use the potential of the customizable icon grid feature in a future iteration of GNOME Shell, and I’m looking forward to see what comes out of that discussion!

Fortunately, thanks to my employer Endless, I was able to work full time on these changes, and will be able to work on future changes in GNOME Shell as well.

This was all made possible by Endless.

Review of the Igalia Multimedia team Activities (2019/H1)

This blog post takes a look back at the various Multimedia-related tasks the Igalia Multimedia team was involved in during the first half of 2019.

GStreamer Editing Services

Thibault added support for the OpenTimelineIO open format for editorial timeline information. Having many editorial timeline information formats supported by OpenTimelineIO reduces vendor lock-in in the tools used by video artists in post-production studios. For instance a movie project edited in Final Cut Pro can now be easily reimported in the Pitivi free and open-source video editor. This accomplishment was made possible by implementing an OpenTimelineIO GES adapter and formatter, both upstreamed respectively in OpenTimelineIO and GES by Thibault.

Another important feature for non-linear video editors is nested timeline support. It allows teams to decouple big editing projects in smaller chunks that can later on be assembled for the final product. Another use-case is about pre-filling the timeline with boilerplate scenes, so that an initial version of the movie can be assembled before all teams involved in the project have provided the final content. To support this, Thibault implemented a GES demuxer which transparently enables playback support for GES files (through file://path/to/file.xges URIs) in any GStreamer-based media player.

As if this wasn’t impressive enough yet, Thibault greatly improved the GES unit-tests, fixing a lot of memory leaks, race conditions and generally improving the reliability of the test suite. This is very important because the Gitlab continuous integration now executes the tests harness for every submitted merge request.

For more information about this, the curious readers can dive in Thibault’s blog post. Thibault was invited to talk about these on-going efforts at SIGGRAPH during the OpenTimelineIO BOF.

Finally, Thibault is mentoring Swayamjeet Swain as part of the GSoC program, the project is about adding nested timeline support in Pitivi.

GStreamer VA-API

Víctor performed a good number of code reviews for GStreamer-VAAPI contributors, he has also started investigating GStreamer-VAAPI bugs specific to the AMDGPU Gallium driver, in order to improve the support of AMD hardware in multimedia applications.

As part of the on-going GStreamer community efforts to improve continuous integration (CI) in the project, we purchased Intel and AMD powered devices aimed to run validation tests. Running CI on real end-user hardware will help ensure regressions remain under control.

Servo and GStreamer-rs

As part of our on-going involvement in the Servo Mozilla project, Víctor has enabled zero-copy video rendering support in the GStreamer-based Servo-media crate, along with bug fixes in Glutin and finally in Servo itself.

A prior requirement for this major milestone was to enable gst-gl in the GStreamer Rust bindings and after around 20 patches were merged in the repository, we are pleased to announce Linux and Android platforms are supported. Windows and macOS platforms will also be supported, soon.

The following screencast shows how hardware-accelerated video rendering performs on Víctor’s laptop:

WebKit’s Media Source Extensions

As part of our on-going collaboration with various device manufacturers, Alicia and Enrique validated the Youtube TV MSE 2019 test suite in WPEWebKit-based products.

Alicia developed a new GstValidate plugin to improve GStreamer pipelines testing, called validateflow. Make sure to read her blog post about it!.

In her quest to further improve MSE support, especially seek support, Alicia rewrote the GStreamer source element we use in WebKit for MSE playback. The code review is on-going in Bugzilla and on track for inclusion in WPEWebKit and WebKitGTK 2.26. This new design of the MSE source element requires the playbin3 GStreamer element and at least GStreamer 1.16. Another feature we plan to work on in the near future is multi-track support; stay tuned!

WebKit WebRTC

We announced LibWebRTC support for WPEWebKit and WebKitGTK one year ago. Since then, Thibault has been implementing new features and fixing bugs in the backend. Bridging between the WebAudio and WebRTC backend was implemented, allowing tight integration of WebAudio and WebRTC web apps. More WebKit WebRTC layout tests were unskipped in the buildbots, allowing better tracking of regressions.

Thibault also fixed various performance issues on Raspberry Pi platforms during apprtc video-calls. As part of this effort he upstreamed a GstUVCH264DeviceProvider in GStreamer, allowing applications to use already-encoded H264 streams from webcams that provide it, thus removing the need for applications to encode raw video streams.

Additionally, Thibault upstreamed a device provider for ALSA in GStreamer, allowing applications to probe for Microphones and speakers supported through the ALSA kernel driver.

Finally, Thibault tweaked the GStreamer encoders used by the WebKit LibWebRTC backend, in order to match the behavior of the Apple implementation and also fixing a few performance and rendering issues on the way.

WebKit GStreamer Multimedia maintenance

The whole team is always keeping an eye on WebKit’s Bugzilla, watching out for multimedia-related bugs reported by the community members. Charlie recently fixed a few annoying volume bugs along with an issue related with youtube.

I rewrote the WebKitWebSrc GStreamer source element we use in WebKit to download HTTP(S) media resources and feed the data to the playback pipeline. This new element is now based on the GStreamer pushsrc base class, instead of appsrc. A few seek-related issues were fixed on the way but unfortunately some regressions also slipped in; those should all be fixed by now and shipped in WPEWebKit/WebKitGTK 2.26.

An initial version of the MediaCapabilities backend was also upstreamed in WebKit. It allows web-apps (such as Youtube TV) to probe the user-agent for media decoding and encoding capabilities. The GStreamer backend relies on the GStreamer plugin registry to provide accurate information about the supported codecs and containers. H264 AVC1 profile and level information are also probed.

WPEQt

Well, this isn’t directly related with multimedia, but I finally announced the initial release of WPEQt. Any feedback is welcome, QtWebKit has phased out and if anyone out there relies on it for web-apps embedded in native QML apps, now is the time to try WPEQt!

It’s 2019, and I’m starting an email mailing list.

As I’m resurfacing, looking at my writing backlog and seeing over a dozen blog posts to finish and publish in the coming months, I’m thinking that now would be a good time to offer readers a way to be notified of new publications without having to manually check my website all the time or to use specialized tools. So, I’m starting a notification mailing list (a.k.a. “newsletter”).

What kind of topics will be covered?

In the past, my blog has mostly been about technology (particularly Free and Open-Source software) and random discoveries in life. Here are some examples of previous blog posts:

In the future, I will likely continue to cover technology-related subjects, but also hope to write more often on findings and insights from “down to earth” businesses I’ve worked with, so that you can see more than just a single industry.

Therefore, my publications will be be about:

  • business (management, growth, entrepreneurship, market positioning, public relations, branding, etc.);
  • society (sustainability, social psychology, design, public causes, etc. Not politics.);
  • technology;
  • life & productivity improvement (“lifehacking”).

If you want to subscribe right away and don’t care to read about the whole “why” context, here’s a form for this 👉

Otherwise, keep reading below 👇; the rest of this blog post explains the logic behind all this (why a newsletter in this day and age), and answers various questions you might have.

Why go through all that effort, Jeff?”

The idea here is to provide more convenience for some of my readers. It took me a long time to decide to offer this, as I’ll actually be spending more effort (and even money) managing this, going the extra mile to provide relevant information, and sometimes providing information that is not even on the blog.

Why bother with this? For a myriad of reasons:

  • It allows keeping in touch with my readership in a more intimate manner
  • It allows providing digests and reminders/retrospectives, from where people can choose to read more, effectively allowing “asynchronous” reading. If I were to do blog retrospectives on the blog, I think that might dilute the contents and get boring pretty fast.
  • It gives me an idea of how many people are super interested in what I’m publishing (which can be quite motivating)
  • It lets me cover all my publishing channels at once: the blog, my YouTube channel, etc.
  • It gives people the opportunity to react and interact with me more directly (not everybody wants to post a public comment, and my blog automatically disables commenting on older posts to prevent spam).

“But… Why email?!”

I realize it might look a bit surprising to start a newsletter in 2019—instead of ten years ago—but it probably is more relevant now than ever and, with experience, I am going to do a much better job at it than I would’ve a decade ago.

In over 15 years of blogging, I’ve seen technologies and social networks come and go. One thing hasn’t changed in that timeframe, however: email.

Email certainly has its faults but is the most pervasive and enduring distributed communication system out there, built on open standards. Pretty much everyone uses it. We were using email in the previous millenium, we’re using it today, and I suspect we’ll keep using it for a good long while.

“Don’t we have RSS/Atom feeds already?”

While I’m a big fan of syndication feeds (RSS/Atom) and using a feed reader myself (Liferea), these tools and technologies had their popularity peak around a decade ago and have remained a niche, used mostly by journalists and computer geeks.

  • Nobody around me in the “real world” uses them, and most people struggle to understand the concept and its benefits.
  • Even most geeks are unaware of feed syndication. Before I fully grasped what the deal with RSS was, I spent some years creating a GNOME desktop application to watch web pages for me. Ridiculous, I know!
  • And even then, many people prefer not having to use a dedicated application for this.

So, while I’m always going to keep the feeds available on my blog, I realize that most people prefer subscribing via email.

“What about social media?”

Social media creates public buzz, but doesn’t have the same usefulness and staying power.

  • As a true asynchronous medium, email provides convenience and flexibility for the reader compared to the evanescent nature of The Vortex. An email is personal, private, can be filed and consumed later, easily retrieved, unlike the messy firehose that is social media.
  • Social media is evanescent, both in content and in platforms:
    • Social networks are firehoses; they tend to be noisy (because they want to lure you into the Vortex), cluttered, chaotic, un-ordered. They are also typically proprietary and centralized in the hands of big corporations that mine your data & sociopsychological profile to sell it to the highest bidder.
    • There is no guarantee that any given social network is going to last more than a couple years (remember Google+? Or how Facebook used to be cool among youngsters until parents and aunts joined and now people are taking refuge to Twitter/Snapchat/Instagram/whatever?).
  • FLOSS & decentralized social networks? That doesn’t help reach normal people; those platforms barely attract 0.0125% of the population.
  • Instant messaging and chatrooms? Same issues. Besides, there are too many damned messaging systems to count these days (IRC, Signal, FB Messenger, WhatsApp, Telegram, Discord, Snapchat, Slack, Matrix/Riot/Fractal, oh my… stop this nonsense Larry, why don’t you just give me a call?), to the point where some are just leaving all that behind to fallback on email.

Like my blog and website, my mailing list will still be useful and available to me as the years pass. You can’t say that “with certainty” of any of the current social platforms out there.

What’s the catch? 🤔

There is no catch.

  • You get a summary of my contents delivered to your mailbox every now and then, to be read when convenient, without lifting a finger.
  • I probably get more people to enjoy my publications, and that makes me happy. Sure, it’s more work for me, but hey, that’s life (you can send me a fat paycheck to reward me, if you want 😎)

This mailing list is private and owned by me as an individual, and I am not selling your info to anybody. See also my super amazing privacy policy if you care.

I won’t email too often (maybe once a month or per quarter, I suspect), because I’ve got a million things on my plate already. We’ll see how it goes. Subscribing is voluntary, and you can unsubscribe anytime if you find me annoying (hopefully not).

Questions? Comments/feedback? Suggestions? Feel free to comment on this blog post or… send me an email 😉

The post It’s 2019, and I’m starting an email mailing list. appeared first on The Open Sourcerer.

GSoC: Things I've been doing and what I learned until now

The second month of Google Summer of Code passed quickly. Last weeks I’ve been working on my markers code. My early implementation, while functional, needed a lot of cleaning, refactoring and refining to fit into Pitivi. Mathieu Duponchell and Alexandru Băluț have been guiding me through this process.

In GES I expanded the GESMarkerList with new signals, writed new tests and changed some unusual structures for others more usual in GES.

In Pitivi I added a new module with the markers logic, ‘markers.py’. Roughly speaking, now we have the class MarkersBox, which is a GTK.EventBox containing a GESMarkerList and a GTK.Layout to put on markers. The class Marker is also a GTK.EventBox, so we have a widget for every GESMarker, which allows to move, remove and select markers. The class MarkerPopover brings a popover menu to edit metadata in every marker. I also implemented undo and redo actions.

The process of rewriting a lot of my previous code has been hard and challenging. I knew that my original code wasn’t clear or optimized but I wasn’t sure how to exactly improve it. It implied to learn and apply some concepts which wasn’t clear to me. While hard work it felt as a rewarding and foundamental learning.

For future GSoC students I would like to recap some of the things and concepts I have learned or worked on with Pitivi. Maybe this would be useful for someone:

  • MVC:

Pitivi follows the Model-View-Controller pattern. In my early code I was binding together the View and Controller parts. For example, removing a marker: first I called the method to remove a marker and after that I had a line of code to redraw the removed marker. This would make impossible to use other controllers.

To change this I had to implement new signals in GESMarkerList, for removed and moved markers. Then in Pitivi I wrote handlers for these signals which updates the View. So, every time the Model is updated with a ‘remove marker’ it triggers a remote signal and the handler updates the view.

  • UNDO/REDO ACTIONS:

Undo/Redo actions are alternative controllers. Alternative controllers need the MVC. Without the signals that I wrote previously my undo/redo actions couldn’t work. The class MarkerObserver monitorize what happen with markers. It have handlers for every GESMarkerList signal. With every signal (add, remove and move) a new action is created and added to an action log. Every action have a method to implement undo and redo.

One complicate thing was that when a marker is removed and the user wants to undo that action, a new marker has to be created. This brings a new problem, if the user wants to undo to previous actions the reference to the original marker get lost, and previous actions in the stack don’t know about the last marker created to substitute the original marker.

Lucky me, Aleb guided me to the UndoableAutomaticObjectAction class who takes care of these situations, but it took me a while to discover how it works.

  • GTK:

Pitivi uses GTK, a toolkit for creating graphical user interfaces. And not everything is evident in GTK documentation.

To display markers we chose to use CSS styles. I wanted that the interface of the markers change while they are hovered or selected. But it was hard to make it work in EventBoxes. While in GTK.Button it works just right, it took me time to discover why that wasn’t working in our case. After some talk in the GTK channel we discovered that we needed to manually update the flags state in our widget:

def do_enter_notify_event(self, unused_event): self.set_state_flags(Gtk.StateFlags.PRELIGHT, clear=False)

def do_leave_notify_event(self, unused_event): self.unset_state_flags(Gtk.StateFlags.PRELIGHT)

Slightly out of context but this code made our CSS work

  • TIME AND COMMUNICATION:

These things took me more time that I would like. Being a beginner programmer isn’t easy, there are multiple factors which can block you. It’s hard to know if the thing that is blocking you is a trivial one or a hard one. Also, the desire of resolve all the problems by my own often result in a slower and inefficient pace. In my case all my problems got a new light when I talked with my menthors.

  • THE DEVIL IS IN THE DETAILS:

When I read quickly through the code I have a loose knowledge of what it does. Stackoverflow gives me the false impression that I understand everything in a hurry. Most of the time these approaches drive me to get stuck.

If I read carefully, line by line, taking my time to understand, to explain things to myself, to make questions, everything starts to make sense and flow. The same happens when I try to go for a big task without subtasking. And I noted that when I have a good understanding of the little parts is easier to refine the code later.

Chafa 1.2.0: Faster than ever, now with 75% more grit

For all you terminal graphics connoisseurs out there (there must be dozens of us!), I released Chafa 1.2.0 this weekend. Thanks to embedded copies of some parallel image scaling code and the quite excellent libnsgif, it’s faster and better in every way. What’s more, there are exciting new dithering knobs to further mangle refine your beautiful pictures. You can see what this stuff looks like in the gallery.

Included is also a Python program by Mo Zhou that uses k-means clustering to produce optimal glyph sets from training data. Neat!

Thanks to all the packagers, unsung heroes of the F/OSS world. Shoutouts go to Michael Vetter (openSUSE) and Guy Fleury Iteriteka (Guix) who got in touch with package info and installation instructions.

The full release notes are on GitHub.

What’s next

I’ve been asked about sixel support and some kind of interactive mode. I think both are in the cards… In the meantime, here’s a butterfly¹.

A very chafa butterfly

¹ Original via… Twitter? Tumblr? Imgur? Gfycat? I honestly can’t remember.

August 02, 2019

Portlandia

Petr and I had an hour to kill on the final evening of the West Coast Hackfest. He’d heard about a local statue that needed to be seen (and photographed to prove we were there). Some research showed that it was a few blocks from the hotel, just over 200 m — the downtown blocks are 61 m on a side. In fact, we must have passed it every morning on the way to the bus stop.

We arrived at the site to find The Portland Building undergoing substantial renovations, the reason we hadn’t paid it much attention. The sidewalk was covered by scaffolding and the view was barred by construction hoarding. Some informative signs attached to the hoarding showed the Portlandia story: photographs of the statue on a barge on the river, its original unveiling, and an architect’s rendering of the finished renovation. However, a peek inside yielded no clue as to where the statue might have stood before the renovations began. The Wikipedia entry later told us that the statue’s location was visible as a protective wrap partway up the front of the building: we’d passed it every morning without noticing.

Ice cream in the Square

Aftermath of Scooperfest.

We went looking for another photogenic landmark, and wound up in the Pioneer Courthouse Square as Scooperfest was winding down. Later on our way for supper we passed an interesting column at the Friday evening workspace, Powell’s City of Books.

Powell's column

Column at alternate Powell’s entrance.

Welcome to the Inclusion and Diversity Team at GNOME!

A photo of spherical paper lanterns in a variety of colors, against a dark blue night sky. In the background is a building, lit in rainbow colors.Introduction

The Inclusion and Diversity team at GNOME was created to encourage and empower staff and volunteers, and to create an environment within GNOME where people from all backgrounds can thrive.

We welcome and encourage participation by everyone. To us, it doesn’t matter how you identify yourself or how others perceive you: we welcome you.

A sign that reads: We welcome all races and ethnicities all religions all countries of origin all gender identities all sexual orientations all abilities and disabilities all spoken lnguages all ages everyone. We stand here with you you are safe hereGoals

Our main focus is to create an inclusive and diverse community. This means that we want to actively cultivate diversity in all forms, and to create ways to make people feel welcome and able to fully participate in GNOME.

In order to achieve that effectively we do activities like promoting diversity and inclusion throughout and beyond GNOME, educate ourselves and the GNOME community around creating welcoming and inclusive environments, organize events that are safe and welcoming to all, and offer internships and do outreach programs to promote diversity and inclusion at GNOME.

We just started the team this year, and have so far focused on making this year’s GUADEC a more inclusive event. As a small part of that, we will be holding workshops on things like imposter syndrome and unconscious bias. We welcome ideas for future conferences and GNOME events!

How To Join

We welcome everyone who wishes to contribute to this mission! It will be a great pleasure for us to have you working with us for the cause. We currently meet every Wednesday on UberConference at 16 UTC. It would be great to see you there. For more info please visit the wiki.

 

All images are public domain. See: https://unsplash.com/photos/1R2sGnkcECA and https://unsplash.com/photos/1R2sGnkcECA.

Text by the GNOME Engagement team.

August 01, 2019

Sysprof Updates

I just uploaded the sysprof-3.33.4 tarball as we progress towards 3.34. This alpha release has some interesting new features that some of you may find interesting as you continue your quests to improve the performance of your system by improving the software running upon it.

An image of Sysprof with various performance graphs

For a while, I’ve been wondering about various ways to move GtkTextView forward in GTK 4. It’s of particular interest to me because I spent some time in the GTK 3 days making it faster for smooth scrolling. However, the designs that were employed there work better on the traditional Xorg setup than they do on GTK 3’s Wayland backend. Now that GTK 4 can have a proper GL pipeline, there is a lot of room for improvement.

Thanks to the West Coast Hackfest, I had a chance to sit down with Matthias and work through that design. GtkLabel was already using some accelerated text rendering so we started by making that work for GtkTextView. Then we extended the GSK PangoRenderer to handle the rest of the needs of GtkTextView and Matthias re-implemented some features to avoid cairo fallbacks.

After the hackfest I also found time to implement layout caching of PangoLayout. It helps reduce some of the CPU overhead in calculating line layouts.

As we start using the GPU more it becomes increasingly important to keep the CPU usage low. If we don’t it’s very likely to raise overall energy usage. To help keep us honest, I’ve added some RAPL energy statistics to Sysprof.

July 31, 2019

Sprint 4: tons of code reviews, improved web calendar discoverer

The Sprint series comes out every 3 weeks or so. Focus will be on the apps I maintain (Calendar, To Do, and Settings), but it may also include other applications that I contribute to.

GNOME Calendar: a new web calendar discoverer & optimizations

After a fairly big push to reimplement the web calendar discoverer code, it landed in Calendar! The new code is a threaded implementation of a web discoverer where we first ping the server to see if the passed URL is an actual file; otherwise, we perform a full CalDAV discovery on the URL.

Credentials are handled automatically — if the server rejects either the file or CalDAV checks due to permission, the user is asked about it.

In addition to that, the Year view is now much optimized and we avoid a big amount of D-Bus traffic by caching the events that appear in the sidebar.

GNOME To Do: minor cleanups

Not a lot to report on To Do. The week was dedicated to fixing a few crashers and warnings. Mostly boring stuff.

GNOME Settings: lots of code reviews, adaptative improvements

The focus of the Settings week was to get the merge request backlog under control. I do not personally enjoy seeing a number bigger than 50 merge requests against it. So I set the entire week to get some reviews.

Most of the merge requests were polishments to Purism’s push towards an adaptative Settings — we should see many improvements in that front.

July 30, 2019

GXml and on-the-fly-post-parsing technique

I think this is new, so I’ll describe a new technique used in GXml to parse a large set of nodes in an XML document.

Parsing Large XML documents

If you have a large XML document, with a root with a number of child nodes, the standard technique is read all of them, including the child’s children ones, to create the XML tree. This process can take a while.

New on-the-fly parser

GXml now has a new custom parser called StreamReader used to read the root element and its children, but without any attribute and without any child’s children; the attributes and the children’s children are stored in a string on-the-fly in order to read the document almost at the same time it is read from the IO stream, for the root and for each children, improving the loading time of large XML documents up to 400% times faster than the previous technique already present in GXml.

On-the-fly-post-parsing

By using this On-the-fly-post-parsing technique, You can’t access the child’s children or the root’s attributes immediately after first read, you have to parse it from a temporally location in the GXml.Element class, using the new GXml.Element.parse_buffer() method, this one use the standard method, already present in GXml, to parse the root’s properties and the children’s children. When GXml.Element.parse_buffer() is called over the root, all children’s children are parsed recursively, but you can choose to parse just one of the root’s child, making a really convenient technique when you need just one root’s child node in a large XML document.

Multi-threading parsing

Currently GXml.Element.parse_buffer_async(), when called on root’s element, uses GLib.ThreadPool to parse each child in a different thread each and uses as many as threads are usable (less one) in your system. The expected behavior is getting a parse boost over the standard technique using in GXml: Xml.TextReader from the veteran libxml2 library running over just one thread. Currently a standard time parsing is provided when GXml.Element.parse_buffer_async() is called on document’s root, this maybe is a limitation on libxml2, because we have lot of Xml.TextReader running at the same time parsing element’s children; or a limitation on GLib.ThreadPool. Maybe the solution is a step away.

Bug bounties and NDAs are an option, not the standard

Zoom had a vulnerability that allowed users on MacOS to be connected to a video conference with their webcam active simply by visiting an appropriately crafted page. Zoom's response has largely been to argue that:

a) There's a setting you can toggle to disable the webcam being on by default, so this isn't a big deal,
b) When Safari added a security feature requiring that users explicitly agree to launch Zoom, this created a poor user experience and so they were justified in working around this (and so introducing the vulnerability), and,
c) The submitter asked whether Zoom would pay them for disclosing the bug, and when Zoom said they'd only do so if the submitter signed an NDA, they declined.

(a) and (b) are clearly ludicrous arguments, but (c) is the interesting one. Zoom go on to mention that they disagreed with the severity of the issue, and in the end decided not to change how their software worked. If the submitter had agreed to the terms of the NDA, then Zoom's decision that this was a low severity issue would have led to them being given a small amount of money and never being allowed to talk about the vulnerability. Since Zoom apparently have no intention of fixing it, we'd presumably never have heard about it. Users would have been less informed, and the world would have been a less secure place.

The point of bug bounties is to provide people with an additional incentive to disclose security issues to companies. But what incentive are they offering? Well, that depends on who you are. For many people, the amount of money offered by bug bounty programs is meaningful, and agreeing to sign an NDA is worth it. For others, the ability to publicly talk about the issue is worth more than whatever the bounty may award - being able to give a presentation on the vulnerability at a high profile conference may be enough to get you a significantly better paying job. Others may be unwilling to sign an NDA on principle, refusing to trust that the company will ever disclose the issue or fix the vulnerability. And finally there are people who can't sign such an NDA - they may have discovered the issue on work time, and employer policies may prohibit them doing so.

Zoom are correct that it's not unusual for bug bounty programs to require NDAs. But when they talk about this being an industry standard, they come awfully close to suggesting that the submitter did something unusual or unreasonable in rejecting their bounty terms. When someone lets you know about a vulnerability, they're giving you an opportunity to have the issue fixed before the public knows about it. They've done something they didn't need to do - they could have just publicly disclosed it immediately, causing significant damage to your reputation and potentially putting your customers at risk. They could potentially have sold the information to a third party. But they didn't - they came to you first. If you want to offer them money in order to encourage them (and others) to do the same in future, then that's great. If you want to tie strings to that money, that's a choice you can make - but there's no reason for them to agree to those strings, and if they choose not to then you don't get to complain about that afterwards. And if they make it clear at the time of submission that they intend to publicly disclose the issue after 90 days, then they're acting in accordance with widely accepted norms. If you're not able to fix an issue within 90 days, that's very much your problem.

If your bug bounty requires people sign an NDA, you should think about why. If it's so you can control disclosure and delay things beyond 90 days (and potentially never disclose at all), look at whether the amount of money you're offering for that is anywhere near commensurate with the value the submitter could otherwise gain from the information and compare that to the reputational damage you'll take from people deciding that it's not worth it and just disclosing unilaterally. And, seriously, never ask for an NDA before you're committing to a specific $ amount - it's never reasonable to ask that someone sign away their rights without knowing exactly what they're getting in return.

tl;dr - a bug bounty should only be one component of your vulnerability reporting process. You need to be prepared for people to decline any restrictions you wish to place on them, and you need to be prepared for them to disclose on the date they initially proposed. If they give you 90 days, that's entirely within industry norms. Remember that a bargain is being struck here - you offering money isn't being generous, it's you attempting to provide an incentive for people to help you improve your security. If you're asking people to give up more than you're offering in return, don't be surprised if they say no.

comment count unavailable comments

July 29, 2019

g_assert_finalize_object() in GLib 2.61.2

One more API in this mini-series! g_assert_finalize_object(), which is available in GLib 2.61.2, which was released today.

This one’s useful when writing tests (and only when writing tests). It’s been put together by Simon McVittie to implement the common pattern needed in tests, where you want to unref a GObject and assert that you just dropped the final reference to the object — i.e., check that no references to the object have been leaked in the test.

Use it in place of g_object_unref(). If G_DISABLE_ASSERT is defined, it will actually just be a call to g_object_unref().

Here’s an example usage of it, straight out of the GLib unit test for it:

static void
test_assert_finalize_object (void)
{
  GObject *obj = g_object_new (G_TYPE_OBJECT, NULL);

  /* do some things with the obj here */

  g_assert_finalize_object (obj);
}

Finally TagEditor!

After a lot of Merge Requests related with MBIDS and AcoustID, finally I started working on acoustid plugin.

Before the logic was to return the recording with most sources. Now, we need to return multiple results. We need to retrieve first release belonging to each release group of each recording which matched with the given chromaprint.

Each chromaprint can get matched with multiple recordings and each recording can belong to many different release groups. Now each release group can have many releases, but releases within a release group only differ in track_count, dates or release country, so this difference between releases is not useful for a single song. Hence we can pick any release, so I went ahead with the first release.
Now the doubt continued, after implementing browse operation we realized, query operation would make more sense, because browse operation needs a top level container which can be browsed.

I also started with first version of tag editor. UI is not polished yet, but I could browse through various suggestions.

Welcome to AcoustID

So in the last month, I’ve been working on MusicBrainz IDs retrieval and mappings. After that I spent some time trying to figure out how to start working on metadata retrieval. There’s so many pieces involved, after finally figuring out a rough plan, I started with the first piece of the puzzle. i.e AcoustID
AcoustID is a unique identifier which points to a specific entry in the AcoustID database. There are mainly two ways of accessing AcoustID database which are relevant to us, through fingerprint or through identifier. Former is more common, considering not all song files will have AcoustID stored. One good thing about my project is, once we retrieve chromaprint fingerprint or AcoustID or MBIDs, we are gonna make sure we writeback them to files, so that same calculations don’t have to be repeated.
Similar to how I extracted MBIDs from files based on mappings given by MusicBrainz, I had to extract AcoustID using mp3, vorbis and gstreamer extractors. Once again, GStreamer didn’t have support for GST tags for acoustid. So I had to update my old mr on GStreamer. Finally after few days, that MR got merged! It is quite an achievement!
I also discussed whether a new Grilo property is required for chromaprint fingerprint with Toso and Jean. They suggested that it’s better to keep chromaprint fingerprint within chromaprint plugin only and anyways, we can indirectly access them anywhere, there’s no need for them to be in core.
Other than this I also started working on browse operation to be able to return multiple results. I am not yet sure if browse or query operation makes more sense. Once I have first draft of browse operation, I will also start working on Tag Editor.

Survival Analysis for Deep Learning

Most machine learning algorithms have been developed to perform classification or regression. However, in clinical research we often want to estimate the time to and event, such as death or recurrence of cancer, which leads to a special type of learning task that is distinct from classification and regression. This task is termed survival analysis, but is also referred to as time-to-event analysis or reliability analysis. Many machine learning algorithms have been adopted to perform survival analysis: Support Vector Machines, Random Forest, or Boosting. It has only been recently that survival analysis entered the era of deep learning, which is the focus of this post.

You will learn how to train a convolutional neural network to predict time to a (generated) event from MNIST images, using a loss function specific to survival analysis. The first part, will cover some basic terms and quantities used in survival analysis (feel free to skip this part if you are already familiar). In the second part, we will generate synthetic survival data from MNIST images and visualize it. In the third part, we will briefly revisit the most popular survival model of them all and learn how it can be used as a loss function for training a neural network. Finally, we put all the pieces together and train a convolutional neural network on MNIST and predict survival functions on the test data.

July 27, 2019

More text rendering updates

There is a Pango 1.44 release now. It contains all the changes I outlined recently. We also managed to sneak in a few features and fixes for longstanding bugs. That is the topic of this post.

Line breaking

One area for improvements in this release is line breaking.

Hyphenation

We don’t have TeX-style automatic hyphenation yet (although it may happen eventually). But at least, Pango inserts hyphens now when it breaks a line in the middle of a word (for example, at a soft hyphen character).

Example with soft hyphens

This is something i have wanted to do for a very long time, so I am quite happy that switching to harfbuzz for shaping on all platforms has finally enabled us to do this without too much effort.

Better line breaks

Pango follows Unicode UAX14 and UAX29 for finding word boundaries and line break opportunities.  The algorithm described in there is language-independent, but allows for language-specific tweaks. The Unicode standard calls this tailoring.

While Pango has had implementations for both the language-independent and -dependent parts before, we didn’t have them clearly separated in the API, until now.

In 1.44, we introduce a new pango_tailor_break() function which applies language-specific tweaks to a segment of text that has a uniform language. It is meant to be called after pango_default_break().

Line break control

Since my focus was on line-breaking already, I’ve added support for a text attribute to control line breaking. You can now say:

Don't break <span allow_break="false">here!</span>

in Pango markup, and Pango will obey.

In the hyphenation example above, the words showing possible hyphenation points (like im‧peachment) are marked up in this way.

Placement

Another area with significant changes is placement, both of lines and of individual glyphs.

Line height

Up to now, Pango has been placing the lines of a paragraph directly below each other, possibly with a fixed amount of spacing between them. While this works ok most of the time, a more typographically correct way to go about this is to control the baseline-to-baseline distance between lines.

Fonts contain a recommended value for this distance, so the first step was to make this value available with a new pango_font_metrics_get_height() API.

To make use of it, we added a new parameter to PangoLayout that tells it to place lines according to baseline-to-baseline distance. Once we had this, it was very easy to turn the parameter into a floating point number and allow things like double-spaced lines, by saying

pango_layout_set_line_spacing (layout, 2.0)
Line spacing 1, 1.5, and 2

You can still use the old way of spacing if you set line-spacing to 0.

Subpixel positions

Pango no longer rounds glyph positions and font metrics to integral pixel numbers. This lets consumers of the formatted glyphs (basically, implementations of PangoRenderer) decide for themselves if they want to place glyphs at subpixel positions or pixel-aligned.

Non-integral extents

The cairo renderer in libpangocairo will do subpixel positioning, but you need cairo master for best results. GTK master will soon have the necessary changes to take advantage of it for its GL and Vulkan renderers too.

This is likely one of the more controversial changes in this release—any change to font rendering causes strong reactions. One of the reasons for doing the release now is that it gives us enough time to make sure it works ok for all users of Pango before going out in the next round of upstream and distro releases in the fall.

Visualization

Finally, I spent some time implementing  some long-requested features around missing glyphs, and their rendering as hex boxes. These are also known as tofu (which is the origin of the name for the Noto fonts – ‘no tofu’).

Invisible space

Some fonts don’t have a glyph for the space character – after all, there is nothing to draw. In the past, Pango would sometimes draw a hex box in this case. This is entirely unnecessary – we can just leave a gap of the right size and pretend that nothing happened.  Pango 1.44 will do just that: no more hex boxes for space.

Visible space

On the other hand, sometimes you do want to see where spaces and other whitespace characters such as tabs, are. We’ve added an attribute that lets you request visible rendering of whitespace:

<span show="spaces">Some space here</span>
Visible space

This is implemented in the cairo backend, so you will need to use pangocairo to see it.

Special characters

In the same vein, sometimes it is helpful to see special characters such as left-to-right controls in the output.  Unicode calls these characters default-ignorable.

The show attribute also lets you make default-ignorables visible:

<span show=”ignorables”>Hidden treasures</span>

Visible default-ignorable characters

As you can see, we use nicknames for ignorables.

Font information

Pango has been shipping a simple tool called pango-list for a while. It produces a list of all the fonts Pango can find.  This can be very helpful in tracking down changes between systems that are caused by differences in the available fonts.

In 1.44, pango-list can optionally show font metrics and variation axes as well. This may be a little obsure, but it has helped me fix the CI tests for Pango.

Summary

This release contains a significant amount of change; I’ve closed a good number of ‘teenage’ bugs while working on it. Please let us know if you see problems or unexpected changes with it!

scikit-survival 0.9 released

This release of scikit-survival adds support for scikit-learn 0.21 and pandas 0.24, among a couple of other smaller fixes. Please see the release notes for a full list of changes. If you are using scikit-survival in your research, you can now cite it using an Digital Object Identifier (DOI).

July 26, 2019

desktop-file-utils 0.24 released

One thing one can do in this amazing summer heat, is cut the 0.24 release of desktop-file-utils. It’s rather a small thing, but since the last few releases have been happening at roughly three-year intervals I felt it merited a quick post.

Changes since 0.23

desktop-file-validate

  • Allow desktop file spec version 1.2 (Severin Glöckner).
  • Add Budgie, Deepin, Enlightenment and Pantheon to list of registered desktop environments (fdo#10, fdo#11, fdo#16, oldfdo#97385) (Ikey Doherty, sensor.wen, Joonas Niilola, David Faure).

update-desktop-database

  • Sort output lines internally to conserve reproducibility (fdo#12) (Chris Lamb).
  • Use pledge(2) on OpenBSD to limit capabilities (fdo#13) (Jasper Lievisse Adriaanse).

Common

  • Fix missing ; when appending to a list not ending with one (oldfdo#97388) (Pascal Terjan).
  • Add font as valid media type (Matthias Clasen).
  • Fix broken emacs blocking compile (fdo#15) (Hans Petter Jansson, reported by John).

About

desktop-file-utils contains command line utilities for working with desktop entries:

  • desktop-file-validate: Validates a desktop file according to the desktop entry specification.
  • desktop-file-install: Installs a desktop file to the applications directory after applying optional modifications.
  • update-desktop-database: Updates the database containing a cache of MIME types handled by desktop files.

Thanks to everyone who contributed to this release! And an extra big thanks to Daniel Stone for his patient freedesktop.org support.

July 25, 2019

Documentation at the West Coast Hackfest

The West Coast Hackfest was a terrific experience. The venue, the Urban Office coworking space, was ideal. Sharing the space, and the energy, with the Engagement and GTK teams was inspirational. Thanks in particular to Britt for his thoughtful presentation on growing the team.

the workspace

The workspace at Urban Office.

Thursday, the first day, we had a brainstorming session. We triaged and then started attacking the GitLab issues for gnome-user-docs. Over the hackfest, we reduced 28 outstanding issues to 12.5. This entailed 33 commits and 105+ user help pages modified (in addition to a few pages in the Sys Admin Guide, and the wiki).

My part consisted of Bluetooth and Wacom pages, touchscreen gestures (still in progress, the 0.5 of an issue), general Settings updates, and some of the terminology fixes.

Nifty Control Panel feature — like others, the Wacom panel is hidden if no device is connected. This would seem to defeat the help instructions. However, when you search and select the panel from the activities overview, Settings opens to the Wacom panel and its hidden message of No stylus found/No tablet detected.

working hard

Hard at work on docs.

Friday evening we worked late in the coffee shop of Powell’s City of Books, with (a hint of) free wifi, easily accessible hot chocolate and cookies, and acres of reference material.

On Saturday, we discussed the logistics of replacing library-web with Pintail for User Docs, and Petr and Jim started implementing it.

the nonconformists

Pioneer Courthouse Square.

Saturday evening was the fun all-team event. We experienced the Portland Night Market, a combination craft fair and rib fest in the space between the off-ramps in the Industrial District.

Cascade on Belmont

Three-team event on Saturday night (photo courtesy of Christian Hergert).

Portland is a good place for a hackfest. The transit system is excellent, and there is a nicely photogenic mountain just over 80 km away. Thank you to the organizers for a tremendous event, and thanks to the GNOME Foundation for sponsoring my travel and accommodation.

Mount Hood

Mount Hood looming over the Skyline. Unlike the postcard version, you have to zoom in.

July 24, 2019

Constructors

Have you ever had these annoyances with GObject-style constructors?

  • From a constructor, calling a method on a partially-constructed object is dangerous.

  • A constructor needs to set up "not quite initialized" values in the instance struct until a construct-time property is set.

  • You actually need to override GObjectClass::constructed (or was it ::constructor?) to take care of construct-only properties which need to be considered together, not individually.

  • Constructors can't report an error, unless you derive from GInitable, which is not in gobject, but in gio instead. (Also, why does that force the constructor to take a GCancellable...?)

  • You need more than one constructor, but that needs to be done with helper functions.

This article, Perils of Constructors, explains all of these problems very well. It is not centered on GObject, but rather on constructors in object-oriented languages in general.

(Spoiler: Rust does not have constructors or partially-initialized structs, so these problems don't really exist there.)

(Addendum: that makes it somewhat awkward to convert GObject code in C to Rust, but librsvg was able to solve it nicely with <buzzword>the typestate pattern</buzzword>.)

 

GSoC 2019: New Learnings

1. Virtual methods can be overridden in derived classes, while abstract methods must be overridden.  

2. A class that has atleast one abstract method must be an abstract class.  

3. We can't create instances of abstract classes.

4. GObject subclasses are any classes derived directly or indirectly from GLib.Object. They support a lot of features like signals, managed properties, interfaces and complex construction methods.(reference).


About git
  • Soft reset: this feature allows to remove the commits but keeps the changes made in files saved whereas hard reset removes the saved changes from file also.
  • When we create a merge request by comparing two branches, the commits included in the MR get updated every time we update either of the branch, if we update the branch that we used to compare our new branch, the new commits included in that branch will not be included in the MR.