GNOME.ORG

24 hours a day, 7 days a week, 365 days per year...

February 26, 2017

How to get started in open source software

A friend pointed me to the Open Source Guides website, a collection of resources for individuals, communities, and companies who want to learn how to run and contribute to an open source project. I thought it was very interesting for new contributors, so I thought I'd share it here.

The website provides lots of information about starting or joining an open source project. There's lots to read, but I hope that doesn't seem like too much for people interested in getting started in open source software. Open Source Guides has several sections for new developers:
  1. How to Contribute to Open Source
  2. Starting an Open Source Pro ject
  3. Finding Users For Your Project
  4. Building Welcoming Communities
  5. Best Practices for Maintainers
  6. Leadership and Governance
  7. Getting Paid for Open Source Work
  8. Your Code of Conduct
  9. Open Source Metrics
  10. The Legal Side of Open Source
I'm not connected with Open Source Guides, but I think it's a great idea!

Of course, there are other ways to learn about open source software, how to get involved and how to start your own project. Eric Raymond's essay series The Cathedral and the Bazaar are the typical go-to examples about open source software. Along similar lines, Opensource.com has a very brief primer on open source contributor guidelines. And there are countless other articles and books I could mention here, including a few articles written by me.

But I'm interested in the open source software community, and anything that helps new folks get started in open source software will help the community grow. So if you're interested in getting involved in open source software, I encourage you to read the Open Source Guides.

FEDORA and GNOME at UNSAAC

Today I did a talk to introduce students of UNSAAC to the Fedora and GNOME world as it was announced by the GDG Cusco group. We started at 8:30 am and it was a free event:

cusco

I started by arranging a table with the stickers and other merchandisings to give them as a prize to the attendances in the room, and of course the balloons of the projects as follow:

fedora02

As it is usual, my first question is how many people in the audience use Linux and this time only one guy use Ubuntu and the rest of people used Windows.

Then, I did a review of the whole Linux history to depict the Fedora and GNOME projects into the scene. I pointed out also the GSoC and Outreachy program for students, as well as indicating the relationship between Fedora and Red Hat and the benefits of them.

fedora04As in my lecturing classes I promote the participation of the students by responding the questions during the presentation, besides they sometimes were wrong.

fedora03One thing that almost all the participants were worried was about the privacy and patent of their new ideas, they expressed that might somebody else could stealing them. Does the Free Software do something to protect them?

fedora09I must say that I was impressed by the presentation of local projects to improve their local community (the government sector specifically)  by using Django and Angular. We have share a coffee break and thanks to Damian Nohales for his support during the entire event.

fedora05Some of them willing to come to Lima next Saturday to attend the Linux Playa event 🙂 These are the photos of the group of this great experience:

fedora01It is obvious to say that Cusco is an amazing cosmopolitan city to eat a delicious “cuy”:

fedora07Damian from GNOME Argentina and I did a voluntary job to promote Fedora & GNOME. I must thank again to Mitzuko, leader of GDG in Cuzco for arranging the meetup, and another special thanks to the Fedora project to support my flight ticket Lima-Cuzco-Lima.


Filed under: FEDORA, GNOME Tagged: Cusco, fedora, FEDORA 25, GDG, GDG Cusco, GNOME, Google Developer Group, Julita Inca, Julita Inca Chiroque, meetup Cusco

February 24, 2017

Top open source projects

TechRadar recently posted an article about "The best open source software 2017" where they list a few of their favorite open source software projects. It's really hard for an open source software project to become popular if it has poor usability—so I thought I'd add a few quick comments of my own about each.

Here you go:
The best open source office software: LibreOffice
LibreOffice hasn't changed its user interface very substantially for a very long time. In the recent LibreOffice 5.3 release, they introduced a new interface option, which they call the MUFFIN (My User Friendly Flexible INterface).

The new interface has several modes, including Single Toolbar, Sidebar, and Notebookbar. The last mode, Notebookbar, is interesting. This is very similar in concept to the Microsoft Office Ribbon. People who come from an Office background and are used to how Ribbon behaves, and how it changes based on what you are working on, should like the Notebookbar setting.

To comment on the new interface: I think this is an interesting and welcome direction for LibreOffice. I don't think the current user interface is bad, but I think the proposed changes are a positive step forward. The new MUFFIN interface is flexible and supports users they way they want to use LibreOffice. I think it will appeal to current and new users, and "lower the bar" for users to come to LibreOffice from Microsoft Office.
The best open source photo editor: GIMP
I use GIMP at home for a lot of projects. Most often, I use GIMP to create and edit images for my websites, including the FreeDOS Project website. Although we've recently turned to SVG where possible on the FreeDOS website, for years all our graphics were made in the GIMP.

A few years ago, I asked readers to suggest programs that have good usability (I also solicited feedback through colleagues via their blogs and their readers). Many people talked about GIMP, the open source graphics program (very similar to Photoshop). There were some strong statements on either side: About half said it had good usability, and about half said it had bad usability.

In following up, it seemed that two types of users thought GIMP had poor usability: Those who used Photoshop a lot, such as professional graphics editors or photographers. Those who never used Photoshop, and only tried GIMP because they needed a graphics program.

So GIMP is an interesting case. It's an example of mimicking another program perhaps too well, but (necessarily) not perfectly. GIMP has good usability if you have used Photoshop occasionally, but not if you are an expert in Photoshop, and not if you are a complete Photoshop novice.
The best open source media player: VLC
I haven't used VLC, part of the VideoLAN project, in a few years. I just don't watch movies on my computer. But looking at the screenshots I see today, I can see VLC has made major strides in ease of use.

The menus seem obvious, and the buttons are plain and simple. There isn't much decoration to the application (it doesn't need it) yet it seems polished. Good job!
The best open source video editor: Shotcut
This is a new project for me. I have recorded a few YouTube videos for my private channel, but they're all very simple: just me doing a demo of something (usually related to FreeDOS, such as how to install FreeDOS.) Because my videos aren't very complicated, I just use the YouTube editor to "trim" the start and end of my videos.

Shotcut seems quite complicated to me, at first glance. Even TechRadar seems to agree, commenting "It might look a little stark at first, but add some of the optional toolbars and you'll soon have its most powerful and useful features your your fingertips."

I'm probably not the right audience for Shotcut. Video is just not my interest area. And it's okay for a project to target a particular audience, if they are well suited to that audience.
The best open source audio editor: Audacity
I used Audacity many years ago, probably when it was still a young project. But even then, I remember Audacity as being fairly straightforward to use. For someone (like me) who occasionally wanted to edit a sound file, Audacity was easy to learn on my own. And the next time I used Audacity, perhaps weeks later, I quickly remembered the path to the features I needed.

Those two features (Learnability and Memorability) are two important features of good usability. We learn about this topic when I teach my online class about usability. The five key characteristics of Usability are: Learnability, Efficiency, Memorability, Error Rates, and Satisfaction. Although that last one is getting close to "User eXperience" ("UX") which is not the same as Usability.
The best open source web browser: Firefox
Firefox is an old web browser, but still feels fresh. I use Firefox on an almost daily basis (when I don't use Firefox, I'm usually in Google Chrome.)

I did usability testing on Firefox a few years ago, and found it does well in several areas:

Familiarity: Firefox tries to blend well with other applications on the same operating system. If you're using Linux, Firefox looks like other Linux applications. When you're on a Mac, Firefox looks like a Mac application. This is important, because UI lessons that you learn in one application will carry over to Firefox on the same platform.

Consistency: Features within Firefox are accessed in a similar way and perform in a similar way, so you aren't left feeling like the program is a mash of different coders.

Obviousness: When an action produces an obvious result, or clearly indicated success, users feel comfortable because they understand what the program is doing. They can see the result of their actions.
The best open source email client: Thunderbird
Maybe I shouldn't say this, but I haven't used a desktop email client in several years. I now use Gmail exclusively.

However, the last desktop email client I used was definitely Thunderbird. And I remember it being very nice. Sometimes I explored other desktop email programs like GNOME Evolution or Balsa, but I always came back to Thunderbird.

Like Firefox, Thunderbird integrated well into whatever system you use. Its features are self-consistent, and actions produce obvious results. This shouldn't be surprising, though. Thunderbird is originally a Mozilla project.
The best open source password manager: KeePass
Passwords are the keys to our digital lives. So much of what we do today is done online, via a multitude of accounts. With all those accounts, it can be tempting to re-use passwords across websites, but that's really bad for security; if a hacker gets your password on one website, they can use it to get your private information from other websites. To practice good security, you should use a different password on every website you use. And for that, you need a way to store and manage those passwords.

KeePass is an outstanding password manager. There are several password managers to choose from, but KeePass has been around a long time and is really solid. With KeePass, it's really easy to create new entries in the database, group similar entries together (email, shopping, social, etc.) and assign icons to them. And a key feature is generating random passwords. KeePass lets you create passwords of different lengths and complexity, and provides a helpful visual color guide (red, yellow, green) to suggest how "secure" your password is likely to be.

Fri 2017/Feb/24

  • Griping about parsers and shitty specifications

    The last time, I wrote about converting librsvg's tree of SVG element nodes from C to Rust — basically, the N-ary tree that makes up librsvg's view of an SVG file's structure. Over the past week I've been porting the code that actually implements specific shapes. I ran into the problem of needing to port the little parser that librsvg uses for SVG's list-of-points data type; this is what SVG uses for the points in a polygon or a polyline. In this post, I want to tell you my story of writing Rust parsers, and I also want to vent a bit.

    My history of parsing in Rust

    I've been using hand-written parsers in librsvg, basically learning how to write parsers in Rust as I go. In a rough timeline, this is what I've done:

    1. First was the big-ass parser for Bézier path data, an awkward recursive-descent monster that looks very much like what I would have written in C, and which is in dire need of refactoring (the tests are awesome, though, if you (hint, hint) want to lend a hand).

    2. Then was the simple parser for RsvgLength values, in which the most expedient thing, if not the prettiest, was to reimplement strtod() in Rust. You know how C and/or glib have a wealth of functions to write half-assed parsers quickly? Yeah, that's what I went for here. However, with this I learned that strtod()'s behavior of returning a pointer to the thing-after-the-number can be readily implemented with Rust's string slices.

    3. Then was the little parser for preserveAspectRatio values, which actually uses Rust idioms like using a split_whitespace() iterator and returing a custom error inside a Result.

    4. Lastly, I've implemented a parser for SVG's list-of-points data type. While shapes like rect and circle simply use RsvgLength values which can have units, like percentages or points:

      <rect x="5pt" y="6pt" width="7.5 pt" height="80%/">
      
      <circle cx="25.0" cy="30.0" r="10.0"/>
      		

      In contrast, polygon and polyline elements let you specify lists of points using a simple but quirky grammar that does not use units; the numbers are relative to the object's current coordinate system:

      <polygon points="100,60 120,140 140,60 160,140 180,60 180,100 100,100"/>
      		

    Quirky grammars and shitty specs

    The first inconsistency in the above is that some object types let you specify their positions and sizes with actual units (points, centimeters, ems), while others don't — for these last ones, the coordinates you specify are relative to the object's current transformation. This is not a big deal; I suspect it is either an oversight in the spec, or they didn't want to think of a grammar that would accomodate units like that.

    The second inconsistency is in SVG's quirky grammars. These are equivalent point lists:

    points="100,-60 120,-140 140,60 160,140"
    
    points="100 -60 120 -140 140 60 160 140"
    
    points="100-60,120-140,140,60 160,140"
    
    points="100-60,120, -140,140,60 160,140"
    	    

    Within an (x, y) coordinate pair, the space is optional if the y coordinate is negative. Or you can separate the x and y with a space, with an optional comma.

    But between coordinate pairs, the separator is not optional. You must have any of whitespace or a comma, or both. Even if an x coordinate is negative, it must be separated from its preceding y coordinate.

    But this is not true for path specifications! The following are paths that start with a Move_to comamnd and follow with three Line_to commands, and they are equivalent:

    M 1 2 L -3 -4 5 -6 7 -8
    
    M1,2 L-3 -4, 5,-6,7 -8
    
    M1,2L-3-4, 5-6,7-8
    
    M1 2-3,-4,5-6 7,-8
    
    M1,2-3-4 5-6 7-8
    	    

    Inside (x, y) pairs, you can have whitespace with an optional comma, or nothing if the y is negative. Between coordinate pairs you can have whitespace with an optional comma, or nothing at all! As you wish! And also, since all subpaths start with a single move_to anyway, you can compress a "M point L point point" to just "M point point point" — the line_to is implied.

    I think this is a misguided attempt by the SVG spec to allow compressibility of the path data. But they might have gotten more interesting gains, oh, I don't know, by not using XML and just nesting things with parentheses.

    Borat and XML big     data

    Also, while the grammar for the point lists in polygon and polyline clearly implies that you must have an even number of floats in your list (for the x/y coordinate pairs), the description of the polygon element says that if you have an odd number of coordinates, the element is "in error" and should be treated the same as an erroneous path description. But the implementation notes for paths say that you should "render a path element up to (but not including) the path command containing the first error in the path data specification", i.e. just ignore the orphan coordinate and somehow indicate an error.

    So which one is it? Should we a) count the floats and ignore the odd one out? Or b) should we follow the grammar and ditch the whole string if it isn't matched by the grammar? Or c) put special handling in the grammar for the last odd coordinate?

    I mean, it's no big deal to actually handle this in the code, but it's annoying to have a spec that can't make up its mind.

    The expediency of half-assed parsers

    How would you parse a string that has an array of floating-point numbers separated by whitespace? In Python you could split() the string and then apply float() to each item, and catch an exception if the value does not parse to float. Or something like that.

    The C code in librsvg has a rather horrible rsvg_css_parse_number_list() function, complete with a "/* TODO: some error checking */" comment, that is implemented in terms of an equally horrible rsvg_css_parse_list() function. This last one is the one that handles the "optional whitespace and comma" shenanigans... with strtok_r(). Between both functions there are like a million temporary copies of the whole string and the little substrings. Or something like that.

    Neither version is fully conformant to the quirks in the SVG spec. I'm replacing all that shit with "proper" parsers written in Rust. And what could be more proper than to actually use a parser-generating library instead of writing them from scratch?

    The score is Nom: 2 - Federico: 1

    This is the second time I have tried to use nom, a library for parser combinators. I like the idea of parser combinators instead of writing a grammar and then generating a parser out of that: supposedly your specification of a parser combinator closely resembles your grammar, but the whole thing is written and embedded in the same language as the rest of your code.

    There is a good number of parsers already written with nom, so clearly it works.

    But nom has been... hard. A few months ago I tried rewriting the path data parser for librsvg, from C to Rust with nom, and it was a total failure. I couldn't even parse a floating-point number out of a string. I ditched it after a week of frustration, and wrote my own parser by hand.

    Last week I tried nom again. Sebastian Dröge was very kind to jumpstart me with some nom-based code to parse floating-point numbers. I then discovered that the latest nom actually has that out of the box, but it is buggy, and after some head-scratching I was able to propose a fix. Apparently my fix won't work if nom is being used for a streaming parser, but I haven't gotten around to learning those yet.

    I've found nom code to be very hard to write. Nom works by composing macros which then call recognizer functions and other macros. To be honest, I find the compiler errors for that kind of macros quite intractable. And I'm left with not knowing if the problem is in my use of the macros, or in the macro definitions themselves.

    After much, much trial-and-error, and with the vital help of Sebastian Köln from the #nom IRC channel, I have a nom function that can match SVG's comma-whitespace construct and two others that can match coordinate pairs and lists of points for the polygon and polyline elements. This last one is not even complete: it still doesn't handle a list of points that has whitespace around the list itself (SVG allows that, as in "   1 2, 3 4   ").

    While I can start using those little nom-based parsers to actually build implementations of librsvg's objects, I don't want to battle nom anymore. The documentation is very sketchy. The error messages from macros don't make sense to me. I really like the way nom wants to handle things, with an IResult type that holds the parsing state, but it's just too hard for me to even get things to compile with it.

    So what are you gonna use?

    I don't know yet. There are other parser libraries for Rust; pom, lalrpop, and pest among them. I hope they are easier to try out than nom. If not, "when in doubt, use brute force" may apply well here — hand-written parsers in Rust are as cumbersome as in any other language, but at least they are memory-safe, and writing tests for them is a joy in Rust.

encyclopedia snabb and the case of the foreign drivers

Peoples of the blogosphere, welcome back to the solipsism! Happy 2017 and all that. Today's missive is about Snabb (formerly Snabb Switch), a high-speed networking project we've been working on at work for some years now.

What's Snabb all about you say? Good question and I have a nice answer for you in video and third-party textual form! This year I managed to make it to linux.conf.au in lovely Tasmania. Tasmania is amazing, with wild wombats and pademelons and devils and wallabies and all kinds of things, and they let me talk about Snabb.

You can check that video on the youtube if the link above doesn't work; slides here.

Jonathan Corbet from LWN wrote up the talk in an article here, which besides being flattering is a real windfall as I don't have to write it up myself :)

In that talk I mentioned that Snabb uses its own drivers. We were recently approached by a customer with a simple and honest question: does this really make sense? Is it really a win? Why wouldn't we just use the work that the NIC vendors have already put into their drivers for the Data Plane Development Kit (DPDK)? After all, part of the attraction of a switch to open source is that you will be able to take advantage of the work that others have produced.

Our answer is that while it is indeed possible to use drivers from DPDK, there are costs and benefits on both sides and we think that when we weigh it all up, it makes both technical and economic sense for Snabb to have its own driver implementations. It might sound counterintuitive on the face of things, so I wrote this long article to discuss some perhaps under-appreciated points about the tradeoff.

Technically speaking there are generally two ways you can imagine incorporating DPDK drivers into Snabb:

  1. Bundle a snapshot of the DPDK into Snabb itself.

  2. Somehow make it so that Snabb could (perhaps optionally) compile against a built DPDK SDK.

As part of a software-producing organization that ships solutions based on Snabb, I need to be able to ship a "known thing" to customers. When we ship the lwAFTR, we ship it in source and in binary form. For both of those deliverables, we need to know exactly what code we are shipping. We achieve that by having a minimal set of dependencies in Snabb -- only LuaJIT and three Lua libraries (DynASM, ljsyscall, and pflua) -- and we include those dependencies directly in the source tree. This requirement of ours rules out (2), so the option under consideration is only (1): importing the DPDK (or some part of it) directly into Snabb.

So let's start by looking at Snabb and the DPDK from the top down, comparing some metrics, seeing how we could make this combination.

snabbdpdk
Code lines 61K 583K
Contributors (all-time) 60 370
Contributors (since Jan 2016) 32 240
Non-merge commits (since Jan 2016) 1.4K 3.2K

These numbers aren't directly comparable, of course; in Snabb our unit of code change is the merge rather than the commit, and in Snabb we include a number of production-ready applications like the lwAFTR and the NFV, but they are fine enough numbers to start with. What seems clear is that the DPDK project is significantly larger than Snabb, so adding it to Snabb would fundamentally change the nature of the Snabb project.

So depending on the DPDK makes it so that suddenly Snabb jumps from being a project that compiles in a minute to being a much more heavy-weight thing. That could be OK if the benefits were high enough and if there weren't other costs, but there are indeed other costs to including the DPDK:

  • Data-plane control. Right now when I ship a product, I can be responsible for the whole data plane: everything that happens on the CPU when packets are being processed. This includes the driver, naturally; it's part of Snabb and if I need to change it or if I need to understand it in some deep way, I can do that. But if I switch to third-party drivers, this is now out of my domain; there's a wall between me and something that running on my CPU. And if there is a performance problem, I now have someone to blame that's not myself! From the customer perspective this is terrible, as you want the responsibility for software to rest in one entity.

  • Impedance-matching development costs. Snabb is written in Lua; the DPDK is written in C. I will have to build a bridge, and keep it up to date as both Snabb and the DPDK evolve. This impedance-matching layer is also another source of bugs; either we make a local impedance matcher in C or we bind everything using LuaJIT's FFI. In the former case, it's a lot of duplicate code, and in the latter we lose compile-time type checking, which is a no-go given that the DPDK can and does change API and ABI.

  • Communication costs. The DPDK development list had 3K messages in January. Keeping up with DPDK development would become necessary, as the DPDK is now in your dataplane, but it costs significant amounts of time.

  • Costs relating to mismatched goals. Snabb tries to win development and run-time speed by searching for simple solutions. The DPDK tries to be a showcase for NIC features from vendors, placing less of a priority on simplicity. This is a very real cost in the form of the way network packets are represented in the DPDK, with support for such features as scatter/gather and indirect buffers. In Snabb we were able to do away with this complexity by having simple linear buffers, and our speed did not suffer; adding the DPDK again would either force us to marshal and unmarshal these buffers into and out of the DPDK's format, or otherwise to reintroduce this particular complexity into Snabb.

  • Abstraction costs. A network function written against the DPDK typically uses at least three abstraction layers: the "EAL" environment abstraction layer, the "PMD" poll-mode driver layer, and often an internal hardware abstraction layer from the network card vendor. (And some of those abstraction layers are actually external dependencies of the DPDK, as with Mellanox's ConnectX-4 drivers!) Any discrepancy between the goals and/or implementation of these layers and the goals of a Snabb network function is a cost in developer time and in run-time. Note that those low-level HAL facilities aren't considered acceptable in upstream Linux kernels, for all of these reasons!

  • Stay-on-the-train costs. The DPDK is big and sometimes its abstractions change. As a minor player just riding the DPDK train, we would have to invest a continuous amount of effort into just staying aboard.

  • Fork costs. The Snabb project has a number of contributors but is really run by Luke Gorrie. Because Snabb is so small and understandable, if Luke decided to stop working on Snabb or take it in a radically different direction, I would feel comfortable continuing to maintain (a fork of) Snabb for as long as is necessary. If the DPDK changed goals for whatever reason, I don't think I would want to continue to maintain a stale fork.

  • Overkill costs. Drivers written against the DPDK have many considerations that simply aren't relevant in a Snabb world: kernel drivers (KNI), special NIC features that we don't use in Snabb (RDMA, offload), non-x86 architectures with different barrier semantics, threads, complicated buffer layouts (chained and indirect), interaction with specific kernel modules (uio-pci-generic / igb-uio / ...), and so on. We don't need all of that, but we would have to bring it along for the ride, and any changes we might want to make would have to take these use cases into account so that other users won't get mad.

So there are lots of costs if we were to try to hop on the DPDK train. But what about the benefits? The goal of relying on the DPDK would be that we "automatically" get drivers, and ultimately that a network function would be driver-agnostic. But this is not necessarily the case. Each driver has its own set of quirks and tuning parameters; in order for a software development team to be able to support a new platform, the team would need to validate the platform, discover the right tuning parameters, and modify the software to configure the platform for good performance. Sadly this is not a trivial amount of work.

Furthermore, using a different vendor's driver isn't always easy. Consider Mellanox's DPDK ConnectX-4 / ConnectX-5 support: the "Quick Start" guide has you first install MLNX_OFED in order to build the DPDK drivers. What is this thing exactly? You go to download the tarball and it's 55 megabytes. What's in it? 30 other tarballs! If you build it somehow from source instead of using the vendor binaries, then what do you get? All that code, running as root, with kernel modules, and implementing systemd/sysvinit services!!! And this is just step one!!!! Worse yet, this enormous amount of code powering a DPDK driver is mostly driver-specific; what we hear from colleagues whose organizations decided to bet on the DPDK is that you don't get to amortize much knowledge or validation when you switch between an Intel and a Mellanox card.

In the end when we ship a solution, it's going to be tested against a specific NIC or set of NICs. Each NIC will add to the validation effort. So if we were to rely on the DPDK's drivers, we would have payed all the costs but we wouldn't save very much in the end.

There is another way. Instead of relying on so much third-party code that it is impossible for any one person to grasp the entirety of a network function, much less be responsible for it, we can build systems small enough to understand. In Snabb we just read the data sheet and write a driver. (Of course we also benefit by looking at DPDK and other open source drivers as well to see how they structure things.) By only including what is needed, Snabb drivers are typically only a thousand or two thousand lines of Lua. With a driver of that size, it's possible for even a small ISV or in-house developer to "own" the entire data plane of whatever network function you need.

Of course Snabb drivers have costs too. What are they? Are customers going to be stuck forever paying for drivers for every new card that comes out? It's a very good question and one that I know is in the minds of many.

Obviously I don't have the whole answer, as my role in this market is a software developer, not an end user. But having talked with other people in the Snabb community, I see it like this: Snabb is still in relatively early days. What we need are about three good drivers. One of them should be for a standard workhorse commodity 10Gbps NIC, which we have in the Intel 82599 driver. That chipset has been out for a while so we probably need to update it to the current commodities being sold. Additionally we need a couple cards that are going to compete in the 100Gbps space. We have the Mellanox ConnectX-4 and presumably ConnectX-5 drivers on the way, but there's room for another one. We've found that it's hard to actually get good performance out of 100Gbps cards, so this is a space in which NIC vendors can differentiate their offerings.

We budget somewhere between 3 and 9 months of developer time to create a completely new Snabb driver. Of course it usually takes less time to develop Snabb support for a NIC that is only incrementally different from others in the same family that already have drivers.

We see this driver development work to be similar to the work needed to validate a new NIC for a network function, with the additional advantage that it gives us up-front knowledge instead of the best-effort testing later in the game that we would get with the DPDK. When you add all the additional costs of riding the DPDK train, we expect that the cost of Snabb-native drivers competes favorably against the cost of relying on third-party DPDK drivers.

In the beginning it's natural that early adopters of Snabb make investments in this base set of Snabb network drivers, as they would to validate a network function on a new platform. However over time as Snabb applications start to be deployed over more ports in the field, network vendors will also see that it's in their interests to have solid Snabb drivers, just as they now see with the Linux kernel and with the DPDK, and given that the investment is relatively low compared to their already existing efforts in Linux and the DPDK, it is quite feasible that we will see the NIC vendors of the world start to value Snabb for the performance that it can squeeze out of their cards.

So in summary, in Snabb we are convinced that writing minimal drivers that are adapted to our needs is an overall win compared to relying on third-party code. It lets us ship solutions that we can feel responsible for: both for their operational characteristics as well as their maintainability over time. Still, we are happy to learn and share with our colleagues all across the open source high-performance networking space, from the DPDK to VPP and beyond.

LinuXatUSIL – Previas 2 for #LinuxPlaya

Yesterday, we did a workshop to start in GTK and Python in charge of Damian Nohales. We first check the libraries go and sqlite since the original idea was to do a simple form.

usil01

To draw widgets that are already designed he made us install Glade and he also made us installed Atom to review code. This is how Glade looks like:

glade

Damian from GNOME Argentina explained us some code based on this tutorial and the widgets in Glade were presented. These are some students impressions:

Luis Zacarias

https://sites.google.com/site/luiszacariasquispeojeda1997/python-sqlite-con-fedora-25

Azul Huaylla

https://post852.wordpress.com/2017/02/24/linux-at-usil-azul/

Lizeth Lucar

https://inglizbethlucar.wixsite.com/lizbeth

Handle students at the same time can be harsh at some point but Damian did it well 😀

These are some pictures of this previous event to the #LinuxPlaya

linuxatusil

 


Filed under: FEDORA, GNOME, τεχνολογια :: Technology Tagged: fedora, GNOME, Julita Inca, Julita Inca Chiroque, Linux Playa, LinuXatUSIL, Previas Linux Playa, USIL

February 23, 2017

Machine Learning Speech Recognition

Keeping up my yearly blogging cadence, it’s about time I wrote to let people know what I’ve been up to for the last year or so at Mozilla. People keeping up would have heard of the sad news regarding the Connected Devices team here. While I’m sad for my colleagues and quite disappointed in how this transition period has been handled as a whole, thankfully this hasn’t adversely affected the Vaani project. We recently moved to the Emerging Technologies team and have refocused on the technical side of things, a side that I think most would agree is far more interesting, and also far more suited to Mozilla and our core competence.

Project DeepSpeech

So, out with Project Vaani, and in with Project DeepSpeech (name will likely change…) – Project DeepSpeech is a machine learning speech-to-text engine based on the Baidu Deep Speech research paper. We use a particular layer configuration and initial parameters to train a neural network to translate from processed audio data to English text. You can see roughly how we’re progressing with that here. We’re aiming for a 10% Word Error Rate (WER) on English speech at the moment.

You may ask, why bother? Google and others provide state-of-the-art speech-to-text in multiple languages, and in many cases you can use it for free. There are multiple problems with existing solutions, however. First and foremost, most are not open-source/free software (at least none that could rival the error rate of Google). Secondly, you cannot use these solutions offline. Third, you cannot use these solutions for free in a commercial product. The reason a viable free software alternative hasn’t arisen is mostly down to the cost and restrictions around training data. This makes the project a great fit for Mozilla as not only can we use some of our resources to overcome those costs, but we can also use the power of our community and our expertise in open source to provide access to training data that can be used openly. We’re tackling this issue from multiple sides, some of which you should start hearing about Real Soon Now™.

The whole team has made contributions to the main code. In particular, I’ve been concentrating on exporting our models and writing clients so that the trained model can be used in a generic fashion. This lets us test and demo the project more easily, and also provides a lower barrier for entry for people that want to try out the project and perhaps make contributions. One of the great advantages of using TensorFlow is how relatively easy it makes it to both understand and change the make-up of the network. On the other hand, one of the great disadvantages of TensorFlow is that it’s an absolute beast to build and integrates very poorly with other open-source software projects. I’ve been trying to overcome this by writing straight-forward documentation, and hopefully in the future we’ll be able to distribute binaries and trained models for multiple platforms.

Getting Involved

We’re still at a fairly early stage at the moment, which means there are many ways to get involved if you feel so inclined. The first thing to do, in any case, is to just check out the project and get it working. There are instructions provided in READMEs to get it going, and fairly extensive instructions on the TensorFlow site on installing TensorFlow. It can take a while to install all the dependencies correctly, but at least you only have to do it once! Once you have it installed, there are a number of scripts for training different models. You’ll need a powerful GPU(s) with CUDA support (think GTX 1080 or Titan X), a lot of disk space and a lot of time to train with the larger datasets. You can, however, limit the number of samples, or use the single-sample dataset (LDC93S1) to test simple code changes or behaviour.

One of the fairly intractable problems about machine learning speech recognition (and machine learning in general) is that you need lots of CPU/GPU time to do training. This becomes a problem when there are so many initial variables to tweak that can have dramatic effects on the outcome. If you have the resources, this is an area that you can very easily help with. What kind of results do you get when you tweak dropout slightly? Or layer sizes? Or distributions? What about when you add or remove layers? We have fairly powerful hardware at our disposal, and we still don’t have conclusive results about the affects of many of the initial variables. Any testing is appreciated! The Deep Speech 2 paper is a great place to start for ideas if you’re already experienced in this field. Note that we already have a work-in-progress branch implementing some of these ideas.

Let’s say you don’t have those resources (and very few do), what else can you do? Well, you can still test changes on the LDC93S1 dataset, which consists of a single sample. You won’t be able to effectively tweak initial parameters (as unsurprisingly, a dataset of a single sample does not represent the behaviour of a dataset with many thousands of samples), but you will be able to test optimisations. For example, we’re experimenting with model quantisation, which will likely be one of multiple optimisations necessary to make trained models usable on mobile platforms. It doesn’t particularly matter how effective the model is, as long as it produces consistent results before and after quantisation. Any optimisation that can be made to reduce the size or the processor requirement of training and using the model is very valuable. Even small optimisations can save lots of time when you start talking about days worth of training.

Our clients are also in a fairly early state, and this is another place where contribution doesn’t require expensive hardware. We have two clients at the moment. One written in Python that takes advantage of TensorFlow serving, and a second that uses TensorFlow’s native C++ API. This second client is the beginnings of what we hope to be able to run on embedded hardware, but it’s very early days right now.

And Finally

Imagine a future where state-of-the-art speech-to-text is available, for free (in cost and liberty), on even low-powered devices. It’s already looking like speech is going to be the next frontier of human-computer interaction, and currently it’s a space completely tied up by entities like Google, Amazon, Microsoft and IBM. Putting this power into everyone’s hands could be hugely transformative, and it’s great to be working towards this goal, even in a relatively modest capacity. This is the vision, and I look forward to helping make it a reality.

The new Online Accounts & Printer panels (and other related news!)

Greetings, GNOMErs!

If you’re watching closely the GNOME Control Center iterations, you probably noticed it already has a bunch of new panels: Keyboard, Mouse & Touchpad, and other panels like Sharing, Privacy and Search that don’t need to be ported.

I’m pleased to announce that the Online Accounts panel joined this family.

Check this out:

Captura de tela de 2017-02-23 10-27-19.pngThe new Online Accounts panel

This panel was easier to be ported since the codebase was already somewhat recent. It now features a single-column layout, which fits perfectly the next shell of GNOME Control Center.

Editing and creating accounts are now handled in a separate dialog:

captura-de-tela-de-2017-02-23-10-27-43Editing an account
captura-de-tela-de-2017-02-23-10-27-51Adding a new online account

But that’s not all!

The new Printers panel

Thanks to the always awsome Felipe Borges, we have a brand new, super shiny Printers panel. Most importantly, it has colored ink indicators! Dude, what else could you ask for?

I’ll let Felipe himself write this one in details, but why not tease the audience? 🙂

Captura de tela de 2017-02-23 10-45-52.pngSuper beautiful Printers panel (uh… no printers here)

Lets shout a big thanks to this great GNOME contributor!

m(_ _)m

What’s left?

In order to permanently switch to the new shell (the one in the screenshots), there are three major panels to be reworked:

  • Sounds: required to switch to the new shell, the Sounds panel will probably be very time consuming.
  • Networks: another required and hard panel. This panel will be split into Wi-Fi, Mobile Broadband, and Networks, and it’ll probably require a lot of work too.
  • Details: each tab in the Details panel will be split into a separate panel. It’ll be mostly manual work, and it’s easy (anyone interested in picking this one up? 🙂 )

This work can be tracked here. Excited? Join us in creating the next generation Settings application – contributions are so appreciated!

February 22, 2017

Slow Braised Carnitas

Update 23 Feb 2017 Ok, I may not have invented this recipe myself. The text of the DMCA takedown notice I received this morning has replaced the recipe below. I call upon Mr. Love to repost the recipe, as I am making carnitas this weekend.

My name is ROBERT LOVE and I am the NATURAL COPYRIGHT HOLDER OF THE CONTENT FORMERLY AT BLOG.RLOVE.ORG. A website that you and/or your company hosts is infringing on at least one copyright owned by myself.

An article was copied onto your servers without permission. The original article, to which we own the exclusive copyrights, can be found at:

https://web.archive.org/web/20160318211638/http://blog.rlove.org/2013/11/slow-braised-carnitas.html

The unauthorized and infringing copy can be found at:

https://joeshaw.org/slow-braised-carnitas/

This letter is official notification under Section 512(c) of the Digital Millennium Copyright Act (”DMCA”), and I seek the removal of the aforementioned infringing material from your servers. I request that you immediately notify the infringer of this notice and inform them of their duty to remove the infringing material immediately, and notify them to cease any further posting of infringing material to your server in the future.

Please also be advised that law requires you, as a service provider, to remove or disable access to the infringing materials upon receiving this notice. Under US law a service provider, such as yourself, enjoys immunity from a copyright lawsuit provided that you act with deliberate speed to investigate and rectify ongoing copyright infringement. If service providers do not investigate and remove or disable the infringing material this immunity is lost. Therefore, in order for you to remain immune from a copyright infringement action you will need to investigate and ultimately remove or otherwise disable the infringing material from your servers with all due speed should the direct infringer, your client, not comply immediately.

I am providing this notice in good faith and with the reasonable belief that rights my company owns are being infringed. Under penalty of perjury I certify that the information contained in the notification is both true and accurate, and I have the authority to act on behalf of the owner of the copyright(s) involved.

Should you wish to discuss this with me please contact me directly.

2017-02-22 Wednesday.

  • Excited to be demoing a new Ubuntu / Snap based solution alongside Ubuntu and Nextcloud at MWC Barcelona.
  • Also pleased to see AMD release Ryzen Making 16 threads accessible to everyone - helping your LibreOffice compiles (and load times of course too); good stuff.

February 21, 2017

2017-02-21 Tuesday.

  • Mail, financials with Tracie, call with Eloy; then Ash & Kendy. Built ESC stats. Out to cubs in the evening with M. - ran a code-breaking table with several fun encoding schemes.

Nautilus 3.24 – The changes

Another release of GNOME and Nautilus is coming, 3.24. As we have been doing for every release I will explain the changes that have an impact on the users and explain the reasoning behind them, and also a brief look into what we worked on this release.

User changes

Accessing root files – The admin backend

Since Nautilus was created, if a user wanted to open a folder where the user didn’t have permissions, for example a system folder where only root has access, it was required to start Nautilus with sudo.

However running UI apps under root is strongly discouraged, and to be honest, quite inconvenient. Running any UI app with sudo is actually not even supported in Wayland by design due to the security issues that that conveys.

So now we put a gvfs backend (kudos to Cosimo Cecchi) and we added support in Nautilus to open folders and files with no permission to access then conveniently into Nautilus itself. Now if you try to open a folder that requires special permissions it will ask for the root password with a dialog and once provided it will allow the user to navigate them without needing to restart the application or opening a new instance. The password is saved with a timeout so you can use access the file for some time, as is common with PolKit.

Also this allows to use Nautilus to access these files when using Wayland.

Here’s a demo how it works:

Operations button

Two releases ago we introduced a new way to handle operations in Nautilus, using an integrated popover rather than a separated window. We also added a button that shows a pie chart with the completion progress of the operations.

One issue we faced is that this button was not visible enough when an operation started, making the user think nothing was happening and the operation didn’t actually started.

For that we added an animation that makes the button shine, but seems that was not enough and after going back and forward through some solutions we end up showing the operations popover when and operations started, which is the last resort we could go with.

This is by far the least ideal solution, since it kinda disturbs the user and makes them dismiss the popover in every window (because where do you show it? only in the source of the operation? only in the destination? only where the user is looking at? everywhere?)

So we created a new animation that tries to catch more attention, and we will like to receive some feedback on it. Here is a demo:

And so far we didn’t add any user change more, so it should be a good release for those that like stability.

Nautilus desktop works on Wayland sessions

Thanks to Florian Mullner, he came up with a simple but working solution. Now you can use the desktop when in a Wayland session. Keep in mind this requires XWayland, and it’s backported to 3.22 too.

What we have work on

Flow box view

As you may know, the views in Nautilus are quite old and on the technical side they lack the most basic modern stuff, due to it’s custom implementation. For more than 2 years we have been working on allow a new view based on GtkFlowBox to be created.

Issues with current implementation are:

  • It’s not using gtk+ widgets or containers. This is by far the most important issue:
    • Any change or new feature  requires a whole “widget alike” object creation.
    • We cannot use any of the gtk+ widgets technology, including the gtk+ inspector or CSS
    • We cannot really create new visualisations like a placeholder for when copying files with some progress information or showing tags for files or any smart feature you could think
    • We have to maintain a big amount of code that Nautilus shouldn’t. This is widget work from gtk+
  • Spacing between items is not dynamic
  • Selection of items modify the visual of the picture, videos, icons with a blue overlay
  • We cannot have several zooms without making the view unusable.
  • The view and items “jumps” while it’s loading the files inside, making effectively useless to use Nautilus while the directory is being loaded.

This work it’s without hesitation the hardest and longest that Nautilus have faced for years. It required the desktop to be split from Nautilus, it requires modification in the most basic level of the code, porting to new technologies, etc.

Unfortunately, even if we managed to make all of this work, due to its performance issues we cannot switch to GtkFlowBox yet, and more work is required both in Nautilus and gtk+.

However I’m happy to announce that on the Hackfest we did last Novermber we decided to push forward this as a optional view to be able to iterate on its implementation and open the door for contributions from contributors and GSoC projects on it. This work is an experiment in several ways, not only in the technical part but also in the views itself and the code design. We will iterate on the feedback and code to make it shine before making it the default view. The same will apply to the list view once the icon view is ready.

This experimental view is now optionally checked as a gsetting called “use-experimental-views”. It’s not yet in the user preferences because it’s not in a usable state and it’s not intended to be used by nobody other than developers contributing to Nautilus.

Probably most of you already saw how this looks like on some of my posts in social medias almost a year ago. The state now is much better, here it is:

What did not make it into this release

You can take a look at the roadmap of Nautilus to see that most of the items I planned for this release were not implemented. Instead we implemented two different ones, the admin support and the new icon view.

This is expected, we decided a year ago to do a “tick-tack” release pace. Than means, one release with a lot of new changes and one with stabilisation and improvements in mind, and we kept that promise. We needed to stabilise and improve the new features, like the batch renaming and the decompression support, that we added last release.

So actually there are the same items that couldn’t be included and were mentioned in the previous release blog post.

The reasons this time are different. For the path bar, it was mostly because of the work in gtk4. We don’t want to have different path bars in gtk file chooser and Nautilus. So I would say we need to port Nautilus to gtk4 before committing the new path bar to gtk+. And that requires more work, and to gtk4 to stabilise a little.

For the action bar, was mostly because we need more design time on it. The difficulty in this project is not technical, but design wise.

You can look more details and bug fixes in the NEWS file.

Hope you enjoy the new release! As always, feel free to give any feedback or ask any questions in the comment section of this blog, be constructive please 🙂


GNOME hackaton in Brno

Last week, we had a presentation on Google Summer of Code and Outreachy at Brno University of Technology. Around 80 students attended which was a pretty good success considering it was not part of any course. It was a surprise for the uni people as well because the room they booked was only for 60 ppl.

The main reason why we did the presentation is that there have been very few students in Brno who participated in such programs. And the open source community is pretty big at local universities due to the presence of Red Hat. When we asked students who had heard of Google Summer of Code or Outreachy before only two raised their hands. That was even fever then we expected.

Shortly before the presentation, we discovered that the money reward for successfully finishing Google Summer of Code was not the same globally any more. And for the Czech Republic, it’s now $3600 instead of $5500. So considerably less, but still fairly attractive to local students.

As a follow-up to this presentation, we organized a GNOME hackaton in the Red Hat lab at BUT. Carlos Soriano was in charge of it with me, Felipe Borges, and Debarshi Ray helping him. Carlos prepared images for VirtualBox and KVM with a prepared development environment every student was supposed to download. People had to work in a virtual machine, but they didn’t have to spend time configuring and compiling everything and it assured that everyone had the same environment.

Around 12 students showed up which I think was a good turnout. 3 of them were women which is definitely higher % than the average at the uni. First Carlos told them to read the GNOME Newcomers guide and pick an app they’d like to contribute to. Then he created a dummy bug and showed students the whole process of fixing it from searching the code to the patch review. Then they were supposed to find some easy bug in the app of their choice and fix it.

Almost all students picked apps written in C, which is not so surprising because that’s the language they learn primarily at the university. Only one picked GNOME Music written in Python. The hackaton lasted for 5 hours and all students were busy for the whole time and almost everyone submitted some fix in the end.

Carlos is planning to do a follow-up with those who want to continue, probably before our (ir)regular Linux Desktop Meetup next week. Let’s see if some of them will make it to Google Summer of Code or Outreachy and even become long-term contributors to GNOME later on. It was the first time we actually made students to dip their fingers into the code. At all events before we had presentations on how they can contribute and pointed them to the docs to study at home, but the response was minimal. Maybe such a hackaton where you help students in person to make the first steps is the right approach to break through the barrier.

I’m pretty sure Carlos will also blog about his findings and it will be much more insightful since he spent a lot of time preparing the hackaton and was the one who talked to the students the most.

img_-dhuozjCarlos showing students how to fix a bug in GNOME

 


Valgrind Integration in Builder

I wanted a bit of a distraction this evening, so I put together a quick valgrind plugin. Just select “Run with Valgrind” from the drop down menu and it will make sure your program is run using the venerable valgrind.

If valgrind isn’t available in your runtime (host, jhbuild, flatpak, etc) you wont see the menu item. But thankfully the org.gnome.Sdk runtime we use to build GNOME apps has it, so for many of you it should Just Work™.

February 20, 2017

Data-driven testing with fbp-spec

Automated testing is a key part of software development toolkit and practice. fbp-spec is a testing framework especially designed for Flow-based Programming(FBP)/dataflow programming, which can be used with any FBP runtime.

For imperative or object-oriented code good frameworks are commonly available. For JavaScript, using for example Mocha with Chai assertions is pretty nice. These existing tools can be used with FBP also, as the runtime is implemented in a standard programming language. In fact we have used JavaScript (or CoffeeScript) with Mocha+Chai extensively to test NoFlo components and graphs. However the experience is less nice than it could be:

  • A high degree of amount of setup code is needed to test FBP components
  • Mental gymnastics when developing in FBP/dataflow, but testing with imperative code
  • The most critical aspects like inputs and expectations can drown in setup code
  • No integration with FBP tooling like the Flowhub visual programming IDE

A simple FBP specification

In FBP, code exists as a set of black-box components. Each component defines a set of inports which it receives data on, and a set of outports where data is emitted. Internal state, if any, should be observable through the data sent on outports.

A trivial FBP component

So the behavior of a component is defined by how data sent to input ports causes data to be emitted on output ports.
An fbp-spec is a set of testcases: Examples of input data along with the corresponding output data that is expected. It is stored as a machine-readable datastructure. To also make it nice to read/write also for humans, YAML is used as the preferred format.

topic: myproject/ToBoolean
cases:
-
  name: 'sending a boolean'
  assertion: 'should repeat the same'
  inputs:
    in: true
  expect:
    out:
      equals: true
- 
  name: 'sending a string'
  assertion: 'should convert to boolean'
  inputs: { in: 'true' }
  expect: { out: { equals: true } }

This kind of data-driven declaration of test-cases has the advantage that it is easy to see which things are covered – and which things are not. What about numbers? What about falsy cases? What about less obvious situations like passing { x: 3.0, y: 5.0 }?
And it would be similarly easy to add these cases in. Since unit-testing is example-based, it is critical to cover a diverse set of examples if one is to hope to catch the majority of bugs.

equals here is an assertion function. A limited set of functions are supported, including above/below, contains, and so on. And if the data output is a compound object, and possibly not all parts of the data are relevant to check, one can use a JSONPath to extract the relevant bits to run the assertion function against. There can also be multiple assertions against a single output.

topic: myproject/Parse
cases:
-
  name: 'sending a boolean'
  assertion: 'should repeat the same'
  inputs:
    in: '{ "json": { "number": 4.0, "boolean": true } }'
  expect:
    out:
    - { path: $.json.number,  equals: 4.0 }
    - { path: $.json.boolean, type: boolean }

Stateful components

A FBP component should, when possible, be state-free and not care about message ordering. However it is completely legal, and often useful to have stateful components. To test such a component one can specify a sequence of multiple input packets, and a corresponding expected output sequence.

topic: myproject/Toggle
cases:
-
  name: 'sending two packets'
  assertion: 'should first go high, then low'
  inputs:
  - { in: 0 }
  - { in: 0 }
  expect:
  -
    out: { equals: true }
    inverted: { equals: false }
  -
    out: { equals: false }
    inverted: { equals: true }

This still assumes that the component sends one set of packet out per input packet in. And that we can express our verification with the limited set of assertion operators. What if we need to test more complex message sending patterns, like a component which drops every second packet (like a decimation filter)? Or what if we’d like to verify the side-effects of a component?

Fixtures using FBP graphs

The format of fbp-spec is deliberately simple, designed to support the most common axes-of-freedom in tests as declarative data. For complex assertions, complex input data generation, or component setup, one should use a FBP graph as a fixture.

For instance if we wanted to test an image processing operation, we may have reference out and input files stored on disk. We can read these files with a set of components. And another component can calculate the similarity between the processed out, as a number that we can assert against in our testcases. The fixture graph could look like this:

Example fixture for testing image processing operation, as a FBP graph.

This can be stored using the .FBP DSL into the fbp-spec YAML file:

topic: my/Component
fixture:
 type: 'fbp'
 data: |
  INPORT=readimage.IN:INPUT
  INPORT=testee.PARAM:PARAM
  INPORT=reference.IN:REFERENCE
  OUTPORT=compare.OUT:SIMILARITY

  readimage(test/ReatImage) OUT -> IN testee(my/Component)
  testee OUT -> ACTUAL compare(test/CompareImage)
  reference(test/ReadImage) OUT -> REFERENCE compare
cases:
-
  name: 'testing complex data with custom components fixture'
  assertion: 'should pass'
  inputs:
    input: someimage
    param: 100
    reference: someimage-100-result
  expect:
    similarity:
      above: 0.99

Since FBP is a general purpose programming system, you can do arbitrarily complex things in such a fixture graph.

Flowhub integration

The Flowhub IDE is a client-side browser application. For it to actually cause changes in a live program, it communicate using the FBP runtime protocol to the FBP runtime, typically over WebSocket. This standardized protocol is what makes it possible to program such diverse targets, from webservers in Node.js, to image processing in C, sound processing with SuperCollider, bare-metal microcontrollers and distributed systems. And since fbp-spec uses the same protocol to drive tests, we can edit & run tests directly from Flowhub.

This gives Flowhub a basic level of integrated testing support. This is useful right now, and unlocks a number of future features.

On-device testing with MicroFlo

When programming microcontrollers, automated testing is still not as widely used as in web programming, at least outside very advanced or safety-critical industries. I believe this is largely because the tooling is far from as good. Which is why I’m pretty excited about fbp-spec for MicroFlo, since it makes it exactly as easy to write tests that run on microcontrollers as for any other FBP runtime.

Testing microcontroller code using fbp-spec

To summarize, with fbp-spec 0.2 there is an easy way to test FBP components, for any runtime which supports the FBP runtime protocol (and thus anything Flowhub supports). Check the documentation for how to get started.

Flattr this!

On Problems with Vala

If you’re going to be writing a new application based on GNOME technologies and targeting the GNOME ecosystem, then you should seriously consider writing it in the Vala programming language.

That’s a pretty controversial statement! Emmanuele just told us that Vala is dying and that you should find an alternative. So, if I’m recommending that you start writing new applications in Vala, clearly I disagree with him at least somewhat. Even so, I actually think pretty much all of Emmanuele’s points are correct. Vala really is dying! The status quo really is pretty bad. Using a dying programming language to write your application is rarely a good idea. You should think twice before doing so.

Still, I wouldn’t be so quick to write off Vala. For one thing, it’s a pleasure to use. The design of the language is very good. The integration with GObject and the GNOME ecosystem, from GObject signals and properties to native support for D-Bus and composite GTK+ widget templates, is second to none, and will probably never be surpassed by another language. It’s hard to understate how good the syntax of the language is, and how tailored it is for GNOME programming. People like Vala for good reasons.

Emmanuele says that it’s time to look at alternatives to Vala, but the alternatives we have to Vala right now have big problems too. If I were to start writing a new GNOME app today, Vala is still the language I would use. So now I have to try to convince you of that! First, let’s look at current problems with Vala in more detail. Then, let’s look into the alternatives we have available.

The problems with Vala are real and very serious, so I can only give it that qualified recommendation. The Vala community is slowly dying, and I would not recommend starting a big, complex application in Vala today, given the risk that the compiler might be completely unmaintained in a few years. But most GNOME applications are fairly small — only a few, like Builder or Evolution or Epiphany, are big and complex — and I think most will probably do well enough in the long run with Vala even if the Vala compiler stops improving.

Problems with Vala

Yeah, I’m afraid this is not going to be a short post. Let’s take this in two: first, common complaints that I don’t think are actually serious problems, and second, the actual serious problems.

Minor Problems with Vala

Let me start off by pointing out a couple things that I don’t consider to be serious problems with Vala: bindings issues and tooling issues.

People often complain that there are bugs with the bindings. And there are! Debugging bindings bugs is not fun at all. But I get the impression that bindings complaints are generally about the state of the bindings five years ago. The bindings situation is a lot better now than it was then, and it is constantly improving. Vala bindings actually are well-maintained (thanks mostly to Rico Tzschichholz); it’s only the compiler itself that is having maintenance problems. And most of the bindings problems are caused by introspection issues in the upstream libraries. That means that if you’re hitting a bindings problem in Vala, it’s probably a problem in every other language you might want to use as well… except C and C++, of course. And bindings issues are actually arguably far easier to debug in Vala than they would be in Python or JavaScript, since you can look for errors in the generated C code. Fixing bindings is generally easy, and you can work around the problems using a drop-in vapi file if you can’t wait to get the fix upstreamed. Adding new bindings is work if the library is not introspectable, but much easier than it is in other languages. No doubt programming would be nicer if bindings were not necessary, but unless you want to write everything in C or C++, bindings are a fact of life that won’t go away, and Vala’s are pretty darn good.

As far as tooling: it’s true that the Vala ecosystem does not have great tools, but I don’t think this is really a horrible problem. The most common complaint I see is that debugging requires looking at generated C code, and there’s no special Vala debugger. Now, in the case of crashes, it usually does indeed require looking at the generated code. But, in my experience, crashes are much, much rarer in applications that are written in Vala, so we’re going to be spending a lot less time in the debugger than we would when working on C applications anyway. And debugging the generated code with gdb isn’t so horrible. It’s hardly a great experience, but you get used to it. Be sure that you’re using vala -g to emit Vala line numbers into the generated code, otherwise you’re just making your life unnecessarily difficult. At any rate, gdb plus line numbers is the way to go here. Vala debugging is never going to be as simple as C or C++ debugging, but you’ll have to do less of it than you would in C or C++, and that’s a reasonable trade-off.

Another problem with Vala is that it suffers from the same safety issues as C and C++. You will make mistakes, and your mistakes will allow remote attackers to take control of your users’ computers. Vala doesn’t do anything to avoid buffer overflows, for instance. That’s pretty bad. But you will at least make fewer mistakes than you would in C or C++. For instance, I believe the language makes refcounting errors an order of magnitude less likely, drastically reducing the number of use-after-free vulnerabilities in your code. Compared to Rust or Python or JavaScript, this is not very good at all, but compared to C or C++, it’s excellent.

Major Problems with Vala

I see two serious problems with Vala. The first is that the compiler has bugs, and debugging compiler bugs is very unpleasant. The second is that the compiler is not well-maintained. Like Emmanuele says, the Vala community is dying, or, if you want to be generous, at least not in a very healthy state. So when you report compiler bugs, probably nobody is going to fix those bugs. This can be very frustrating.

Vala Bugs

I can confidently say that Vala has more bugs than any other programming language you might be considering using for GNOME development. It’s sad, but true. Most of the bugs are just small annoyances; for instance, bugs in which the Vala compiler emits C code that does not actually compile. These are usually easy to work around, but that can be pretty annoying. Other bugs are more serious. For instance, see signal handler spuriously runs when signal is emitted by object not connected to once every 98 emissions (which was fixed a few years ago, but a good example of how Vala bugs can cause runtime problems) or Incorrect choice of signal marshaller causes crash when promoting a pawn in GNOME Chess when built with Fedora or Debian hardening flags (still broken).

Of course, all bugs are fragile if there is an active community of developers fixing them. But, as Emmanuele has already pointed out, that is not going so well.

Vala Maintainership and Community

Vala’s greatest strength — its focus on GNOME — is also its greatest weakness. Vala is not very interesting to anyone outside the GNOME and GTK+ development communities. Accordingly, the community of Vala developers and maintainers is several orders of magnitude smaller than other programming language communities.

Relative to the fairly small size of the GNOME ecosystem, there are actually a very large number of Vala applications in existence. (All of elementary’s applications use Vala, for example.) So there is a relatively large number of Vala application maintainers with a stake in the success of the Vala project. But they’re mostly focused on developing their applications, not Vala itself. A programming language is probably not the greatest tool for any job if it requires that you participate in maintaining the compiler, after all. And the barrier for entry to Vala compiler development is high. For starters, compilers are difficult and complicated; working on a compiler is far more difficult than working on desktop applications. Moreover, of the people who are motivated to contribute to the compiler and submit a patch, most probably get discouraged pretty quickly, because most patches posted on Bugzilla do not get reviewed. There are currently 179 unreviewed patches in Vala’s request queue. The oldest patch there is 2,556 days old, so we know that it’s been seven years since anyone has cared for the outstanding patches. Any of those discouraged contributors might have eventually turned into Vala maintainers if only their patches were reviewed. Of course, most would not have, but if only one or two of the people who submitted patches was an active Vala maintainer today, the project would be in a significantly better state. And I see patches there from a large number of different developers.

But who can review the patches? Vala needs more maintainers. Rico is taking good care of the bindings and appears to be committing patches to the compiler as well, but he’s just one person and can’t do it alone. Vala stakeholders need to increase investment in the compiler. But this is a familiar problem: the majority of our modules need more maintainers. Maintainers do not grow on trees. Ideally a company will step in to support Vala compiler development, but few companies seem to have taken an interest in Vala, so this doesn’t seem likely. This is unfortunate.

I frankly expect that Emmanuele’s prediction will prove true, and that the Vala situation will only get worse in the next five years. It’s more likely than not. But I’m not very confident in that guess! Several people have contributed significant patches to the Vala compiler recently. (Carlos Garnacho, you have earned much beer.) The future is still uncertain. I very much hope that my pessimistic expectation is proven wrong, and that the maintainership situation will improve soon.

But while the Vala compiler may stagnate, it’s probably not going to get worse. I think it’s good enough for writing GNOME applications today, and I expect it will still be good enough in five years.

Alternatives to Vala

So Vala is not in great state. What else can we use to write GNOME applications? The only serious programming languages in the GNOME ecosystem are C, C++, Vala, Python (using PyGObject), and JavaScript (using gjs). No, I did not miss any options. If your favorite language isn’t listed there, it’s either because (a) it doesn’t have decent GObject bindings, or (b) the language is not popular at all. To my knowledge, all GNOME software is written in one of those five languages, except for a couple old applications that use C#. And the state of C# in GNOME makes Vala look like an active vibrant language. If you want to start writing a GTK+ 2 app in 2017, go ahead and use C#. The rest of us will restrict our search to C, C++, Vala, Python, and JavaScript.

(Tangent time! Rust is trendy, but I’m told it needs more help to improve the GObject bindings before we start using it in applications. I’m hoping that it will emerge as the superior option in the not so distant future, but it’s definitely not ready for use in GNOME yet. It has to have better GObject integration. It has to have some degree of ABI stability, even if it’s limited. Dynamic linking has to be the default. It’s not going to be successful in the GNOME community otherwise. You should join the Rust folks and help out!)

Let’s start with C. C is undoubtedly the most popular language used in GNOME programming, but it would be flatly irresponsible to choose it for writing new applications. I enjoy writing C, but like everyone else, I make mistakes, and I think it would be desirable if my programming mistakes did not allow attackers to execute arbitrary code on your computer. It’s also extremely verbose, requiring far more lines of code to do simple things than the other programming languages that we’re considering do. C is not a reasonable option for new applications in 2017, even if it is the language you are most familiar with. I wouldn’t go so far as to say that our existing applications need to be rewritten in a safer language, because rewriting applications is hard and our developer community is small, but I certainly would not want to start writing any new applications in C. We need a C migration plan.

Modern C++ is a bit safer and much more pleasant to use than C, but that’s really not saying all that much. Footguns abound. You have to know all sorts of arcane rules to use it properly. The barrier for entry to new contributors is much higher than it is with C. Developers still make lots of mistakes, and those mistakes still allow remote attackers to take control of your users’ computers. So C++ is not a good choice for new applications either.

Python… OK, I suppose Python is pretty good, if you’re willing to give up compiler errors and static typing. I prefer to use a compiled language for writing serious software, because I make a lot of mistakes, and I’d rather the compiler catch those mistakes when possible than find out about them at runtime. So I would still prefer Vala. But if you prefer scripting languages, then Python is just fine, and doesn’t suffer from any of the disadvantages of Vala, and you should use it for your new app. Some developers have mentioned that there are some gotchas and interoperability issues with moving between Python APIs and GNOME APIs, but no programming environment is ever going to be perfect. PyGObject is good enough, and I’m pretty sure we’re going to be using it for a long time.

The last option is JavaScript. With all due respect to the gjs folks — and Philip Chimento in particular, who has been working hard at Endless to improve the JavaScript experience for GNOME developers — there’s no way to change the reality that JavaScript is a terrible language. It has close to zero redeeming features, and many confusing ones. You use it in web browsers because you have to, but for a desktop application, I have no clue why you would choose to use this over Python. We have to maintain gjs forever (for some value of “forever”) because GNOME Shell uses it, and it’s also being used by a couple apps like GNOME Weather and GNOME Documents. But it should be your last choice for a desktop application. Do not use JavaScript for new projects.

Another disadvantage of using JavaScript is that there is a huge barrier to entry for newcomers. But wait, lots of web developers are familiar with JavaScript; wasn’t the whole point of using it to lower the barrier of entry to newcomers? Well look how well that worked out for us! We have approximately zero new developers flocking to work on our JavaScript applications. The only documentation currently available online is over three years old, covers only a subset of the introspectable libraries that you want to use, and is frankly pretty bad. Unless opening gir files in a text editor and reading internal gjs unit tests to figure out how to call functions sounds like a good newcomer experience to you, then we need to steer far clear of JavaScript. The documentation situation is a fixable problem — Philip has much improved documentation that’s just waiting for hosting to materialize — but there’s no momentum to fix it right now, and the defects of the language can’t ever be fixed.

So all of the alternatives to Vala have big problems too, except maybe for Python, which is not a compiled language, which many of us would consider a serious disadvantage in itself. If you don’t want to use Vala, you have to pick one of the alternatives. So which will it be? I have no doubt that many or even most of our community places different weight on the various advantages and disadvantages of the languages. I actually expect mine is a minority opinion. But at the very least, I think I’ve shown why Vala still seems attractive to many developers.

(Note that the above analysis does not apply to libraries. You cannot write a system library in Python or JavaScript. You can do so with Vala or C++, but it requires special care. GNOME platform libraries must have a C API in order to be introspectable and useful.)

Conclusion

If you ignore its bugs and its maintainership status, Vala is by far the best language for writing GNOME applications. But those are pretty big things to ignore. I’d still use it anyway. It’s hard to understate how pleasant it is to develop with. The most frequent complaints I see are about problems that I don’t actually consider very serious. I don’t know. I also don’t know what the language of GNOME’s future is, but I do know that we need to stop writing new applications in C, and until GObject integration for Rust is ready, Vala still seems like our best shot at that.

Who Maintains That Stuff?

If you use GNOME or Ubuntu, then GNOME Disks is probably what you rely on if you ever need to do any disk management operations, so it’s a relatively important piece of software for GNOME and Ubuntu users. Now if you’re a command line geek, you might handle disk management via command line, and that’s fine, but most users don’t know how to do that. Or if you’re living in the past like Ubuntu and not yet using Wayland, you might prefer GParted (which does not work under Wayland because it requires root permissions, while we intentionally will not allow applications to run as root in Wayland). But for anyone else, you’re probably using GNOME Disks. So it would be good for it to work reliably, and for it to be relatively free of bugs.

I regularly receive new bug reports against GNOME Disks. Sometimes they’re not very well-constructed or based on some misunderstanding of how partitioning works, in which case I’ll close them, but most of them are good and valid. So who fixes bug reports against GNOME Disks? The answer is: nobody! Unless it’s really, really easy — in which case I might allocate five minutes for it — nobody is going to fix the bug that you reported. What a shame!

Who is the maintainer? In this case, it’s me, but I don’t actually know much anything about the application and certainly don’t have time to fix things; I just check Bugzilla to see if anybody has posted a patch, so that contributors’ patches (which are rare) don’t get totally neglected, and make new releases every once in a while, and only because I didn’t want to see such a critical piece of software go completely unmaintained.

If you’re a software developer with an interest in both GNOME and disk management, GNOME Disks would be a great place to help out. A great place to start would be to search through GNOME Bugzilla for issues to work on, and submit patches for them.

Of course, Disks is far from the only unmaintained or undermaintained software in GNOME. Last year, Sébastien set up a wiki page to track unmaintained and undermaintained apps. It has had some success: in that time, GNOME Calculator, Shotwell, Gtranslator, and Geary have all found maintainers and been removed from the list of unmaintained modules. (Geary is still listed as undermaintained, and no doubt it would be nice to have more Geary maintainers, but the current maintainer seems to be quite active, so I would hesitate to list it as undermaintained. Epiphany would love to have a second maintainer as well. No doubt most GNOME apps would.)

But we still have a few apps that are listed as unmaintained:

  • Bijiben (GNOME Notes)
  • Empathy
  • GNOME Disks

No doubt there are more GNOME modules that should be listed. If you know of some, please add them or leave a comment here.

Help would be very much welcome with any of these. In particular, Empathy and Bijiben are both slated to be removed from Fedora beginning with Fedora 27 due to their unacceptable dependencies on an old, insecure version of WebKitGTK+ that is about to be removed from the distribution. Most of the work to port these applications to modern WebKitGTK+ is already done (and, in the case of Empathy, I’ve already committed the port to git), but an active maintainer is required to finish the job and get things to a releasable state. Last I checked, Bijiben also still needed to be ported to GTK+ 3.20. If nobody is interested in helping out, these apps are going to disappear sooner rather than later.

Disks, fortunately, is not going to disappear anytime soon. But the bugs aren’t going to fix themselves.

P.S. This blog is not the right place to complain about no longer allowing applications to run as root. Such applications can and should use Polkit to move privileged operations out of the GUI and into a helper process. This should have been done roughly a decade ago. Such applications might themselves be unmaintained or undermaintained; can you help them out?

February 19, 2017

Rusty Builder

Thanks to Georg Vienna, Builder can now manage your Rust installations using RustUp!

February 18, 2017

Fri 2017/Feb/17

  • How librsvg exports reference-counted objects from Rust to C

    Librsvg maintains a tree of RsvgNode objects; each of these corresponds to an SVG element. An RsvgNode is a node in an n-ary tree; for example, a node for an SVG "group" can have any number of children that represent various shapes. The toplevel element is the root of the tree, and it is the "svg" XML element at the beginning of an SVG file.

    Last December I started to sketch out the Rust code to replace the C implementation of RsvgNode. Today I was able to push a working version to the librsvg repository. This is a major milestone for myself, and this post is a description of that journey.

    Nodes in librsvg in C

    Librsvg used to have a very simple scheme for memory management and its representation of SVG objects. There was a basic RsvgNode structure:

    typedef enum {
        RSVG_NODE_TYPE_INVALID,
        RSVG_NODE_TYPE_CHARS,
        RSVG_NODE_TYPE_CIRCLE,
        RSVG_NODE_TYPE_CLIP_PATH,
        /* ... a bunch of other node types */
    } RsvgNodeType;
    	      
    typedef struct {
        RsvgState    *state;
        RsvgNode     *parent;
        GPtrArray    *children;
        RsvgNodeType  type;
    
        void (*free)     (RsvgNode *self);
        void (*draw)     (RsvgNode *self, RsvgDrawingCtx *ctx, int dominate);
        void (*set_atts) (RsvgNode *self, RsvgHandle *handle, RsvgPropertyBag *pbag);
    } RsvgNode;
    	    

    This is a no-frills base struct for SVG objects; it just has the node's parent, its children, its type, the CSS state for the node, and a virtual function table with just three methods. In typical C fashion for derived objects, each concrete object type is declared similar to the following one:

    typedef struct {
        RsvgNode super;
        RsvgLength cx, cy, r;
    } RsvgNodeCircle;
    	    

    The user-facing object in librsvg is an RsvgHandle: that is what you get out of the API when you load an SVG file. Internally, the RsvgHandle has a tree of RsvgNode objects — actually, a tree of concrete implementations like the RsvgNodeCircle above or others like RsvgNodeGroup (for groups of objects) or RsvgNodePath (for Bézier paths).

    Also, the RsvgHandle has an all_nodes array, which is a big list of all the RsvgNode objects that it is handling, regardless of their position in the tree. It also has a hash table that maps string IDs to nodes, for when the XML elements in the SVG have an "id" attribute to name them. At various times, the RsvgHandle or the drawing-time machinery may have extra references to nodes within the tree.

    Memory management is simple. Nodes get allocated at loading time, and they never get freed or moved around until the RsvgHandle is destroyed. To free the nodes, the RsvgHandle code just goes through its all_nodes array and calls the node->free() method on each of them. Any references to the nodes that remain in other places will dangle, but since everything is being freed anyway, things are fine. Before the RsvgHandle is freed, the code can copy pointers around with impunity, as it knows that the all_nodes array basically stores the "master" pointers that will need to be freed in the end.

    But Rust doesn't work that way

    Not so, indeed! C lets you copy pointers around all you wish; it lets you modify all the data at any time; and forces you to do all the memory-management bookkeeping yourself. Rust has simple and strict rules for data access, with profound implications. You may want to read up on ownership (where variable bindings own the value they refer to, there is one and only one binding to a value at any time, and values are deleted when their variable bindings go out of scope), and on references and borrowing (references can't outlive their parent scope; and you can either have one or more immutable references to a resource, or exactly one mutable reference, but not both at the same time). Together, these rules avoid dangling pointers, data races, and other common problems from C.

    So while the C version of librsvg had a sea of carefully-managed shared pointers, the Rust version needs something different.

    And it all started with how to represent the tree of nodes.

    How to represent a tree

    Let's narrow down our view of RsvgNode from C above:

    typedef struct {
        ...
        RsvgNode  *parent;
        GPtrArray *children;
        ...
    } RsvgNode;
    	    

    A node has pointers to all its children, and each child has a pointer back to the parent node. This creates a bunch of circular references! We would need a real garbage collector to deal with this, or an ad-hoc manual memory management scheme like librsvg's and its all_nodes array.

    Rust is not garbage-collected and it doesn't let you have shared pointers easily or without unsafe code. Instead, we'll use reference counting. To avoid circular references, which a reference-counting scheme cannot handle, we use strong references from parents to children, and weak references from the children to point back to parent nodes.

    In Rust, you can add reference-counting to your type Foo by using Rc<Foo> (if you need atomic reference-counting for multi-threading, you can use an Arc<Foo>). An Rc represents a strong reference count; conversely, a Weak<Foo> is a weak reference. So, the Rust Node looks like this:

    pub struct Node {
        ...
        parent:    Option<Weak<Node>>,
        children:  RefCell<Vec<Rc<Node>>>,
        ...
    }
    	    

    Let's unpack that bit by bit.

    "parent: Option<Weak<Node>>". The Weak<Node> is a weak reference to the parent Node, since a strong reference would create a circular refcount, which is wrong. Also, not all nodes have a parent (i.e. the root node doesn't have a parent), so put the Weak reference inside an Option. In C you would put a NULL pointer in the parent field; Rust doesn't have null references, and instead represents lack-of-something via an Option set to None.

    "children: RefCell<Vec<Rc<Node>>>". The Vec<Rc<Node>>> is an array (vector) of strong references to child nodes. Since we want to be able to add children to that array while the rest of the Node structure remains immutable, we wrap the array in a RefCell. This is an object that can hand out a mutable reference to the vector, but only if there is no other mutable reference at the same time (so two places of the code don't have different views of what the vector contains). You may want to read up on interior mutability.

    Strong Rc references and Weak refs behave as expected. If you have an Rc<Foo>, you can ask it to downgrade() to a Weak reference. And if you have a Weak<Foo>, you can ask it to upgrade() to a strong Rc, but since this may fail if the Foo has already been freed, that upgrade() returns an Option<Rc<Foo>> — if it is None, then the Foo was freed and you don't get a strong Rc; if it is Some(x), then x is an Rc<Foo>, which is your new strong reference.

    Handing out Rust reference-counted objects to C

    In the post about Rust constructors exposed to C, we talked about how a Box is Rust's primitive to put objects in the heap. You can then ask the Box for the pointer to its heap object, and hand out that pointer to C.

    If we want to hand out an Rc to the C code, we therefore need to put our Rc in the heap by boxing it. And going back, we can unbox an Rc and let it fall out of scope in order to free the memory from that box and decrease the reference count on the underlying object.

    First we will define a type alias, so we can write RsvgNode instead of Rc<Node> and make function prototypes closer to the ones in the C code:

    pub type RsvgNode = Rc<Node>;
    	    

    Then, a convenience function to box a refcounted Node and extract a pointer to the Rc, which is now in the heap:

    pub fn box_node (node: RsvgNode) -> *mut RsvgNode {
        Box::into_raw (Box::new (node))
    }
    	    

    Now we can use that function to implement ref():

    #[no_mangle]
    pub extern fn rsvg_node_ref (raw_node: *mut RsvgNode) -> *mut RsvgNode {
        assert! (!raw_node.is_null ());
        let node: &RsvgNode = unsafe { & *raw_node };
    
        box_node (node.clone ())
    }
    	    

    Here, the node.clone () is what increases the reference count. Since that gives us a new Rc, we want to box it again and hand out a new pointer to the C code.

    You may want to read that twice: when we increment the refcount, the C code gets a new pointer! This is like creating a hardlink to a Unix file — it has two different names that point to the same inode. Similarly, our boxed, cloned Rc will have a different heap address than the original one, but both will refer to the same Node in the end.

    This is the implementation for unref():

    #[no_mangle]
    pub extern fn rsvg_node_unref (raw_node: *mut RsvgNode) -> *mut RsvgNode {
        if !raw_node.is_null () {
            let _ = unsafe { Box::from_raw (raw_node) };
        }
    
        ptr::null_mut () // so the caller can do "node = rsvg_node_unref (node);" and lose access to the node
    }
    	    

    This is very similar to the destructor from a few blog posts ago. Since the Box owns the Rc it contains, letting the Box go out of scope frees it, which in turn decreases the refcount in the Rc. However, note that this rsvg_node_unref(), intended to be called from C, always returns a NULL pointer. Together, both functions are to be used like this:

    RsvgNode *node = ...; /* acquire a node from Rust */
    
    RsvgNode *extra_ref = rsvg_node_ref (node);
    
    /* ... operate on the extra ref; now discard it ... */
    
    extra_ref = rsvg_node_unref (extra_ref);
    
    /* Here extra_ref == NULL and therefore you can't use it anymore! */
    	    

    This is a bit different from g_object_ref(), which returns the same pointer value as what you feed it. Also, the pointer that you would pass to g_object_unref() remains usable if you didn't take away the last reference... although of course, using it directly after unreffing it is perilous as hell and probably a bug.

    In these functions that you call from C but are implemented in Rust, ref() gives you a different pointer than what you feed it, and unref() gives you back NULL, so you can't use that pointer anymore.

    To ensure that I actually used the values as intended and didn't fuck up the remaining C code, I marked the function prototypes with the G_GNUC_WARN_UNUSED_RESULT attribute. This way gcc will complain if I just call rsvg_node_ref() or rsvg_node_unref() without actually using the return value:

    RsvgNode *rsvg_node_ref (RsvgNode *node) G_GNUC_WARN_UNUSED_RESULT;
    
    RsvgNode *rsvg_node_unref (RsvgNode *node) G_GNUC_WARN_UNUSED_RESULT;
    	    

    And this actually saved my butt in three places in the code when I was converting it to reference counting. Twice when I forgot to just use the return values as intended; once when the old code was such that trivially adding refcounting made it use a pointer after unreffing it. Make the compiler watch your back, kids!

    Testing

    One of the things that makes me giddy with joy is how easy it is to write unit tests in Rust. I can write a test for the refcounting machinery above directly in my node.rs file, without needing to use C.

    This is the test for ref and unref:

    #[test]
    fn node_refs_and_unrefs () {
        let node = Rc::new (Node::new (...));
    
        let mut ref1 = box_node (node);                            // "hand out a pointer to C"
    
        let new_node: &mut RsvgNode = unsafe { &mut *ref1 };       // "bring back a pointer from C"
        let weak = Rc::downgrade (new_node);                       // take a weak ref so we can know when the node is freed
    
        let mut ref2 = unsafe { rsvg_node_ref (new_node) };        // first extra reference
        assert! (weak.upgrade ().is_some ());                      // "you still there?"
    
        ref2 = unsafe { rsvg_node_unref (ref2) };                  // drop the extra reference
        assert! (weak.upgrade ().is_some ());                      // "you still have life left in you, right?"
    
        ref1 = unsafe { rsvg_node_unref (ref1) };                  // drop the last reference
        assert! (weak.upgrade ().is_none ());                      // "you are dead, aren't you?
    }
    	    

    And this is the test for two refcounts indeed pointing to the same Node:

    #[test]
    fn reffed_node_is_same_as_original_node () {
        let node = Rc::new (Node::new (...));
    
        let mut ref1 = box_node (node);                         // "hand out a pointer to C"
    
        let mut ref2 = unsafe { rsvg_node_ref (ref1) };         // "C takes an extra reference and gets a new pointer"
    
        unsafe { assert! (rsvg_node_is_same (ref1, ref2)); }    // but they refer to the same thing, correct?
    
        ref1 = rsvg_node_unref (ref1);
        ref2 = rsvg_node_unref (ref2);
    }
    	    

    Hold on! Where did that rsvg_node_is_same() come from? Since calling rsvg_node_ref() now gives a different pointer to the original ref, we can no longer just do "some_node == tree_root" to check for equality and implement a special case. We need to do "rsvg_node_is_same (some_node, tree_root)" instead. I'll just point you to the source for this function.

February 16, 2017

Announcing scikit-survival – a Python library for survival analysis build on top of scikit-learn

I've meant to do this release for quite a while now and last week I finally had some time to package everything and update the dependencies. scikit-survival contains the majority of code I developed during my Ph.D.

About Survival Analysis

Survival analysis – also referred to as reliability analysis in engineering – refers to type of problem in statistics where the objective is to establish a connections between a set of measurements (often called features or covariates) and the time to an event. The name survival analysis originates from clinical research: in many clinical studies, one is interested in predicting the time to death, i.e., survival. Broadly speaking, survival analysis is a type of regression problem (one wants to predict a continuous value), but with a twist. Consider a clinical study, which investigates coronary heart disease and has been carried out over a 1 year period as in the figure below.

Patient A was lost to follow-up after three months with no recorded cardiovascular event, patient B experienced an event four and a half months after enrollment, patient D withdrew from the study two months after enrollment, and patient E did not experience any event before the study ended. Consequently, the exact time of a cardiovascular event could only be recorded for patients B and C; their records are uncensored. For the remaining patients it is unknown whether they did or did not experience an event after termination of the study. The only valid information that is available for patients A, D, and E is that they were event-free up to their last follow-up. Therefore, their records are censored.

Formally, each patient record consists of a set of covariates $x \in \mathbb{R}^d$ , and the time $t > 0$ when an event occurred or the time $c > 0$ of censoring. Since censoring and experiencing and event are mutually exclusive, it is common to define an event indicator $\delta \in \{0; 1\}$ and the observable survival time $y > 0$. The observable time $y$ of a right censored sample is defined as
\[ y = \min(t, c) =
\begin{cases}
t & \text{if } \delta = 1 , \\
c & \text{if } \delta = 0 ,
\end{cases}
\]

What is scikit-survival?

Recently, many methods from machine learning have been adapted for these kind of problems: random forest, gradient boosting, and support vector machine, many of which are only available for R, but not Python. Some of the traditional models are part of lifelines or statsmodels, but none of those libraries plays nice with scikit-learn, which is the quasi-standard machine learning framework for Python.

This is exactly where scikit-survival comes in. Models implemented in scikit-survival follow the scikit-learn interfaces. Thus, it is possible to use PCA from scikit-learn for dimensionality reduction and feed the low-dimensional representation to a survival model from scikit-survival, or cross-validate a survival model using the classes from scikit-learn. You can see an example of the latter in this notebook.

Download and Install

The source code is available at GitHub and can be installed via Anaconda (currently only for Linux) or pip.

conda install -c sebp scikit-survival

pip install scikit-survival

The API documentation is available here and scikit-survival ships with a couple of sample datasets from the medical domain to get you started.

Using IPython for parallel computing on an MPI cluster using SLURM

I assume that you already familiar with using IPython.parallel, otherwise have a look at the documentation. Just one point of caution, if you move code that you want to run in parallel to a package or module, you might stumble upon the problem that Python cannot find the function you imported. The problem is that IPython.parallel only looks in the global namespace. The solution is using @interactive from IPython.parallel as described in this Stack Overflow post.

Although, IPython.parallel comes with built-in support for distributing work using MPI, it is going to create MPI tasks itself. However, usually compute clusters come with a job scheduler like SLURM that manages all resources. Consequently, the scheduler determines how many resources you have access to. In the case of SLURM, you have to define the number of tasks you want to process in parallel and the maximum time our job will require when adding your job to the queue.

#!/bin/bash
#SBATCH -J ipython-parallel-test
#SBATCH --ntasks=112
#SBATCH --time=00:10:00

The above script gives the job a name, requests resources to 112 tasks and sets the maximum required time to 10 minutes.

Normally, you would use ipcluster start -n 112, but since we are not allowed to create MPI tasks ourselves, we have to start the individual pieces manually via ipcontroller and ipengine. The controller provides a single point of contact the engines connect to and the engines take commands and execute them.

profile=job_${SLURM_JOB_ID}_$(hostname)
 
echo "Creating profile ${profile}"
ipython profile create ${profile}
 
echo "Launching controller"
ipcontroller --ip="*" --profile=${profile} --log-to-file &
sleep 10
 
echo "Launching engines"
srun ipengine --profile=${profile} --location=$(hostname) --log-to-file &
sleep 45

First of all, we create a new IPython profile, which will contain log files and temporary files that are necessary to establish the communication between controller and engines. To avoid clashes with other jobs, the name of the profile contains the job's ID and the hostname of the machine it is executed on. This will create a folder named profile_job_XYZ_hostname in the ~/.ipython folder.

Next, we start the controller and instruct it to listen on all available interfaces, use the newly create profile and write output to log files residing in the profile's directory. Note that this command is executed only on a single node, thus we only have a single controller per job.

Finally, we can create the engines, one one for each task, and instruct them to connect to the correct controller. Explicitly specifying the location of the controller is necessary if engines are spread across multiple physical machines and machines have multiple Ethernet interfaces. Removing this option, engines running on other machines are likely to fail connecting to the controller because they might look for the controller at the wrong location (usually localhost). You can easily find out whether this is the case by looking at the ipengine log files in the ~/.ipython/profile_job_XYZ_hostname/log directory.

Finally, we can start our Python script that uses IPython.parallel to distribute work across multiple nodes.

echo "Launching job"
python ipython-parallel-test.py --profile ${profile}

To make things more interesting, I created an actual script that approximates the number Pi in parallel.

import argparse
from IPython.parallel import Client
import numpy as np
import os
 
 
def compute_pi(n_samples):
        s = 0
        for i in range(n_samples):
                x = random()
                y = random()
                if x * x + y * y <= 1:
                        s += 1
        return 4. * s / n_samples
 
 
def main(profile):
        rc = Client(profile=profile)
        views = rc[:]
        with views.sync_imports():
                from random import random
        
        results = views.apply_sync(compute_pi, int(1e9))
        my_pi = np.sum(results) / len(results)
        filename = "result-job-{0}.txt".format(os.environ["SLURM_JOB_ID"])
        with open(filename, "w") as fp:
                fp.write("%.20f\n" % my_pi)
 
 
if __name__ == '__main__':
        parser = argparse.ArgumentParser()
        parser.add_argument("-p", "--profile", required=True,
                help="Name of IPython profile to use")
 
        args = parser.parse_args()
 
        main(args.profile)

I ran this on the Linux cluster of the Leibniz Supercomputing Centre. Using 112 tasks, it took a little more than 8 minutes, and the result I got was 3.14159637160714266813. You can download the SLURM script and Python code from above here.

February 15, 2017

Transit routing has landed!

So, at FOSDEM a bit over a week ago, me, Jonas Danielsson, Mattias Bengtsson, and Andreas Nilsson talked about plans for landing the transit routing feature and we started doing some patch reviewing over some beers on Saturday evening.

Thanks to a lot of awesome reviewing work done by Jonas, this work has finally landed in master!

As always when merging such a long-living branch where I have been dwelling for about a year or so almost feels like saying farewell to an old friend.

Though, until we have sorted out some infrastructure for running OpenTripPlanner, you would have to use the debug environment variable OTP_BASE_URL (or modify the service file) as I have described in earlier blog posts.

Another feature that I intend to squeeze in before we release 3.24 is the ability to reverse routes. This is something that came up during Andreas' user interviews, one user had a use-case of going into town to meet up with friends and finding a suitable route home later in the evening. I realized our current UI is a bit awkward for this (even though you can drag and drop the entered places/addresses).

So with this new feature you get a handy button to reverse what you just searched for:


Searching for a route.


After clicking the reverse button (the up/down arrows)

Looking at one of trips

So, until next time, map along! :-)

Nightly and Wayland Builds of Firefox for Flatpak

When I announced Firefox Developer Edition for Flatpak over a month ago, I also promised that we would not stop there and bring more options in the future. Now I can proudly announce that we provide two more variants of Firefox – Firefox Nightly and Firefox Nightly for Wayland.

With Nightly, you can closely follow the development of Firefox. Due to Flatpak you can easily install it and keep getting daily updates via our flatpak repo.

As a bonus, we’re also bringing a Firefox build that runs natively on Wayland. We used to provide a Copr repository, but with Flatpak it’s open to users of other distros, too. When running this version, keep in mind it’s still WIP and very experimental. Firefox seems to run just fine on Wayland at first glance, but there is still some basic functionality missing (copy-paste for example) and it’s not so stable either (it crashed the whole Wayland session for me once). But once it’s done, it will be a big improvement in security for Firefox on Linux because Wayland isolates the application on the display server level. Together with other pieces of Flatpak sandboxing, it will provide a full sandbox for the browser in the future.

When adding more Firefox variants to the repo, we first considered using branches, but you have to switch between them manually to start different variants of Firefox which we didn’t find very user friendly. In the end, we’re using one branch and multiple applications with different names in it. This way, you can install and run multiple variants in parallel.

snimek-z-2017-02-15-13-13-22

You can find the instructions to install Firefox for Flatpak on the repository webpage. We’re also constantly improving how Firefox runs in Flatpak. If you have any Flatpak-specific problems with Firefox, please report it to our bug tracker on Github. If you hit problems that are not Flatpak-specific, please report them directly to Mozilla.

And again kudos to Jan Hořák from our team who made all this happen!


Dark Windows for Dark Firefox

I recently set the Compact Dark theme as my default in Firefox. Since we don’t yet have Linux client-side window decorations yet (when is that happening??), it looks kind of bad in GNOME. The window decorator shows up as a light band in a sea of darkness:

Firefox with light window decorator

It just looks bad. You know? I looked for an addon that would change the decorator to the dark-themed one, but I couldn’t find any. I ended up adapting the gtk-dark-theme Atom addon to a Firefox one. It was pretty easy. I did it over a remarkable infant sleep session on a Saturday morning. Here is the result:

Firefox with dark window decorator

You can grab the yet-to-be-reviewed addon here.


February 14, 2017

Vala 1.0?

Yes is time to consider a Vala 1.0 release. Vala 0.34 code generator and bindings support LTS versions of GTK+ 3.22 and GLib 2.50. Next stable version of GTK+ will be 4.0 and GLib 2.x, but they have to traverse through 3.9x versions and any GLib 2.x on the way. Reaching that point we can consider Vala 2.0 release.

Traditionally a new Vala version is released along with GLib and GTK+, with bindings updates, from them and other projects. Some times, is necessary to change some Vala syntax or implementations, like threats, in order to be in sync with current GLib/GObject/GIO features. Because this, may we can consider to increase Vala API each stable version of GTK+.

In order to reach Vala 1.0, may we should identify stoppers bugs before to go. Some of them can be related to GNOME Builder, but may is not a real problem is you don’t use it for Vala development. I really would like to file, find and send patches for some of them, producing warnings at C compilation, for a better experience.

A road map could be stated at Vala’s Wiki, in order to track that stopper bugs/features to fix/implement before 1.0 release. This will mark a good stand point for any developer using Vala.

Debian is around corner. It supply Vala 0.34, then it should be considered, and backport fixes, as a LTS version.

Vala is not a Programming Language

Vala provides you a way to write C/GObject/GInterface code using a different syntax. Vala doesn’t require to develop a “core library” in order to provide its features. Its “compiler” is not a compiler, is a C code generator.

Vala can’t be compared with Rust, Go, Python, Java or C#, all of them provide their own “core library” in order to provide most of their features, allows you to create modules (like a library) to extend the language for their users consume. Their core generally is written in C, for very basic features, but almost in the language itself.

Vala is different in many different ways to other real programming languages. Vala can’t exists without GLib and GObject. It just wraps that libraries to provide its Object Oriented syntax. It doesn’t provides any feature by it self, just a syntax to use the ones already present in above libraries, a part to provide some convenient methods to reduce coding boilerplate.

GLib, GIO and GObject are C libraries. They are the basis for GNOME, GNOME Shell and any other application using GNOME technologies. GTK+ and any library or application using it, relays on above libraries, they can’t exists without them. VMWare and LibreOffice consume GTK+ and so GLib/GObject.

Vala provides mechanism to create bindings. They are conventions to help Vala’s code generator to produce correct C code to consume any other, apart of GLib/GObject, libraries. It can do better job if the library use GObject.

Vala is not a Language: This is a conclusion after the On Vala article, comparing Vala’s repository’s activity with other real languages.

May we need to add to Vala’s activity, all commits and developers on GLib, GObject, GTK+ and GObject Introspection, because they are, indirectly, supporters of Vala’s features and usefulness; may be we need to consider all other libraries’ activity, like libsoup, libgda, and any other one supporting GObject Introspection, because they can be consumed by Vala; this is specially true if we consider their support to generate, automatically, Vala bindings.

Considering all of these, if Vala was a real language: it can be considered to be one with a very reach, secure and stable “core”, with years of development behind; with a reach set of other libraries for your use, as a developer, to choose from. But again: Vala is not a Programming Language.

Vala is multiplatform as long as its Code Generator and their libraries it depends on, or can bind, are present in your target platform: Windows and OS X, are just examples.

May is time to update Vala’s README and vala.doap at git.gnome.org, in order to avoid missunderstandings.

Something completely different

I will take part in a little experiment with a couple of friends, trying this newfangled technology thingy with games and stuff. So tonight, Feb 14th, 2017, 20:15 CET, we will try to release our inner Bernhard Langer and amaze you with our virtual golfing skills!

Come and watch LIVE, in color and whatnot at https://twitch.tv/phako78 (that’s with a camera pointing at me) or https://twitch.tv/fallenbeck (that’s with a camera pointing at someone else) – Unfortunately, except for some swearing, it’ll be German language only.

The Dystopia of Minority Report Needs Proprietary Software

I encourage all of you to either listen to or read the transcript of Terry Gross' Fresh Air interview with Joseph Turow about his discussion of his book “The Aisles Have Eyes: How Retailers Track Your Shopping, Strip Your Privacy, And Define Your Power”.

Now, most of you who read my blog know the difference between proprietary and Free Software, and the difference between a network service and software that runs on your own device. I want all of you have a good understanding of that to do a simple thought experiment:

How many of the horrible things that Turow talks about can happen if there is no proprietary software on your IoT or mobile devices?

AFAICT, other than the facial recognition in the store itself that he talked about in Russia, everything he talks about would be mitigated or eliminated completely as a thread if users could modify the software on their devices.

Yes, universal software freedom will not solve all the worlds' problems. But it does solve a lot of them, at least with regard to the bad things the powerful want to do to us via technology.

(BTW, the blog title is a reference to Philip K. Dick's Minority Report, which includes a scene about systems reading people's eyes to target-market to them. It's not the main theme of that particular book, though… Dick was always going off on tangents in his books.)

February 13, 2017

Looking into what Rust can do that other languages can't ... or can they

A recent blog post What Rust Can Do That Other Languages Can't the following statement is presented:

struct X {
  y: Y
}
impl X {
  fn y(&self) -> &Y { &self.y }
}

This defines an aggregate type containing a field y of type Y directly (not as a separate heap object). Then we define a getter method that returns the field by reference. Crucially, the Rust compiler will verify that all callers of the getter prevent the returned reference from outliving the X object. It compiles down to what you'd expect a C compiler to produce (pointer addition) with no space or time overhead.

Let's look into this a bit more deeply. But, since this is The Internet, first a pre-emptive statement.

Things not claimed in this blog post

To make things easier for forum discussers who have strong opinions but have hard time reading entire articles, here is a list of things that are not claimed. I repeat: not claimed.
  • that [other language] is as safe as Rust (it is most likely not)
  • that you should use [other language] instead of Rust
  • that you should use Rust instead of [other language]
  • that [other language] is safe out of the box (it probably is not)
  • security feature X of language Y will prevent all bugs of type Z (it does not)

To business

Plain C code corresponding to the setup above is roughly the following:

struct foo {
  int x;
};

struct bar {
  struct foo f;
};

A simple function that demonstrates the bug is the following:

int main(int argc, char **argv) {
  struct foo *f;
  struct bar *expiring = malloc(sizeof(struct bar));
  f = &expiring->f;
  free(expiring);
  do_something(f);
  return 0;
}

This uses an obviously invalid pointer in a function call. Neither GCC nor Clang warns about this when compiling. Scan-build, however, does detect the bug:

expired.c:23:3: warning: Use of memory after it is freed
  do_something(f);
  ^~~~~~~~~~~~~~~~~
1 warning generated.
scan-build: 1 bug found.

Address sanitizer also detects the invalid access operation inside do_something. Things get more interesting if we change the function so that expiring is on the stack instead of the heap.

int main(int argc, char **argv) {
  struct foo *b;
  {
    struct bar expiring;
    b = &expiring.f;
  }
  do_something(b);
  return 0;
}

This one is not detected at all. Neither GCC, Clang, Scan-build, Asan, CppCheck nor Valgrind see anything wrong with this. Yet this seems so simple and obvious that it should be detectable automatically.

Recently this has become possible with Address sanitizer. It grew a new argument -fsanitize-address-use-after-scope, which checks for exactly this and finds the bug effortlessly:

==10717==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffe5d0c1920 at pc 0x00000050767f bp 0x7ffe5d0c18a0 sp 0x7ffe5d0c1898

This requires, of course, a run of the program that executes this code path. It would be nice to have this in a static analyzer so it could be found in even more cases (there is already a bug report for this for scan-build).

In conclusion

The security features of Rust are great. However many of them (not all, but many) that were unique to Rust are now possible in other languages as well. If you are managing a project in C or C++ and you are not using asan + scan-build + other error checkers on your code, you should use them all and fix the issues detected. Do it even if you intend to rewrite your code in Rust, Go or any other language. Do it now! There is no excuse.

Supporting Conservancy Makes a Difference

There are a lot of problems in our society, and particularly in the USA, right now, and plenty of charities who need our support. The reason I continue to focus my work on software freedom is simply because there are so few focused on the moral and ethical issues of computing. Open Source has reached its pinnacle as an industry fad, and with it, a watered-down message: “having some of the source code for some of your systems some of the time is so great, why would you need anything more?”. Universal software freedom is however further from reality than it was even a few years ago. At least a few of us, in my view, must focus on that cause.

I did not post many blog posts about this in 2016. There was a reason for that — more than any other year, work demands at Conservancy have been constant and unrelenting. I enjoy my work, so I don't mind, but blogging becomes low priority when there is a constant backlog of urgent work to support Conservancy's mission and our member projects. It's not just Conservancy's mission, of course, it's my personal one as well.

For our 2016 fundraiser, I wrote last year a blog post entitled “Do You Like What I Do For a Living?”. Last year, so many of you responded, that it not only made it possible for me to continue that work for one more year, but we were able to add our colleague Brett Smith to our staff, which brought Conservancy to four full-time staff for the first time. We added a few member projects (and are moving that queue to add more in 2017), and sure enough — the new work plus the backlog of work waiting for another staffer filled Brett's queue just like my, Karen's and Tony's was already filled.

The challenge now is sustaining this staffing level. Many of you came to our aid last year because we were on the brink of needing to reduce our efforts (and staffing) at Conservancy. Thanks to your overwhelming response, we not only endured, but we were able to add one additional person. As expected, though, needs of our projects increased throughout the year, and we again — all four of us full-time staff — must work to our limits to meet the needs of our projects.

Charitable donations are a voluntary activity, and as such they have a special place in our society and culture. I've talked a lot about how Conservancy's Supporters give us a mandate to carry out our work. Those of you that chose to renew your Supporter donations or become new Supporters enable us to focus our full-time efforts on the work of Conservancy.

On the signup and renewal page, you can read about some of our accomplishments in the last year (including my recent keynote at FOSDEM, an excerpt of which is included here). Our work does not follow fads, and it's not particularly glamorous, so only dedicated Supporters like you understand its value. We don't expect to get large grants to meet the unique needs of each of our member projects, and we certainly don't expect large companies to provide very much funding unless we cede control of the organization to their requests (as trade associations do). Even our most popular program, Outreachy, is attacked by a small group of people who don't want to see the status quo of privileged male domination of Open Source and Free Software disrupted.

Supporter contributions are what make Conservancy possible. A year ago, you helped us build Conservancy as a donor-funded organization and stabilize our funding base. I now must ask that you make an annual commitment to renewal — either by renewing your contribution now or becoming a monthly supporter, or, if you're just learning about my work at Conservancy from this blog post, reading up on us and becoming a new Supporter.

Years ago, when I was still only a part-time volunteer at Conservancy, someone who disliked our work told me that I had “invented a job of running Conservancy”. He meant it as an insult, but I take it as a compliment with pride. In fact, between me and my colleague (and our Executive Director) Karen Sandler, we've “invented” a total of four full-time jobs and one part-time one to advance software freedom. You helped us do that with your donations. If you donate again today, your donation will be matched to make the funds go further.

Many have told me this year that they are driven to give to other excellent charities that fight racism, work for civil and immigration rights, and other causes that seem particularly urgent right now. As long as there is racism, sexism, murder, starvation, and governmental oppression in the world, I cannot argue that software freedom should be made a priority above all of those issues. However, even if everyone in our society focused on a single, solitary cause that we agreed was the top priority, it's unlikely we could make quicker progress. Meanwhile, if we all single-mindedly ignore less urgent issues, they will, in time, become so urgent they'll be insurmountable by the time we focus on them.

Industrialized nations have moved almost fully to computer automation for most every daily task. If you question this fact, try to do your job for a day without using any software at all, or anyone using software on your behalf, and you'll probably find it impossible. Then, try to do your job using only Free Software for a day, and you'll find, as I have, that tasks that should take only a few minutes take hours when you avoid proprietary software, and some are just impossible. There are very few organizations that are considering the long-term implications of this slowly growing problem and making plans to build the foundations of a society that doesn't have that problem. Conservancy is one of those few, so I hope you'll realize that long-term value of our lifelong work to defend and expand software freedom and donate.

On Vala

It seems I raised a bit of a stink on Twitter last week:

Of course, and with reason, I’ve been called out on this by various people. Luckily, it was on Twitter, so we haven’t seen articles on Slashdot and Phoronix and LWN with headlines like “GNOME developer says Vala is dead and will be removed from all servers for all eternity and you all suck”. At least, I’ve only seen a bunch of comments on Reddit about this, but nobody cares about that particular cesspool of humanity.

Sadly, 140 characters do not leave any room for nuance, so maybe I should probably clarify what I wrote on a venue with no character limit.

First of all, I’d like to apologise to people that felt I was attacking them or their technical choices: it was not my intention, but see above, re: character count. I may have only about 1000 followers on Twitter, but it seems that the network effect is still a bit greater than that, so I should be careful when wording opinions. I’d like to point out that it’s my private Twitter account, and you can only get to what it says if you follow me, or if you follow people who follow me and decide to retweet what I write.

My PSA was intended as a reflection on the state of Vala, and its impact on the GNOME ecosystem in terms of newcomers, from the perspective of a person that used Vala for his own personal projects; recommended Vala to newcomers; and has to deal with the various build issues that arise in GNOME because something broke in Vala or in projects using Vala. If you’re using Vala outside of GNOME, you have two options: either ignore all I’m saying, as it does not really apply to your case; or do a bit of soul searching, and see if what I wrote does indeed apply to you.

First of all, I’d like to qualify my assertion that Vala is a “dead language”. Of course people see activity in the Git repository, see the recent commits and think “the project is still alive”. Recent commits do not tell a complete story.

Let’s look at the project history for the past 10 cycles (roughly 2.5 years). These are the commits for every cycle, broken up in two values: one for the full repository, the other one for the whole repository except the vapi directory, which contains the VAPI files for language bindings:

Commits

Aside from the latest cycle, Vala has seen very little activity; the project itself, if we exclude binding updates, has seen less than 100 commits for every cycle — some times even far less. The latest cycle is a bit of an outlier, but we can notice a pattern of very little work for two/three cycles, followed by a spike. If we look at the currently in progress cycle, we can already see that the number of commits has decreased back to 55/42, as of this morning.

Commits

Number of commits is just a metric, though; more important is the number of contributors. After all, small, incremental changes may be a good thing in a language — though, spoiler alert: they are usually an indication of a series of larger issues, and we’ll come to that point later.

These are the number of developers over the same range of cycles, again split between committers to the full repository and to the full repository minus the vapi directory:

Developers

As you can see, the number of authors of changes is mostly stable, but still low. If we have few people that actively commit to the repository it means we have few people that can review a patch. It means patches linger longer and longer, while reviewers go through their queues; it means that contributors get discouraged; and, since nobody is paid to work full time on Vala, it means that any interruption caused by paid jobs will be a bottleneck on the project itself.

These concerns are not unique of a programming language: they exist for every volunteer-driven free and open source project. Programming languages, though, like core libraries, are problematic because any bottleneck causes ripple effects. You can take any stalled project you depend on, and vendor it into your own, but if that happens to the programming language you’re using, then you’re pretty much screwed.

For these reasons, we should also look at how well-distributed is the workload in Vala, i.e. which percentage of the work is done by the authors of those commits; the results are not encouraging. Over that range of cycles, Only two developers routinely crossed the 5% of commits:

  • Rico Tzschichholz
  • Jürg Billeter

And Rico has been the only one to consistently author >50% of the commits. This means there’s only one person dealing with the project on a day to day basis.

As the maintainer of a project who basically had to do all the work, I cannot even begin to tell you how soul-crushing that can become. You get burned out, and you feel responsible for everyone using your code, and then you get burned out some more. I honestly don’t want Rico to burn out, and you shouldn’t, either.

So, let’s go into unfair territory. These are the commits for Rust — the compiler and standard library:

Rust

These are the commits for Go — the compiler and base library:

Go

These are the commits for Vala — both compiler and bindings:

Vala

These are the number of commits over the past year. Both languages are younger than Vala, have more tools than Vala, and are more used than Vala. Of course, it’s completely unfair to compare them, but those numbers should give you a sense of scale, of what is the current high bar for a successful programming language these days. Vala is a niche language, after all; it’s heavily piggy-backing on the GNOME community because it transpiles to C and needs a standard library and an ecosystem like the one GNOME provides. I never expected Vala to rise to the level of mindshare that Go and Rust currently occupy.

Nevertheless, we need to draw some conclusions about the current state of Vala — starting from this thread, perhaps, as it best encapsulates the issues the project is facing.

Vala, as a project, is limping along. There aren’t enough developers to actively effect change on the project; there aren’t enough developers to work on ancillary tooling — like build system integration, debugging and profiling tools, documentation. Saying that “Vala compiles to C so you can use tools meant for C” is comically missing the point, and it’s effectively like saying that “C compiles to binary code, so you can disassemble a program if you want to debug it”. Being able to inspect the language using tools native to the language is a powerful thing; if you have to do the name mangling in your head in order to set a breakpoint in GDB you are elevating the barrier of contributions way above the head of many newcomers.

Being able to effect change means also being able to introduce change effectively and without fear. This means things like continuous integration and a full test suite heavily geared towards regression testing. The test suite in Vala is made of 210 units, for a total of 5000 lines of code; the code base of Vala (vala AST, codegen, C code emitter, and the compiler) is nearly 75 thousand lines of code. There is no continuous integration, outside of the one that GNOME Continuous performs when building Vala, or the one GNOME developers perform when using jhbuild. Regressions are found after days or weeks, because developers of projects using Vala update their compiler and suddenly their projects cease to build.

I don’t want to minimise the enormous amount of work that every Vala contributor brought to the project; they are heroes, all of them, and they deserve as much credit and praise as we can give. The idea of a project-oriented, community-oriented programming language has been vindicated many times over, in the past 5 years.

If I scared you, or incensed you, then you can still blame me, and my lack of tact. You can still call me an asshole, and you can think that I’m completely uncool. What I do hope, though, is that this blog post pushes you into action. Either to contribute to Vala, or to re-new your commitment to it, so that we can look at my words in 5 years and say “boy, was Emmanuele wrong”; or to look at alternatives, and explore new venues in order to make GNOME (and the larger free software ecosystem) better.

Outreachy (GNOME)-W7~W10

The work I did in this month is not so much, because of the Chinese New Year and the Lantern Festival. Spring Festival is the most important holiday for Chinese people. We Chinese believe that the Spring Festival is the right moment to review the year past and make plans for the year ahead.

In this period, I supplemented some new terms of GNOME 3.22 while the committer was submitting. I think maybe the Chinese (China) Translation Team need a extra committer, the work is too heavy to only one people who has another full-time job, so that it always piles up too many “translated” to review.

On the other hand, I’m translating the terms in GNOME 3.24. I have finished the apps of Accessibility and Utils, which are all committed in 3.22. It is estimated that it will reduce much workload of 3.24 if the 3.22 is committed fully, and I try to finish it in the next period.


This work by Mandy Wang is licensed under a Creative Commons Attribution-ShareAlike 4.0 International


February 12, 2017

Google Code In at coala

Dear people!

coala participated in Google Code In thanks to FOSSASIA this year.

We have always been active in engaging newcomers and teaching people about Open Source. It is only natural that we think and work towards helping pupils all over the world take this step and learn about contributing to open source. (If you are a teacher and reading this, reach out to us on coala.io/chat – we’re very interested in working with you and are also starting an initiative in germany to connect to schools.)

So let’s get some data: we had 37 successfully completed tasks. Our mentors wrote an impressive amount of 26 GCI tasks – some of which are multiple paged step by step guides that are still used for non GCI purposes.

Our star contributor Kaisar Arkhan is a GCI winner and Ridhwanul Haque made it as a finalist. We are proud of you! Kaisar Arkhan is actively helping us with our infrastructure to get status.coala.io greener every day!

An unimaginable huge part of the credit here goes to John Mark Vandenberg who mainly administered GCI for us and mentored a huge number of students by himself and helped us writing up the best possible tasks we could have. We are very thankful that we could build on his experience with the program and that we had his valuable input at every stage. Backstage, we had Mario Behling and Hong Phuc Dang from FOSSASIA working tirelessly so we could make this happen.

If you meet any of those – consider inviting them for a cup of coffee and thank them for what they are doing for our community, for FOSSASIA and for the Open Source education.

New Users Panel

The GNOME Control Center redesign goes on. This release we are happy to announce the new Users Panel design. As you can see in the preview video below, we are moving away from a two column panel into a single page concept. These changes make the panel way clearer specially with the new shell.

In terms of user interface/experience, the Carousel is the main star of the new Users panel. It presents the system users, alphabetically sorted, three at the time. Its pagination allows browsing through the user list. The arrow indicates the selected item.

The Users panel now joins the Keyboard, Printers, and Mouse, as the new Settings experience. We plan to switch to the new Shell in the very next cycle. Stay tuned!

 

Recipes by mail

Since I last wrote about GNOME recipes, we’ve mainly focused on completing our feature set for 3.24.

Todays 0.12.0 release brings us very close to covering all the goals we’ve set ourselves when we started working on recipes: We have a  fullscreen cooking mode

and recipes and shopping lists can be shared by email

Since the Share button has replaced the Export button, contributing your recipes is now easier too: Just choose the “Contribute” option and send the recipe to the new recipes-list@gnome.org mailing list.

While working on this, it suddenly dawned on my why I may have seen some recipe contributions in bugzilla that where missing attachments: bugzilla has a limit for the sizes of attachments it allows, and recipes with photos may hit this limit.

So, if you’ve tried to contributed a recipe via bugzilla, and ran into this problem, please send your recipe to recipes-list@gnome.org instead.

February 11, 2017

Rust 🖤 GNOME Hackfest – Mexico City, 2017

1487048523_e1b5bcdbe1_b_dGlorieta del Ángel de la Independencia, Credits: laap mx@flicker cc-by-nc-nd

As you know a bunch of GNOMies have been in touch with Rustaceans to improve the Rust and GNOME binding integration. To accelerate the effort, we have decided to organize a hackfest soon-ish. After some digging and given that the great Federico has been involved in the effort we’ve figured that it was rather poetic to do it in Mexico City, birthplace of GNOME! Check out the wiki page for more details.

We will be meeting at the Red Hat office from the 29th to the 31st of March to try to advance the state of GObject bindings and try to make Rust emit GObject code as well. Get in touch with me or Federico if you want to attend!

 

Epoxy

Epoxy is a small library that GTK+, and other projects, use in order to access the OpenGL API in somewhat sane fashion, hiding all the awful bits of craziness that actually need to happen because apparently somebody dosed the water supply at SGI with large quantities of LSD in the mid-‘90s, or something.

As an added advantage, Epoxy is also portable on different platforms, which is a plus for GTK+.

Since I’ve started using Meson for my personal (and some work-related) projects as well, I’ve been on the lookout for adding Meson build rules to other free and open source software projects, in order to improve both their build time and portability, and to improve Meson itself.

As a small, portable project, Epoxy sounded like a good candidate for the port of its build system from autotools to Meson.

To the Bat Build Machine!

tl;dr

Since you may be interested just in the numbers, building Epoxy with Meson on my Kaby Lake four Core i7 and NMVe SSD takes about 45% less time than building it with autotools.

A fairly good fraction of the autotools time is spent going through the autogen and configure phases, because they both aren’t parallelised, and create a ton of shell invocations.

Conversely, Meson’s configuration phase is incredibly fast; the whole Meson build of Epoxy fits in the same time the autogen.sh and configure scripts complete their run.

Administrivia

Epoxy is a simple library, which means it does not need a hugely complicated build system set up; it does have some interesting deviations, though, which made the porting an interesting challenge.

For instance, on Linux and similar operating systems Epoxy uses pkg-config to find things like the EGL availability and the X11 headers and libraries; on Windows, though, it relies on finding the opengl32 shared or static library object itself. This means that we get something straightforward in the former case, like:

# Optional dependencies
gl_dep = dependency('gl', required: false)
egl_dep = dependency('egl', required: false)

and something slightly less straightforward in the latter case:

if host_system == 'windows'
  # Required dependencies on Windows
  opengl32_dep = cc.find_library('opengl32', required: true)
  gdi32_dep = cc.find_library('gdi32', required: true)
endif

And, still, this is miles better than what you have to deal with when using autotools.

Let’s take a messy thing in autotools, like checking whether or not the compiler supports a set of arguments; usually, this involves some m4 macro that’s either part of autoconf-archive or some additional repository, like the xorg macros. Meson handles this in a much better way, out of the box:

# Use different flags depending on the compiler
if cc.get_id() == 'msvc'
  test_cflags = [
    '-W3',
    ...,
  ]
elif cc.get_id() == 'gcc'
  test_cflags = [
    '-Wpointer-arith',
    ...,
  ]
else
  test_cflags = [ ]
endif

common_cflags = []
foreach cflag: test_cflags
  if cc.has_argument(cflag)
    common_cflags += [ cflag ]
  endif
endforeach

In terms of speed, the configuration step could be made even faster by parallelising the compiler argument checks; right now, Meson has to do them all in a series, but nothing except some additional parsing effort would prevent Meson from running the whole set of checks in parallel, and gather the results at the end.

Generating code

In order to use the GL entry points without linking against libGL or libGLES* Epoxy takes the XML description of the API from the Khronos repository and generates the code that ends up being compiled by using a Python script to parse the XML and generating header and source files.

Additionally, and unlike most libraries in the G* stack, Epoxy stores its public headers inside a separate directory from its sources:

libepoxy
├── cross
├── doc
├── include
│   └── epoxy
├── registry
├── src
└── test

The autotools build has the src/gen_dispatch.py script create both the source and the header file for each XML at the same time using a rule processed when recursing inside the src directory, and proceeds to put the generated header under $(top_builddir)/include/epoxy, and the generated source under $(top_builddir)/src. Each code generation rule in the Makefile manually creates the include/epoxy directory under the build root to make up for parallel dispatch of each rule.

Meson makes is harder to do this kind of spooky-action-at-a-distance build, so we need to generate the headers in one pass, and the source in another. This is a bit of a let down, to be honest, and yet a build that invokes the generator script twice for each API description file is still faster under Ninja than a build with the single invocation under Make.

There are sill issues in this step that are being addressed by the Meson developers; for instance, right now we have to use a custom target for each generated header and source separately instead of declaring a generator and calling it multiple times. Hopefully, this will be fixed fairly soon.

Documentation

Epoxy has a very small footprint, in terms of API, but it still benefits from having some documentation on its use. I decided to generate the API reference using Doxygen, as it’s not a G* library and does not need the additional features of gtk-doc. Sadly, Doxygen’s default style is absolutely terrible; it would be great if somebody could fix it to make it look half as good as the look gtk-doc gets out of the box.

Cross-compilation and native builds

Now we get into “interesting” territory.

Epoxy is portable; it works on Linux and *BSD systems; on macOS; and on Windows. Epoxy also works on both Intel Architecture and on ARM.

Making it run on Unix-like systems is not at all complicated. When it comes to Windows, though, things get weird fast.

Meson uses cross files to determine the environment and toolchain of the host machine, i.e. the machine where the result of the build will eventually run. These are simple text files with key/value pairs that you can either keep in a separate repository, in case you want to share among projects; or you can keep them in your own project’s repository, especially if you want to easily set up continuous integration of cross-compilation builds.

Each toolchain has its own; for instance, this is the description of a cross compilation done on Fedora with MingW:

[binaries]
c = '/usr/bin/x86_64-w64-mingw32-gcc'
cpp = '/usr/bin/x86_64-w64-mingw32-cpp'
ar = '/usr/bin/x86_64-w64-mingw32-ar'
strip = '/usr/bin/x86_64-w64-mingw32-strip'
pkgconfig = '/usr/bin/x86_64-w64-mingw32-pkg-config'
exe_wrapper = 'wine'

This section tells Meson where the binaries of the MingW toolchain are; the exe_wrapper key is useful to run the tests under Wine, in this case.

The cross file also has an additional section for things like special compiler and linker flags:

[properties]
root = '/usr/x86_64-w64-mingw32/sys-root/mingw'
c_args = [ '-pipe', '-Wp,-D_FORTIFY_SOURCE=2', '-fexceptions', '--param=ssp-buffer-size=4', '-I/usr/x86_64-w64-mingw32/sys-root/mingw/include' ]
c_link_args = [ '-L/usr/x86_64-w64-mingw32/sys-root/mingw/lib' ]

These values are taken from the equivalent bits that Fedora provides in their MingW RPMs.

Luckily, the tool that generates the headers and source files is written in Python, so we don’t need an additional layer of complexity, with a tool built and run on a different platform and architecture in order to generate files to be built and run on a different platform.

Continuous Integration

Of course, any decent process of porting, these days, should deal with continuous integration. CI gives us confidence as to whether or not any change whatsoever we make actually works — and not just on our own computer, and our own environment.

Since Epoxy is hosted on GitHub, the quickest way to deal with continuous integration is to use TravisCI, for Linux and macOS; and Appveyor for Windows.

The requirements for Meson are just Python3 and Ninja; Epoxy also requires Python 2.7, for the dispatch generation script, and the shared libraries for GL and the native API needed to create a GL context (GLX, EGL, or WGL); it also optionally needs the X11 libraries and headers and Xvfb for running the test suite.

Since Travis offers an older version of Ubuntu LTS as its base system, we cannot build Epoxy with Meson; additionally, running the test suite is a crapshoot because the Mesa version if hopelessly out of date and will either cause most of the tests to be skipped or, worse, make them segfault. To sidestep this particular issue, I’ve prepared a Docker image with its own harness, and I use it as the containerised environment for Travis.

On Appveyor, thanks to the contribution of Thomas Marrinan we just need to download Python3, Python2, and Ninja, and build everything inside its own root; as an added bonus, Appveyor allows us to take the build artefacts when building from a tag, and shoving them into a zip file that gets deployed to the release page on GitHub.

Conclusion

Most of this work has been done off and on over a couple of months; the rough Meson build conversion was done last December, with the cross-compilation and native builds taking up the last bit of work.

Since Eric does not have any more spare time to devote to Epoxy, he was kind enough to give me access to the original repository, and I’ve tried to reduce the amount of open pull requests and issues there.

I’ve also released version 1.4.0 and I plan to do a 1.4.1 release soon-ish, now that I’m positive Epoxy works on Windows.

I’d like to thank:

  • Eric Anholt, for writing Epoxy and helping out when I needed a hand with it
  • Jussi Pakkanen and Nirbheek Chauhan, for writing Meson and for helping me out with my dumb questions on #mesonbuild
  • Thomas Marrinan, for working on the Appveyor integration and testing Epoxy builds on Windows
  • Yaron Cohen-Tal, for maintaining Epoxy in the interim

February 10, 2017

Live programming IoT systems with MsgFlo+Flowhub

Last weekend at FOSDEM I presented in the Internet of Things (IoT) devroom,
showing how one can use MsgFlo with Flowhub to visually live-program devices that talk MQTT.

If the video does not work, try the alternatives here. See also full presentation notes, incl example code.

Background

Since announcing MsgFlo in 2015, it has mostly been used to build scalable backend systems (“cloud”), using AMQP and RabbitMQ. At The Grid we’ve been processing hundred thousands of jobs each week, so that usecase is pretty well tested by now.

However, MsgFlo was designed from the beginning to support multiple messaging systems (including MQTT), as well as other kinds of distributed systems – like a networks of embedded devices working together (one aspect of “IoT”). And in MsgFlo 0.10 this is starting to work pretty nicely.

Visual system architecture

Typical MQTT devices have the topic names hidden in code. Any documentation is typically kept in sync (or not…) by hand.
MsgFlo lets you represent your devices and services as FBP/dataflow “components”, and a system as a connected graph of component instances. Each device periodically sends a discovery message to the broker. This message describing the role name, as well as what ports exists (including the MQTT topic name). This leads to a system architecture which can be visualized easily:

Imaginary solution to a typically Norwegian problem: Avoiding your waterpipes freezing in the winter.

Rewire live system

In most MQTT devices, output is sent directly to the input of another device, by using the same MQTT topic name. This hardcodes the system functionality, reducing encapsulation and reusability.
MsgFlo each device *should* send output and receive inports on topic namespaced to the device.
Connections between devices are handled on the layer above, by the broker/router binding different topics together. With Flowhub, one can change these connections while the system is running.

Change program values on the fly

Changing a parameter or configuration of an embedded device usually requires changing the code and flashing it. This means recompiling and usually being connected to the device over USB. This makes the iteration cycle pretty slow and tedious.
In MsgFlo, devices can (and should!) expose their parameters on MQTT and declare them as inports.
Then they can be changed in Flowhub, the device instantly reflecting the new values.

Great for exploratory coding; quickly trying out different values to find the right one.
Examples are when tweaking animations or game mechanics, as it is near impossible to know up front what feels right.

Add component as adapters

MsgFlo encourages devices to be fairly stupid, focused on a single generally-useful task like providing sensor data, or a way to cause actions in the real world. This lets us define “applications” without touching the individual devices, and adapt the behavior of the system over time.

Imagine we have a device which periodically sends current temperature, as a floating-point number in Celcius unit. And a display device which can display text (for instance a small OLED). To show current temperature, we could wire them directly:

Our display would show something like “22.3333333”. Not very friendly, how does one know what this number means?

Better add a component to do some formatting.

Adding a Python component

Component formatting incoming temperature number to a friendly text string

And then insert it before the display. This will create a new process, and route the data through it.

Our display would now show “Temperature: 22.3 C”

Over time maybe the system grows further

Added another sensor, output now like “Inside 22.2 C Outside: -5.5 C”.

Getting started with MsgFlo

If you have existing “things” that support MQTT, you can start using MsgFlo by either:
1) Modifying the code to also send the discovery message.
2) Use the msgflo-foreign-participant tool to provide discovery without code changes

If you have new things, using one of the MsgFlo libraries is a quick way to support MQTT and MsgFlo. Right now there are libraries for Python, C++11, Node.js, NoFlo and Arduino.

Flattr this!

Working on an Android tablet, 2017 edition

Back in 2013 I was working exclusively on an Android tablet. Then with the NoFlo Kickstarter I needed a device with a desktop browser. What followed were brief periods working on a Chromebook, on a 12” MacBook, and even an iPad Pro.

But from April 2016 onwards I’ve been again working with an Android device. Some people have asked me about my setup, and so here is an update.

Information technology

Why work on a tablet?

When I started on this path in 2013, using a tablet for “real work” was considered crazy. While every story on tablet productivity still brings out the people claiming it is not a real computer for real work, using tablets for real work is becoming more and more common.

A big contributor to this has been the plethora of work-oriented tablets and convertibles released since then. Microsoft’s popular Surface Pro line brought the PC to tablet form factor, and Apple’s iPad Pro devices gave the iPad a keyboard.

Here are couple of great posts talking about how it feels to work on an iPad:

With all the activity going on, one could claim using a tablet for work has been normalized. But why work on a tablet instead of a “real computer”? Here are some reasons, at least for me:

Free of legacy cruft

Desktop operating systems have become clunky. Window management. File management. Multiple ways to discover, install, and uninstall applications. Broken notification mechanisms.

With a tablet you can bypass pretty much all of that, and jump into a simpler, cleaner interface designed for the modern connected world.

I think this is also the reason driving some developers back to Linux and tiling window managers — cutting manual tweaking and staying focused.

Amazing endurance

Admittedly, laptop battery life has increased a lot since 2013. But with some manufacturers using this an excuse to ship thinner devices, tablets still win the endurance game.

With my current work tablet, I’m customarily getting 12 or more hours of usage. This means I can power through the typical long days of a startup founder without having to plug in. And when traveling, I really don’t have to care where power sockets are located on trains, airplanes, and conference centers.

Low power usage also means that I can really get a lot of more runtime by utilizing the mobile battery pack I originally bought to use with my phone. While I’ve never actually had to try this, back-of-the-envelope math claims I should be able to get a full workweek from the combo without plugging in.

Work and play

The other aspect of using a tablet is that it becomes a very nice content consumption device after I’m done working. Simply disconnect the keyboard and lean back, and the same device you used for writing software becomes a great e-reader, video player, or a gaming machine.

Livestreaming a SpaceX launch

This combined with the battery life has meant that I’ve actually stopped carrying a Kindle with me. While an e-ink screen is still nicer to read, not needing an extra device has its benefits, especially for a frequent one-bag traveller.

The setup

I’m writing this on a Pixel C, a 10.2” Android tablet made by Google. I got the device last spring when there were developer discounts available at ramp-up to the Android 7 release, and have been using it full-time since.

Software

My Android homescreen

Surprisingly little has changed in my software use since 2013 — I still spend the most of the time writing software in either Flowhub or terminal. Here are the apps I use on daily basis:

Looking back to the situation in early 2013, the biggest change is that Slack has pretty much killed work email.

Termux is a new app that has done a lot to improve the local development situation. By starting the app you get a very nice Linux chroot environment where a lot of software is only a quick apt install away.

Since much of my non-Flowhub work is done in tmux and vim, I get the exactly same working environment on both local chroot and cloud machines by simply installing my dotfiles on each of them.

Keyboard

Laptop tablet

When I’m on the road I’m using the Pixel C keyboard. This doubles as a screen protector, and provides a reasonable laptop-like typing environment. It attaches to the tablet with very strong magnets and allows a good amount of flexibility on the screen angles.

However, when stationary, no laptop keyboard compares to a real mechanical keyboard. When I’m in the office I use a Filco MiniLa Air, a bluetooth keyboard with quiet-ish Cherry MX brown switches.

Desktop tablet

This tenkeyless (60%) keyboard is extremely comfortable to type on. However, the sturdy metal case means that it is a little too big and heavy to carry on a daily basis.

In practice I’ve only taken with me when there has been a longer trip where I know that I’ll be doing a lot of typing. To solve this, I’m actually looking to build a more compact custom mechanical keyboard so I could always have it with me.

Comparison with iOS

So, why work on Android instead of getting an iPad Pro? I’ve actually worked on both, and here are my reasons:

  • Communications between apps: while iOS has extensions now, the ability to send data from an app to another is still a hit-or-miss. Android had intents from day one, meaning pretty much any app can talk to any other app
  • Standard charging: all of my other devices charge with the same USB-C chargers and cables. iPads still use the proprietary Lightnight plug, requiring custom dongles for everything
  • Standard accessories: this boils down to USB-C just like charging. With Android I can plug in a network adapter or even a mouse, and it’ll just work
  • Ecosystem lock-in: we’re moving to a world where everything — from household electronics to cars — is either locked to the Apple ecosystem or following standards. I don’t want to be locked to a single vendor for everything digital
  • Browser choice: with iOS you only get one web renderer, the rather dated Safari. On Android I can choose between Chrome, Firefox, or any other browser that has been ported to the platform

Of course, iOS has its own benefits. Apple has a stronger stance on privacy than Google. And there is more well-made tablet software available for iPads than Android. But when almost everything I use is available on the web, this doesn’t matter that much.

The future

Hacking on the c-base patio

As a software developer working on Android tablets, the weakest point of the platform is still that there are no browser developer tools available. This was a problem in 2013, and it is still a problem now.

From my conversations with some Chrome developers, it seems Google has very little interest in addressing this. However, there is a bright spot: the new breed of convertible Chromebooks being released now. And they run Android apps:

Chrome OS is another clean, legacy free, modern computing interface. With these new devices you get the combination of a full desktop browser and the ability to run all Android tablet software.

The Samsung Chromebook Pro/Plus mentioned above is definitely interesting. A high-res 12” screen and a digital pen which I see as something very promising for visual programming purposes.

However, given that I already have a great mechanical keyboard, I’d love a device that shipped without an attached keyboard. We’ll see what kind of devices get out later this year.

Accelerated compositing in WebKitGTK+ 2.14.4

WebKitGTK+ 2.14 release was very exciting for us, it finally introduced the threaded compositor to drastically improve the accelerated compositing performance. However, the threaded compositor imposed the accelerated compositing to be always enabled, even for non-accelerated contents. Unfortunately, this caused different kind of problems to several people, and proved that we are not ready to render everything with OpenGL yet. The most relevant problems reported were:

  • Memory usage increase: OpenGL contexts use a lot of memory, and we have the compositor in the web process, so we have at least one OpenGL context in every web process. The threaded compositor uses the coordinated graphics model, that also requires more memory than the simple mode we previously use. People who use a lot of tabs in epiphany quickly noticed that the amount of memory required was a lot more.
  • Startup and resize slowness: The threaded compositor makes everything smooth and performs quite well, except at startup or when the view is resized. At startup we need to create the OpenGL context, which is also quite slow by itself, but also need to create the compositing thread, so things are expected to be slower. Resizing the viewport is the only threaded compositor task that needs to be done synchronously, to ensure that everything is in sync, the web view in the UI process, the OpenGL viewport and the backing store surface. This means we need to wait until the threaded compositor has updated to the new size.
  • Rendering issues: some people reported rendering artifacts or even nothing rendered at all. In most of the cases they were not issues in WebKit itself, but in the graphic driver or library. It’s quite diffilcult for a general purpose web engine to support and deal with all possible GPUs, drivers and libraries. Chromium has a huge list of hardware exceptions to disable some OpenGL extensions or even hardware acceleration entirely.

Because of these issues people started to use different workarounds. Some people, and even applications like evolution, started to use WEBKIT_DISABLE_COMPOSITING_MODE environment variable, that was never meant for users, but for developers. Other people just started to build their own WebKitGTK+ with the threaded compositor disabled. We didn’t remove the build option because we anticipated some people using old hardware might have problems. However, it’s a code path that is not tested at all and will be removed for sure for 2.18.

All these issues are not really specific to the threaded compositor, but to the fact that it forced the accelerated compositing mode to be always enabled, using OpenGL unconditionally. It looked like a good idea, entering/leaving accelerated compositing mode was a source of bugs in the past, and all other WebKit ports have accelerated compositing mode forced too. Other ports use UI side compositing though, or target a very specific hardware, so the memory problems and the driver issues are not a problem for them. The imposition to force the accelerated compositing mode came from the switch to using coordinated graphics, because as I said other ports using coordinated graphics have accelerated compositing mode always enabled, so they didn’t care about the case of it being disabled.

There are a lot of long-term things we can to to improve all the issues, like moving the compositor to the UI (or a dedicated GPU) process to have a single GL context, implement tab suspension, etc. but we really wanted to fix or at least improve the situation for 2.14 users. Switching back to use accelerated compositing mode on demand is something that we could do in the stable branch and it would improve the things, at least comparable to what we had before 2.14, but with the threaded compositor. Making it happen was a matter of fixing a lot bugs, and the result is this 2.14.4 release. Of course, this will be the default in 2.16 too, where we have also added API to set a hardware acceleration policy.

We recommend all 2.14 users to upgrade to 2.14.4 and stop using the WEBKIT_DISABLE_COMPOSITING_MODE environment variable or building with the threaded compositor disabled. The new API in 2.16 will allow to set a policy for every web view, so if you still need to disable or force hardware acceleration, please use the API instead of WEBKIT_DISABLE_COMPOSITING_MODE and WEBKIT_FORCE_COMPOSITING_MODE.

We really hope this new release and the upcoming 2.16 will work much better for everybody.

Feeds