24 hours a day, 7 days a week, 365 days per year...

December 20, 2014

Star Duel

I just had my second puzzle published at GM Puzzles. Take a look.

This puzzle is a Star Battle, which is an elegant and easy-to-learn puzzle type. All you have to do is place some stars into the grid such that exactly one star, (or more stars for larger puzzles), appears in each row, column, and bold region. Also, no two stars can be placed in adjacent cells, not even diagonally.

For the puzzle I published today, I took a couple of steps beyond the basic Star Battle formula. First, I made the grid quite large, (15x15 with 3 stars per row/column/region). Then I also provided two grids. Each grid has the same solution of stars, so you work back and forth between the grids to solve them both simultaneously. Finally, instead of dividing the regions into random-looking arbitrary shapes, I tried to make some recognizable pictures with the grids. If you click through above, hopefully you'll be able to recognize the shapes I was going for. This puzzle marks my first in a series of puzzles where I chose the name of the puzzle type as the theme for the puzzle. So, I drew a star battle for a Star Battle puzzle. And with dual grids, I named this battle a Star Duel.

Finally, I should point out that where my previous puzzle was published on a Monday, today's puzzle is published on a Saturday. The GM Puzzles website publishes puzzles that get increasingly difficult throughout the week. So if you've never attempted a star-battle puzzle before, I don't actually recommend you start with my puzzle from today.

Instead, you might start with this simpler star battle puzzle that I wrote as part of a Christmas puzzle hunt for my boys. It's a very gentle introduction to the puzzle type. If you try it, you can ignore the second grid with the C, H, A, and S regions. That was part of a metapuzzle included in the puzzle hunt.

Happy puzzling!

Christmas Code 2014

My boys just finished this year's Christmas puzzle hunt, (known in our family as "The Code"). Read below for a summary of how things went, or check out the actual puzzles if you want to solve them yourself or just wonder at the amount of work my poor boys have to go through before they get any Christmas presents.

I'm really pleased with how things went this year. I scaled the difficulty back compared to last year, and that was good for everyone's sanity. It's easy to see the difference in just the numbers. Last year's hunt had 24 puzzles, 5 metapuzzles, and 1 final metametapuzzle. This year there were only 9 puzzles and 2 small metapuzzles. I was nervous the boys might think I did too little, but I think I actually got the difficulty just right this year.

They did solve it a few days before Christmas this year, but that's a good victory for them. Last year, there was a tough Battleships metapuzzle and the deciphering of a Playfair cipher that all came together only late at night on Christmas Eve.

Of course, the most important feedback comes from the boys themselves. There was a priceless moment last night as one-by-one, the boys in turn had the final flash of insight needed to solve the last clue. There's nothing better for a puzzle designer than getting to watch that "ah-ha" moment hit with its explosion of laughter. And that laughter is always mixed---appreciation for the disguise of the clue and a bit of disgust at not being able to see through it all earlier. And last night, it was a perfect cascade of these moments from one boy to the next. After the first one finally got it, just knowing it was possible was all the others needed to reach the same insight themselves. It was perfect.

This morning the boys reported that this year's code was the "best one yet". I think I've heard that three years in a row now. That's what gives me the motivation to do all this again. I guess I had better start planning now if I want to improve again next year!

Installing Mapnik and Tilemill from source on Fedora

I’ve had my fair share of frustrations with jhbuild, so I find that I am surprisingly patient, and observant and my source-building muscles seem well-oiled. Today I installed Mapnik and Tilemill after running into a few troubles.

Here’s the brief for someone else with a Fedora machine, who might perhaps struggle lesser with these instructions. The instructions are mirrored from here but written for the Fedora user. Mostly, I have replaced all debian packages with the corresponding rpms from Fedora to save you a lot of searching and marked fixes for some specific issues that I encountered. Note that some of the packages I have included might well be unnecessary but I hope you will excuse that as one of my newbie misgivings.

1) Install development headers for boost dependencies.

sudo yum install make automake gcc gcc-c++ kernel-devel \
bzip2-devel.x86_64 python-devel.x86_64 libicu-devel.x86_64

2) Install Boost from source

tar xjvf boost_1_54_0.tar.bz2
cd boost_1_54_0
./b2 stage toolset=gcc --with-thread --with-filesystem --with-python 
--with-regex -sHAVE_ICU=1 -sICU_PATH=/usr/ --with-program_options 
--with-system link=shared
sudo ./b2 install toolset=gcc --with-thread --with-filesystem 
--with-python --with-regex -sHAVE_ICU=1 -sICU_PATH=/usr/ 
--with-program_options --with-system link=shared -d0
sudo ldconfig && cd ../

3)Now install the remaining Mapnik dependencies like icu, proj4, libpng, libjpeg, libtiff, libxml2, libltdl, and freetype.

sudo yum install libxml2-devel.x86_64 \
freetype-devel-2.4.11-7.fc19.x86_64 libjpeg-turbo-devel.x86_64 \
libpng-devel.x86_64  proj-devel.x86_64 libtiff-devel.x86_64 \
cairo-devel-1.12.14-2.fc19.x86_64 cairo-dock-python3.x86_64 \
libgda-devel.x86_64 postgresql-server.x86_64 \
postgresql-contrib.x86_64 libsqlite3x-devel.x86_64 \
libpqxx-devel.x86_64 gdal-python.x86_64 gdal-devel.x86_64 \
gdal-libs.x86_64 proj-epsg.x86_64 proj-static.x86_64 proj-nad.x86_64 \

4) Now download the latest Mapnik release and build it from scratch:

tar xf mapnik-v2.2.0.tar.bz2
cd mapnik-v2.2.0

NOTE: Add osm in in the first line to list of INPUT PLUGINS, so your should read as follows

INPUT_PLUGINS = 'csv,gdal,geojson,ogr,osm,postgis,raster,shape,sqlite'
BOOST_INCLUDES = '/usr/local/include'
BOOST_LIBS = '/usr/local/lib'
BINDINGS = 'python'

Make install

./configure && make && sudo make install

5)make test && cd ../

Possible Issues

5.1 – fixed by installing proj-epsg.x86_64

sudo yum install proj-epsg.x86_64

5.2 – fixed by including ‘osm’ in the list of INPUT_PLUGINS as mentioned in STEP 4 or if you don’t want that, replace tests/python_tests/ with the file from here

This ensures that tests are only run for plugins in the INPUT_PLUGINS list and ignores all the tests for osm, but if you want those tests to work, you must include osm in the list of INPUT_PLUGINS in run this part of STEP 4 again

./configure && make && sudo make install

6)I already had nodeJS installed, which I simply picked of the main nodeJS site and built from source in the normal way. The mapbox website recommended way should also work:

tar xf node-v${NODE_VERSION}.tar.gz
cd node-v${NODE_VERSION}
./configure && make && sudo make install
cd ../

7)For the Desktop Windowing UI for Tilemill, I simply installed libwebkit2gtk.x86_64 because I have a GNOME desktop. Pick an equivalent here.

sudo yum install libwebkit2gtk.x86_64

8)Install Google Protobuf library

sudo yum install protobuf-compiler.x86_64 protobuf-lite-devel.x86_64 \

9) Install Tilemill

git clone
cd tilemill
npm install

Use g_set_object() to simplify (and safetyify) GObject property setters

tl;dr: Use g_set_object() to conveniently update owned object pointers in property setters (and elsewhere); see the bottom of the post for an example.

A little while ago, GLib gained a useful function called g_clear_object(), which clears an owned object pointer (a pointer which owns a reference to a GObject). GLib also just gained a function called g_set_object(), which works similarly but can either clear an object pointer or update it to point to a new object.

Why is this valuable? It saves a few lines of code each time an object pointer is updated, which isn’t much in itself. However, one thing it gets right is the order of reference counting operations, which is a common mistake in property setters. Instead of:

/* This is bad code. */
if (object_ptr != NULL)
    g_object_unref (object_ptr);
if (new_object != NULL)
    g_object_ref (new_object);
object_ptr = new_object;

you should always do:

/* This is better code. */
if (new_object != NULL)
    g_object_ref (new_object);
if (object_ptr != NULL)
    g_object_unref (object_ptr);
object_ptr = new_object;

because otherwise, if (new_object == object_ptr) (or if the objects have some other ownership relationship) and the object only has one reference left, object_ptr will end up pointing to a finalised GObject (and g_object_ref() will be called on a finalised GObject too, which it really won’t like).

So how does g_set_object() help? We can now do:

g_set_object (&object_ptr, new_object);

which takes care of all the reference counting, and allows new_object to be NULL. &object_ptr must not be NULL. If you’re worried about performance, never fear. g_set_object() is a static inline function, so shouldn’t adversely affect your code size.

Even better, the return value of g_set_pointer() indicates whether the value changed, so we can conveniently base GObject::notify emissions off it:

/* This is how all GObject property setters should look in future. */
if (g_set_object (&priv->object_ptr, new_object))
    g_object_notify (self, "object-ptr");

Hopefully this will make property setters (and other object updates) simpler in new code.

Porting Yelp to WebKit2Gtk+, three months as an igalian

It have been a while since I write a post here and so many things has happened on my life. I have ended my second GSoC, I had an internship with Igalia and I’m participating on the organization of a music festival!

But I’m here to talk about my Igalia internship. The work I should did during this internship was to port a GNOME app from WebKitGtk+ to WebKit2Gtk+ and write a tutorial about how to do the port. The chosen app was Yelp. During this year GUADEC people talk about the need of porting this app and both my mentors and me thought that it would be a really interesting app to port.

For those who don’t know what Yelp is, Yelp is the GNOME help viewer. The implementation of it includes a WebView that shows the help document. The idea is transform the XML documents to HTML using XSLT and then show the result. I have submit the port patches to bugzilla and they are waiting to be reviewed. The most difficult part of the port was dealing with the multiprocess architecture of WebKit2 so we have to deal with DOM in a different process from the WebView process. This is something that makes more difficulta to port certain apps but it could be solved with different techniques.

Appart from porting Yelp, I submit a patch to WebKit2Gtk+. I expose some part of the API to could check if the clicked element corresponds to a selected area or not. Working with the WebKit code has been a really challenging and cool experience that I want to repeat. When I have time I would end my work on this patch to enable improve plugin support at WebKit.

The tutorial I made to help people to port Yelp can be found here. In addition when I ended my Yelp port I also worked on porting Sushi and Bijiben. Sushi port is finished (it was a really trivial port) and Bijiben port had been cancelled due to the fact that it would probably be refactored to use a TextView instead of WebView.

To end I have to thank to Igalia to give this huge opportunity of working on such a awesome company. I specially want to thank my two mentors, Carlos Lopez and Carlos García for the help they provide me during this time. During the Brno GUADEC somebody told me that Igalia was one-of-a-kind company and I totally agree, the atmosphere is great and the way of working they have is unique. I’m really proud of could be an igalian for these three months. Thank you Igalia!!

December 19, 2014

OpenHardware Random Number Generator

Before I spend another night reading datasheets; would anyone be interested in an OpenHardware random number generator in an full-size SD card format? The idea being you insert the RNG into the SD slot of your laptop, leave it there, and the kernel module just slurps trusted entropy when required.

Why SD? It’s a port that a a lot of laptops already have empty, and on server class hardware you can just install a PCIe addon card. I think I can build such a thing for less than $50, but at the moment I’m just waiting for parts for a prototype so that’s really just a finger-in-the-air estimate. Are there enough free software people who care about entropy-related stuff?

December 18, 2014

Builder Progress Update

It's been a while since I've had the pleasure of updating everyone. Those that follow my twitter @hergertme get the daily updates and screenshots as they happen.

This is going to be a sort of long post, but it's filled with screenshots. I hope you enjoy keeping track of Builder as much as I enjoy creating it.


The editor has gone through a major facelift since you last saw it. I knew early on that tabs were not going to be a good interaction for our desired use case. But I had to start somewhere. Using Builder to build Builder was an important goal of mine.

No More Tabs

In the screenshot below, you'll see that we are no longer using tabs for managing documents.

Like a web browser, it's common for some of us to have ten or twenty documents open at any given time. Navigating this with tabs is simply more effort than it's worth. You can't see the tab label and jumping to them with an <Alt>Number accelerator doesn't work when you see documents twelve through twenty in the tab strip.

Keep in mind that in the future, we will add some accelerators to help move between frequently edited documents.

Document Switching

We like to borrow from other editors where it makes sense. That's why we have a decent VIM mode for the editor. This time, however, we decided to add a feature that is a bit closer to emacs.

Behind the scenes, is the document manager. It is responsible for tracking all of the open buffers in Builder.

We've tied that into the menu button above the editor so you can switch between open buffers. Buffers that have been modified are denoted with a dot after the document title.

Split Views

Another feature, grown out of the document manager, the ability to have split views. We abstracted creating a view for the document so that we can have multiple views into the same buffer. We still need to do some work here so that the insert mark is not lost between focus changes. It's all doable, we just need to spend time to get the details perfect.

Fear not, you can do vertical splits too. I tried having the titlebar duplicated for vertical splits, but it felt rather jarring and clunky. Therefore, we limit vertical splits to a single buffer.

Editor Setting Overrides

Occasionally you might find yourself needing to change the editor settings for a specific file. You don't want to change the defaults for the language (in Preferences) because that is slow and annoying. So this week I introduced the editor tweak popover. You can use it to change a few of the those common settings without leaving your document.

Editor Markdown

The new split and abstracted document design works well with markdown. In fact, I'm writing this document using the live preview. We are still using markdown parser from GNOME Software, so it is a bit limited in what it supports. It would be nice if someone provided patches to use CommonMark.

Editor HTML

Just like markdown, you have live preview with HTML. If you do web stuff and you want to use Builder for such work, best to tell me what you want.

Editor Code Assistance

Thankfully we don't have to go reinvent source parsers and diagnostics tools. gnome-code-assistance already does the heavy lifting for us here. So we made the choice early on to reuse as much of that as we can. Also, I hope to add some more services to GCA so they can be reused from Gedit. For example, I think it makes sense for our auto-completion engine to live there since it already has access to the parser and AST.

Code assistance works for Python too. It should for anything that GNOME Code Assistance diagnostics engine supports. So in other words, go help them and make all our projects better.

Editor State Tracking

We now keep track of the buffer's file on disk similar to Gedit. If the buffer was modified outside of the application you are notified the next time the widget is focused.

Closing Modified Documents

If you try to close modified documents, we will nag you now with a dialog lifted right from Gedit.

Editor Search

Search within the editor is looking pretty good these days.

Style Schemes

I created a style scheme just for Builder that feels very blueprint-like. It comes in two modes, a light and dark variant. Neither are finished (in fact I only made the dark variant today). But I've been pretty happy with the light theme the last couple of weeks.


The preferences window got some work too.

Preferences Searching

Searching the preferences window now works. Tweaks that do not match your search query are hidden from view. This will look familiar to many of you, I copied the design from GNOME Tweak Tool, but lots of IDE's do this.

Style Scheme Selector

I made a fancy style scheme selector widget for editor styling. It could use some work, but I think it's easier than the try, look at editor, go back to preferences cycle people are used to.

Global Search

Global search is in it's infancy. I just started on it the other day, mostly getting plumbing in place. I'll continue using it the next week or two so I get a feel of what works and what doesn't. After that, I'll make a bunch of changes and then start pushing that forward faster.

Command Bar

The FireFox-style command bar is in place. You can use it to execute actions (and even pass GVariant parameters) that are defined in the application.

It's very handy, I'll often run :save while in VIM mode. : will focus the command bar just like in VIM. save is the GAction we are executing.

As you might guess, you can even execute some VIM commands from this bar (when in VIM mode). Try something like :colorscheme tango or :%s/foo/bar/g.

Alex Larsson added tab completion to the command bar a few weeks ago.

Roll the Credits!

I wanted to create something special for those that choose to support me financially while I write Builder. The campaign will start very, very soon. Everyone that donates will get their name in the credits (among some other totally awesome perks).

I wanted to create a movie style credits widget. It was a fun bit of animation hacking.

December 17, 2014

Actually shipping AppStream metadata in the repodata

For the last couple of releases Fedora has been shipping the appstream metadata in a package. First it was the gnome-software package, but this wasn’t an awesome dep for KDE applications like Apper and was a pain to keep updated. We then moved the data to an appstream-data package, but this was just as much of a hack that was slightly more palatable for KDE. What I’ve wanted for a long time is to actually ship the metadata as metadata, i.e. next to the other files like primary.xml.gz on the mirrors.

I’ve just pushed the final patches to libhif, PackageKit and appstream-glib, which that means if you ship metadata of type appstream and appstream-icons in repomd.xml then they get downloaded automatically and decompressed into the right place so that gnome-software and apper can use the data magically.

I had not worked on this much before, as appstream-builder (which actually produces the two AppStream files) wasn’t suitable for the Fedora builders for two reasons:

  • Even just processing the changed packages, it took a lot of CPU, memory, and thus time.
  • Downloading screenshots from random websites all over the internet wasn’t something that a build server can do.

So, createrepo_c and modifyrepo_c to the rescue. This is what I’m currently doing for the Utopia repo.

createrepo_c --no-database x86_64/
createrepo_c --no-database SRPMS/
modifyrepo_c					\
	--no-compress				\
	/tmp/asb-md/appstream.xml.gz		\
modifyrepo_c					\
	--no-compress				\
	/tmp/asb-md/appstream-icons.tar.gz	\

If you actually do want to create the metadata on the build server, this is what I use for Utopia:

appstream-builder			\
	--api-version=0.8		\
	--origin=utopia			\
	--cache-dir=/tmp/asb-cache	\
	--enable-hidpi			\
	--max-threads=4			\
	--min-icon-size=48		\
	--output-dir=/tmp/asb-md	\
	--packages-dir=x86_64/		\
	--temp-dir=/tmp/asb-icons	\

For Fedora, I’m going to suggest getting the data files from during compose. It’s not ideal as it still needs a separate server to build them on (currently sitting in the corner of my office) but gets us a step closer to what we want. Comments, as always, welcome.

Good bye Bugzilla, welcome Phabricator.

<tl;dr>: Wikimedia migrated its bug tracking from Bugzilla to Phabricator in late November 2014.

After ten years of using Bugzilla with 73681 tickets and ~20000 user accounts and after months of planning, writing migration code, testing, gathering feedback, discussing, writing more code, writing documentation, communicating, et cetera, Wikimedia switched from Bugzilla to Phabricator as its issue tracking tool.
Phabricator is a fun adventure game collaboration platform and a forge that consists of several well-integrated applications. Maniphest is the name of the application for handling bug reports and tasks.
My announcement from May 2014 explained the idea (better collaboration and having less tools) and the decision making process that led to choosing Phabricator and starting to work on making it happen.

Wikimedia Phabricator frontpage an hour after opening it for the public again after the migration from Bugzilla.

Wikimedia Phabricator frontpage an hour after opening it for the public again after the migration from Bugzilla.

Quim already published an excellent summary of Wikimedia Phabricator right after the migration from Bugzilla, covering its main features and custom functionality that we implemented for our needs. Read that if you want to get an overview of how Phabricator helps Wikimedia with collaborating and planning in software development.
This blog post instead covers more details of the actual steps taken in the last months and the migration from Bugzilla itself. If you want even more verbose steps and information on the progress, check the status updates that I published every other week with links to the specific tickets and/or commits.


After reviewing our project management tools and closing the RfC the team started to implement a Wikimedia SUL authentication provider (via OAuth) so no separate account is needed, work on an implementation to restrict access to certain tasks (access restrictions are on a task level and not on a project level), and creating an initial Phabricator module in Puppet.
We started to discuss how to convert information in Bugzilla (keywords, products and components, target milestones, versions, custom fields, …), which information to entirely drop (e.g. the severity field, the history of field value changes, votes, …), and which information to only drop as text in the initial description instead of a dedicated field. More information about data migrated is available in a table. This constantly influenced the scope of the script for the actual data migration from Bugzilla (more information on code).

We already had a (now defunct) Phabricator test instance in Wikimedia Labs under which we now started to also use for planning the actual migration.
There’s a 7 minute video summary from June describing the general problem with our tools that we were trying to solve and the plan at that time. We also started to write help documentation.

As we got closer to launching the final production instance on, we decided to split our planning into three separate projects to have a better overview: Day 1 of a Phabricator Production instance in use, Bugzilla migration, and RT migration.

On September 15th, launched with relevant content imported from the test instance which we had used for dogfooding. In the case of Wikimedia, this required setting up SNI and making it work with nginx and the certificate to allow using SUL and LDAP for login. After the production instance had launched we also had another Hangout video session to teach the very basics of Phabricator.

To provide a short impression of further stuff happening in the background: Elasticsearch was set up as Phabricator’s search backend, some legal aspects (footer, determining the license of submitted content) were brought up, was set up as a new playground, and we made several further customizations when it comes to user-visible strings and information on user pages within Phabricator. In the larger environment of Wikimedia infrastructure interacting with the issue tracker, areas like IRC bots, interwiki links, on-wiki templates, and automatic notifications in tasks about related patches in the code review system were dealed with or being worked on.

Paying attention to details: The “tracked” template on Wikimedia sites supports linking to tasks in Phabricator, while still redirecting links to Bugzilla tickets via URL redirects (see below).

We also had a chicken and egg problem to solve: Accounts versus tickets. Accounts in Bugzilla are defined by email addresses while accounts in Phabricator are user names. For weeks we were asking Bugzilla users and community users to already create an account in Phabricator and “claim” their Bugzilla accounts by entering the email address that they used in Bugzilla in their Phabricator account settings. The plan was to import the tickets and account ‘placeholders’ and then use cron jobs to connect the placeholder accounts with the actual users and to ‘claim’/connect their past Bugzilla contributions and activity by updating the imported data in Phabricator.

On October 23th, we made a separate “bugzillapreview” test instance available on Wikimedia Labs with thousands of Bugzilla tickets imported. For two weeks, the community was invited to check how Bugzilla tickets are going to look in Phabricator after the migration and to identify more potential issues. The input was helpful and valuable: We received 45 reports and fixed 25 of them (9 were duplicates, 2 invalid, and 9 got declined).

A task imported from Bugzilla in the Phabricator preview instance.

A task imported from Bugzilla in the Phabricator preview instance.

Having had reached a good overview, we created a consolidated list of known issues and potential regressions created by the migration from Bugzilla to Phabricator and defined a final date for the migration: November 21-23.

Keeping timestamps of comments intact (such as the original creation date of a ticket in Bugzilla or when a certain comment was made) was still something to sort out at this point (and got tackled). It would have been confusing and would have broken searches that triagers need when trying to clean up (e.g. tickets which have not seen updates for two years).

It was also tricky performance-wise to keep the linear numbering order of reports which was requested by many people to not solely depend on URL redirects from to which we planned to set up (more information on the redirect setup). As we already had ~1400 tickets in Phabricator we went for the simple rule “report ID in Bugzilla + 2000 = task ID in Phabricator”.

Regarding documentation and communication, we created initial project creation guidelines, sent one email to those 66 users of personal tags in Bugzilla warning that tags will not be migrated, sent two emails to the 850 most recently active Bugzilla users asking them to log into Phabricator and provide their email address used in Bugzilla to claim their imported contributions as part of the migration already (for comparison, the average number of active users per month in Bugzilla was around 500+ for the last months), put migration announcement banners on and every page on our Bugzilla, sent reminders to the wikitech-l, mediawiki-l, wikitech-ambassadors, and wmfall mailing lists.

After a last ‘Go versus No-Go’ meeting on November 12th, we set up the timeline with the list of steps to perform for the migration, ordered, with assignees defined for each task. This was mostly based on the remaining open dependencies of our planning task. We had two more IRC office hours on November 18 and 19 to answer questions regarding the migration and Phabricator itself.

While migrating, the team used a special X-Forwarded-For header to still be able to access Bugzilla and Phabricator via their browsers while normal users trying to access Phabricator or Bugzilla were redirected to a wikipage telling them what’s going on and where to escalate urgent issues (MediaWiki support desk or IRC) while no issue tracker is available. With aforementioned URL redirects in place we intended to move and keep Bugzilla available for a while under the new address


The page on that users were redirected to while the migration was taking place.

The page on that users were redirected to while the migration was taking place.

The migration started by switching Bugzilla to read-only for good. Users can still log into Bugzilla (now available at and e.g. run their searched queries or access their list of votes on the outdated data but they cannot create or change any existing tickets.

We pulled and disabled its email interface, switched off the code review notification bot for Bugzilla, and switched off the scripts to sync Bugzilla tickets with Mingle and Trello.

The data migration started by applying a hack to workaround a Bugzilla XML-RPC API issue (see below), running the migration fetch script (tasks and comments), reverting the hack, running the migration create script (attachments), moving Bugzilla to, starting the cron jobs to start assigning Bugzilla activity to Phabricator users by replacing the generic “bzimport” user by the actual corresponding users, and setting up redirects from URLs.

A task before and after users have claimed their previous Bugzilla accounts (positions of comments in the right image manually altered for better comparison).

A task before and after users have claimed their previous Bugzilla accounts (positions of comments in the right image manually altered for better comparison).

After several of those data migration steps we performed numerous tests. In parallel we prepared emails and announcements to send out and publish once we’re finished, updated links to Bugzilla by Phabricator on dozens of wikipages, updating MediaWiki templates on the Wikimedia, and further small tasks.

Paying attention to details: The "infobox" template on MediaWiki extension homepages linking to the extension's bug reports at the bottom, now handled in Phabricator instead of Bugzilla.

Paying attention to details: The infobox template on MediaWiki extension homepages linking to the extension’s bug reports at the bottom, now handled in Phabricator instead of Bugzilla.

For those being curious about time spans: Fetching the 73681 Bugzilla tickets took ~5 hours, importing them ~25 hours, and claiming the imported user contributions of the single most active Bugzilla user took ~15 minutes.

But obviously we were pioneers that could not rely on Stackoverflow.
Even if you try to test everything, unexpected things happen while you are running the migration. I’m proud to say that we (well, rather Chase, Daniel, Mukunda and Sean when it came to dealing with code) managed to fix all of them. And while you try to plan everything, for such a complex move that nobody has tried before, there are things that you simply forget or have not thought about:

  • We had to work around an unresolved upstream XML-RPC API bug in Bugzilla by applying a custom hack when exporting comments in a first step and removing the hack when exporting attachments (with binary data) in a second step. Though we did, it took us a while to realize that Bugzilla attachments imported into Phabricator were scrambled as the hack got still applied for unknown reasons (some caching?). Rebooting the Bugzilla server fixed the problem but we had to start from scratch with importing attachments.
  • Though we had planned to move Bugzilla from to after exporting all data, we hadn’t realized that we would need a certificate for that new subdomain. For a short time we had an ugly “This website might be insecure” browser warning when users tried to access the old Bugzilla until old Bugzilla was moved behind the Varnish/nginx layer with its wildcard * certificate.
  • Two Bugzilla statuses did not get converted into Phabricator tags. The code once worked when testing but broke again later at some point without anybody realizing but this was noticed and fixed.
  • Bugzilla comments marked as private got public again once the cron jobs claiming contributions of that commenter were run. Again this was noticed and fixed.
  • We ended up with a huge feed queue and search indexing queue. We killed the feed daemon at some point. Realizing that it would have taken Phabricator’s daemons ~10 days to handle the queue, Chase and Mukunda debugged the problem together with upstream’s Evan and found a way to improve the SQL performance drastically.
  • We hadn’t thought about switching off some Bugzilla related cronjobs (minor) and I hadn’t switched off mail notifications from Bugzilla so some users still received "whining" emails until we stopped that.
  • We had a race condition in the migration code which did not always set the assignee of a Bugzilla ticket also as the assignee of the corresponding task in Phabricator. We realized early enough by comparing the numbers of assigned tickets for specific users and fixed the problem.
  • I hadn’t tested that aliases of Bugzilla reports actually get migrated. As this only affected ~120 tickets we decided to not try to fix this retroactively.
Phabricator daemons being (too) busy handling the tasks mass-imported from Bugzilla.

Phabricator daemons being (too) busy handling the tasks mass-imported from Bugzilla.

We silently reopened Phabricator on late Sunday evening (UTC) and announced its availability on Monday morning (UTC) to the wikitech-l community and via the aforementioned blogpost.

A list of dependency tasks handled before completing the migration from Bugzilla to Phabricator is available.


Phabricator has many advantages compared to Bugzilla: Wikimedia users do not reveal their email addresses and users do not have another separate login and password. (These were the most popular complaints about Bugzilla.)

Integration with MediaWiki's Single User Login via OAuth - no separate login.

Integration with MediaWiki’s Single User Login via OAuth – no separate login.

There is a preview when writing comments.
The initial description can be edited and updated like a summary while the discussion on a task evolves.
Users have a profile showing their latest activity.
There’s a global activity feed.
There is a notification panel on top.
The UI looks modern and works pretty well on devices with small screens.
Tasks can have either zero or one assignee. In Bugzilla an assignee must be set even if nobody plans to work on a ticket.
Tasks can have between zero and unlimited projects (such as code bases, sprints, releases, cross-project tags) associated. In Bugzilla, tickets must have exactly one product, exactly one component, exactly one target milestone, and between zero and unlimited cross-project keywords. That also solves Bugzilla’s problem of dealing with branches, e.g. setting several target milestones.
Projects have workboards (a card wall) with columns for planning sprints (Bugzilla only allowed getting lists of items which you cannot directly interact with from the list view.). Thanks to Wikimedia Deutschland we now also have burndown charts for sprint projects.

The workboard of the Wikimedia Phabricator project, right after the Bugzilla migration.

Burndown chart for a two week sprint of the Wikimedia Analytics team.

Burndown chart for a two week sprint of the Wikimedia Analytics team.


From a bugmaster point of view there are also small disadvantages:
Some searches are not possible anymore via the web interface, e.g. searching for open tasks which have the same assignee set for more than 12 months ("cookie-licking") or tasks that have been closed within the last month.
Phabricator is more atomic when it comes to actions: I receive more mail notifications and it also takes me slightly longer to perform several steps in a single ticket (though my local Greasemonkey script saves me a little bit of time).

Furthermore, admins don’t have the same powers as in Bugzilla. The UI feels very clean though (breadcrumbs!):

Administrator view for settings policies in Maniphest.

Administrator view for settings policies in Phabricator.

New territories

Apart from the previous list of unexpected situations while migrating, there were also further issues we experienced before or after the migration.
Mass-importing huge amounts of data from an external system into Phabricator was new territory. For example, Phabricator initially had no API to create new projects or to import tickets from other systems. No Phabricator instance with >70000 tasks had existed before – before the migration we had a crash affecting anonymous users and after the migration the reports/statistics functionality became inaccessible (timing out). Those Phabricator issues were quickly fixed by upstream.
And of course in hindsight, there are always a few more things that you would have approached differently.

Next steps

All in all and so far, things work surprisingly well.
We are still consolidating good practices and guidelines for project management (we had a Hangout video session on December 11th about that), I’ve shared some queries helpful for triagers, and we keep improving our Phabricator and bug management related help and documentation. The workflow offered by Phabricator also creates interesting new questions to discuss. Just one example: When a task has several code related projects assigned that belong to different teams, who decides on the priority level of the task?

Next on the list is to replace RT (mostly used by the Operations team) and helping teams to migrate from Trello and Mingle (Language Engineering, Multimedia and parts of Analytics have already succeeded). In 2015 we plan to migrate code repository browsing from gitblit and code review from Gerrit.

OMG we made it

A huge huge thanks to my team: Chase (Operations), Mukunda (Platform), Quim (Engineering Community Team), the many people who contributed code or helped out (Christopher, Daniel, Sean, Valhallasw, Yuvi, and more that I’ve likely forgotten), and the even more people who provided input and opinions (developers, product managers, release management, triagers, bug reporters, …) leading to decisions.
I can only repeat that the upstream Phabricator team (especially Evan) have been extremely responsive and helpful by providing feedback incredibly fast, fixing many of our requests and being available when we ran into problems we could not easily solve ourselves.


Running Django on Docker: a workflow and code

It has been an extremely long time between beers (10 months!). I’ve gotten out of the habit of blogging and somehow I never blogged about the talk I co-presented at PyCon AU this year on Pallet and Forklift the standard and tool we’ve developed at Infoxchange to help make it easier to develop web-applications on Docker1.

Infoxchange is one of the few places I’m aware of that runs Docker in prod. If you’re looking at using Docker to do web development, it’s worth checking out what we’ve been doing over on the Infoxchange devops blog.

  1. There’s also Straddle Carrier, a set of Puppet manifests for loading Docker containers on real infrastructure, but they’ve not been released yet as they rely too much on our custom Puppet config.

December 16, 2014


In the last year working at Xamarin, I have learned lots of new things (.NET, Cocoa, …), and since the beginning of that, I was thinking on bringing some of that nice stuff to GNOME, but didn’t really had the chance to finish anything. But, fortunately, being free now (on vacation), I finally finished the 1st thing: GObservableCollection, a thread-safe collection implementation which emits signals on changes.

It is based on ideas from .NET’s ObservableCollection and concurrent collections, which I’ve used successfully for building a multi-thread data processing app (with one thread updating the collection and another consuming it), so I thought it would be a good addition to GLib’s API. This class can be used on single-threaded apps to easily get notifications for changes in a collection, and in multi-threaded ones for, as mentioned above, easily share data between different threads (as can be seen on the simple test I wrote).

This is the 1st working version, so for sure it will need improvements, but instead of keeping it private for a few more months, I thought it would be better getting some feedback before I submit it as a patch for GLib’s GIO (if that’s the best place for it, which I guess it is).

December 15, 2014

Web Engines Hackfest 2014

For the 6th year in a row, Igalia has organized a hackfest focused on web engines. The 5 years before this one were actually focused on the GTK+ port of WebKit, but the number of web engines that matter to us as Free Software developers and consultancies has grown, and so has the scope of the hackfest.

It was a very productive and exciting event. It has already been covered by Manuel RegoPhilippe Normand, Sebastian Dröge and Andy Wingo! I am sure more blog posts will pop up. We had Martin Robinson telling us about the new Servo engine that Mozilla has been developing as a proof of concept for both Rust as a language for building big, complex products and for doing layout in parallel. Andy gave us a very good summary of where JS engines are in terms of performance and features. We had talks about CSS grid layouts, TyGL – a GL-powered implementation of the 2D painting backend in WebKit, the new Wayland port, announced by Zan Dobersek, and a lot more.

With help from my colleague ChangSeok OH, I presented a description of how a team at Collabora led by Marco Barisione made the combination of WebKitGTK+ and GNOME’s web browser a pretty good experience for the Raspberry Pi. It took a not so small amount of both pragmatic limitations and hacks to get to a multi-tab browser that can play youtube videos and be quite responsive, but we were very happy with how well WebKitGTK+ worked as a base for that.

One of my main goals for the hackfest was to help drive features that were lingering in the bug tracker for WebKitGTK+. I picked up a patch that had gone through a number of iterations and rewrites: the HTML5 notifications support, and with help from Carlos Garcia, managed to finish it and land it at the last day of the hackfest! It provides new signals that can be used to authorize notifications, show and close them.

To make notifications work in the best case scenario, the only thing that the API user needs to do is handle the permission request, since we provide a default implementation for the show and close signals that uses libnotify if it is available when building WebKitGTK+. Originally our intention was to use GNotification for the default implementation of those signals in WebKitGTK+, but it turned out to be a pain to use for our purposes.

GNotification is tied to GApplication. This allows for some interesting features, like notifications being persistent and able to reactivate the application, but those make no sense in our current use case, although that may change once service workers become a thing. It can also be a bit problematic given we are a library and thus have no GApplication of our own. That was easily overcome by using the default GApplication of the process for notifications, though.

The show stopper for us using GNotification was the way GNOME Shell currently deals with notifications sent using this mechanism. It will look for a .desktop file named after the application ID used to initialize the GApplication instance and reject the notification if it cannot find that. Besides making this a pain to test – our test browser would need a .desktop file to be installed, that would not work for our main API user! The application ID used for all Web instances is org.gnome.Epiphany at the moment, and that is not the same as any of the desktop files used either by the main browser or by the web apps created with it.

For the future we will probably move Epiphany towards this new era, and all users of the WebKitGTK+ API as well, but the strictness of GNOME Shell would hurt the usefulness of our default implementation right now, so we decided to stick to libnotify for the time being.

Other than that, I managed to review a bunch of patches during the hackfest, and took part in many interesting discussions regarding the next steps for GNOME Web and the GTK+ and Wayland ports of WebKit, such as the potential introduction of a threaded compositor, which is pretty exciting. We also tried to have Bastien Nocera as a guest participant for one of our sessions, but it turns out that requires more than a notebook on top of a bench hooked up to   a TV to work well. We could think of something next time ;D.

I’d like to thank Igalia for organizing and sponsoring the event, Collabora for sponsoring and sending ChangSeok and myself over to Spain from far away Brazil and South Korea, and Adobe for also sponsoring the event! Hope to see you all next year!

Web Engines Hackfest 2014 sponsors: Adobe, Collabora and Igalia

Web Engines Hackfest 2014 sponsors: Adobe, Collabora and Igalia

2014-12-15: Monday

  • Mail chew, interview, sync. with Matus, more mail; lunch. Product Team call, sync with Michal, call with GL guys, Consultancy Team call, sync. with CL guys. Fed babes; read stories. J. doing PCC training - wrote LXF column.

Web Engines Hackfest 2014

Last week I attended the Web Engines Hackfest. The event was sponsored by Igalia (also hosting the event), Adobe and Collabora.

As usual I spent most of the time working on the WebKitGTK+ GStreamer backend and Sebastian Dröge kindly joined and helped out quite a bit, make sure to read his post about the event!

We first worked on the WebAudio GStreamer backend, Sebastian cleaned up various parts of the code, including the playback pipeline and the source element we use to bridge the WebCore AudioBus with the playback pipeline. On my side I finished the AudioSourceProvider patch that was abandoned for a few months (years) in Bugzilla. It’s an interesting feature to have so that web apps can use the WebAudio API with raw audio coming from Media elements.

I also hacked on GstGL support for video rendering. It’s quite interesting to be able to share the GL context of WebKit with GStreamer! The patch is not ready yet for landing but thanks to the reviews from Sebastian, Mathew Waters and Julien Isorce I’ll improve it and hopefully commit it soon in WebKit ToT.

Sebastian also worked on Media Source Extensions support. We had a very basic, non-working, backend that required… a rewrite, basically :) I hope we will have this reworked backend soon in trunk. Sebastian already has it working on Youtube!

The event was interesting in general, with discussions about rendering engines, rendering and JavaScript.

December 14, 2014

2014-12-14: Sunday

  • Off to NCC, ran the older kids group looking at John 2; back for a fine lunch; applied slugging - slept on the sofa. Quartet practice, the Princess Bride "Get used to disappointment", tea, & sermon in bed.

December 12, 2014

Tips for contributing code to open source projects

I've spent a lot of time over the years contributing to and reviewing code changes to open source projects. It can take a lot of work for the submitter and reviewer to get a change accepted and often they don't make it. Here are the things in my experience that successful contributions do.

Use the issue tracker. Having an open issue means there is always something to point to with all the history of the change that wont get lost. Submit patches using the appropriate method (merge proposals, pull requests, attachments in the issue tracker etc).

Sell your idea. The change is important to you but the maintainers may not think so. You may be a 1% use case that doesn't seem worth supporting. If the change fixes a bug describe exactly how to reproduce the issue and how serious it is. If the change is a new feature then show how it is useful.

Always follow the existing coding style. Even if you don't like it. If the existing code uses tabs, then use them too. Match brace style. If the existing code is inconsistent, match the code nearest to the changes you are making.

Make your change as small as possible. Put yourself in the mind of the reviewer. The longer the patch the more time it will take to review (and the less appealing it will be to do). You can always follow up later with more changes. First time contributors need more review - over time you can propose bigger changes and the reviewers can trust you more.

Read your patch before submitting it. You will often find bits you should have removed (whitespace, unrelated variable name changes, debugging code).

Be patient. It's OK to check back on progress - your change might have be forgotten about (everyone gets busy). Ask if there's any more you can do to make it easier to accept.

Remote Reference Counting in Rapicorn

In the last months I finally completed and merged a long standing debt into Rapicorn. Ever since the Rapicorn GUI layout & rendering thread got separated from the main application (user) thread, referencing widgets (from the application via the C++ binding or the Python binding) worked mostly due of luck. I investigated and researched several [...]

Nautilus port to GAction, GMenu, and Popovers – Penultimate last step


Finally I have all working and change as per the mockup in

The positioning and size of icons are also changed to and



So, how much change is it?

The main patch touch 24 files, a clean up of -6210 lines of code, and added +2802

For me the most important part was deleting 6000 lines of code. Nautilus was using lot of legacy code, codified in an intricate way. Cleaning up those lines makes the maintenance of the application a lot more pleasure, and a little more smarter.

So what is the work still needed?

Basically, make sure I deleted all the legacy code and make sure every case is still took into account now with less lot code. Hopefully next week I will have a patch ready to review. I guess the review part will be a little long, since there’s new ideas that probably reviewers will argue.

After that, create the new API for extensions and make extensions work.

Hope you like the new changes in nautilus!

December 11, 2014

Ipsilon 0.3.0 released!

Ipsilon 0.3.0 released!

My last post about the identiy provider project Ipsilon is now almost a month old, and we just hit another big milestone in Ipsilon: version 0.3.0 has been released!

This release includes a bunch of new features, partially because of the merge with FedOAuth, of which the most notable are, in no particular order :

  • A transaction system so Ipsilon can be used to authenticate in multiple browser tabs at the same time
  • A completely revamped admin panel to make it more user-friendly
  • The option to have completely file-based configuration in case you use configuration management
  • Ability to store configuration and other data in a SQL database
  • Addition of OpenID support including some often-used extensions
  • Addition of the Persona protocol
  • And lots of more fixes and code cleanup

For more information, see the 0.3.0 release page.

We also now have a mailing list to get in contact with the Ipsilon developers to contribute or ask questions about Ipsilon.

You can also get in contact with us on #ipsilon on Freenode.

Thanks to everyone that helped with this release, and we hope to have more exciting new in the near future!

Web Engines Hackfest 2014

During the last days I attended the Web Engines Hackfest 2014 in A Coruña, which was kindly hosted by Igalia in their office. This gave me some time to work again on WebKit, and especially improve and clean up its GStreamer media backend, and even more important of course an opportunity to meet again all the great people working on it.

Apart from various smaller and bigger cleanups, and getting 12 patches merged (thanks to Philippe Normand and Gustavo Noronha reviewing everything immediately), I was working on improving the WebAudio AudioDestination implementation, which allows websites to programmatically generate audio via Javascript and output it. This should be in a much better state now.

But the biggest chunk of work was my attempt for a cleaner and actually working reimplementation of the Media Source Extensions. The Media Source Extensions basically allow a website to provide a container or elementary stream for e.g. the video tag via Javascript, and can for example be used to implement custom streaming protocols. This is still far from finished, but at least it already works on YouTube with their Javascript DASH player and should give a good starting point for finishing the implementation. I hope to have some more time for continuing this work, but any help would be welcome.

Next to the Media Source Extensions, I also spent some time on reviewing a few GStreamer related patches, e.g. the WebAudio AudioSourceProvider implementation by Philippe, or his patch to use GStreamer’s OpenGL library directly in WebKit instead of reimplementing many pieces of it.

I also took a closer look at Servo, Mozilla’s research browser engine written in Rust. It looks like a very promising and well designed project (both, Servo and Rust actually!). I’m sure I’ll play around with Rust in the near future, and will also try to make some time available to work a bit on Servo. Thanks also to Martin Robinson for answering all my questions about Servo and Rust.

And then there were of course lots of discussions about everything!

Good things are going to happen in the future, and WebKit still seems to be a very active project with enthusiastic people :)
I hope I’ll be able to visit next year’s Web Engines Hackfest again, it was a lot of fun.

December 10, 2014

A looking glass for GTK+

gnome-shell has  a nice integrated developer tool called “Looking glass”. To bring it up, simply enter “lg” into the command prompt:

Alt-F2 lg

This brings up a translucent overlay with an interactive JavaScript prompt. Among the nice things it offers are tab completion and an object picker.

Here is how it looks:

Looking glass If you haven’t tried it yet,  you really should.

Over the years, many people have asked,

“Can I use Looking glass to debug my GTK+ application ?”.

So far, the answer has always been no. But now, the always awesome Alex Larsson has stepped up and made essentially the same functionality available for the GTK+ Inspector:

Interactive InspectorAs you can see, it has an object picker, tab completion and a result history just like its ‘big brother’.  Alex made a little demo video for it.

To avoid a gjs dependency in GTK+ itself and the associated cyclic dependency issues, this is implemented as an extension in a loadable module.  The code currently lives on github, in Alex’ gjs-inspector repository.


Wikimedia in Google Code-in 2014: The first week

Wikimedia takes part in Google Code-in (GCI) 2014. The contest has been running for one week and students have already resolved 35 Wikimedia tasks. You can help making that more (see below).

Google Code-in 2014

Some of the achievements:

  • Citoid offers export in BibTeX format (and more contributions)
  • Analytics’ Dashiki has a mobile-friendlier view
  • Echo‘s badge label text has better readability; Echo uses the standard gear icon for preferences
  • Wikidata’s Wikibase API modules use i18n for help/docs
  • Two MediaWiki extensions received patches to not use deprecated i18n functions anymore
  • MediaWiki displays an error when trying to create a self-redirect
  • The sidebar group separator in MediaWiki’s Installer looks like in Vector
  • The Wikimedia Phabricator docs have video screencasts and an updated Bug report life cycle diagram
  • Huggle‘s on-wiki docs were updated; exceptions received cleanup
  • Pywikibot‘s replicate_wiki supports global args; optparse was replaced by argparse
  • Reasons for MediaWiki sites listed as defunct on WikiApiary were researched
  • Wikimedia received logo proposals for the European Wikimedia Hackathon 2015
  • …and many more.

Sounds good? Want to help? Then please spend five minutes to go through the tasks on your to-do list and identify simple ones to help more young people contribute! Got an idea for a task? Become a mentor!

December 09, 2014

sndflo 0.1: Visual sound programming in SuperCollider

SuperCollider is an open source project for real-time audio synthesis and algorithmic composition.
It is split into two parts; an interpreter (sclang) implementing the SuperCollider language and the audio synthesis server (scsynth).
The server has an directed acyclic graph of nodes which it executes to produce the audio output (paper|book on internals). It is essentially a dataflow runtime, specialized for the problem domain of real-time audio processing. The client controls the server through OSC messages which manipulates this graph. Typically the client is some SuperCollider code in the sclang interpreter, but one can also use Clojure, Python or other clients. It is in many ways quite similar to the Flowhub visual IDE (a FBP protocol client) and runtimes like NoFlo, imgflo and MicroFlo.
So we decided to make SuperCollider a runtime too: sndflo.


Growing list of runtimes that Flowhub can target

We used SuperCollider for Piksels & Lines Orchestra, a audio performance system which hooked into graphics applications like GIMP, Inkscape, MyPaint, Scribus – and sonified the users actions in the application. A lot of time was spent wrestling with SuperCollider, due to the number of new concepts and myriad of ways to do things, and
lack of (well documented) best practices.
There is also a tendency to favor very short, expressive constructs (often opaque). An extreme example, here is an album of SuperCollider pieces composed with <140 characters (+ an analysis of some of them).

On the contrary sndflo is very focused and opinionated. It exposes Synths as components, which are be wired together using Busses (edges in the graph), allowing to build audio effect pipelines. There are several known issues and limitations, but it has now reached a minimally useful state. Creating Synths components (the individual effects) as a visual graph of UGen (primitives like Sin,Cos,Min,Max,LowPass) components is also within scope and planned for next release.

Simple substrative audio synthesis using sawwave and low-pass filter

Simple substrative audio synthesis using sawwave and low-pass filter

The sndflo runtime is itself written in SuperCollider, as an extension. This is to make it easier for those familiar with SuperCollider to understand the code, and to facilitate integration with existing SuperCollider code and tools. For instance setting up a audio pipeline visually using Flowhub+sndflo, then using the Event/Pattern/Stream system in SuperCollider to create an algorithmic composition that drives this pipeline.
Because a web browser cannot talk OSC (UDP/TCP) and SuperCollider does not talk WebSocket a node.js wrapper converts messages on the FBP protocol between JSON over WebSocket to JSON over OSC.

sndflo also implements the remote runtime part of the FBP protocol, which allows seamless interconnection between runtimes. One can export ports in one runtime, and then use it as a component in another runtime, communicating over one of the supported transports (typically JSON over WebSocket).

YouTube demo video

In above example sndflo runs on a Raspberry Pi, and is then used as a component in a NoFlo browser runtime to providing a web interface, both programmed with Flowhub. We could in the same way wire up another FBP runtime, for instance use MicroFlo on Arduino to integrate some physical sensors into the system.
Pretty handy for embedded systems, interactive art installations, internet-of-things or other heterogenous systems.

flattr this!

A checklist for writing pkg-config files

tl;dr: Use AX_PKG_CHECK_MODULES to split public/private dependencies; use AC_CONFIG_FILES to magically include the API version in the .pc file name.

A few tips for creating a pkg-config file which you will never need to think about maintaining again — because one of the most common problems with pkg-config files is that their dependency lists are years out of date compared to the dependencies checked for in See lower down for some example automake snippets.

  • Include the project’s major API version1 in the pkg-config file name. e.g. libfoo-1.pc rather than libfoo.pc. This will allow parallel installation of two API-incompatible versions of the library if it becomes necessary in future.
  • Split private and public dependencies between Requires and Requires.private. This eliminates over-linking when dynamically linking against the project, since in that case the private dependencies are not needed. This is easily done using the AX_PKG_CHECK_MODULES macro (and perhaps using an upstream macro in future — see pkg-config bug #87154). A dependency is public when its symbols are exposed in public headers installed by your project; it is private otherwise.
  • Include useful ancillary variables, such as the paths to any utilities, directories or daemons which ship with the project. For example, glib-2.0.pc has variables giving the paths for its utilities: glib-genmarshal, gobject-query and glib-mkenums. libosinfo-1.0.pc has variables for its database directories. Ensure the variables use a variable form of ${prefix}, allowing it to be overridden when invoking pkg-config using pkg-config --define-variable=prefix=/some/other/prefix. This allows use of libraries installed in one (read only) prefix from binaries in another, while installing ancillary files (e.g. D-Bus service files) to the second prefix.
  • Substitute in the Name and Version using @PACKAGE_NAME@ and @PACKAGE_VERSION@ so they don’t fall out of sync.
  • Place the template in the source code subdirectory for the library it’s for — so if your project produces multiple libraries (or might do in future), the files don’t get mixed up at the top level.

Given all those suggestions, here’s a template libmy-project/ file (updated to incorporate suggestions by Dan Nicholson):



Description: Some brief but informative description
Libs: -L${libdir} -lmy-project-@API_VERSION@
Cflags: -I${includedir}/my-project-@API_VERSION@

And here’s a a few snippets from a template

# Release version

# API version


# Dependencies


# The first list on each line is public; the second is private.
                     [glib-2.0 >= $glib_reqs gio-2.0 >= $gio_reqs],
                     [gthread-2.0 >= $gthread_reqs])
                     [nice >= $nice_reqs],


# Output files
# Rename the template .pc file to include the API version on configure

And finally, the top-level

# Install the pkg-config file; the directory is set using
pkgconfig_DATA = libmy-project/my-project-$(API_VERSION).pc

Once that’s all built, you’ll end up with an installed my-project-1.pc file containing the following (assuming a prefix of /usr):



Name: my-project
Description: Some brief but informative description
Version: 1.2.3
Libs: -L${libdir} -lmy-project-1
Cflags: -I${includedir}/my-project-1
Requires: glib-2.0 >= 2.40 gio-2.0 >= 2.42 nice >= 0.1.6
Requires.private: gthread-2.0 >= 2.40

All code samples in this post are released into the public domain.

  1. Assuming this is the number which will change if backwards-incompatible API/ABI changes are made. 

state of js implementations, 2014 edition

I gave a short talk about the state of JavaScript implementations this year at the Web Engines Hackfest.

29 minutes, vorbis or mp3; slides (PDF)

The talk goes over a bit of the history of JS implementations, with a focus on performance and architecture. It then moves on to talk about what happened in 2014 and some ideas about where 2015 might be going. Have a look if that's a thing you are in to. Thanks to Adobe, Collabora, and Igalia for sponsoring the event.

December 08, 2014

A look at new developer features

As the development window for GNOME 3.16 advances, I've been adding a few new developer features, selfishly, so I could use them in my own programs.

Connectivity support for applications

Picking up from where Dan Winship left off, we've merged support for application to detect the network availability, especially the "connected to a network but not to the Internet" case.

In glib/gio now, watch the value of the "connectivity" property in GNetworkMonitor.

Grilo automatic network awareness

This glib/gio feature allows us to show/hide Grilo sources from applications' view if they require Internet and LAN access to work. This should be landing very soon, once we've made the new feature optional based on the presence of the new GLib.


And finally, this means we'll soon be able to show a nice placeholder when no network connection is available, and there are no channels left.

Grilo Lua resources support

A long-standing request, GResources support has landed for Grilo Lua plugins. When a script is loaded, we'll look for a separate GResource file with ".gresource" as the suffix, and automatically load it. This means you can use a local icon for sources with the URL "resource:///org/gnome/grilo/foo.png". Your favourite Lua sources will soon have icons!

Grilo Opensubtitles plugin

The developers affected by this new feature may be a group of one, but if the group is ever to expand, it's the right place to do it. This new Grilo plugin will fetch the list of available text subtitles for specific videos, given their "hashes", which are now exported by Tracker.

GDK-Pixbuf enhancements

I can point you to the NEWS file for the latest version, but the main gains are that GIF animations won't eat all your memory, DPI metadata support in JPEG, PNG and TIFF formats, and, for image viewers, you can tell whether a TIFF file is multi-page to open it in a more capable viewer.

Batched inserts, and better filters in GOM

Does what it says on the tin. This is useful for populating the database quicker than through piecemeal inserts, it also means you don't need to chain inserts when inserting multiple items.

Mathieu also worked on fixing the priority of filters when building complex queries, as well as supporting more than 2 items in a filter ("foo OR bar OR baz" for example).

summing up 66

i am trying to build a jigsaw puzzle which has no lid and is missing half of the pieces. i am unable to show you what it will be, but i can show you some of the pieces and why they matter to me. if you are building a different puzzle, it is possible that these pieces won't mean much to you, maybe they won't fit or they won't fit yet. then again, these might just be the pieces you're looking for. this is summing up, please find previous editions here.

  • federated education: new directions in digital collaboration, as advocates we're so often put in a situation where we have to defend the very idea that social media is an information sharing solution that we don't often get to think about what a better solution for collaboration would look like. because there are problems with the way social media works now. minority voices are squelched, flame wars abound. we spend hours at a time as rats hitting the skinner-esque levers of twitter and tumblr, hoping for new treats - and this might be ok if we actually then built off these things, but we don't. we're stuck in an attention economy feedback loop where we react to the reactions of reactions (while fearing further reactions), and then we wonder why we're stuck with groupthink and ideological gridlock. we're bigger than this and we can envision new systems that acknowledge that bigness. we can build systems that return to the the vision of the forefathers of the web. the augmentation of human intellect. the facilitation of collaboration. the intertwingling of all things. this is one such proposal. maybe you have others. highly recommended
  • your app is good and you should feel good, there's no disincentive to honking at people for the slightest provocation. there's little recourse for abuse. it's such an asymmetrical, aggressive technology, so lacking in subtlety. it kind of turns everyone into a crying baby - you can let the people around you know that you're very upset, but not why. i think the internet is like this sometimes, too. the internet is like a car horn that you can honk at the entire world. recommended
  • the best investment advice you'll never get, don't try to beat the market and don't believe anyone who tells you they can - not a stock broker, a friend with a hot stock tip, or a financial magazine article touting the latest mutual fund. seasoned investment professionals have been hearing this anti-industry advice, and the praises of indexing, for years. but while wall street has considerable soul-searching to do, full blame for the gouging of naive investors does not lie with the investment management industry alone. there is an innate cultural imperative in this country to beat the odds, to do better than the joneses. it's simply difficult for most of us to accept average returns on our money, or on anything for that matter. recommended
  • forget shorter showers - why personal change does not equal political change, i think we're in a double bind. a double bind is where you're given multiple options, but no matter what option you choose, you lose, and withdrawal is not an option. at this point, it should be pretty easy to recognize that every action involving the industrial economy is destructive. so if we choose option one - if we avidly participate in the industrial economy - we may in the short term think we win because we may accumulate wealth, the marker of "success" in this culture. but we lose, because in doing so we give up our empathy, our animal humanity. and we really lose because industrial civilization is killing the planet, which means everyone loses. if we choose the "alternative" option of living more simply, thus causing less harm, but still not stopping the industrial economy from killing the planet, we may in the short term think we win because we get to feel pure, and we didn't even have to give up all of our empathy, but once again we really lose because industrial civilization is still killing the planet, which means everyone still loses. the third option, acting decisively to stop the industrial economy, is very scary for a number of reasons, including but not restricted to the fact that we'd lose some of the luxuries to which we've grown accustomed, and the fact that those in power might try to kill us if we seriously impede their ability to exploit the world - none of which alters the fact that it's a better option than a dead planet. any option is a better option than a dead planet

December 07, 2014

Moving on

Just a quick (and slightly delayed) note to let you all know that after seven years I have left Igalia. It has really been an amazing, rocking place to be, and I have the best wishes for the company and its people.

Malmö seaside view

Malmö seaside view, with the bridge to Denmark in the background.

Currently, I am living in Malmö (Sweden), where I am taking part in the Master in Interaction Design. This will be my second Master’s degree, after the one on HCI that I studied at the University of York in 2009. But whereas that was more focused on theory and academical research, this one is eminently practical: the course is structured around projects that try to go beyond the boundaries of mainstream everyday design, using many different techniques and tools to gather insights and quickly build working prototypes.

I don’t know yet what the future holds, but that is a question for the new year.

This blog will be frozen after this post; if you want to keep in touch, I’m  @felipeerias on Twitter.

December 06, 2014

layer by layer

I’ve been having some more fun working on Graphene, lately, thanks to Alex and his port of three.js to GObject, Gthree.

the little library that started in May with a bunch of basic types for vectors and matrices, as well as some 2D types that were required for convenience, has now grown to include some more 3D types, for the convenience of people developing 3D canvases:

  • graphene_triangle_t, a simple triangle shape made of co-planar 3D points
  • graphene_plane_t, a representation of a 2D plane in 3D space
  • graphene_box_t, a 3D volume, defined by a minimum and a maximum vertices
  • graphene_sphere_t, a sphere, unsurprisingly
  • graphene_frustum_t, a frustum defined by six clipping planes
  • graphene_euler_t, a rotation described by Euler angles

alongside these new types there are various new operators for vectors, as well as utility functions for matrices and quaternions. all in all, ~120 new symbols have been added to the public API since the 1.0 release, bringing the total symbols a tad over 400, all documented.

obviously, given the amount of new types and entry points, I had to improve the coverage of the test suite, which is now around at 60%. still ways to go before it being good instead of just passable, but definitely better than before.

for the deparment of the recycle bin of development history, I entertained the idea of adding a vectorized graphene_color_t type, useful if you’re interpolating or operating on texture vertices; but in the end I decided against it. it’s easy enough to use graphene_vec4_t for those cases, and the thing I don’t want to do is re-implement the RGBA parsing and HSL conversion and interpolation again, this time using SIMD instructions.

I guess, at this point, it has come a time to stop adding stuff, cut the 1.2 release, and wait until users come up with data types and API for their particular needs. right now, Graphene has become a fairly sizeable library, but one that is easy to keep in your head in its entirety — and I’d hate to lose that characteristic.

December 05, 2014

WideOpenId –

Uh, I meant to blog about this a while ago, but somehow, it got lost… Anyway, I was inspired by and intrigued by OpenID I set out to find an implementation that comes with an acceptable level of required effort to set up and run.

While the idea of federated authentication sounds nice, the concepts gets a bit flawed if everybody uses Google or Stackexchange as their identity provider. Also, you might not really want to provide your very own OpenID for good reasons. Pretty much as with email, which is why you could make use of mailinator, yopmail, or others.

There is a list of server software on the OpenID page, but none of them really looked like low effort. I wouldn’t want to install Django or any other web framework. But I’d go with a bad Python solution before even looking at PHP.

There is an “official” OpenID example server which is not WSGI aware and thus requires more effort than I am willing to invest. Anyway, I took an existing OpenID server and adapted it such that anyone could log in. Always. When developing and deploying, I noticed that mod_wsgi‘s support for virtualenv is really bad. For example, the PYTHONPATH cannot be inside Apache’s VirtualHosts declaration and you thus need a custom WSGI file which hard codes the Python version. It appears that there is also no helper on the Python level to “load” a virtual env. Weird.

woid server in action

Anyway, you can now enjoy OpenID by providing as your identity provider. The service will happily tell anyone that any ID is valid. So you can log in as any name you one. A bit like mailinator for OpenID.

To test whether the OpenID provider actually works, you can download the example consumer and start it.
Screenshot from 2014-01-06 16:49:43

Three Years and Counting!

Making a quick pit stop to mark this milestone in my professional career: today is my 3-year anniversary at Red Hat! Time has certainly flown by and I really cannot believe that it has been three years since I joined this company.

I know it is sort of cliche to say “I can not believe that it has been this long…” and so on and so forth, but it is so true. Back then I joined a relatively new project with very high ambitions, and the first few months had me swimming way out in the deepest part of the pool, trying to learn all ‘Red Hat-things’ and Clojure for the existing automation framework (now we are fully using Python).

I did a lot of swimming for sure, and through the next months, through many long days and weekends and hard work, tears and sweat (you know, your typical life for a Quality Engineer worth his/her salt), I succeeded in adding and wearing many types of hats, going from a Senior Quality Engineer, to a Supervisor of the team, to eventually becoming the Manager for a couple of teams, spread over 4 different countries. Am I bragging? Maaaybe a little bit :) but my point is really to highlight a major key factor that made this rapid ascension path possible: Red Hat’s work philosophy and culture of rewarding those who work hard and truly embrace the company! Sure, I worked really hard, but I have worked just as hard before in previous places and gotten nowhere really fast! Being recognized and rewarded for your hard work is something new to me, and I owe a great debt of gratitude to those who took the time to acknowledge my efforts and allowed me room to grow within this company!

The best part of being a Red Hatter for 3 years? Being surrounded by an enormous pool of talented, exciting people who not only enjoy what they do, but are always willing to teach you something new, and/or to drop what they’re working on to lend you a helping hand! There is not a single day that I don’t learn something new, and thankfully I don’t see any sign of this trend stopping :) Have I mentioned that I love my teammates too? What a great bunch of guys!!! Getting up early in the morning and walking to my home office (yeah, they let me work remotely too) day in, day out, is never a drag because I just know that there are new things to learn and new adventures and ‘achievements to unlock’ right around the corner.

I am Red Hat!!!

Mahalo for removing your shoes

The local custom in Kauai is for you to remove your shoes before coming into the house, because I have an injured toe I had been ignoring this.

Apparently the Kauai deities decided I should be taught a lesson.


December 03, 2014

Retro 0.1 RC

During last GUADEC, I had a chance to briefly present my project of having a powerful yet simple video game manager and player for GNOME. To make it a reality, a lot of work was needed on the backend side.

This article present the release of the first version of this backend, in its release candidate form.


Libretro is a C/C++ API used mainly by retro video game console emulators and game engines. Writing an emulator and writing a GUI application require very different skills, using Libretro allows to isolate the backends (often called modules or cores) implementing the API from the frontends using the API to manipulate them, easying the port of the emulators or engines and offering a multiplicity of cores to choose from to application developers.

The main frontend of Libretro is RetroArch and it have been ported across multiple systems.


Retro (or retro-gobject) is a GObject based Libretro wrapping library written in Vala. It eases the creation of Libretro frontends by using OOP and automatic memory management (thanks to Vala and GObject); it allows the creation of Libretro frontends in other languages than C/C++ via GObject Introspection; and it allows to have multiple Libretro cores loaded at the  same time, through the use of a global variable to store the calling core's identity, and through file copy to avoid global variable collision in already used modules).

Retro's API is not to be considered stable yet, but it is very close to be and shouldn't change much. Some objects are marked with a temporary internal visibility until they are tested and polished and can be part of the public API.

But most importantly, what Retro isn't is a fork of Libretro. All it does is to wrap it in a nice object-oriented layer and it is fully compatible with modules implementing Libretro.

Retro's git repository can be found here:

Retro version 0.1 RC can be found here:


Core is the most important class of Retro. It represents the functionalities associated to a Libretro module and allow you to run it.
A Core allows you to get information about its module, to set interfaces to handle the module's callbacks, and most importantly, to run it.


The Video interface is use by a Core to ask the frontend to render video and to set details about how the video must be rendered, such as the source video's pixel format.


The Audio interface is used by a Core to ask the frontend to play audio.


The Input interface is used by a Core to ask the frontend about the state of the input devices, such as gamepads, keyboards and mice.

Other interfaces

Lots of other interfaces can be set by the frontend or by the Core to communicate.
Some examples are the Log interface, to let the Core write log messages, to the standard output or a file by example; the DiskControl interface, to let the frontend control a Core's virtual disk drive (for example, to swap disks on a CD based game console emulator); or the Variables interface to allow the frontend to know about the Core's special options and to set them.


RetroGtk is a library written in Vala which links Retro, in order to run Libretro modules, and Gtk.
It offers Gtk widgets implementing various Retro interfaces, allowing to easily display a Core's video, to forward keyboard and mouse events to it, and to mimick gamepads with a keyboard.

RetroGtk also contains non-Gtk related objects, like loops to help running a Core, an interface to forward a Core's log to a FileStream, and an interface to help manage a Core's variables.

You can use Clutter and ClutterGtk or Cairo to render the video, and I hope that the inclusion of a Gl widget in Gtk+ 3.16 will allow to get rid the dependency on Clutter.
It currently can also play a Core's audio via PulseAudio, but it may be moved to some "RetroPa" library.
Proper joystick support is still to come, certainly in a separate library (RetroJs?).

RetroGtk's git repository can be found here:

Libretro module collection

Being able to run, control and render Libretro implementations is of no use if you have no such implementation to run, control and render.
To help solving this, I started collecting GPL compatible implementation in a git repository, adding a makefile to ease the compiling and installing them.

It is still in its infancy but it already helped me quite a lot.

The machines currently emulated by the collection are:
  • Nintendo Entertainment System
  • Super Nintendo Entertainment System
  • Game Boy Advance
  • Saturn 
  • PC Engine
  • DOS

PlayStation support is to come, but an annoying bug in the emulator have to be solved first.

You can find this collection here:



To test all of this, I wrote a demo application, which uses Retro, RetroGtk and the module collection to run games.
It allows you to open several game files (see the modules in the collection to know what can be run), to set the Core's options, and it supports inputs such as the keyboard, the mouse, and can use the keyboard as a virtual gamepad. You can set the virtual gamepad's configuration via a button.

Theme Park running in DOSBox

Virtual gamepad configuration when running PC Genjin 2 in Mednafen

Nestopia's options

bSNES running Yoshi's Island, showing the "ungrab the pointer" message

Here is the demo's git repository:


The ultimate goal of this project is to write a video game manager and player for GNOME, with a user interface similar to GNOME Music or GNOME Video. This project is similar to OpenEmu for MacOS X systems.

New tablet UI for Firefox on Android

The new tablet UI for Firefox on Android is now available on Nightly and, soon, Aurora! Here’s a quick overview of the design goals, development process, and implementation.

Design & Goals

Our main goal with the new tablet UI was to simplify the interaction with tabs—read Yuan Wang’s blog post for more context on the design process.

In 36, we focused on getting a solid foundation in place with the core UI changes. It features a brand new tab strip that allows you to create, remove and switch tabs with a single tap, just like on Firefox on desktop.

The toolbar got revamped with a cleaner layout and simpler state changes.

Furthermore, the fullscreen tab panel—accessible from the toolbar—gives you a nice visual overview of your tabs and sets the stage for more advanced features around tab management in future releases.

Development process

At Mozilla, we traditionally work on big features in a separate branch to avoid disruptions in our 6-week development cycles. But that means we don’t get feedback until the feature lands in mozilla-central.

We took a slightly different approach in this project. It was a bit like replacing parts of an airplane while it’s flying.

We first worked on the necessary changes to allow the app to have parallel UI implementations in a separate branch. We then merged the new code to mozilla-central and did most of the UI development there.

This approach enabled us to get early feedback in Nightly before the UI was considered feature-complete.


In order to develop the new UI directly in mozilla-central, we had to come up with a way to run either the old or the new tablet UIs in the same build.

We broke up our UI code behind interfaces with multiple concrete implementations for each target UI, used view factories to dynamically instantiate parts of the UI, prefixed overlapping resources, and more.

The new tab strip uses the latest stable release of TwoWayView which got a bunch of important bug fixes and couple of new features such as smooth scroll to position.

Besides improving Firefox’s UX on Android tablets, the new UI lays the groundwork for some cool new features. This is not a final release yet and we’ll be landing bug fixes until 36 is out next year. But you can try it now in our Nightly builds. Let us know what you think!

Help Fund Open-Wash-Free Zones

Recently, I was forwarded an email from an executive at a 501(c)(6) trade association. In answering a question about accepting small donations for an “Open Source” project through their organization, the Trade Association Executive responded Accepting [small] donations [from individuals] is possible, but [is] generally not a sustainable way to raise funds for a project based on our experience. It's extremely difficult … to raise any meaningful or reliable amounts.

I was aghast, but not surprised. The current Zeitgeist of the broader Open Source and Free Software community incubated his disturbing mindset. Our community suffers now from regular and active cooption by for-profit interests. The Trade Association Executive's fundraising claim — which probably even bears true in their subset of the community — shows the primary mechanism of cooption: encourage funding only from a few, big sources so they can slowly but surely dictate project policy.

Today, more revenue than ever goes to the development of code released under licenses that respect software freedom. That belabored sentence contains the key subtlety: most Free Software communities are not receiving more funding than before, in fact, they're probably receiving less. Instead, Open Source became a fad, and now it's “cool” for for-profit companies to release code, or channel funds through some trade associations to get the code they want written and released. This problem is actually much worse than traditional open-washing. I'd call this for-profit cooption its own subtle open-washing: picking a seemingly acceptable license for the software, but “engineering” the “community” as a proxy group controlled by for-profit interests.

This cooption phenomenon leaves the community-oriented efforts of Free Software charities underfunded and (quite often) under attack. These same companies that fund plenty of Open Source development also often oppose copyleft. Meanwhile, the majority of Free Software projects that predate the “Open Source Boom” didn't rise to worldwide fame and discover a funding bonanza. Such less famous projects still struggle financially for the very basics. For example, I participate in email threads nearly every day with Conservancy member projects who are just trying to figure out how to fund developers to a conference to give a talk about their project.

Thus, a sad kernel of truth hides in the Trade Association Executive's otherwise inaccurate statement: big corporate donations buy influence, and a few of our traditionally community-oriented Free Software projects have been “bought” in various ways with this influx of cash. The trade associations seek to facilitate more of this. Unless we change our behavior, the larger Open Source and Free Software community may soon look much like the political system in the USA: where a few lobbyist-like organizations control the key decision-making through funding. In such a structure, who will stand up for those developers who prefer copyleft? Who will make sure individual developers receive the organizational infrastructure they need? In short, who will put the needs of individual developers and users ahead of for-profit companies?

Become a Conservancy Supporter!

The answer is simple: non-profit 501(c)(3) charities in our community. These organizations that are required by IRS regulation to pass a public support test, which means they must seek large portions of their revenue from individuals in the general public and not receive too much from any small group of sources. Our society charges these organizations with the difficult but attainable tasks of (a) answering to the general public, and never for-profit corporate donors, and (b) funding the organization via mechanisms appropriate to that charge. The best part is that you, the individual, have the strongest say in reaching those goals.

Those who favor for-profit corporate control of “Open Source” projects will always insist that Free Software initiatives and plans just cannot be funded effectively via small, individual donations. Please, for the sake of software freedom, help us prove them wrong. There's even an easy way that you can do that. For just $10 a month, you can join the Conservancy Supporter program. You can help Conservancy stand up for Free Software projects who seek to keep project control in the hands of developers and users.

Of course, I realize you might not like my work at Conservancy. If you don't, then give to the FSF instead. If you don't like Conservancy nor the FSF, then give to the GNOME Foundation. Just pick the 501(c)(3) non-profit charity in the Free Software community that you like best and donate. The future of software freedom depends on it.

Link pack #01

Following the lead of my dear friend Daniel and his fantastic and addictive “Summing up” series, here’s a link pack of recent stuff I read around the web.

Link pack is definitely a terrible name, but I’m working on it.

How to Silence Negative Thinking
On how to avoid the pitfall of being a Negatron and not an Optimist Prime. You might be your own worst enemy and you might not even know it:

Psychologists use the term “automatic negative thoughts” to describe the ideas that pop into our heads uninvited, like burglars, and leave behind a mess of uncomfortable emotions. In the 1960s, one of the founders of cognitive therapy, Aaron Beck, concluded that ANTs sabotage our best self, and lead to a vicious circle of misery: creating a general mindset that is variously unhappy or anxious or angry (take your pick) and which is (therefore) all the more likely to generate new ANTs. We get stuck in the same old neural pathways, having the same negative thoughts again and again.

Meet Harlem’s ‘Official’ Street Photographer
A man goes around Harlem with his camera, looking to give instead of taking. Makes you think about your approach to people and photography, things can be simpler. Kinda like Humans of New York, but in Harlem. And grittier, and on film —but as touching, or more:

“I tell people that my camera is a healing mechanism,” Allah says. “Let me photograph it and take it away from you.”

What Happens When We Let Industry and Government Collect All the Data They Want
Why “having nothing to hide” is not about the now, but about the later. It’s not that someone is going to judge for pushing every detail of your life to Twitter and Instagram, it’s just that something you do might be illegal a few years later:

There was a time when it was essentially illegal to be gay. There was a time when it was legal to own people—and illegal for them to run away. Sometimes, society gets it wrong. And it’s not just nameless bureaucrats; it’s men like Thomas Jefferson. When that happens, strong privacy protections—including collection controls that let people pick who gets their data, and when—allow the persecuted and unpopular to survive.

The Sex-Abuse Scandal Plaguing USA Swimming
Abusive coaches and a bullying culture in sports training are the perfect storm for damaging children. And it’s amazing the extent to which a corporation or institution is willing to look the other way, as long as they save face. Very long piece, but intriguing to read.

What Cities Would Look Like if Lit Only by the Stars
Thierry Cohen goes around the world and builds beautiful and realistic composite images of how would big cities look like if lit only by stars. The original page has some more cities: Villes éteintes (Darkened Cities).

On Muppets & Merchandise: How Jim Henson Turned His Art into a Business
Lessons from how Jim Henson managed to juggle both art and business without selling out for the wrong reasons. Really interesting, and reminds you to put Henson in perspective as a very smart man who managed to convince everyone to give him money for playing with muppets. The linked video on How the Muppet Show is Made is also cool. Made me curious enough to get the book.

Barbie, Remixed: I (really!) can be a computer engineer
Mattel launched the most misguided book about empowering Barbie to be anything but a computer engineer in a book about being a computer engineer. The internet did not disappoint and fixed the problem within hours. There’s now even an app for that (includes user submitted pages).

December 02, 2014

there are no good constant-time data structures

Imagine you have a web site that people can access via a password. No user name, just a password. There are a number of valid passwords for your service. Determining whether a password is in that set is security-sensitive: if a user has a valid password then they get access to some secret information; otherwise the site emits a 404. How do you determine whether a password is valid?

The go-to solution for this kind of problem for most programmers is a hash table. A hash table is a set of key-value associations, and its nice property is that looking up a value for a key is quick, because it doesn't have to check against each mapping in the set.

Hash tables are commonly implemented as an array of buckets, where each bucket holds a chain. If the bucket array is 32 elements long, for example, then keys whose hash is H are looked for in bucket H mod 32. The chain contains the key-value pairs in a linked list. Looking up a key traverses the list to find the first pair whose key equals the given key; if no pair matches, then the lookup fails.

Unfortunately, storing passwords in a normal hash table is not a great idea. The problem isn't so much in the hash function (the hash in H = hash(K)) as in the equality function; usually the equality function doesn't run in constant time. Attackers can detect differences in response times according to when the "not-equal" decision is made, and use that to break your passwords.

Edit: Some people are getting confused by my use of the term "password". Really I meant something more like "secret token", for example a session identifier in a cookie. I thought using the word "password" would be a useful simplification but it also adds historical baggage of password quality, key derivation functions, value of passwords as an attack target for reuse on other sites, etc. Mea culpa.

So let's say you ensure that your hash table uses a constant-time string comparator, to protect against the hackers. You're safe! Or not! Because not all chains have the same length, "interested parties" can use lookup timings to distinguish chain lookups that take 2 comparisons compared to 1, for example. In general they will be able to determine the percentage of buckets for each chain length, and given the granularity will probably be able to determine the number of buckets as well (if that's not a secret).

Well, as we all know, small timing differences still leak sensitive information and can lead to complete compromise. So we look for a data structure that takes the same number of algorithmic steps to look up a value. For example, bisection over a sorted array of size SIZE will take ceil(log2(SIZE)) steps to get find the value, independent of what the key is and also independent of what is in the set. At each step, we compare the key and a "mid-point" value to see which is bigger, and recurse on one of the halves.

One problem is, I don't know of a nice constant-time comparison algorithm for (say) 160-bit values. (The "passwords" I am thinking of are randomly generated by the server, and can be as long as I want them to be.) I would appreciate any pointers to such a constant-time less-than algorithm. However a bigger problem is that the time it takes to access memory is not constant; accessing element 0 of the sorted array might take more or less time than accessing element 10. In algorithms we typically model access on a more abstract level, but in hardware there's a complicated parallel and concurrent protocol of low-level memory that takes a non-deterministic time for any given access. "Hot" (more recently accessed) memory is faster to read than "cold" memory.

Non-deterministic memory access leaks timing information, and in the case of binary search the result is disaster: the attacker can literally bisect the actual values of all of the passwords in your set, by observing timing differences. The worst!

You could get around this by ordering "passwords" not by their actual values but by their cryptographic hashes (e.g. by their SHA256 values). This would force the attacker to bisect not over the space of password values but of the space of hash values, which would protect actual password values from the attacker. You still leak some timing information about which paths are "hot" and which are "cold", but you don't expose actual passwords.

It turns out that, as far as I am aware, it is impossible to design a key-value map on common hardware that runs in constant time and is sublinear in the number of entries in the map. As Zooko put it, running in constant time means that the best case and the worst case run in the same amount of time. Of course this is false for bucket-and-chain hash tables, but it's false for binary search as well, as "hot" memory access is faster than "cold" access. The only plausible constant-time operation on a data structure would visit each element of the set in the same order each time. All constant-time operations on data structures are linear in the size of the data structure. Thems the breaks! All you can do is account for the leak in your models, as we did above when ordering values by their hash and not their normal sort order.

Once you have resigned yourself to leaking some bits of the password via timing, you would be fine using normal hash tables as well -- just use a cryptographic hashing function and a constant-time equality function and you're good. No constant-time less-than operator need be invented. You leak something on the order of log2(COUNT) bits via timing, where COUNT is the number of passwords, but since that's behind a hash you can't use it to bisect on actual key values. Of course, you have to ensure that the hash table isn't storing values in sorted order and short-cutting early. This sort of detail isn't usually part of the contract of stock hash table implementations, so you probably still need to build your own.

Edit: People keep mentioning Cuckoo hashing for some reason, despite the fact that it's not a good open-hashing technique in general (Robin Hood hashes with linear probing are better). Thing is, any operation on a data structure that does not touch all of the memory in the data structure in exactly the same order regardless of input leaks cache timing information. That's the whole point of this article!

An alternative is to encode your data structure differently, for example for the "key" to itself contain the value, signed by some private key only known to the server. But this approach is limited by network capacity and the appropriateness of copying for the data in question. It's not appropriate for photos, for example, as they are just too big.

Edit: Forcing constant-time on the data structure via sleep() or similar calls is not a good mitigation. This either massively slows down your throughput, or leaks information via side channels. Remote attackers can measure throughput instead of latency to determine how long an operation takes.

Corrections appreciated from my knowledgeable readers :) I was quite disappointed when I realized that there were no good constant-time data structures and would be happy to be proven wrong. Thanks to Darius Bacon, Zooko Wilcox-O'Hearn, Jan Lehnardt, and Paul Khuong on Twitter for their insights; all mistakes are mine.

Free-riding and copyleft in cultural commons like Flickr

Flickr recently started selling prints of Creative Commons Attribution-Share Alike photos without sharing any of the revenue with the original photographers. When people were surprised, Flickr said “if you don’t want commercial use, switch the photo to CC non-commercial”.

This seems to have mostly caused two reactions:

  1. This is horrible! Creative Commons is horrible!”
  2. “Commercial reuse is explicitly part of the license; I don’t understand the anger.”

I think it makes sense to examine some of the assumptions those users (and many license authors) may have had, and what that tells us about license choice and design going forward.

Free ride!!, by
Free ride!!, by Dhinakaran Gajavarathan, under CC BY 2.0

Free riding is why we share-alike…

As I’ve explained before here, a major reason why people choose copyleft/share-alike licenses is to prevent free rider problems: they are OK with you using their thing, but they want the license to nudge (or push) you in the direction of sharing back/collaborating with them in the future. To quote Elinor Ostrom, who won a Nobel for her research on how commons are managed in the wild, “[i]n all recorded, long surviving, self-organized resource governance regimes, participants invest resources in monitoring the actions of each other so as to reduce the probability of free riding.” (emphasis added)

… but share-alike is not always enough

Copyleft is one of our mechanisms for this in our commons, but it isn’t enough. I think experience in free/open/libre software shows that free rider problems are best prevented when three conditions are present:

  • The work being created is genuinely collaborative — i.e., many authors who contribute similarly to the work. This reduces the cost of free riding to any one author. It also makes it more understandable/tolerable when a re-user fails to compensate specific authors, since there is so much practical difficulty for even a good-faith reuser to evaluate who should get paid and contact them.
  • There is a long-term cost to not contributing back to the parent project. In the case of Linux and many large software projects, this long-term cost is about maintenance and security: if you’re not working with upstream, you’re not going to get the benefit of new fixes, and will pay a cost in backporting security fixes.
  • The license triggers share-alike obligations for common use cases. The copyleft doesn’t need to perfectly capture all use cases. But if at least some high-profile use cases require sharing back, that helps discipline other users by making them think more carefully about their obligations (both legal and social/organizational).

Alternately, you may be able to avoid damage from free rider problems by taking the Apache/BSD approach: genuinely, deeply educating contributors, before they contribute, that they should only contribute if they are OK with a high level of free riding. It is hard to see how this can work in a situation like Flickr’s, because contributors don’t have extensive community contact.1

The most important takeaway from this list is that if you want to prevent free riding in a community-production project, the license can’t do all the work itself — other frictions that somewhat slow reuse should be present. (In fact, my first draft of this list didn’t mention the license at all — just the first two points.)

Flickr is practically designed for free riding

Flickr fails on all the points I’ve listed above — it has no frictions that might discourage free riding.

  • The community doesn’t collaborate on the works. This makes the selling a deeply personal, “expensive” thing for any author who sees their photo for sale. It is very easy for each of them to find their specific materials being reused, and see a specific price being charged by Yahoo that they’d like to see a slice of.
  • There is no cost to re-users who don’t contribute back to the author—the photo will never develop security problems, or get less useful with time.
  • The share-alike doesn’t kick in for virtually any reuses, encouraging Yahoo to look at the relationship as a purely legal one, and encouraging them to forget about the other relationships they have with Flickr users.
  • There is no community education about the expectations for commercial use, so many people don’t fully understand the licenses they’re using.

So what does this mean?

This has already gone on too long, but a quick thought: what this suggests is that if you have a community dedicated to creating a cultural commons, it needs some features that discourage free riding — and critically, mere copyleft licensing might not be good enough, because of the nature of most production of commons of cultural works. In Flickr’s case, maybe this should simply have included not doing this, or making some sort of financial arrangement despite what was legally permissible; for other communities and other circumstances other solutions to the free-rider problem may make sense too.

And I think this argues for consideration of non-commercial licenses in some circumstances as well. This doesn’t make non-commercial licenses more palatable, but since commercial free riding is typically people’s biggest concern, and other tools may not be available, it is entirely possible it should be considered more seriously than free and open source software dogma might have you believe.

  1. It is open to discussion, I think, whether this works in Wikimedia Commons, and how it can be scaled as Commons grows.

December 01, 2014

.NET Foundation: Advisory Council

Do you know of someone that would like to participate in the .NET foundation, as part of the .NET Foundation Advisory Council?

Check the discussion where we are discussing the role of the Advisory Council.

Network clock examples

Way back in 2006, Andy Wingo wrote some small scripts for GStreamer 0.10 to demonstrate what was (back then) a fairly new feature in GStreamer – the ability to share a clock across the network and use it to synchronise playback of content across different machines.

Since GStreamer 1.x has been out for over 2 years, and we get a lot of questions about how to use the network clock functionality, it’s a good time for an update. I’ve ported the simple examples for API changes and to use the gobject-introspection based Python bindings and put them up on my server.

To give it a try, fetch and onto 2 or more computers with GStreamer 1 installed. You need a media file accessible via some URI to all machines, so they have something to play.

Then, on one machine run, passing a URI for it to play and a port to publish the clock on:

./ http://server/path/to/file 8554

The script will print out a command line like so:

Start slave as: python ./ http://server/path/to/file [IP] 8554 1071152650838999

On another machine(s), run the printed command, substituting the IP address of the machine running the master script.

After a moment or two, the slaved machine should start playing the file in synch with the master:

Network Synchronised Playback

If they’re not in sync, check that you have the port you chose open for UDP traffic so the clock synchronisation packets can be transferred.

This basic technique is the core of my Aurena home media player system, which builds on top of the network clock mechanism to provide file serving and a simple shuffle playlist.

For anyone still interested in GStreamer 0.10 – Andy’s old scripts can be found on his server: and