24 hours a day, 7 days a week, 365 days per year...

July 30, 2014

Compiler Structure

This post is a follow up from my previous post. Here I will talk about parts of a compiler. A Compiler usually contains following parts:

  • Lexical Analyser (Lexer)
  • Syntax Analyser (Parser)
  • Semantic Analyser
  • Symbol Table
  • Intermediate Code Generator
  • Code Generator
  • Code Optimizer

The Source Code goes through the these phases and is finally converted to Target Language (which can be Machine Language or Byte Code).


Let us have a short talk about each one of them.

Lexical Analysis: This is the phase when the source code is converted into a stream of Tokens. These tokens are meaningful character strings. They would have some predefined meaning in the syntax of a language. For example, in C “int” could be converted to a Token called “Integer Type”. This token will then be used in the subsequent phases of the compiler which tells them that “int” describes “Integer Type”.

Syntax Analysis: This is the phase when the Source Code is analysed and converted into its Parse Tree or Syntax Tree. If any error is found in this phase then it is reported to the user. This phase takes the stream of Tokens from Lexical Analyser. Syntax Analyser is actually an Automata (which is a type of Turing Machine). It is created on the basis of the Syntax of Language. Syntax Analyser for C and of C++ are different. Most of the Syntax Analyser creates an Abstract Syntax Tree which actually depicts the structure of the source code.

Semantic Analysis: In this phase Compiler checks for Semantic errors of a language. Semantic errors mainly consists of type errors and whether variable has been declared or not. It makes a heavy use of Symbol Table. For every variable it encounters it gets variable’s type in Symbol Table and check with the language rules. It also performs Type Conversions based on the language rules. For example, the expression “1.5 + 2″ will give a Float but “2” is an Integer. In this phase compiler will convert “2” to Float and apply float addition operator.

Symbol Table: This part of compiler is responsible for storing all the Type Information about all variables or functions. For example, if you have declared your variable named “foo” as an integer then Symbol Table will store it. This part is used by all the stages of compilers. It is mainly created at the Syntax Analysis stage.

Intermediate Code: Intermediate Code may or may not be present in a Compiler structure. It depends on the Compiler Designers whether they want the Intermediate Code to be generated or not. Intermediate Code can be of many forms like Abstract Syntax Tree, Three Address Code, Static Single Assignment etc.

Code Optimizer: Code Optimizers mainly work on Intermediate Code. They apply many algorithms for optimizing the code. Similar to Intermediate Code, Code Optimization is not a necessary part of compiler. Even many compilers like GCC or LLVM implements a command line switch to turn off/on and set level of Code Optimization.

Code Generation: Code Generation is the final step of Compiler. At this step the target language code is generated. It can be Byte Code or x86 Assembly Code.

In the next post I will start with Lexical Analysis phase of compiler.

Cheers :D

gedit 3 for Windows


It has been a while since the last release of gedit for windows. After some work I am proud to announce that a new version of gedit is available. This version can be downloaded from here.

This new version features the latest unstable version of gedit 3.13.4 with all the unstable versions of the libraries we depend on (GTK+, Glib, atk, gtksourceview… etc). If the word “unstable” does not take you out the idea of trying it, please give it a try and report any issues that you may find. Right now it is known that the Start menu link is not working properly and that some of the python plugins crash the application.

On the next post I will explain how you can create the installer for gedit by yourself, how you can build gedit easily on windows and how you can actually do to build for windows your GTK application or library in a few steps.


Evince Hackfest

The Evince hackfest took place last week from 23rd to 25th July in Strasbourg. Yes, 3 days only, but very productive in my opinion, I’ll summarize all the cool stuff we worked on.


This work was initially started by Owen, and then Germán kept the patches up to date with evince git master. I reviewed all the pending patches and updated the thumbnails one and the result is that evince doesn’t look blurry on HiDPI screens any more.

Evince running with GDK_SCALE=2

Evince running with GDK_SCALE=2

Recent View

This was a GSoC project of 2013, but the patch provided by the student was never in an “upstreamable” state. Again Germán, who always loved this feature, took care of the patch addressing my review comments. At the beginning of the hackfest most of the work has already been done, we only needed a few more review iterations during the hackfest to finally push this feature to master. The idea is to show the list of recent documents as an icon view with thumbnails and documents metadata. This view is loaded when evince is launched without any document replacing the useless empty window we had before. It also replaces the recent documents submenu in the gear menu.

Evince Recent View

UI improvements

The move to the header bar recently made the toolbar look a bit cluttered, mainly because the title might use a lot of space. We discussed several ideas to improve the header bar and implemented some of them:

Evince header bar improvements


Juanjo Marín also wrote a patch to change the default zoom mode to “Automatic”, since several people commented that the current “Fit Width” mode doesn’t look good in screens with higher resolutions. The patch is still waiting review.


Giselle and Anuj, our GSoc students this year, worked on their projects to improve the annotations support in both Evince and poppler.

    • Anuj wrote some patches to add support for Free Text annotations to poppler glib API. After a couple of review iterations and discussions about the API, the patches are now in bugzilla waiting for a final review (I hope to find the time soon)
    • Giselle was focused on adding support for highlight annotations to Evince, since poppler already has all the required API for this. The patches are not yet ready, but they look really promising.


Caret navigation and accessibility

Joanie and API continued improving the evince a11y support and fixing some remaining issues from the FoG project. Antía fought with the caret navigation implementation again to implement some missing key bindings and fixing other issues.

Comics backend

Juanjo Marín focused on the comics backend, working on a patch to use libarchive to uncompress the documents instead of spawning external command line tools.


I started to review the gestures branch during the hackfest, patches looked clean and simple, but since I was not familiarized with the new GTK+ touch API and I didn’t have a touch screen to try it out either, I decided to wait after the hackfest and see it in action in garnacho’s laptop during GUADEC. Carlos explained to me how the touch API works in GTK+ and I could check it actually works great. The code doesn’t affect the normal use with the mouse, so the branch will be merged in master soon.

Evince hackfest dinner

And of course not everything was hacking


Many thanks to Alexandre Franke for the local organization, everything worked perfectly. Of course thanks to the GNOME Foundation for sponsoring the GSoC students, Giselle and Anuj, and Igalia for sponsoring all the Igalians attending the hackfest. Thanks also to Epitech for allowing us to do the hackfest there before the GUADEC.

Igalia S.L. GNOME FoundationEPITECH

Fewer auth dialogs for Print Settings

The latest version of system-config-printer adds a new button to the main screen: Unlock. This is a GtkLockButton attached to the “all-edit” polkit permission for cups-pk-helper.

The idea is to make it work a bit more like the Printing screen in the GNOME Settings application. To make any changes you need to click Unlock first, and this fetches all the permissions you need in order to make any changes.

Screenshot from 2014-07-30 10:20:43

This is a change from the previous way of working. Before this, all user interface elements that made changes were available immediately, and permission was sought for each change required (adding a printer, fetching a list of devices, etc).

Screenshot from 2014-07-30 10:20:55

Hopefully it will now be a little easier to use, as only one authentication dialog will generally be needed rather than several as before.

Exceptions are:

  • the dialog for reading/changing the firewall (when there are no network devices discovered but the user is trying to add a network queue), and
  • the dialog for adjusting server settings such as whether job data is preserved to allow for reprinting

These are covered by separate polkit permissions and I think a LockButton can only be responsible for one permission.

For those missing my talk on Builder

Long story short, I'm leaving MongoDB in September to work on Builder full time for a year. This is a pretty scary thing for me to do. I assume I'll start freaking out about lack of income in about a month :-)

July 29, 2014

LibreOffice under the hood: progress to 4.3.0

Today we release LibreOffice 4.3.0, packed with a load of new features for people to enjoy - you can read and enjoy all the great news about the user visible features from so many hardy developers, but there are of course also some contributors whose work is primarily behind the scenes in places that are not so easy to see. These are of course still vitally important to the project. It can be hard to extract those from the over fourteen thousand commits since LibreOffice 4.2 was branched, so let me expand:

User Interface Dialog / Layout

The UI migration to Glade based layout of VCL widgets is finally approaching the home straight; more than two hundred dialogs were converted this release; leaving the final dialogs rather hard to find - help appreciated. Many thanks to Caolán McNamara (Red Hat) - for his incredible work here, and also Szymon Kłos, Michal Siedlaczek, Olivier Hallot (EDX), Andras Timar (Collabora), Jan Holesovsky (Collabora), Katarina Behrens, Thomas Arnhold, Maxim Monastirsky, Manal Alhassoun, Palenik Mihály, and many others ... Thanks also to our translators who helped in the migration of strings.

Graph of progress in UI layout conversion

If you'd like to get involved in driving this to 100%, checkout Caolan's howto and his great blog: 99 to go update (now only 65) illustrated by this:

Build improvements

We've improved a lot this cycle in terms of buildability, and ease of comprehension - important for new contributors.

Visual Studio support

Not only did Jesus Corrius add initial support for Visual Studio 2013, but we had a major win from Honza Havlíček who (building on Bjoern Michaelsen (Canonical)'s similar KDevelop work) implemented building a Visual Studio project file - allowing much improved build / debugging support video or just: make vs2012-ide-integration.

OpenGL as a run-time dependency

In the past when we needed an OpenGL code-path we would link a separate shared library to OpenGL and then dynamically load that component - as for the OpenGL slideshow. In 4.3 we unified all of our OpenGL code to use glew and now have a central VCL API for initializing and linking in OpenGL, making it much easier to use in future. Another benefit of using glew is the ability to check for certain extensions at run-time dynamically to better adapt to your platform's capabilities rather than having to work vs. a baseline.

Pre-compiled-headers / PCH updates

Thomas Arhnold discovered that our pch files (used for accelerating windows building) had bit-rotted, and did a fine cleanup sweep across them. That significantly reduced build time for a number of modules.

Graph of compile-time speedup from improving pre-compiled headers

Mobile code-size reduction

A lot of work was put into LibreOffice 4.3 to allow us to shrink the code to fit a mobile footprint nicely. Thanks to Matus Kukan (Collabora) for splitting a large number of UNO components into individual factory functions - to allow the linker to garbage collect un-used components. Matus also created a python script solenv/bin/ to share the building of lists of components to statically link in for various combinations of functionality. Tor Lillqvist (Collabora) did some re-work on ICU to package the rather large data tables as a file instead of code. Vincent Saunders (Collabora) worked away to improve dwarfprofile to identify larger pieces of object file and where they came from. Jan Holesovsky de-coupled lots of accessibility code, and removed lots of static variables dragging in un-needed code. Miklos Vajna turned OOXML custom shape preset definitions (oox::drawingml::CustomShapeProperties::PresetsMap) from generated code to generated data: that allowed removal of 50k lines of code. Thanks to Tsahi Glik / CloudOn for funding this work.

Code quality work

There has been a lot of work on code quality and improving the maintainability and cleanliness of the code. Another 75 or so commits to fix cppcheck errors are thanks to Julien Nabet, along with the huge scad of daily commits to build without any compile warnings -Werror -Wall -Wextra on every platform with thanks primarily to Tor Lillqvist (Collabora), Caolán McNamara (Red Hat), and Thomas Arnhold.

Assert usage

Another tool that developers use to ensure they do not introduce new bugs is assertions; historically the OOo code base has had custom assertion facilities that can easily be ignored, and so most developers did just that; thanks to Stephan Bergmann (Red Hat), we have started to use the standard assert() macros in LibreOffice, which have the important advantage that they actually abort the program: if an assertion fails, developers see a crash that is rather harder to ignore than some text printed on the terminal. Thanks to all who asserted the truth.

Graph of number of run-time assertions
Rocking Coverity

We have been chewing through the huge amount of analysis from the Coverity Scan, well - in particular Caolán McNamara (Red Hat) has done an awesome job here; his blog on that is typically modest.

We now have a defect density of 0.08 - meaning 8x bugs in every 100,000 lines of code found by static checking. This compares rather favourably with the average open source project of this size which has 65 per 100,000 lines. Perhaps the most useful thing here is Coverity's report on new issues - many of which are rather more serious than the last few, lowest priority un-triaged reports.

This was achieved by 2679 commits, 88% of them from Caolán, and then Norbert Thiebaud, Miklos Vajna (Collabora), Noel Grandin, Stephan Bergmann (RedHat), Chris Sherlock, David Tardon (RedHat), Thomas Arnhold, Steve Yin (IBM), Kohei Yoshida (Collabora), Jan Holesovsky (Collabora), Eike Rathke (RedHat), Markus Mohrhard (Collabora) and Julien Nabet

Import and now export testing

Markus Mohrhard's great import/export crash testing has been expanded to 55,000+ problem/bug documents, now covering the PDF importer, and our crash and validation problem counts continue to drop. Markus also re-wrote and simplified the test script in python to make it simpler; however we routinely suffer from this test (running for 5 days and consuming a beefy machine) locking up Linux of several distributons, kernel versions, on both virtual and real hardware; which has a negative impact on usefulness.

Re-factoring big objects

In some cases LibreOffice has classes that seem to do 'everything' and include the kitchen sink too. Thanks to Valentin Kettner, Michael Stahl (RedHat) and Bjoern Michaelsen (Canonical) for helping to re-factor these. As an example SwDoc (a writer document) now inherits from only nine classes instead of nineteen, and the header file shrunk by more than three hundred lines.

Valgrind fixes

Valgrind continued to be a wonderful tool for finding and isolating leaks, and poor behavior of various bits of code - although normal code-paths are by now rather valgrind clean. Dave Richards from Largo very kindly donated us some CPU time on his new 80x CPU Linux machine to burn it in. We used that to run Markus' import/export testing under valgrind, and found and fixed a number of issues. valgrind logs here. We would be most happy to help others with their boxes in need of load testing.

Address / Leak Sanitizer

There are some great new ways of doing (compile time) code sanitisation, and thanks to Stephan Bergmann (RedHat) we're using them enthusiastically -fsanitize is available for Clang and gcc 4.9. It lets us do memory checking (like valgrind) but with visibility into stack corruption, and to do that very significantly faster. Some details on -fsanitize for libreoffice are available. Lots of leaks and badness have been fixed using the tool, thanks too to Markus Mohrhard, and Caolan McNamara.

Unit testing

We also built and executed more unit tests with LibreOffice 4.3 to avoid regressions as we change the code. Grepping for CPPUNIT_TEST() and CPPUNIT_ASSERT as last time we continued the trend of growth here:

Graph of number of unit tests and assertions
Our ideal is that every bug that is fixed gets a unit test to stop it ever recurring. With 1100 commits, and over eighty committers to the unit tests in 4.3 it is hard to list everyone involved here, apologies for that; what follows is a sorted list of those with over 20x commits to the qa/ directories: Miklos Vajna (Collabora), Kohei Yoshida (Collabora), Caolán McNamara (RedHat), Stephan Bergmann (RedHat), Jacobo Aragunde Pérez (Igalia), Tomaž Vajngerl (Collabora), Markus Mohrhard (Collabora), Zolnai Tamás (Collabora), Tor Lillqvist (Collabora), Michael Stahl (RedHat), Alexander Wilms


Traditionally C++ has allowed significant ambiguity in overriding methods, allowing the 'virtual' keyword to be ommitted in overrides, and also allowing accidentally polymorphic overrides. To prepare for the new C++ standard here we've annotated all of our virtual methods that are overridden in sub-classes with the SAL_OVERRIDE macro, to ensure that we are building our vtables correctly. Many thanks to Noel Grandin, and Stephan Bergmann (RedHat) for building a clang plugin to help to build annotation here with another to verify that the result stays consistent. That fixed several long-standing bugs. As a bonus when you read the code it is much easier to find the base virtual method declaration: it's the one that is not marked with SAL_OVERRIDE.

QA / bugzilla

This release the QA team has grown, and done some amazing work both triaging bugs, and also closing them, getting us back well under the totemic one thousand un-triaged bug barrier. Currently ~750 un-confirmed which is the lowest in over two years. Thanks to everyone for their great work there, sadly it is rather hard to extract credits for confirming bugs, but the respective hero list overlaps with the non-developer / top closers listed below.

We also had one of our best bug-hunting weekends ever around 4.3 see Joel Madero's write-up. The QA team are also doing excellent job with our bibisect git repositories to isolate regressions to small blocks of commits - which makes life significantly easier for developers.

One metric we watch in the ESC call is who is in the top ten in the freedesktop Weekly bug summary. Here is a list of the top twenty people who have appeared most frequently in the weekly list of top ten bug closers in order of frequency of appearance: Jorendc, Kohei Yoshida (Collabora), Maxim Monastirsky, tommy27, Joel Madero, Caolán McNamara (RedHat), Foss, Jay Philips, m.a.riosv, Julien Nabet, Sophie Gautier (TDF), Cor Nouws, Michael Stahl (RedHat), Jean-Baptiste Faure, Andras Timar (Collabora), Adolfo Jayme, ign_christian, Markus Mohrhard (Collabora), Eike Rathke (RedHat), Urmas. And thanks to the many others that helped to close so many bugs for this release.

Bjoern Michaelsen (Canonical) also write up a nice taxonomy of our twenty five thousand reported bugs so far, and provided the data for this nice breakdown:

Graph of bug stats over the development of 4.3

Code cleanup

Code that is dirty should be cleaned up - so we did a lot of that.

The final death of UniString

While we killed our last tools/ string class in 4.2 and switched to clean, uniform OUStrings everywhere - we were still using some 16bit quantities to describe text offsets elsewhere. Thanks to Caolán McNamara (Red Hat) for finally enabling writer to have >64k paragraphs - a long requested feature by a certain type of user, see the related blogpost.

VCL code / structure cleanup

The Visual Class Libraries - the LibreOffice native toolkit has not been given the love it deserves in recent years. Many thanks to Chris Sherlock for several hundred commits - starting to cleanup VCL. That involves lots of good things - giving the code a more logical structure so it is easy to find methods; systematically writing doxygen documentation for API methods, ensuring that API methods have sensible, descriptive names and starting to unwind some poor legacy design decisions; much appreciated.

Ongoing German Comment redux

We continued to make some progress on translating our last lingering German comments across the codebase to good, crisp technical English. Many thanks to Luc Castermans, Sven Wehner, Christian M. Heller, Philipp Weissenbacher, Stefan Ring, Philipp Riemer, Tobias Mueller, Chris Sherlock, Alexander Wilms and others. We also reduced the number of false positives and accelerated the bin/find-german-comments tool in this cycle.

Graph of remaining lines of German comment to translate
Automated code re-factoring using Clang

One hero of code cleaning is Noel Grandin who is constantly improving the code in many ways; eg. writing out un-necessary duplicate code to use standard wrappers such as SimpleReferenceObject. Noel has been heavily involved in Clang plugins to re-write a lot of our error prone binary file format / stream overrides pStream >> nVar seems like a great idea until you realise that an unexpected change to the type of nVar far away tweaks the file format. These operators are now all re-written to explicit ReadFloat type methods enhancing the robustness of the code to changes. Noel also created plugins to inline simple member functions, detect inefficient passing of uno::Sequence, and OUString. Stephan Bergmann (RedHat) also wrote a number of advanced linting tools, checks for de-referencing NULL pointers, quickly catching inlining problems on Linux that cause most grief on Windows, and re-writing un-necessary uses of sal_Bool to bool. Stephan also wrote a plugin to find unused functions and unused functions in templates, as well as warning on illicit conversions of literal to bool e.g. if (n == KIND_FOO || KIND_BAR). All of this improves the readability, consistency, reliability and in some cases performance of the code.

Improving lifecycle

Takeshi Abe invested lots of time this cycle in improving our often unhelpful object lifecycle. Using smart pointers not only makes the code more readable and often shorter, but also exception safe which is very useful.

DocTok cleanup

This cleanup saved nearly 80k lines of code and make the codebase much simpler to understand thanks to Miklos Vajna (Collabora) you can see the before & after pictures in his blog.

Holding the line on performance

Performance is one of those hard things to keep solid. It has an alarming habit of bit-rotting when your back is turned. That's why Matus Kukan (Collabora) has built a test machine that routinely builds LibreOffice and runs a suite of document loads, conversions etc. under callgrind. Using callgrind's simulated CPU has the beautiful property of ensuring repeatable behaviour, and thus making any small reduction or improvement in performance noticeable and fixable. It is easy to see that in a graph - admire the crisp flatness of the graph between significant events. The X axis is time (annotating the axis with git hashes is not so photogenic).

Graph of various documents performance

Often we only check performance just before a release, its interesting to see here the big orange hump from a performance fragility found and fixed as a direct result of these tests. Raw callgrind data is made available for trivial examination of the latest traces along with a flat ODS of the previous runs.

Getting involved

I hope you get the idea that more developers continue to find a home at LibreOffice and work together to complete some rather significant work both under the hood, and also on the surface. If you want to get involved there are plenty of great people to meet and work alongside. As you can see individuals make a huge impact to the diversity of LibreOffice (the colour legends on the right should be read left to right, top to bottom, which maps to top down in the chart):

Graph showing individual code committers per month

And also in terms of diversity of code commits, we love to see the unaffiliated volunteers contribution by volume, though clearly the volume and balance changes with the season, release cycle, and volunteers vacation / business plans:

Graph of number of commits per month by affiliation

Naturally we maintain a list of small, bite-sized tasks which you can use to get involved at our Easy Hacks page, with simple build / setup instructions. It is extremely easy to build LibreOffice, each easy-hack should have code pointers and be a nicely self contained task that is easy to solve. In addition some of them are really nice-to-have features or performance improvements. Please do consider getting stuck in with something.

Graph of progress closing easy hacks over time

Another thing that really helps is running pre-release builds and reporting bugs just grab and install a pre-release and you're ready to contribute alongside the rest of the development team.


LibreOffice 4.3 is the next in a series of releases that incrementally improve not only the features, but also the foundation of the Free Software office suite. Please be patient, it is just the first in a long series of monthly 4.3.x releases which will bring a stream of bug fixes and quality improvements over the next months as we start working in earnest on LibreOffice 4.4.

I hope you enjoy LibreOffice 4.3.0, thanks for reading, and thank you for supporting LibreOffice.

Raw data for many of the above graphs is available.

GUADEC 2014 talk notes

I put the notes of the GSK talk I gave at GUADEC 2014 online; I believe there should be a video coming soon as well.

the notes are available on this very website.

Builder Talk

You can find the slides for my Builder talk here.

Introduction to Compilers

Hey all,

I have been trying to create my own compiler for a simple language like C (by language I mean it don’t contains many construct, sure programming correctly in C is not easy :P) but I couldn’t find any good tutorial which go through every thing about creating compilers. And I don’t wanted to go through a 1000 pages Dragon Book (well who would :o). I found some other books too like Writing Compilers and Interpreters: A Software Engineering Approach, 3rd Edition but it concentrates on Top Down parsers instead of Bottom Up Parsers (don’t worry I will tell you what they mean). So, finally (Believe me it required so much of motivation) I decided to go through Dragon Book. And I was able to create a working compiler for a subset of C language for x86. The source code could be found here: CCompiler – GitHub. But I don’t think you would want to go through the source code because 1) The project is difficult to understand and 2) The way I wrote the code it is even more difficult to read it (After all it was just a test project).

Now, I am writing a new language named Leopard (I love Leopards :) ). Its source code will be good for sure. You can find more about the code here Leopard – GitHub. It will run on a Virtual Machine and is Garbage Collected and JITed. In this and the following blogs I will write about how to write Compilers, all the problems I faced while writing the Compiler, Assembler, Virtual Machine etc. I will tell you a solution for them. I didn’t use any Compiler Writers like Lex or Yacc. Instead everything has been written by hand (that is the beauty of Compilers, it is hard isn’t it :D). Compiler and Assembler has been written in C#, Mono and Virtual Machine in Java (yes, I want to write it in C# but I want to try my hands over Java too, lets see if Java is any better than C# :P).

Pre-requisites for Compilers are so many. You must know Theory of Computation, many Data Structures, any Assembly language (I would prefer x86) and many others. But no one would want to learn all these right? (It is human nature isn’t it? Who wants to study). So, I will tell you about all these things. These are things I am going to write about:

  1. Introduction to Compilers, Interpreters, Virtual Machines
  2. Automata, Languages, Grammars.
  3. Lexer
  4. Parser
  5. Semantic Analyser
  6. Intermediate Code Generation
  7. Code Generation
  8. Assembler
  9. Virtual Machine
  10. Garbage Collection
  11. JIT

I know its too much, but if you want to write your own compiler you have to do that. So, at the end of this blog series you would have the knowledge of how to develop a Compiler, Assembler, Virtual Machine with JIT and Garbage Collection. We will not use LLVM or any other compiler infrastructure project instead we will start with scratch.

Lets start with Introduction to Compilers.

It is said that there are two types of languages in this world.

  • Compiled
  • Interpreted

Compiled Languages compiles directly to the machine code. C, C++, Objective-C are examples of compiled languages. The code written in C is compiled directly to machine code. This is the typical work flow of a program written in compiled languages.compile

Source files contains the code, Compiler compiles it into Object File. These object files are then linked to different libraries (if required) and an executable file is produced. Now this executable file contains only binary code. This binary code can be executed directly by the processor. Compiled code gives us the best speed as it is directly executed by processor. But we do not have any control over such compiled code. It can be malicious and if executed by the processor it could lead to havoc.

Interpreted Languages do not compiles to machine code in one go. Rather a program known as Interpreter will read a line of source code, checks if it is correct and then executes it. If it founds an error then the program will be stopped by the Interpreter. Examples of such languages are BASIC, Shell Scripting, Perl, Python, Ruby. Unlike Compiler, processor doesn’t executes the code directly. A line is first converted to code and then executed by the processor. Interpreted Languages gives worst speed.  But, we have a control over such code. Interpreter knows which code is going to be executed and it could check whether that code could do some malicious operations or not. If it does then it will stop it right there.

As you can see, we could have good security but on the cost of speed and good speed on the cost of security. In 1970s, LISP was introduced. It was the first language to bring something in between of Compiled and Interpreted Languages. Later with Java this concept became very popular. Here is the basic idea about it.

The source code is compiled into Byte Code. This Byte Code is then executed by a Virtual Machine.


This is the work flow of a Java Program. The execution of Byte Code by Virtual Machine can be done is two ways:

  • Interpreting the Byte Code: Virtual Machine will behave similarly as an Interpreter to Byte Code. It will read each instruction an execute it. The speed of a program executed now is much better than Interpreted Languages but less than Compiled Languages.
  • Just In Time Compilation: Instead of read and converting each instruction of Byte Code one at a time, Virtual Machine can compile some amount of Byte Code (may be one function or one compound statement block) instruction into machine code and save the compiled code for future use. Whenever it founds that the Byte Code instructions to be compiled have already been compiled it will reload the saved compiled code. This increases the speed significantly as compared to “Interpreting the Byte Code”. It is even seen that JIT has surpassed the performance of C/C++ Compiled code. This is because at runtime JIT has information about the current processor being used and it can then apply some optimizations specific to that processor. But Optimized C/C++ Code is faster than JIT. When we will be developing JIT we will talk more about it.

Not only Java but today many languages follow this approach. It includes C#, VB.NET, C++/CLI, Python, Ruby.

Next time we will see a block diagram of Compiler and how it works.

Cheers :D

GNOME Beers @ LinuxCon + August ChicagoLUG Meeting

We are planning a GNOME beers event at LinuxCon North America. We're planning to meet at Howells and Hood at 8 pm on Thursday, August 2.  If you are going to be in town you can RSVP on the Chicago LUG's meetup page or you can put your name down on the Wiki page. Once we have an idea of how many people will be around we'll make a reservation :)

The ChicagoLUG will also be hosting a meeting at FreeGeek on Saturday, August 23rd at 2pm. If you'll be around after LinuxCon ends and you want to give a talk let us know :) or just come by and hang out.

See you then, and happy hacking.

July 28, 2014

A talk in 9 images

My talk at GUADEC this year was about GTK+ dialogs. The first half of the talk consisted of a comparison of dialogs in GTK+ 2, in GTK+ 3 under gnome-shell and in GTK+ 3 under xfwm4 (as an example of an environment that does not favor client-side decorations).

The main take-away here should be that in 3.14, all GTK+ dialogs will again have traditional decorations if that is what works best in the environment it is used in.

About DialogsPreference DialogsFile ChoosersMessage DialogsError DialogsPrint DialogsFont DialogsColor DialogsAction DialogsThe second part of my talk was discussing best practices for dealing with various issues that can come up with custom GTK+ dialogs; I’ve summarized the main points in this HowDoI page.

2014-07-28: Monday

  • Up early; mini team meeting, update, tried to deal with the most urgent E-mail; off to Carls for lunch with Robert. Back, continued to wade through the mail. Built ESC bug stats, sent some mail. Slugged with the family in the evening.

Pre commit hook for PO files on

After a little pestering by André I’ve made the following changes regarding PO file checking on

  1. PO files are actually checked using msgfmt
  2. Added a simple check to ensure keyword header in .desktop is properly translated

That PO files weren’t checked for syntax issues was pretty surprising for me, it seems no translator ever uploaded a PO file with a wrong syntax, else I assume the sysadmin team would’ve received a bugreport/ticket.

The check for a properly translated keyword header is implemented using sed:

sed -rn '/^#: [^\n]+\.desktop/{:start /\nmsgstr/!{N;b start};/\nmsgid [^\n]+;"/{/\nmsgstr [^\n]+;"/!p}}'

Or in English: match from /^#: .*\.desktop/ until /\nmsgstr/. Then if there is a msgid ending with ";, check if there also is a msgstr ending with ;". If msgstr doesn’t end with ;" but all other conditions apply: print the buffer (so #: line up to msgstr). The git hook itself just looks if the output is empty or not.

Example of the error message.

$ git push
Counting objects: 23, done.
Delta compression using up to 2 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 313 bytes | 0 bytes/s, done.
Total 3 (delta 2), reused 0 (delta 0)
remote: ---
remote: The following translation (.po) file should ensure the translated content ends with a semicolon. (When updating branch 'master'.)
remote: po/sl.po
remote: The following part of the file fails to do this. Please correct the translation and try to push again.
remote: #: ../data/
remote: msgid "folder;manager;explore;disk;filesystem;"
remote: msgstr "mapa;upravljalnik;datoteke;raziskovalec;datotečni sistem;disk"
remote: After making fixes, modify your commit to include them, by doing:
remote: git add sl.po
remote: git commit --amend
remote: If you have any further problems or questions, please contact the GNOME Translation Project mailing list <>. Thank you.
remote: ---
To ssh://
 ! [remote rejected] master -> master (pre-receive hook declined)
error: failed to push some refs to 'ssh://'

Note: explanation can probably be improved. The check is not just for Keywords, also for other things like e.g. mime types, etc.

Word Crimes

I have just one thing to add to Weird Al’s Word Crimes video: I want a Biohazard Bin too. (At 3:02.)


Well hello there, dear Internet.

So yes, it’s been quite some time since the last blogpost. A lot has been going on, but that will be for future blogposts.

Currently at GUADEC 2014 in Strasbourg, France. Lots of interesting talks and people around.

Quite a few discussions with several people regarding future GStreamer improvements (1.4.0 is freshly out, we need to prepare 1.6 already). I’ll most likely be concentrating on the whole QA and CI side of things (more builds, make them clearer, make them do more), plan how to do nightly/weekly releases (yes, finally, I kid you not !). We also have a plan to make it faster/easier/possible for end-users (non-technical ones) to get GStreamer in their hands. More on that soon (unless Sebastian blogs about it first).

If you want to come hack on GStreamer and Pitivi, or discuss with the various contributors, there’s a Hackfest taking place at GUADEC from Wednesday to Friday. More info at

GTG ! Past can not be changed

Hi everybody,

Sorry for the late update, was a little busy discussing about the implementation stuffs .

Anyways let me give you a brief  idea of what I have been able to implement and what’s changed so far .

Now we are following ” Past can not be changed ” rule for handling the edit event.

So What’s changed?

Initially we were storing all the recurring details with each instances, but storing all is useless once task gets overdue or completed as recurring details never gonna used .

So we have changed the little bit implementation stuffs like this :

If task gets overdue or completed then in that case we make the task as a normal task and next instance will be recurring . so eventually we are removing all recurring details when we are making it as a normal task except the rid attribute to keep track of instances which gets created.

How to handle the edit event ?

Following the rule “ Past can not be changed ” , we are providing the edit options ( edit current instance or all instances )  to only the task having the recurring pattern. There is a one “Hidden” task gets created when user open a task for editing, which is used to check that task is actually edited or not.

All instances : -

Doing a simple check that task is actually edited then only reflect the changes in the current instances and according to that future instances gets created .

Current instance only : -

Make the current instance as a normal task and make the Hidden task as active task.

With this, a majority of the work for my GSoC project is complete. Now I will be focusing on optimizing and tidy up my source code .

Progress in introspection for libical

A quick catchup, to add the support for introspection for libical, I create a description file for each structure and its APIs and then, use a generator program to automatically generate the introspectable library.

These two weeks, I continue to work on the generator and the rest of the description files. When I wrote the description files, I found that just like the source code, the description files themselves also include a lot of similar stuff which can be generated by some rules. So I decide to further cut the weight of the description files. If this project can be applied for some other non-introspectable libraries, the hard coding part will be minimized. 

In the next two weeks, the plan is:

1. Finish all the description files.

2. Set up the test environment and write tests for the generated codes.

3. Adjust the structure of this project so that the whole structure can be more clear and the parsing part can be put in one place so that it is easier to be debugged.

Have a wonderful and productive week!

Vale Peter Miller

Sad to receive news this morning that a long time friend and colleague, Peter Miller, had passed.

Peter Miller

“After fighting cancer for many years, finally lost”. No, not lost; if there was ever anyone who fought the battle of life and won it was be Peter. Even knowing he was at his last days he was unbowed. Visiting him last week he proudly showed us the woodworking plans and cut lists for some cabinets he was making for his wife MT. He had created the diagrams himself, writing C++ code to call manually drive a drawing library, outputting postscript. Let’s see you do architectural drawing without a CAD program. The date on the printout was two weeks ago.

“The world is a less interesting place today,” wrote another friend. No. Peter firmly believed that interest comes from within. The world is there to be explored, I can hear him saying. He taught us to go forth, wonder, and understand. And so we should.


July 27, 2014

Five or More tournament

Hey folks,

Ben Asselstine is running a short 2-week Five or More tournament starting on July 31st. He has created a special version of Five or More that posts scores to a scoreboard at  It’s all free-software, and it’s free to join.  There’s no prize for winning, and you don’t even have to sign up with an email address.  Once the tournament is over the scoreboard will stop accepting scores.

For full details, see

Details on the Five or More game are on the wiki page -

July 26, 2014

Month View for GTG’s Calendar Plugin

These past couple weeks I implemented a new view for the calendar plugin: the Month View.

Until now, the plugin only displayed a single week. Now the user can switch between views, using the drop-down menu, and select between ‘Week’ or ‘Month’.

This new view has very similar functionalities to the previous one, allowing a task to be dragged/dropped to change its dates, double clicking on it to change its content, and also allowing the creation and deletion of tasks.

Here is a preview of how things are going with the month view, even though it’s not working a 100% yet:month-view

I hope to finish the implementation and fix the issues with this new view soon, in order to be able to integrate it to GTG as well.

July 25, 2014

Not going to GUADEC

For the first time since I first started attending GUADEC in 2009, this year I am not going.
Strasbourg is very close to where I live (about four hours driving) but some factors just make it very difficult to attend the conference this year. We have a guest over and I didn’t want to go by myself because, even though Olivia is a very relaxed baby, I still notice how I am more tired than usual and it would be risky to drive all alone (besides, it’d leave all the responsibility of taking care of the baby to Helena). Taking everybody there could be an option but I could only go for the weekend anyway and, since we’re going for vacation a week after the event, logistics and timing are just not convenient.

I will surely miss having a beer with old friends.

Have a great GUADEC!


GSoC 2014 report 2

What an eventful month! Lots of code have been read, produced, reviewed and refused, several code designs have been proposed, but as everything was work in progress, I didn't fell it was exciting enough to blog about it at that time.

But here it is! The blog post that explains it all.

Multiple displays per machine

I mostly spent the third week trying to have multiple displays for a SPICE machine, which implied reading a lot virt-manager's code and some of spice-gtk's code.

I achieved to get the maximum number of displays available for a machine and, with Christophe Fergeau's help, to get some good results: we were able to have multiple displays each in its own window, but we were not able to open them up from Boxes yet (the session had to have them already opened).

Multiples machines

Jakub Steiner proposed a design focusing mainly on having multiple machines running side by side in different windows:
It settled what idea would be implemented, and a new bug report have been created to reflect that.

The window singleton

The main problem that have been encountered is that Boxes uses a static window instance that is accessed directly by a huge number of classes.

It is very problematic, as if you summon a second instance of this window's class, any action in the second window will have its reactions produced not on itself but on the static window instance, which is conceivably problematic who you want to implement a multi window feature.

Zeeshan proposed not to change the app's design to allow having multiple windows but to have multiple one window instances communicating via D-Bus.
It have its pros, as it would make Boxes more resistant to crashes, to slowdowns and to be be naturally more parallelized, but the design would most certainly have ended up being a gigantic mishmash.

This idea have been abandoned in favor of a code refactoring.

Getting rid of the window singleton

In Boxes' design, the app's controller is scattered over all the classes.
In my opinion it is problematic:
  • you can't use a class in any different way without changing the class' code
  • changing a simple behaviour may require modifying lots of classes
  • you can't have a good overview of the application's behaviour  which may confuse newcomers
I spent several weeks refactoring the app, trying to move the controller closer to the root of the composition tree by using signals to make the leaf classes dumber but more modular.
It worked preety well and Lasse, which reviewed my patches, and I  were preety proud of the newly produced code and design.

At the time when the code started to take shape, unfortunately, Zeeshan had harware failures which made him not able to check out our work.
Because of that, we knew quite lately that he preferred the code to stay as close as possible from the actual design, changing as less code as possible in the process.
It is pretty wise, as he haves to maintain the application, and so I started over with these considerations.

So far the last set of refactoring patches are well received, all they need is a little bit of polish to fully shine!

What's next

It's now the time of GUADEC, here I'll be able to chat with my colleagues and to meet other gnomes!
If everything go as planned, the patches will be merged during this period, and I'll be able to actually implement the multi window feature after that!

Maps BoF

I'm going to GUADEC again this year. Actually I should probably be packing since my flight leaves for Strasbourg in 4h. But first a quick announcement.

I'd like to announce my intention to hold a Maps BoF this GUADEC.
Maps is shaping up and after we've gotten the work from this years GSoC projects merged we are hopefully starting to look like a real world usable application.
Now would be a good time to look over what we've done and discuss where we're going with the larger GNOME community.
We want input and feedback on design, code and more. And we want help from you guys to decide what kind of an application we want Maps to become in the future.

So, if you're staying past the core days at GUADEC this year and are interested in Maps, please add yourself as an attendee here!

Regarding date and time, my suggestion is 14-19 on the 31st of July, but personally I'm flexible.

See you in Strasbourg!

July 24, 2014

L10n & Docs at GUADEC 2014

At this year’s GUADEC, there will be a number of sessions relevant to people interested in GNOME localization and documentation:

  • On Monday, Kat will give an update from the docs team called Documentation: state of the union. Her talk will detail what has happened in the documentation realm since GUADEC 2013, so be sure to attend.
  • Team Reports on Saturday will include localization and documentation.
  • There will be a screenshot automation BoF on July 30th. Vadim Rutkovsky from Red Hat’s Desktop QE has some sweet surprise for you, translators!
  • Finally, on July 30th and 3Ist, Daiki Ueno and Alexandre Franke are planning to organize an i18n hackfest to work on translation tools, spellcheckers, dictionaries, input methods, and related fields.

I’ll be arriving tomorrow evening with Christian Schaller and other desktop people from Red Hat Czech and leaving on the 31st – hope to see you all in Strasbourg!

sponsored-badge-shadow guadec-2014-badge-large


Open Help Conference 2014

This is a belated post on the Open Help Conference in Cincinnati, OH that I had the chance (thanks for sponsoring me, Red Hat!) to attend this year. It took place from June 14-15 at a nice venue provided by Garfield Suites Hotel. The Conference was followed by the GNOME Docs Sprint from June 16-18.

The Open Help Conference is much smaller in attendance than some of the large industry conferences for technical writers out there. This actually allows the attendees to actively participate in many talks and discussions, similarly to what you can usually experience at unconferences. It was the Conference’s primary focus on docs communities that made attending each of the sessions very relevant to those of us who work on open source documentation.

Cincinnati Skyline Cincinnati Skyline chili The venue The hackfest

Along with people representing other open source companies and communities (this included Eric Shepherd from Mozilla or Michael Downey from OpenMRS), there were also two fellow Red Hatters attending (Rich Bowen and David King). We had quite a few people from GNOME Docs, too. The Conference was organized by Shaun McCance who did a fantastic job running the whole event as he found time not only to take care of the venue and catering, but also of the social events on both conference days that took place in his lovely hometown of Cincinnati. Thanks again, Shaun!

You can check #openhelp and Michael Downey’s excellent notes to learn more about the different talks and sessions held at OH.

Open Help Conference 2014 Hackfest

The Open Help Conference 2014 Hackfest followed an unwritten tradition in the GNOME Documentation Project of having two GNOME docs hackfests or sprints annually. Unlike the sprint held earlier this year in Norwich, UK where the team worked mostly on updating the user help for the upcoming GNOME 3.12 release, the Cincinnati hackfest focused on finishing the GNOME System Administration Guide. We managed to completely rework the overall structure of the guide and redesigned the index page for the guide, following the earlier design mockups prepared for GNOME Help by Allan Day.

The restructured System Administration Guide now features the following main groups of topics:

  • User Settings (Lockdown, Pre-seed user defaults, Pre-seed email/calendar/groupware, Appearance, Setup)
  • Login Settings (Appearance, Session, Management)
  • Network (NetworkManager, etc.)
  • Software (Apps, Extensions, Management)
  • Troubleshooting / diagnosis

More details can be found on the Guide planning page.

Other things that caught my attention during the conference:

Duck Pages

Shaun’s plans for the future include an additional input format for Mallard-based documentation – so called Duck pages. A Duck page is essentially a plain text format based on Mallard XML that doesn’t use the often distracting XML syntax. Duck pages should make it easy to author single-sourced topic-based documentation with a Markdown or AsciiDoc-like syntax. Unlike Markdown and others, Duck pages aim to not only allow for quick creation of rich-formatted docs, but also to contain data necessary to integrate the document with the rest of your Mallard-based documents.


Shaun also presented another tool that he has been working on: Blip. It is a web application to monitor documentation projects that use SCM repositories. Some examples include:

Blip lets you not only browse through individual modules in your documentation project, but it also mines data to present information about contributors, their commit or mailing list activity, and much more.

  • An example of a project profile with complete data on branches, commits, authors, included documents, and more: Gnome User Documentation
  • An example of of a user profile with personal stats: Shaun McCance

GSoC Report 4: Stalking your friends with GNOME

More than one month without a GSoC report? Luckily my mentors are so kind to get angry with me ;) . The truth is that I was really busy getting ready for some final exams, but that stuff is finished and I am really motivated to continue.

What was done?

Stalking your friend with GNOME

I started to like the word stalk since I started to play GTA V and got known about the Lifeinvader in-game social network, now this word is coming to GNOME Maps. I was working in a feature to display check-ins from user friends and from the user him/herself. This feature is going to be available only for Foursquare users that have a Foursquare account configured in GNOME Online Accounts. Since I was mostly working in the non-GUI part of the feature I don't have much to show you, just one screenshot:

No! I'm not using maximize and minimize buttons in my environment! that's happening to me with the latest version of Gtk+, dunno why :)

These are markers that Maps will show to represent a check-in (for the current user or for the user itself), obviously, if you press on that marker, a bubble with more information about the check-in will appear, I'm working with Andreas to get a nice mockup for that bubble. Maybe some GNOMErs remember me asking for help in IRC to get a photo with a circle shape, that was what I was trying to do :) .

Maybe you want to get involved in this feature since a have some usability worries that I explain in this Bugzilla comment, any feedback related to this will be appreciated.

Reviews, reviews and more reviews

Now I'm starting to getting some nice feedback from Dario and Rishi (GSoC students working on Maps) about the markers and bubbles I'm coding.

I hope when everything get merged, we will have a really nice maps app for GNOME.

What's next?

Showing check-ins in Maps is the last goal for my GSoC project, so the next tasks is to finish this and work hard to get everything merged before 3.14.


Sadly, I am not going to be in Strasbourg this weekend, not because I couldn't get sponsorship, but due personal stuff. I hope next year to be different, because I really wanted to be there, hacking with you :(

Week 2, 3 and 4: Implementing History View and Revamping the UI of Gnome-Calculator

Hi everybody,

Sorry for the late update on the project, was a little busy writing the code for this project.
Anyways let me give you a brief description of what I have been able to implement so far!
I have added padding to the buttons so that they are aligned in well sepearated rows and columns. We are planning to keep the color of the buttons nearly the same except the buttons for the operators would have a darker shade of grey and the result button would be blue in color. Improving the functionality of Undo buttons has already been implemented by another developer so I just fixed a bug associated with the same. I have also implemented History View which is the most important subtask of this project. The History View stores the expression entered by the user along with the corresponding answer into well defined rows of Gtk.ListBox.You can select and edit the expression in a row by double clicking the corresponding row.
Link to the patch for adding padding to the buttons:

Link to the patch for implementing history view :

Link to image of History View in Gnome-Calculator:

We have more plans for history view and the UI so stay tuned. I would be back with more updates on my project soon!

GUADEC 2014, here I am


I just arrived in Strasbourg where the 2014 edition of the GUADEC will take place.

The city is beautiful and have an interesting architecture full of half-timbered buildings and lush vegetation, which contrasts a lot with my home town of Montpellier.

The GUADEC and my participation to the GSoC for Boxes

During the last weeks, I worked a lot with Christophe Fergeau, Zeeshan Ali and Lasse Schuirmann, this GUADEC will be a great occasion to meet them in person and to enhance our collaboration on Boxes.
The GUADEC will also be a great occasion to work on Boxes by chatting about (and solving?) pending patches and to chat about Boxes' future during a BoF.

I'll also attend Lasse's proposed BoF on pointerless navigation (i.e. keybaord navigation), completing it with my interest on forsaken input devices such as gamepads. Don't hesitate to come and chat with us!

Have a nice trip and see you in Strasbourg!

Just two days to GUADEC

Tomorrow I'm going to leave for Strasbourg and starting the next day I will attend GUADEC '14.
As Gnome intern, I'm going to give a lightning talk on Sunday starting at 5pm (UTC/GMT +2 hours).

I would like to thank the Gnome Foundation for sponsor my accommodation.
Can't wait for meeting developers and having interesting lectures!

I’m also going!

I am going to GUADEC and it will be a first timer. GNOME Foundation has been kind enough to sponsor me and I’d like to thank them for doing so. It will be an adventure for me for sure!

What’s happening

I’ll let most things flow, but there is a few things I have in mind for GUADEC:

    • I’ll be volunteering as runner and session chair on at least Sunday, Monday and probably more.
    • I’ll be participating in Engagement’s Birds of a Feather session.
    • I plan to perform a lighting talk around these promotional videos I tinker with.

Most importantly I’ll be meeting a lot of people, both new and known. It will be interesting, inspiring and motivating. A nice learning opportunity I’d say.


Another year, another GUADEC

It’s 2014, and like previous years:


This time I won’t give any talk, just relax and enjoy talks from others, and hope Strasbourg.

And what is more important, meet those hackers you interact with frequently, and maybe share some beers.

So if you go there, and you want to have a nice chat with me, or talk about Grilo project, don’t hesitate to do it. Igalia, which is kindly sponsoring my attendance, will have a place there during the core days, so likely you could find me around or ask anyone there for me.


Continuous testing and Wayland

The GNOME-Continuous  continuous integration and delivery system has been helping us to keep the quality of the GNOME code base up for a while now.

It is doing a number of things:

  • Builds changed modules
  • Creates vm images that can be downloaded for local testing
  • Smoke-tests the installed image  by verifying that it boots up to the login screen
  • Runs  more than 400 tests against the installed image
  • Launches all the applications that are part of the moduleset and takes screenshots of them

All of this happens after every commit, or at least very close to that, and the results are available at the website.

You can learn more about GNOME-Continuous here.

As a member of the the GNOME release team I am really thankful for this service, it has made our job a lot easier – at release time everything just builds and works most of the time nowadays.

Earlier this year, we’ve made the smoke-testing aspect of gnome-continuous more useful by verifying not just GNOME, but also GNOME Classic and GNOME on Wayland.

Today, we’ve had a minor breakthrough: for the first time, the GNOME/Wayland smoke test succeeded all the way to taking a screenshot of the session.

GNOME/Wayland(of course, GNOME on Wayland looks just like GNOME under X11, so the screenshot is not very exciting by itself).

To get this far, we had to switch to the egl-drm branch of mesa, which adds support for running KMS+DRM GL applications with QXL.

July 23, 2014

Mono Performance Team

For many years a major focus of Mono has been to be compatible-enough with .NET and to support the popular features that developers use.

We have always believed that it is better to be slow and correct than to be fast and wrong.

That said, over the years we have embarked on some multi-year projects to address some of the major performance bottlenecks: from implementing a precise GC and fine tuning it for a number of different workloads to having implemented now four versions of the code generator as well as the LLVM backend for additional speed and things like Mono.SIMD.

But these optimizations have been mostly reactive: we wait for someone to identify or spot a problem, and then we start working on a solution.

We are now taking a proactive approach.

A few months ago, Mark Probst started the new Mono performance team. The goal of the team is to improve the performance of the Mono runtime and treat performance improvements as a feature that is continously being developed, fine-tuned and monitored.

The team is working both on ways to track performance of Mono over time, implemented support for getting better insights into what happens inside the runtime and has implemented several optimizations that have been landing into Mono for the last few months.

We are actively hiring for developers to join the Mono performance team (ideally in San Francisco, where Mark is based).

Most recently, the team added a new and sophisticated new stack for performance counters which allows us to monitor what is happening on the runtime, and we are now able to export to our profiler (a joint effort between our performance team and our feature team and implemented by Ludovic). We also unified both the runtime and user-defined performance counters and will soon be sharing a new profiler UI.

Watch out for DRI3 regressions

DRI3 has plenty of necessary fixes for and Wayland, but it's still young in its integration. It's been integrated in the upcoming Fedora 21, and recently in Arch as well.

If WebKitGTK+ applications hang or become unusably slow when an HTML5 video is supposed to be, you might be hitting this bug.

If Totem crashes on startup, it's likely this problem, reported against cogl for now.

Feel free to add a comment if you see other bugs related to DRI3, or have more information about those.

Update: Wayland is already perfect, and doesn't use DRI3. The "DRI2" structures in Mesa are just that, structures. With Wayland, the DRI2 protocol isn't actually used.

Sandboxed applications for GNOME, part 2

This is the second of two posts on application sandboxing for GNOME. In my previous post, I wrote about why application sandboxing is important, and the many positive impacts it could have. In this post, I’m going to concentrate on the user experience design of application sandboxing.

The problem

As I previously argued, application sandboxing has the potential to greatly improve user experience. Privacy features can make users feel more secure and in control, and performance can be improved, for example. At the same time, there is a risk involved with adding security layers: if a security framework is bothersome, or makes software more difficult to use, then it can degrade the user experience.

Sandboxing could be truly revolutionary, but it will fall flat on its face if no one wants to use it. If application sandboxing is going to be a success, it therefore needs to be designed in such a way that it doesn’t get in the way, and doesn’t add too many hurdles for users. This is one reason why effective design is an essential part of the application sandboxing initiative.

Some principles

Discussion about sandboxed applications has been happening in the GNOME project for some time. As these discussions have progressed, we’ve identified some principles that we want to follow in ensuring that sandboxed application provide a positive user experience. Before I get into the designs themselves, it’s useful to quickly go over these principles.

Avoid contracts

Sandboxed applications have limited access to system API and user data and, if they want more access, they have to ask the user for it. One way to allow apps to ask for these permissions is to present a set of application requirements that the user must agree to at install time. Anyone who uses Android will be familiar with this model.

Asking for blanket user permissions at install time is not something we are in favour of, for a number of reasons:

  • People have a tendency to agree to contracts without reading them, or without understanding their implications.
  • It often isn’t clear why the application wants access to the things it wants access to, nor is it clear when it will use them. There is little feedback about application behaviour.
  • There’s no opportunity to try the app before you decide what it gets access to. This is an issue because, prior to using the application, the user is not in a good position to evaluate what it should get access to.
  • Asking for permissions up front makes software updates difficult, since applications might change what they want access to.

It doesn’t have to look like security

All too often, security requests can be scary and intimating things, and they can feel far removed from what a user is actually trying to do. It doesn’t have to be this way, though: security questions don’t have to be expressed using scary security language. They don’t even have to look like security – the primary purpose of posing a security question is to ascertain that a piece of software is doing what the user wants it to do, and often, you can verify this without the user even realising that they are being asked a question for security purposes.

We can take this principle even further. The moment when you ask a security question can be an opportunity to present useful informations or controls – these moments can become a valuable, useful, and even enjoyable part of the experience.

Privacy is the interesting part

People tend to be interested in privacy far more than security. Generally speaking, they are concerned with who can see them and how they appear, rather than with abstract security threats. Thinking in terms of privacy rather than security therefore helps us to shift the user experience in a more human-orientated direction. It prompts us to think about social and presentational issues. It makes us think about people rather than technology.

Real-time feedback

A key part of any security framework is getting real-time feedback about what kinds of access are occurring at any one time. Providing real-time feedback makes the system much more transparent and, as a result, builds trust and understanding, as well as the opportunity to quickly respond when undesirable access occurs. We want to build this into the design, so that you get immediate feedback about which devices and services are being used by applications.

Audit and revocation

This is another key part of security, and follows on from real-time feedback. A vital area that needs to be addressed is the ability to see which services and data have been accessed in the past and which application accessed them. It should be possible to revoke access to individual services or devices based on your changing needs as a user, so that you have genuine control over what your applications get to see and do.

Key design elements

User-facing mechanisms for applications to request access to services and data are one obvious thing that needs to be designed for sandboxed applications. We also need to design how feedback will be given when services are being accessed, and so on.

At the same time, application sandboxing also requires that we design new, enabling features. By definition, sandboxed applications are isolated and can have limited permissions. This represents a challenge, since an isolated application must still be able to function and, as a part of this, it needs (mediated, secure) mechanisms for basic functions, like importing content, or passing content items to other apps. This is the positive aspect of the sandboxing user experience.


Here, sharing is the kind of functionality that is commonly found on mobile platforms. It is a framework which allows a user to pass content items (images, documents, contacts, etc) from one app to another (or to a system service). This is one of the positive, enabling pieces of functionality that we want to implement around application sandboxing.

Sharing is important for sandboxing because it provides a secure way to pass content between applications. It means that persistent access to the user’s files will be less important for many applications.


The sharing system is envisaged to work like many others – each application can provide a share action, which passes a content item to the sharing service. The system determines which applications and services are capable of receiving the content item, and presents these as a set of choices to the user.

In the example shown above, an image that is being viewed in Photos is being shared. Other applications that can use this image are then listed in a system-provided dialog window. Notice that online accounts are also able to act as share points in this system, as are system services, like Bluetooth, Device Send (using DLNA), or setting the wallpaper.

Content selection

Content selection plays a similar role to sharing, but in reverse: where sharing allows a user to pass content items from a sandboxed application, content selection is a mechanism that allows them to pull them in.

Content selection has traditionally occurred through the file chooser dialog. There are a number of obvious disadvantages with this approach, of course. First, content items have to be files: you can’t select a contact or a note or an appointment through the file chooser. Second, content items have to be local: content from the cloud cannot be selected.

The traditional file chooser isn’t well-suited to sandboxed applications. Sandboxing implies that applications might not be able to save content to a common location on disk: this means that we need a much more flexible content selection framework.


Content selection should enable content from a range of applications to be selected. Content can be filtered by source application, and it can include items that aren’t files and aren’t even stored locally.

System authorisation

Sharing and content selection are intended to provide (system mediated) mechanisms for opening or sending individual content items from sandboxed applications. When access is required to hardware devices (like cameras or microphones), or permanent access is required to the user’s files or data (such as contacts or the calendar), the system needs to check that access is authorised.

For cases like this, there is little option but to present the user with a direct check – the system needs to present a dialog which asks the user whether the application should have access to the things it wants to have access to. The advantage of posing a direct question at the time of access is that it provides real-time feedback about what an application is attempting to do.



In line with the principles I outlined above, we’re pushing to take the sting out of these dialogs, by phrasing them as relatively friendly/useful questions, rather than as scary security warnings. We’re also exploring ways to make them into useful parts of the user experience, as you can see with the camera example: in this case, the security dialog is also an opportunity to check which microphone the user wants to use, as well as to indicate the input level.

A key requirement for the design is that these access request dialogs feel like they are part of your natural workflow – they shouldn’t be too much of an interruption, and they should feel helpful. One technique we’ll need to use here is to restrict when system authorisation dialogs can be shown, since we don’t want them popping up uninvited. It certainly shouldn’t be possible for an application to pop up an access request while you are using a different application.

Real-time feedback, audit and revocation


As I said above, providing real-time feedback about when certain services are being used is one of the goals of the design. Here, we plan to extend the system status area to indicate when cameras, microphones, location and other services and devices are in use by applications.

We also have designs to extend GNOME’s privacy settings, so that you can see which services and content have been accessed by applications. You will be able to restrict access to these different services, and block individual applications from accessing them.

Pulling it all together

One of the things that I’ve tried to demonstrate in this post is that implementing application sandboxing isn’t just about adding security layers to desktop infrastructure. It also requires that we carefully think about what it will be like for people to use these applications, and that security frameworks be designed with user experience in mind.

We need to think beyond security to actually making sandboxing into a positive thing that users and developers want to use. For me, one of the most exciting things about sandxboxing is that it provides the opportunity to add new, powerful features to the GNOME application platform. It can be enabling rather than being a purely restrictive technology.

These designs also show that application sandboxing isn’t just about low-level infrastructure. Work needs to be done across the whole stack in order to make sandboxing a reality. This will require a combined effort that we can all participate in and contribute to. It’s the next step for the Free Software desktop, after all.

The designs that I’ve presented here are in the early stages of development. They will evolve as these initiatives progress, and everyone working in this area will have the opportunity to help develop them with us.

GUADEC 2014 Map

Want a custom map for GUADEC 2014?

Here’s a map I made that shows the venue, the suggested hotels, transit ports (airport/train station), vegetarian & veggie-friendly restaurants, and a few sights that look interesting.

I made this with Google Map Engine, exported to KML, and also changed to GeoJSON and GPX.

If you want an offline map on an Android phone, I suggest opening up the KML file with Maps.Me (proprietary OpenStreeMap-based app, but nice) or the GPX on OSMand (open source and powerful, but really clunky).

You can also use the Google Maps Engine version with Google Maps Engine on your Android phone, but it doesn’t really support offline mode all so well, so it’s frustratingly unreliable at best. (But it does have pretty icons!)

See you at GUADEC!

Making Evince “Behave” & GUADEC!

Evince_logo_newThis is probably going to be the shortest post I will be writing, but then it’s GUADEC time!

My latest component is Evince, the document viewer. I am building up on the same infrastructre as we did for Weather testing, and have put together some basic tests for the application. These can be found HERE

On the lighter side of things, I could do with a baggage zipper (much like a file zipper) at a click to take care of my packing, and mail me and my zipped luggage to save us from the gruesome travel :P

Apart from these, I am helping the team organize Volunteers, and I realize I love organizational tasks for GUADEC. (Should take up more of these in the future too :D)

Anyways, returning to the old way of packing now. (I promise the next blog will be better!)

Quick-start guide to gst-uninstalled for GStreamer 1.x

One of the first tools that you should get if you’re hacking with GStreamer or want to play with the latest version without doing evil things to your system is probably the gst-uninstalled script. It’s the equivalent of Python’s virtualenv for hacking on GStreamer. :)

The documentation around getting this set up is a bit frugal, though, so here’s my attempt to clarify things. I was going to put this on our wiki, but that’s a bit search-engine unfriendly, so probably easiest to just keep it here. The setup I outline below can probably be automated further, and comments/suggestions are welcome.

  • First, get build dependencies for GStreamer core and plugins on your distribution. Commands to do this on some popular distributions follow. This will install a lot of packages, but should mean that you won’t have to play find-the-plugin-dependency for your local build.

    • Fedora: $ sudo yum-builddep gstreamer1-*
    • Debian/Ubuntu: $ sudo apt-get build-dep gstreamer1.0-plugins-{base,good,bad,ugly}
    • Gentoo: having the GStreamer core and plugin packages should suffice
    • Others: drop me a note with the command for your favourite distro, and I’ll add it here
  • Next, check out the code (by default, it will turn up in ~/gst/master)

    • $ curl | sh
    • Ignore the pointers to documentation that you see — they’re currently defunct
  • Now put the gst-uninstalled script somewhere you can get to it easily:

    • $ ln -sf ~/gst/master/gstreamer/scripts/gst-uninstalled ~/bin/gst-master
    • (the -master suffix for the script is important to how the script works)
  • Enter the uninstalled environment:

    • $ ~/bin/gst-master
    • (this puts you in the directory with all the checkouts, and sets up a bunch of environment variables to use your uninstalled setup – check with echo $GST_PLUGIN_PATH)
  • Time to build

    • $ ./gstreamer/scripts/
  • Take it out for a spin

    • $ gst-inspect-1.0 filesrc
    • $ gst-launch-1.0 playbin uri=file:///path/to/some/file
    • $ gst-discoverer-1.0 /path/to/some/file
  • That’s it! Some tips:

    • Remember that you need to run ~/bin/gst-master to enter the environment for each new shell
    • If you start up a GStreamer app from your system in this environment, it will use your uninstalled libraries and plugins
    • You can and should periodically update you tree by rerunning the script
    • To run gdb on gst-launch, you need to do something like:
    • $ libtool --mode=execute gdb --args gstreamer/tools/gst-launch-1.0 videotestsrc ! videoconvert ! xvimagesink
    • I find it useful to run cscope on the top-level tree, and use that for quick code browsing

July 22, 2014


I’m going to GUADEC 2014! My flight would be 16 hours from now, so I’m currently packing my baggage.

I used up a lot of time applying for passport, applying for visa, looking for cheap plane and train tickets, buying things I need for my trip, waiting for the electricity to come back after a typhoon, and lots of other stuff, so I haven’t done much about my Google Summer of Code project. Seems like I have to code a lot during and after GUADEC to catch up :) .

Anyway, just found out yesterday that I have to participate in the lightning talks so I’m currently working on my slides too. I wonder where Yahoo put the announcement e-mail about it because it is not in my inbox nor GSoC folder.

Looking at the Zooniverse code

Recently I’ve been looking over the Zooniverse citizen science project and its  source code on github, partly because it’s interesting as a user and partly because I thought writing an Android app for Galaxy Zoo would be a good learning exercise and something useful to open source.

So far my Android app can’t do more than show images, but I thought I’d write up some notes already. I hesitate to implement the Android App further because the classification decision tree is so tied up in the web site’s code, as I describe below.

Hopefully this won’t feel like an attack from a clueless outsider. I’m just a science enthusiast who happens to have spent years developing open source software in various communities. I’ve seen the same mistakes over and over again and I’ve seen how to make things better.

Zooniverse is organised by the Citizen Science Alliance (CSA). Incidentally, though the CSA has an organizational structure, I can’t see it’s actual legal form. Is it a foundation or a company? The and domains are registered to Chris Lintott. Maybe it’s just a loose association of researchers with academic institutions assigning funds to work they care about, and maybe that’s normal in academia. The various Zooniverse employees actually seem to work for the member organisations such as the Adler Planetarium or the University of Oxford, though I guess some funding is for specific Zooniverse projects and some funding is for the overall Zooniverse development and hosting. That probably makes coordination difficult, like a big open source project.

Open Source

Since early 2013, the main Galaxy Zoo project has the code for it’s website on github along with the Zooniverse JavaScript library that it shares with other Zooniverse projects.

But most projects highlighted at (such as Planet Four, Asteroid Zoo, Moon Zoo, or Solar StormWatch) don’t yet have their website’s code on github. This looks like a worrying trend. It doesn’t look like open sourcing has become the default as planned.

The zooniverse github repositories list is a poor overview, particularly because most of the respositories have no github description whatsover. Github should make them mandatory, even though they’d need to be updated later. Most projects don’t even have a basic description in their either. Furthermore, I’d like to see a clear separation between front-ends, server-side code, and utilities (for processing data or for installing/maintaining servers.), maybe presented out on a github wiki page.

Also, they apparently have no plans to open source the server-side code (Ouroboros at that serves new subjects (such as galaxy images) to classify and receives classifications of these subjects. I think I’ve read that it’s a Ruby-On-Rails system. The client-side and server-side code is tightly bound, so this is a bit awkward. There is clearly room at least for some of the data structure and descriptions to be abstracted out and shared between the server, the client, and the analysis tools.

I can’t find any real documentation about the various Zooniverse code or APIs so there’s an awful chance of this blog post being the only introductory documentation that exists. I’d really welcome corrections and I’d gladly help. Neither can I find any place for public discussion of the software’s development, such as a mailing list. It’s hard for any open source project to mature without at least somewhere to discuss it.


Arfon Smith at Zooniverse wrote some blog entries about the Zooniverse Domain Model, Tools and Technologies, and Server-side logic (my title).  (Arfon has since left Zooniverse to work at Github). I also found some useful  documentation at the zooniverse page. But I had to look at the code and the network traffic to get a more complete picture.

Languages, Libraries, Frameworks

The zooniverse front-end web-sites generally seem to be written in CoffeeScript (a mostly nicer language on top of JavaScript), using the Spine framework, which seems to make it easier to separate data and code into an MVC structure and to write code that deals asynchronously with the server while caching some data locally.

Some Coffeescript is written inline with the HTML, in Eco (.eco) files.

The CSS is written in the Stylus syntax, as expected by hem, which they use to bundle the code up for deployment.

I’m no JavaScript expert, but these seem like fairly wise choices.

Zooniverse web sites communicate with the Ouroboros server using RESTful GET (get subjects to classify) and POST (return a classification of a subject) HTTP requests, using JSON syntax. I think the JSON syntax is generated/parsed by the base Spine.Module.  I don’t know of any implementation-independent documentation for this web API.

The website code uses the Zooniverse library  as a helper to communicate with the server, for instance to login, to get subjects, and to submit classifications, and to support the lists of recent and favourite subjetct. The Zooniverse library is also implemented in Coffescript. Strangely, the generated JavaScript is also checked into git. The Api class seems to be most interesting..

Questions and Answers

Let’s look at the Galaxy-Zoo website though its maybe the most complicated. It allows users to classify images of galaxies. Those images may be from one of several astronomical surveys, such as Sloan or UKIDSS. Each survey has an ID and a Workflow ID listed in (with much duplication of magic numbers). Each survey has a human-readable description and title in the list of English strings.

Each survey has a question/decision tree under app/lib, such as Galaxy-Zoo’s  I wonder if this generated or duplicated from somewhere in the server software.  Why are the long question titles duplicated and used as IDs for leadsTo instead of using short codes? Is this tree validated somehow during the build?

These IDs, Workflow IDs, and decision trees are listed in the Subject class.

Question IDs

The zero-based index of the questions in the decision trees are used as IDs when submitting the classification. For instance, a submitted classification POST might contain the following parameter to show that, when classifying a Sloan image, for the “Is there any sign of a spiral arm pattern” question (sloan-3, and the 4th question asked of me) I answered “Spiral” (a-0):

classification[annotations][4][sloan-3]: "a-0"

These implicit IDs, such as sloan-3, are also used in the translations,  and throughout the code. For instance, to reuse some translation strings, to decide if there should be a talk-page link. That i18n hack in particular belongs as an association in the decision tree.

These implicit IDs are also used in the CSS (via the Stylus .styl files) to identify the relevant icons. The icons are in one workflow.png file in order to use the CSS Sprites technique for performance). The various sub-parts of that image are selected by CSS in common.styl.

This seems very fragile. It would be safer if the icon files were stored separately and then the combined file was generated, along with that .styl CSS. I guess that the icons are already stored separately somewhere, maybe as SVG. One parent file could define the decision tree and all the associated descriptions and icon files.

Ideally much of this structure would be described in configuration files separately from the code. That generalisation would allow more code reuse between Zooniverse projects and could allow reuse by other front-ends such as iPhone and Android apps. Presumably it’s this fragility that has caused Galaxy Zoo to withdraw its previous mobile apps. Even with such an improvement, you’d still need a proper release process to coordinate development of interdependent software.

Subject and Classification

Galaxy-Zoo has a Subject class, as does the Operation War Diaries project. These usually derive from the base Subject class in the zooniverse library  ,though the Snapshot Serengeti Subject class does not.

The Ouroboros server at at provides a list of subjects for each group (a group is a survey, I think) to be classified via JSON. Here is the list of subjects for Galaxy Zoo’s Sloan survey. And here is the subjects list for Snapshot Serengeti with a simpler URI because there is only one group/survey.

The surveyId (for the group) for Galaxy Zoo is chosen randomly, though it’s currently hard-coded to always choose the Sloan survey. This JSON message contains the URLs of images for each  subject, in the list of “locations”. The Subject’s fetch() method calls the Api.get() method from the Zooniverse library and then creates Subjects for each item that the JSON message mentions.

The Subject’s constructor seems to take theJSON fragment to populate its member fields using the default Spine.Model’s AJAX functionality.

Galaxy-Zoo has a Classification class, and Snapshot Serengeti has one too. There doesn’t seem to be any common base Classification class in the zooniverse library. The Classification’s send() method calls the Classification’s toJSON() method before POSTING the message to the server via the Zooniverse library’s method.

It’s hard to see any great commonality between the various projects.
For instance, a Galaxy Zoo classification is a series of answers to multiple-choice questions, with the questions being from a decision tree. I guess that Snapshot Serengeti’s animal classification is similar, though you can provide multiple sets of answers to the same questions about what animal it is and what it is doing, to identify multiple animals in each image. Moon Zoo and Planet Four ask you to draw specific shapes on an image and also classify each shape you’ve drawn, probably resulting in coordinates and identifications.

I wonder if the server-side code has any common model for these data structures or if the classifications just get fed into project-specific databases for project-specific data analysis later.