GNOME.ORG

24 hours a day, 7 days a week, 365 days per year...

June 26, 2017

an early look at p4 for software networking

Happy midsummer, hackfriends!

As you know at work we have been trying to find ways to apply compilers technology to the networking space. We will compile high-level configurations into low-level network processing graphs, search algorithms into lookup routines optimized for the target data structures, packet filters into code ready to be further trace-compiled, or hash functions into parallel AVX2 code.

On one side, we try to provide fast implementations of existing "languages"; on the other side we can't help but try out new co-designed domain-specific languages that can be expressive and run fast. As an example, with pfmatch we extended pflang, the tcpdump language, with a more match-action kind of semantics. It worked fine but the embedding between pfmatch and the host language could have been more smooth; in the end the abstractions it offers don't really apply to what we have needed to build. For a long time we have been wondering if indeed there is a better domain-specific programming language to apply to the networking domain.

P4 claims to be this language, and I think it's worth a look. P4's goal is be able to define switches and other networking equipment in software, with the specific goal that they would like to be able for P4 programs to be synthesized to ASICs, or installed in the FPGA of a "Smart NIC", or compiled to CPUs. It's a wide target domain and the silicon-bakery side of things definitely constrains what is possible. Indeed P4 explicitly disclaims any ambition to be a general-purpose programming language. Still, I think they manage to achieve an admirable balance between declarative programming and transparent low-level compilability.

The best, most current intro to P4 out there is probably Vladimir Gurevich's slides from last month's P4 "developer day" in California. I think it does a good job linking the language's syntax and semantics with how they are intended to be applied to the target domain. For a more PL-friendly and abstract introduction, the P416 specification is a true delight.

Like I said, at work we build software switches and other network functions, and our target is commodity hardware. We write most of our work in Snabb, a powerful network toolkit built on LuaJIT, though we are branching out now to VPP/fd.io as well, just to broaden the offering a bit. Generally we try to build solutions that don't have any dependencies other than a commodity Xeon server and a commodity NIC like Intel's 82599. So how could P4 help us in what we're doing?

My first thought in this regard was that if there is a library of P4 building blocks out there, that it would be really convenient to be able to incorporate a functional block written in P4 within the graph of a Snabb program. For example, if we have an IPFIX collector written in Snabb (and we do!), it would be cool to stick that in the middle of a P4 traffic conditioner.

(Immediately I run into the problem that I am straining my mind to think of a network function that we wouldn't rather just write in Snabb -- something valuable enough that we wouldn't want to "own" it and instead we would import someone else's black box into our data-plane. Maybe this interesting in-network key-value cache counts? But I digress, let's assume that something exists here.)

One question is, why bother doing P4 in software? I can understand that if you have 1Tbps ports that you definitely need custom silicon pushing your packets around. You would like to be able to program that silicon, so P4 looks to be a compelling step forward. But if your needs are satisfied with 40Gbps ports and you have chosen a software networking solution for its low cost, low lock-in, high flexibility, and sufficient performance -- well does P4 buy you something?

Right now it would seem that the answer is "no". A Cisco group wrote a custom P4 compiler to VPP, which is architecturally pretty much the same as Snabb, and they had to do some work to get the performance within a couple percent of the hand-coded application. The only win I can see is if people start building up libraries of reusable P4 components that can be linked together -- but the language itself currently doesn't support any more global composition primitive than #include (yes, it uses CPP :).

Additionally, at least as things are now, it doesn't seem that there's a library of reusable, open source P4 components out there to take advantage of. If this changes, I'll have to have another look. And of course it's worth keeping an eye on what kinds of cool things people are building :)

Thanks to Luke Gorrie for conversations leading to this blog post. All opinions and errors mine, of course!

A C++ developer looks at Go (the programming language), Part 1: Simple Features

I’m reading “The Go Programming Language” by Brian Kernighan and Alan Donovan. It is a perfect programming language introduction, clearly written and perfectly structured, with nicely chosen examples. It contains no hand-waving – it’s aware of other languages and briefly acknowledges the choices made in the language design without lengthy discussion.

As an enthusiastic C++ developer, and a Java developer, I’m not a big fan of the overall language. It seems like an incremental improvement on C, and I’d rather use it than C, but I still yearn for the expressiveness of C++. I also suspect that Go cannot achieve the raw performance of C or C++ due to its safety features, though that maybe depends on compiler optimization. But it’s perfectly valid to knowingly choose safety over performance, particularly if you get more safety and more performance than with Java.

I would choose Go over C++ for a simple proof of concept program using concurrency and networking. Goroutines and channels, which I’ll mention in a later post, are convenient abstractions, and Go has standard API for HTTP requests. Concurrency is hard, and it’s particularly easy to choose safety over performance when writing network code.

Here are some of my superficial observations about the simpler features, which mostly seem like straightforward improvements on C. In part 2 I’ll mention the higher-level features and I’ll hopefully do a part 3 about concurrency. I strongly recommend that you read the book to understand these issues properly.

I welcome friendly corrections and clarifications. There are surely several mistakes here, hopefully none major.

No semicolons at the end of lines

Let’s start with the most superficial thing. Unlike C, C++, or Java, Go doesn’t need semicolons at the end of lines of code. So this is normal:

a = b
c = d

This is nicer for people learning their first programming language. It can take a while for those semicolons to become a natural habit.

No () parentheses with if and for

Here’s another superficial difference. Unlike C or Java, Go doesn’t put its conditions inside parentheses with if and for. That’s another small change that feels arbitrary and makes C coders feel less comfortable.

For instance, in Go we might write this:

for i int := 0; i < 100; i++ {
  ...
}

if a == 2 {
  ...
}

Which in C would look like this:

for (int i = 0; i < 100; i++) {
  ...
}

if (a == 2) {
  ...
}

Type inference

Go has type inference, from literal values or from function return values, so you don’t need to restate types that the compiler should know about. This is a bit like C++’s auto keyword (since C++11). For instance:

var a = 1 // An int.
var b = 1.0 // A float64.
var c = getThing()

There’s also a := syntax that avoids the need for var, though I don’t see the need for both in the language:

a := 1 // An int.
b := 1.0 // A float64
d := getThing()

I love type inference via auto in modern C++, and find it really painful to use any language that doesn’t have this. Java feels increasingly verbose in comparison, but maybe Java will get there. I don’t see why C can’t have this. After all, they eventually allowed variables to be declared not just at the start of functions, so change is possible.

Types after names

Go has types after the variable/parameter/function names, which feels rather arbitrary, though I guess there are reasons, and personally I can adapt. So, in C you’d have

Foo foo = 2;

but in Go you’d have

var foo Foo = 2

Keeping a more C-like syntax would have eased C developers into the language. These are often not people who embrace even small changes in the language.

No implicit conversions

Go doesn’t have implicit conversions between types, such as int and uint, or floats and int. This also applies to comparison via == and !=.

So, these won’t compile:

var a int = -2
var b uint = a
var c int = b

var d float64 = 1.345
var e int = c

C compiler warnings can catch some of these, but a) People generally don’t turn on all these warnings, and they don’t turn on warnings as errors, and b) the warnings are not this strict.

Notice that Go has the type after the variable (or parameter, or function) name, not before.

Notice that, unlike Java, Go still has unsigned integers. Unlike C++’s standard library, Go uses signed integers for sizes and lengths. Hopefully C++ will get do that too one day.

No implicit conversions because of underlying types

Go doesn’t even allow implicit conversions between types that, in C, would just be typedefs. So, this won’t compile

type int Meters
type int Feet
var a Meters := 100
var b Feet = a

I think I’d like to see this as a warning in C and C++ compilers when using typedef.

However, you are allowed to implicitly assign a literal (untyped) value, which looks like the underlying type, to a typed variable, but you can’t assign from an actual typed variable of the underlying type:

type int Meters
var a Meters := 100 // No problem.

var i int = 100
var b Meters := i // Will not compile.

No enums

Go has no enums. You should instead use const values with the iota keyword. So, while C++ code might have this:

enum class Continent {
  NORTH_AMERICA,
  SOUTH_AMERICA,
  EUROPE,
  AFRICA,
  ...
};

Continent c = Continent::EUROPE;
Continent d = 2; // Will not compile

in Go, you’d have this:

type continent int

const (
  CONTINENT_NORTH_AMERICA continent = iota
  CONTINENT_SOUTH_AMERICA // Also a continent, with the next value via iota.
  CONTINENT_EUROPE // Also a continent, with the next value via iota.
  CONTINENT_AFRICA // Also a continent, with the next value via iota.
)

var c continent = CONTINENT_EUROPE
var d continent = 2 // But this works too.

Notice how, compared to C++ enums, particularly C++11 scoped enums, each value’s name must have an explicit prefix, and the compiler won’t stop you from assigning a literal number to a variable of the enum type. Also, the Go compiler doesn’t treat these as a group of associated values, so it can’t warn you, for instance, if you forget to mention one in a switch/case block.

Switch/Case: No fallthrough by default

In C and C++, you almost always need a break statement at the end of each case block. Otherwise, the code in the following case block will run too. This can be useful, particularly when you want the same code to run in response to multiple values, but it’s not the common case. In Go, you have to add an explicit fallthrough keyword to get this behaviour, so the code is more concise in the general case.

Switch/Case: Not just basic types

In Go, unlike in C and C++, you can switch on any comparable value, not just values known at compile time, such as ints,  enums, or other constexpr values. So you can switch on strings, for instance:

switch str {
  case "foo":
  doFoo()
case "bar":
  doBar()
}

This is convenient and I guess that it is still compiled to efficient machine code when it uses compile-time values. C++ seems to have resisted this convenience because it couldn’t always be as efficient as a standard switch/case, but I think that unnecessarily ties the switch/case syntax to its original meaning in C when people expected to be more aware of the mapping from C code to machine code.

Pointers, but no ->, and no pointer arithmetic

Go has normal types and pointer types, and uses * and & as in C and C++. For instance:

var a thing = getThing();
var p *thing = &a;
var b thing = *p; // Copy a by value, via the p pointer

As in C++, the new keyword returns a pointer to a new instance:

var a *thing = new(thing)
var a thing = new(thing) // Compilation error

This is like C++, but unlike Java, in which any non-fundamental types (not ints or booleans, for instance) are effectively used via a reference (it just looks like a value), which can confuse people at first by allowing inadvertent sharing.

Unlike C++, you can call a method on a value or a pointer using the same dot operator:

var a *Thing = new(Thing) // You wouldn't normally specify the type.
var b Thing = *a
a.foo();
b.foo();

I like this. After all, the compiler knows whether the type is a pointer or a value, so why should it bother me with complaints about a . where there should be a -> or vice-versa? However, along with type inference, this can slightly obscure whether your code is dealing with a pointer (maybe sharing the value with other code) or a value. I’d like to see this in C++, though it would be awkward with smart pointers.

You cannot do pointer arithmetic in Go. For instance, if you have an array, you can’t step through that array by repeatedly adding 1 to a pointer value and dereferencing it. You have to access the array elements by index, which I think involves bounds checking. This avoids some mistakes that can happen in C and C++ code, leading to security vulnerabilities when your code accesses unexpected parts of your application’s memory.

Go functions can take parameters by value or by pointer. This is like C++, but unlike Java, which always takes non-fundamental types by (non const) reference, though it can look to beginner programmers as if they are being copied by value. I’d rather have both options with the code showing clearly what is happening via the function signature, as in C++ or Go.

Like Java, Go has no notion of const pointers or const references. So if your function takes a parameter as a pointer, for efficiency, your compiler can’t stop you from changing the value that it points to. In Java, this is often done by creating an immutable type, and many Java types, such as String, are immutable, so you can’t change them even if you want to. But I prefer language support for constness as in C++, for pointer/reference parameters and for values initialized at runtime. Which leads us to const in Go.

References, sometimes

Go does seem to have references (roughly, pointers that look like values), but only for the built-in slice, map, and channel types.  (See below about slices and maps.) So, for instance, this function can change its input slide parameter, and that change will be visible to the caller, even though the parameter is not declared as a pointer:

func doThing(someSlice []int) {
  someSlice[2] = 3;
}

In C++, this would be more obviously a reference:

void doThing(Thing& someSlice) {
  someSlice[2] = 3;
}

I’m not sure if this is a fundamental feature of the language or just something about how those types are implemented. It seems confusing for just some types to act differently, and I find the explanation a bit hand-wavy. Convenience is nice, but so is consistency.

const

Go’s const keyword is not like const in C (rarely useful) or C++, where it indicates that a variable’s value should not be changed after initialization. It is more like C++’s constexpr keyword (since C++11), which defines values at compile time. So it’s a bit like a replacement for macros via #define in C, but with type safety. For instance:

const pi = 3.14

Notice that we don’t specify a type for the const value, so the value can be used with various types depending on the syntax of the value, a bit like a C macro #define. But we can restrict it by specifying a type:

const pi float64 = 3.14

Unlike constexpr in C++, there is no concept of constexpr functions or types that can be evaluated at compile time, so you can’t do this:

const pi = calculate_pi()

and you can’t do this

type Point struct {
  X int
  Y int
}

const point = Point{1, 2}

though you can do this with a simple type whose underlying type can be const:

type Yards int
const length Yards = 100

Only for loops

All loops in Go are for loops – there are no while or do-while loops. This simplifies the language in one way compared, for instance, to C, C++, or Java, though there are now multiple forms of for loop.

For instance:

for i int := 0; i < 100; i++ {
  ...
}

or, like a while loop in C:

for keepGoing {
  ...
}

And for loops have a range-based syntax for containers such as string, slices or maps, which I’ll mention later:

for i, c := range things {
  ...
}

C++ has range-based for loops too, since C++11, but I like that Go can (optionally) give you the index as well as the value.

A native (Unicode) String type

Go has a built-in string type, and built in comparison operators such as ==, !=, and < (as does Java). Like Java, Strings are immutable, so you can’t change them after you’ve created them, though you can create new Strings by concatenating other Strings with the built in operator +. For instance:

str1 := "foo"
str2 := str1 + "bar"

Go source code is always UTF-8 encoded and string literals may contain non-ASCII utf-8 code points. Go calls Unicode code points “runes”.

Although the built-in len() function returns the number of bytes, and the built in operator [] for strings operates on bytes, there is a utf8 package for dealing with strings as runes (Unicode code points). For instance:

str := "foo"
l := utf8.RuneCountInString(str)

And the range-based for loop deals in runes, not bytes:

str := "foo"
for _, r := range str {
  fmt.Println("rune: %q", r)
}

C++ still has no standard equivalent.

Slices

Go’s slices are a bit like dynamically-allocated arrays in C, though they are really views of an underlying array, and two slices can be views into different parts of the same underlying array. They feel a bit like std::string_view from C++17, or GSL::span, but they can be resized easily, like std::vector in C++17 or ArrayList in Java.

We can declare a span like so, and append to it:

a := []int{5, 4, 3, 2, 1} // A slice
a = append(a, 0)

Arrays (whose size cannot change, unlike slices) have a very similar syntax:

a := [...]int{5, 4, 3, 2, 1} // An array.
b := [5]int{5, 4, 3, 2, 1} // Another array.

You must be careful to pass arrays to functions by pointer, or they will be (deep) copied by value.

Slices are not (deep) comparable, or copyable, unlike std::array or std::vector in C++, which feels rather inconvenient.

Slices don’t grow beyond their capacity (which can be more than their current length) when you append values. To do that you must manually create a new slice and copy the old slice’s elements into it. You can keep a pointer to an element in a slice (really to the element in the underlying array). So, as with maps (below), the lack of resizing is probably to remove any possibility of an pointer becoming invalid.

The built in append() function may allocate a bigger underlying array if it would need more than the existing capacity (which can be more than the current length). So you should always assign the result of append() like so:

a = append(a, 123)

I don’t think you can keep a pointer to an element in a slice. If you could, the garbage collection system would need to keep the previous underlying array around until you had stopped using that pointer.

Unlike C or C++ arrays, and unlike operator [] with std::vector, attempting to access an invalid index of a slice will result in a panic (effectively a crash) rather than just undefined behaviour. I prefer this, though I imagine that the bounds checking has some small performance cost.

Maps

Go has a built-in map type. This is roughly equivalent to C++’s std::map (balanced binary trees), or std::unordered_map (hash tables). Go maps are apparently hash tables but I don’t know if they are separate-chaining hash tables (like std::unordered_map) or open-addressing hash tables (like nothing in standard C++ yet, unfortunately).

Obviously, keys in hash tables have to be hashable and comparable. The book mentions comparability, but so few things are comparable that they would all be easily hashable too. Only basic types (int, float64, string, etc, but not slices) or structs made up only of basic types are comparable, so that’s all you can use as a key. You can get around this by using a basic type (such as an int or string) that is (or can be made into) a hash of your value. I prefer C++’s need for a std::hash<> specialization, though I wish it was easier to write one.

Unlike C++’, you can’t keep a pointer to an element in a map, so changing one part of the value means copying the whole value back into the map, presumably with another lookup. Go apparently does this to completely avoid the problem of invalid pointers when the map has to grow. C++ instead lets you take the risk, specifying when your pointer could become invalid.

Go maps are clearly a big advantage over C, where you otherwise have to use some third-party data structure or write your own, typically with very little type safety.

They look like this:

m := make(map[int]string)
m[3] = "three"
m[4] = "four"

Multiple return values

Functions in Go can have multiple return types, which I find more obvious then output parameters. For instance:

func getThings() (int, Foo) {
  return 2, getFoo()
}

a, b := getThings()

This is a bit like returning tuples in modern C++, particularly with structured bindings in C++17:

std::tuple<int, Foo> get_things() {
  return make_tuple(2, get_foo());
}

auto [i, f] = get_things();

Garbage Collection

Like Java, Go has automatic memory management, so you can trust that instances will not be released until you have finished using them, and you don’t need to explicitly release them. So you can happily do this:

func getThing () Thing* {
  a := new(Thing)
  ...
  return a
}

b := getThing()
b.foo()

I don’t know how Go avoids circular references or unwanted “leak” references, as Java or C++ would with weak references.

I wonder how, or if, Go avoids Java’s problem with intermittent slowdowns due to garbage collection. Go seems to be aimed at system-level code, so I guess it must do better somehow.

However, also like Java, and probably like all garbage collection, this is only useful for managing memory, not resources in general. The programmer is usually happy to have memory released some time after the code has finished using it, not necessarily immediately. But other resources, such as file descriptors and database connections, need to be released immediately. Some things, such as mutex locks, often need to be released at the end of an obvious scope. Destructors make this possible. For instance, in C++:

void Something::do_something() {
  do_something_harmless();

  {
    std::lock_guard<std::mutex> lock(our_mutex);
    change_some_shared_state();
  }
  
  do_something_else_harmless();
}

Go can’t do this, so it has defer() instead, letting you specify something to happen whenever a function ends. It’s a annoying that defer is associated with functions, not to scopes in general.

func something() {
  doSomethingHarmless()

  ourMutex.lock()
  defer ourMutex.unlock()
  changeSomeSharedState()

  // The mutex has not been released yet when this remaining code runs,
  // so you'd want to restrict the use of the resource (a mutex here) to
  // another small function, and just call it in this function.
  doSomethingElseHarmless()
}

This feels like an awkward hack, like Java’s try-with-resources.

I would prefer to see a language that somehow gives me all of scoped resource management (with destructors), reference-counting (like std::shared_ptr<>) and garbage collection, in a concise syntax, so I can have predictable, obvious, but reliable, resource releasing when necessary, and garbage collection when I don’t care.

Of course, I’m not pretending that memory management is easy in C++. When it’s difficult it can be very difficult. So I do understand the choice of garbage collection. I just expect a system level language to offer more.

Things I don’t like in Go

As well as the minor syntactic annoyances mentioned above, and the lack of simple generic resource (not just memory) management, I have a couple of other frustrations with the language.

(I’m not loving the support for object orientation either, but I’ll mention that in a later article when I’ve studied it more.)

No generics

Go’s focus on type safety, particularly for numeric types, makes the lack of generics surprising. I can remember how frustrating it was to use Java before generics, and this feels almost that awkward. Without generics I soon find myself having to choose between lack of type safety or repeatedly reimplementing code for each type, feeling like I’m fighting the language.

I understand that generics are difficult to implement, and they’d have to make a choice about how far to take them (probably further than Java, but not as far as C++), and I understand that Go would then be much more than a better C. But I think generics are inevitable once, like Go, you pursue static type safety.

Somehow go’s slice and map containers are generic, probably because they are built-in types.

Lack of standard containers

Go has no queue or stack in its standard library. In C++, I use std::queue and std::stack regularly. I think these would need generics. People can use go’s slice (a dynamically-allocated array) to achieve the same things, and you can wrap that up in your own type, but your type, can only contain specific types, so you’ll be reimplementing this for every type. Or your container can hold interface{} types (apparently a bit like a Java Object or a C++ void*), giving up type safety.

Developing a plugin system for Pitivi

In the last post with title Enabling Python support in Libpeas, it was shown how why it is not possible to implement a plugin system using Libpeas in Python based applications. I also published a link to a very simple example of how to write plugin manager in a Python program. I am using that patch in my branch of Pitivi to implement a plugin system. Actually, I started this adventure before starting the Google Summer of Code starts, but I needed to polish my code, to improve design and… some other things I realized during the process.

Plugin Manager

Pitivi Plugin Manager. PluginB is buggy, so if you try to enable it, you will get a message on the infobar.

Althought my first idea before GSoC started was to use the PeasGtkPluginManager, Alexandru Băluț told me to embed the plugin manager in the Pitivi Preferences Dialog. I did it but to be honest and as told him, it didn’t look good. Then, I remembered the design of GNOME Builder. I did what you can see in the screenshot of above. I have had to handle cases like enabling a plugin with dependencies, disabling a plugin when a dependant plugin is enabled or like when a buggy plugin is loaded. I could have used libdazzle, but there is a strong dependency on GSettings and Pitivi uses ConfigParser. Now that I think this is finished, I will wait that my previous patches were pushed to master to upload these new ones. I am working now on finishing to integrate the Pitivi Developer Console with Pitivi where I have to integrate its preferences in the Pitivi Preferences Dialog.

Stuck in the Challenge

My Sunday days are reserved for the GNOME Peru Challenge 2017-1 and one key action to  success is to get an application running, to fix a bug. In this matter, jhbuild and GNOME Builder are the way to achieve it.

Unfortunately, when we are trying to clone an app, libraries missing are the problem. These are some screenshots of the job of our group so far.

Felipe Moreno with Test Builder on Fedora   Martin Vuelta with Polari on ArchiLinux

Randy Real with Polari on Fedora

Ronaldo Cabezas with Gedit on Fedora

Julita Inca with Cheese on Fedora

I hope we can overcome this situation to post in a near future “A successful way to clone apps with GNOMe Builder” 😉

  • Thanks to Randy Real for arranging a lab at UIGV today, and for his support.
  • This journey, we had the honor to count with a developer from Brasil called Thiago!

Filed under: FEDORA, GNOME, τεχνολογια :: Technology Tagged: clone problem, clonning apps, fedora, Fedora + GNOME community, GNOME, gnome 3, GNOME bug, GNOME Builder, GNOME Peru Challenge, GNOME Peru Challenge 2017, Julita Inca, Julita Inca Chiroque

Code Search for GNOME Builder: Indexing

Goal of Code Search for GNOME Builder is to provide ability to search all symbols in project fuzzily and jump to definition from reference of a symbol in GNOME Builder. For implementing these we need to have a database of declarations. So I created a plugin called ‘Indexer’ in Builder which will extract information regarding declarations and store them in a database.

What information needs to extracted from source code and how?
Since we need to search symbols by their names, names of all declarations needs to be extracted. And we also need to jump to definition of a symbol from a reference of that one, so keys of all global declarations also needs to be extracted. Note that keys of local declarations needs not to be extracted because whenever we want to jump to local definition of a symbol from its reference, that definition will be in current file and it can be easily found by AST of current file. AST is tree representation of source code in which all constructs are represents by tree nodes. We can traverse through that tree and analyse source code. For extracting names and keys of all declarations of a source file, first an AST of source code will be created, next we will traverse through that tree and extract keys and names of all declaration nodes in that tree.

How to store keys and names of declarations?
First I thought of implementing a database which will store keys and names together. After various change of plans currently I am using DzlFuzzyIndex in libdazzle for storing names. And for storing keys I implemented a database IdeSimpleTable that will store strings and an array of integers associated with it. So there will be 2 indexes one for storing names and other for storing keys of declarations.

Indexing Tool

I implemented a helper tool for indexing which will take input a set of files to index, cflags for those files and destination folder in which index files needs to be stored. This will take each file from input, extract names and declarations by traversing AST of that source file and store that in DzlFuzzyIndexand IdeSimpleTable indexes respectively. After indexing this will create 2 files one to store names and other to store keys in destination directory.

GNOME Builder - Indexer

Indexer Plugin

For indexing I implemented a plugin which will create index of source code using above tool and store that in cache folder. This will maintain a separate index for every directory. The reason behind this is if we have a single index for whole project whenever there is single change in project then entire source code needs to be reindexed. This plugin will browse through working directory and indexes a directory only if either index is not there for that directory or index is older than files in that directory. For indexing a directory, it will give list of files in that directory, cflags of those files and destination folder to store index to above tool. After indexing of project is done plugin will load all indexes and be ready for taking queries and process them.

Here is the current implementation of both indexing tool and indexer plugin. Next I will use this index to implement Global Search and Jump to definition.


GSoC: GNOME Builder: Improving word completion phase 1

Currently, Builder uses word completion from GtkSourceCompletionWords from the GtkSource* module. The idea is to mimick the word completion technique as in Vim Ctrl-p and Ctrl-n from the insertion cursor.

Scanning entities : Current buffer, open buffers and #includes

Basic scan steps:

  • Find the insertion position in the current buffer. Scan the buffer looking for prefix matches till the end and wrap around.

  • After wrapping around to the start_iter of the buffer, if you hit relevant #includes, resolve the relative path and scan them.

  • Display the result in the completion window (GtkSourceCompletionProvider returns the proposal set).

The above process has to be made incremental. In layman terms, we cannot keep scanning the buffers all at once; we need to return back the results in an incremental manner and display them. Otherwise, this might stall the completion window drawing which would be waiting for all the proposals to get in first. In this regard, I learnt more about GdkFrameClock which is a better way to place our bids against the GTK+ drawing cycle.

GdkFrameClock can help us to update and paint every frame i.e. avoid frame drops. Every new frame will get up-to-date results as somewhat opposed to timeouts used. But still we can use timeouts right now along with making the process “incremental”, which would just work fine. Although, going ahead with this, we still need to keep in mind that GdkFrameClock is still a better way to do this.

A bit more verbose model of incremental process:

static gboolean scan_buffer_incrementally (GtkSourceBuffer *buffer, guint64 clock)
{
	..
	/* get insert cursor GtkTextIter and bounds of current buffer */
	insert_mark = gtk_text_buffer_get_insert (GTK_TEXT_BUFFER (buffer));
	gtk_text_buffer_get_iter_at_mark (GTK_TEXT_BUFFER (buffer), &insert_iter, insert_mark);
	gtk_text_buffer_get_bounds (GTK_TEXT_BUFFER (buffer), &start, &end);
	...

	while (TRUE)
	{
		if (g_get_monotonic_time() < clock)
			/* keep scanning the buffers */
		
		else /* time slab over */
			break;
	}

	/* adjust scan regions and iters to the next scan chunk */
	
}

static gboolean scan_buffer (GtkSourceBuffer *buffer)
{
	...

	clock = g_get_monotonic_time() + 5000;
	scan_buffer_incrementally (buffer, clock);
}


static void file_loaded(GObject *object, GAsyncResult *res, gpointer data)
{
	...
	g_timeout_add_full (G_PRIORITY_DEFAULT, 50, (GSourceFunc)scan_buffer, buffer, NULL);


}

Thank you Christian Hergert (hergertme) for overall guidance till now. Also thanks to Matthias Clasen (mclasen) for explaining me GTK+ drawing model and Debarshi Ray (rishi) for GdkFrameClock.

More posts follow soon. Lot of things under experimentation. Stay tuned. Happy Hacking. :)

June 25, 2017

GSoC: Show Me More

Last time we spoke the documentation was temporarily implemented using the GtkTooltip, making it simple to show the snippet of the documentation but imposible to make it interactive. That’s where the GtkPopover came in.

This widget provides options to mimic the behavior of the tool tip yet provides interface to customize the content to accommodate all the planned features. Now we can add more information from the documentation without taking up the entire screen.

Screen Shot 2017-06-25 at 15.29.04

Screen Shot 2017-06-25 at 15.29.07

The design aspect of the card is still bit lacking since I have still not committed to how the XML file with the documentation is going to be parsed. Balancing the fact that there is no need to analyze the entire file, yet some more knowledge of the structure would help with better ability to style the text.

The current implementation of documentation can be found here. Any feedback will be appreciated.

If the card doesn’t want to show up, you might want to check the right panel if you actually have the documentation. I haven’t found a centralized system for all the documentation but if you install a package *-doc the specific library it will be automatically added to your Builder through Devhelp, for instance:

  • gtk3-devel-doc
    • GTK+ 3 Reference Manual
    • GAIL Reference Manual
    • GDK 3 Reference Manual
  • glib2-doc
    • GLib Reference Manual
    • GIO Reference Manual
    • GObject Reference Manual
  • webkitgtk3-doc
    • WebKitGTK+ Reference Manual

First Round Talks of Fedora + GNOME at UPN

Today our local group has traveled many miles to the north of Lima to present our lately work by using Fedora and GNOME as users and developers. Thanks to the organizers of the IT Forum to invite us and support our job as Linux volunteers and very nice potential contributors to GNOME and Fedora and the group we have formed in Lima, Peru.I has started with what Linux is, who have created it, why do they created it, why it is important, what Fedora is, what GNOME is, how did I get involved in GNOME and Fedora, how anyone can be involved,  about the awesome GNOME and Fedora community

Then Solanch Ccasa, student System Engineering from UPIG, did share her experiences in the workshops she had helped me in the last year as well as her experiences in the GNOME Peru Challenge. Great job so far Solanch! 🙂Toto is another fantastic potential contributor to both projects is Toto. He is student of system Engineering from UNTELS and I trust him the general coordination of the GNOME Peru Challenge. Thank you Toto for all your effortFollowed by Toto, it was Leyla from UPN! Our outstanding designer and also python trained student in the GNOME Peru Challenge. She is System Engineering and she have organized many FLOSS events and she is so lovely! The hardworking Martin from UNMSM, was also there supporting us! He is an experiment developer and talent IT people, he is physics and also smart and funny person. Say hi to Martin Vuelta! 😉Alex from SENATI is another autodidact and inspired guy who is always helping us in this effort. He did a talk regarded on the terminal and commands that help developers in their daily 😀Felipe Moreno from UNI, a computer Science, is another very well skilled student that explain us GTK and his experiences with Go and IT related technologies with GNOME y Fedora. Grow up in FLOSS Felipe! 🙂Sheyla is also a student of mechatronic at UTP. She is a programmers and designer, she has been involved in the GNOME Peru Challenge lately and I hope she will fix a bug soon!Last but not least! Mario Antizana was showing his work with Mechatronics in UTP, and his experiences with KIA Motors and the Fedora project he has proposed and won! Awesome!The fee for the event had not charge, it was for free and we definitely prize our attendances

Screen Shot 2017-06-24 at 11.17.33 PM

Thanks to GOD first! for having the support of these talent and good people to build a Linux community that exert me to be a better leader and person for the sake of our group!Every experience we have as a group is a new satisfactory adventure, we enjoy this way… despite of ignorance, ridiculous and opposition! Thanks again guys! More pics as follow:Thanks again to the UPN (Private University of North) for everything!


Filed under: FEDORA, GNOME, τεχνολογια :: Technology Tagged: community, fedora, Fedora + GNOME community, FLOSS, FLOSS community, GNOME, GNOME 3.22, GNOME Peru Challenge, GNOME Peru Challenge 2017, Julita Inca, Julita Inca Chiroque, Lima, linux, Perú, UPN

June 24, 2017

GNOME Games : Progress so far

First up, I’d like to apologize for my last post on Planet GNOME. It was published on my blog months ago, where I use a ‘magic’ kind of theme. Well, for everyone’s convenience I’d try my best to to write normal as possible from now on, even though I’m crazy.

Unfortunately, due to my prolonged end examinations, I had a late start to the coding period. However, I managed to complete the first few planned tasks successfully. Newly added (to gnome-games) desmume libretro core is working perfectly fine with Nintendo DS games. But don’t get me wrong, this is just merely runs the game roms. Users couldn’t actually play the games. This is because Nintendo DS has a touch screen and libretro core required to have touch support in order for any game to be playable. Since I was lacking a touch screen for testing, adding support for touch screens was held up a bit. Then as a start, mouse touch support has been added. This means, instead a touch screen, a mouse can be used to play the game. Basically this was done by attaching a mouse widget to the libretro core which translate itself as a touch screen handler.

Sounded like a simple solution right? True, but does this work up to expectations? Well the following screen capture will tell you why it doesn’t.

ezgif.com-video-to-gif

Even though the inputs are detected, it is extremely hard to control the pointer. and mouse pointer is not gonna be in the same position as the touch pointer. This makes the game practically unplayable.  Therefore, it would be better to handle the mouse to touch conversion ourselves, so that the user don’t have to handle a secondary pointer handled by the core. For that, mouse events that come from the widget should be converted into libretro touch events. With this implemented, mouse pointer will really represent the finger or the touch pointer.

However, this is a bit of a complex task. Initial steps have been already taken and some modifications for retro-gtk parts have also been done. Hoping to finish this up within a couple of days. Stay tuned for the updates 🙂


GNOME Logs: Event Compression

Hi Everyone,

In this blog post, I am going to elaborate upon the event compression feature in GNOME Logs as implemented by me during last three weeks. It’s been exciting three weeks for me to hack on Logs code base and bring this crucial usability feature to life. First, let me speak in pictures about how this implemented feature looks like.

A GNOME Logs window showing compressed events looks like below:

compressed-events-list

The rows which indicate numbers along side the event messages represent a group of compressed events which have been hidden in the events list. I would like to call such rows “header rows”.

Clicking on a header row, toggles the visibility of the compressed event rows represented by it in the events list:

expanded-header-row

Here, a header row representing seven compressed events is expanded. The header row stores the details of one of these compressed events to be shown in the events list while it hides them. This compressed event, whose details are to be shown in in the header row, is selected such that it maintains the timestamp sorting order specified by “sort-order” GSettings key.

To keep things simple initially, only adjacent events w.r.t timestamp are checked for the compression/similarity criteria. The compression/similarity criteria for grouping these adjacent events under a header row is as follows:

  1. Events containing messages whose first word is same (includes exact duplicates)
  2. Events which have been generated by the same process.

This can be extended in future to compress non-adjacent events in specific batch sizes and accordingly group them under a common header row.

When we click on any row (including the expanded compressed events) except the header row, the detailed information regarding the event is shown in a fixed size popover:

detailed-popover

Longer detail fields like “Message” are wrapped over multiple lines so that they do not exceed the width of the popover. All the detail fields which were available in the previous detailed event view are available in the new popover as well.

Well , what about event rows containing super long messages like coredumps or stacktraces ? Such event details are handled gracefully in the popover using a GtkScrolledWindow.

For example,

detailed-popover-scrolling

I will now try to explain in simple words about what exactly is going on behind the scenes. While fetching the events from the journal using sd-journal API in the model array  (backend), we count the number of adjacent events satisfying the earlier mentioned compression/similarity criteria. If the detected number of compressed entries in a group is 2 or exceeds 2, then a dummy event representing these compressed entries is added to the model array. This dummy event is nothing but our “header row”. This information is then parsed in the frontend and accordingly the events marked as compressed in the model array are hidden in the events list. You can follow progress on merging of this enhancement here.

Moreover, I have now started working on my next task which is writing a shell search provider for GNOME Logs. During previous year, I came out with patches containing a very basic implementation of shell search provider. I will now be improving on those patches according to review given by David. Currently, I am researching on various possible approaches to fetch the events from the model in the shell search provider. Further progress regarding this can be tracked here.

I am very happy to tell you that I will be attending GUADEC 2017 in Manchester. Many thanks to the GNOME Foundation for sponsoring travel as well as accommodation for me. I look forward to meeting all of you at GUADEC 2017 🙂

going-to-guadec-badge


Installing a “full” disk encrypted Ubuntu 16.04 Hetzner server

I needed to install a server multiple times recently. Fully remotely, via the network. In this case, the machines stood at Hetzner, a relatively large German hoster with competitive prices. Once you buy a machine, they boot it into a rescue image that they deliver via PXE. You can log in and start an installer or do whatever you want with the machine.

The installation itself can be as easy as clicking a button in their Web interface. The manual install with their installer is almost as easily performed. You will get a minimal (Ubuntu) installation with the SSH keys or password of your choice deployed.

While having such an easy way to (re-)install the system is great, I’d rather want to have as much of my data encrypted as possible. I came up with a series of commands to execute in order to have an encrypted system at the end. I have put the “full” in the title in quotes, because I dislike the term “full disk encryption”. Mainly because it makes you believe that the disk will be fully encrypted, but it is not. Currently, you have to leave parts unencrypted in order to decrypt the rest. We probably don’t care so much about the confidentiality there, but we would like the contents of our boot partition to be somewhat integrity protected. Anyway, the following shows how I managed to install an Ubuntu with the root partition on LUKS and RAID.

Note: This procedure will disable your machine from booting on its own, because someone will need to unlock the root partition.

shred --size=1M  /dev/sda* /dev/sdb*
installimage -n bitbox -r yes  -l 1 -p swap:swap:48G,/boot:ext3:1G,/mnt/disk:btrfs:128G,/:btrfs:all  -K /root/.ssh/robot_user_keys   -i /root/.oldroot/nfs/install/../images/Ubuntu-1604-xenial-64-minimal.tar.gz


## For some weird reason, Hetzner puts swap space in the RAID.
#mdadm --stop /dev/md0
#mdadm --remove /dev/md0
#mkswap /dev/sda1
#mkswap /dev/sdb1

mount /dev/md3 /mnt
btrfs subvolume snapshot -r /mnt/ /mnt/@root-initial-snapshot-ro

mkdir /tmp/disk
mount /dev/md2 /tmp/disk
btrfs send /mnt/@root-initial-snapshot-ro | btrfs receive -v /tmp/disk/ 
umount /mnt/

luksformat -t btrfs  /dev/md3 
cryptsetup luksOpen /dev/md3 cryptedmd3

mount /dev/mapper/cryptedmd3  /mnt/

btrfs send /tmp/disk/@root-initial-snapshot-ro | btrfs receive -v /mnt/
btrfs subvolume snapshot /mnt/@root-initial-snapshot-ro /mnt/@

btrfs subvolume create /mnt/@home
btrfs subvolume create /mnt/@var
btrfs subvolume create /mnt/@images


blkid -o export /dev/mapper/cryptedmd3  | grep UUID=
sed -i  's,.*md/3.*,,'   /mnt/@/etc/fstab
echo  /dev/mapper/cryptedmd3   /     btrfs defaults,subvol=@,noatime,compress=lzo  0  0  | tee -a /mnt/@/etc/fstab
echo  /dev/mapper/cryptedmd3   /home btrfs defaults,subvol=@home,compress=lzo,relatime,nodiratime  0  0  | tee -a /mnt/@/etc/fstab

umount /mnt/
mount /dev/mapper/cryptedmd3 -osubvol=@ /mnt/

mount /dev/md1 /mnt/boot

mv /mnt//run/lock /tmp/
chroot-prepare /mnt/; chroot /mnt


passwd

echo  "termcapinfo xterm* ti@:te@" | tee -a /etc/screenrc
sed "s/UMASK[[:space:]]\+022/UMASK   027/" -i /etc/login.defs  
#echo   install floppy /bin/false  | tee -a    /etc/modprobe.d/blacklist
#echo "blacklist floppy" | tee /etc/modprobe.d/blacklist-floppy.conf

# Hrm. for some reason, even with crypttab present, update-initramfs does not include cryptsetup in the initrd except when we force it:
# https://bugs.launchpad.net/ubuntu/+source/cryptsetup/+bug/1256730
# echo "export CRYPTSETUP=y" | tee /usr/share/initramfs-tools/conf-hooks.d/forcecryptsetup



echo   cryptedmd3 $(blkid -o export /dev/md3  | grep UUID=) none luks     | tee  -a  /etc/crypttab
# echo   swap   /dev/md0   /dev/urandom   swap,cipher=aes-cbc-essiv:sha256  | tee  -a  /etc/crypttab


apt-get update
apt-get install -y cryptsetup
apt-get install -y busybox dropbear


mkdir -p /etc/initramfs-tools/root/.ssh/
chmod ug=rwX,o=   /etc/initramfs-tools/root/.ssh/
dropbearkey -t rsa -f /etc/initramfs-tools/root/.ssh/id_rsa.dropbear

/usr/lib/dropbear/dropbearconvert dropbear openssh \
        /etc/initramfs-tools/root/.ssh/id_rsa.dropbear \
        /etc/initramfs-tools/root/.ssh/id_rsa

dropbearkey -y -f /etc/initramfs-tools/root/.ssh/id_rsa.dropbear | \
        grep "^ssh-rsa " > /etc/initramfs-tools/root/.ssh/id_rsa.pub



cat /etc/initramfs-tools/root/.ssh/id_rsa.pub >> /etc/initramfs-tools/root/.ssh/authorized_keys

cat /etc/initramfs-tools/root/.ssh/id_rsa

 
update-initramfs -u -k all
update-grub2

exit

umount -l /mnt
mount /dev/mapper/cryptedmd3 /mnt/
btrfs subvolume snapshot -r /mnt/@ /mnt/@root-after-install

umount -l /mnt

Let’s walk through it.


shred --size=1M /dev/sda* /dev/sdb*

I was under the impression that results are a bit more deterministic if I blow away the partition table before starting. This is probably optional.


installimage -n somehostname -r yes -l 1 -p swap:swap:48G,/boot:ext3:1G,/mnt/disk:btrfs:128G,/:btrfs:all -K /root/.ssh/robot_user_keys -i /root/.oldroot/nfs/install/../images/Ubuntu-1604-xenial-64-minimal.tar.gz

This is Hetzner’s install script. You can look at the script here. It’s setting up some hostname, a level 1 RAID, some partitions (btrfs), and SSH keys. Note that my intention is to use dm-raid here and not btrfs raid, mainly because I trust the former more. Also, last time I checked, btrfs’ raid would not perform well, because it used the PID to determine which disk to hit.



mdadm --stop /dev/md0
mdadm --remove /dev/md0
mkswap /dev/sda1
mkswap /dev/sdb1

If you don’t want your swap space to be in the RAID, remove the array and reformat the partitions. I was told that there are instances in which it makes sense to have a raided swap. I guess it depends on what you want to achieve…



mount /dev/md3 /mnt
btrfs subvolume snapshot -r /mnt/ /mnt/@root-initial-snapshot-ro

mkdir /tmp/disk
mount /dev/md2 /tmp/disk
btrfs send /mnt/@root-initial-snapshot-ro | btrfs receive -v /tmp/disk/
umount /mnt/

Now we first snapshot the freshly installed image not only in case anything breaks and we need to restore, but also we need to copy our data off, set LUKS up, and copy the data back. We could also try some in-place trickery, but it would require more scripts and magic dust.


luksformat -t btrfs /dev/md3
cryptsetup luksOpen /dev/md3 cryptedmd3
mount /dev/mapper/cryptedmd3 /mnt/

Here we set the newly encrypted drive up. Remember your passphrase. You will need it as often as you want to reboot the machine. You could think of using pwgen (or similar) to produce a very very long password and save it encryptedly on a machine that you will use when babysitting the boot of the server.


btrfs send /tmp/disk/@root-initial-snapshot-ro | btrfs receive -v /mnt/
btrfs subvolume snapshot /mnt/@root-initial-snapshot-ro /mnt/@

Do not, I repeat, do NOT use btrfs add because the btrfs driver had a bug. The rescue image may use a fixed driver now, but be warned. Unfortunately, I forgot the details, but it involved the superblock being confused about the number of devices used for the array. I needed to re-set the number of devices before systemd would be happy about booting the machine.


btrfs subvolume create /mnt/@home
btrfs subvolume create /mnt/@var
btrfs subvolume create /mnt/@images

We create some volumes at our discretion. It’s up to you how you want to partition your device. My intention is to be able to backup the home directories without also backing up the system files. The images subvolume might become a non-COW storage for virtual machine images.


blkid -o export /dev/mapper/cryptedmd3 | grep UUID=
sed -i 's,.*md/3.*,,' /mnt/@/etc/fstab
echo /dev/mapper/cryptedmd3 / btrfs defaults,subvol=@,noatime,compress=lzo 0 0 | tee -a /mnt/@/etc/fstab
echo /dev/mapper/cryptedmd3 /home btrfs defaults,subvol=@home,compress=lzo,relatime,nodiratime 0 0 | tee -a /mnt/@/etc/fstab

We need to tell the system where to find our root partition. You should probably use the UUID= notation for identifying the device, but I used the device path here, because I wanted to eliminate a certain class of errors when trying to make it work. Because of the btrfs bug mentioned above I had to find out why systemd wouldn’t mount the root partition. It was a painful process with very little help from debugging or logging output. Anyway, I wanted to make sure that systemd attempts to take exactly that device and not something that might have changed.

Let me state the problem again: The initrd successfully mounted the root partition and handed control over to systemd. Systemd then wanted to ensure that the root partition is mounted. Due to the bug mentioned above it thought the root partition was not ready so it was stuck on mounting the root partition. Despite systemd itself being loaded from that very partition. Very confusing. And I found it surprising to be unable to tell systemd to start openssh as early as possible. There are a few discussions on the Internet but I couldn’t find any satisfying solution. Is it that uncommon to want the emergency mode to spawn an SSHd in order to be able to investigate the situation?


umount /mnt/
mount /dev/mapper/cryptedmd3 -osubvol=@ /mnt/

mount /dev/md1 /mnt/boot

mv /mnt//run/lock /tmp/
chroot-prepare /mnt/; chroot /mnt

Now we mount the actual root partition of our new system and enter its environment. We need to move the /run/lock directory out of the way to make chroot-prepare happy.


passwd

We start by creating a password for the root user, just in case.


echo "termcapinfo xterm* ti@:te@" | tee -a /etc/screenrc
sed "s/UMASK[[:space:]]\+022/UMASK 027/" -i /etc/login.defs
#echo install floppy /bin/false | tee -a /etc/modprobe.d/blacklist
#echo "blacklist floppy" | tee /etc/modprobe.d/blacklist-floppy.conf

Adjust some of the configuration to your liking. I want to be able to scroll in my screen sessions and I think having a more restrictive umask by default is good.


echo "export CRYPTSETUP=y" | tee /usr/share/initramfs-tools/conf-hooks.d/forcecryptsetup

Unless bug 1256730 is resolved, you might want to make sure that mkinitramfs includes everything that is needed in order to decrypt your partition. Please scroll down a little bit to check how to find out whether cryptsetup is indeed in your initramdisk.


echo cryptedmd3 $(blkid -o export /dev/md3 | grep UUID=) none luks | tee -a /etc/crypttab
# echo swap /dev/md0 /dev/urandom swap,cipher=aes-cbc-essiv:sha256 | tee -a /etc/crypttab

In order for the initramdisk to know where to find which devices, we populate /etc/crypttab with the name of our desired mapping, its source, and some options.


apt-get update
apt-get install -y cryptsetup
apt-get install -y busybox dropbear

Now, in order for the boot process to be able to decrypt our encrypted disk, we need to have the cryptsetup package installed. We also install busybox and dropbear to be able to log into the boot process via SSH. The installation should print you some warnings or hints as to how to further proceed in order to be able to decrypt your disk during the boot process. You will probably find some more information in /usr/share/doc/cryptsetup/README.remote.gz.


mkdir -p /etc/initramfs-tools/root/.ssh/
chmod ug=rwX,o= /etc/initramfs-tools/root/.ssh/
dropbearkey -t rsa -f /etc/initramfs-tools/root/.ssh/id_rsa.dropbear

/usr/lib/dropbear/dropbearconvert dropbear openssh \
/etc/initramfs-tools/root/.ssh/id_rsa.dropbear \
/etc/initramfs-tools/root/.ssh/id_rsa

dropbearkey -y -f /etc/initramfs-tools/root/.ssh/id_rsa.dropbear | \
grep "^ssh-rsa " > /etc/initramfs-tools/root/.ssh/id_rsa.pub

cat /etc/initramfs-tools/root/.ssh/id_rsa.pub >> /etc/initramfs-tools/root/.ssh/authorized_keys

cat /etc/initramfs-tools/root/.ssh/id_rsa

Essentially, we generate a SSH keypair, convert it for use with openssh, leave the public portion in the initramdisk so that we can authenticate, and print out the private part which you better save on the machine that you want to use to unlock the server.


update-initramfs -u -k all
update-grub2

Now we need to regenerate the initramdisk so that it includes all the tools and scripts to be able decrypt the device. We also need to update the boot loader so that includes the necessary Linux parameters for finding the root partition.


exit

umount -l /mnt
mount /dev/mapper/cryptedmd3 /mnt/
btrfs subvolume snapshot -r /mnt/@ /mnt/@root-after-install

umount -l /mnt

we leave the chroot and take a snapshot of the modified system just in case… You might now think about whether you want your boot and swap parition to be in a RAID and act accordingly. Then you can try to reboot your system. You should be able to SSH into the machine with the private key you hopefully saved. Maybe you use a small script like this:


cat ~/.server_boot_secret | ssh -o UserKnownHostsFile=~/.ssh/server-boot.known -i ~/.ssh/id_server_boot root@server "cat - >/lib/cryptsetup/passfifo"

If that does not work so well, double check whether the initramdisk contains everything necessary to decrypt the device. I used something like


zcat /boot/initrd.img-4.4.0-47-generic > /tmp/inird.cpio
mkdir /tmp/initrd
cd /tmp/initrd
cpio -idmv < ../inird.cpio
find . -name '*crypt*'

If there is no cryptsetup binary, something went wrong. Double check the paragraph above about forcing mkinitramfs to include cryptsetup.

With these instructions, I was able to install a new machine with an encrypted root partition within a few minutes. I hope you will be able to as well. Let me know if you think anything needs to be adapted to make it work with more modern version of either Ubuntu or the Hetzner install script.

Even faster GNOME Music

Hello my GNOMEish friends!

This afternoon, I felt an urge to hear some classical music. Perhaps because I’m overworking a lot these days, I wanted to grab a good hot tea, and listen to relaxing music, and rest for a few minutes.

My player of choice is GNOME Music.

In the past, I couldn’t use it. It was way too slow to be usable. After a round of improvements in a sleepless night, however, Music was usable again to me.

But it was not fast enough for me.

It was taking 15~20 seconds just to show the albums. That’s unacceptable!

The Investigation

Thanks to Christian Hergert we have a beautiful and super useful Sysprof app! After running Music under Sysprof, I got this:

Sysprof result of GNOME MusicSysprof result of GNOME Music

Clearly, there’s an area where Music hits the CPU (the area that is selected in the picture above). And, check it out, in this area, the biggest offenders were libjpeg, libpixman and libavcodec. After digging the code, there it was.

The performance issue was caused by the Album Art loading code.

The Solution

Looking at the code, I made a simple experiement: tried to see how many parallel lookups (i.e. asynchronous calls) Music was performing.

The number is shocking: Music was running almost 1200 asynchronous operations in parallel.

These operations would be fired almost at the same time, would load Zeus knows how many album covers, and return almost at the same time. Precisely when these lookups finished, Music had that performance hit.

The solution, however, was quite simple: limit the number of active lookups, and queue them if needed. But, limit to what? 64 parallel lookups? Or perhaps 32?

I needed data.

The Research

DISCLAIMER: I know very well that the information below is not scientific data, nor a robust benchmark. It’s just a simple comparison.

I decided to try out a few lookup limits, and see what works best. I have a huge collection, big enough to squeeze Music. I’m on an i7 with 8GB of RAM, 7200RPM spinning hard drive.

It was measured (i) the time it took for the album list to show, (ii) the time for all album covers to be loaded, and (iii) a quick score I made up on the fly. All of them are of the type lower is better. I ran each limit 10 times, and used the average of the results.

Time comparisonComparison of various lookup limits

The “No Limits” columns represent what Music does now. It takes a long time to show up, but the album covers are visible almost immediately after.

First conclusion: limiting the number of lookups always performs better than not. That said, the problem was just a matter of finding the optimal value.

After some trial and error, I found that 24 is an excellent limit.

The Result

In general, the initial loading of albums with the performance improvement is around 73% faster than without it. That’s quite a gain!

But words cannot express the satisfaction of seeing this:

Enjoy!!

June 23, 2017

Contributing to OSS

This is going to be a series of posts that highlight my experience with contributing to OSS in general, but focusing on Polari in particular.

I’ve been using GNOME for a while now, a long while. The project is a masterpiece of clean and fresh design, simplicity, and gets out of your way while working. Many of the applications included in the GNOME ecosystem follow this paradigm.

One of these applications is Polari, an IRC client. It has a nice minimalist design, is functional and clean yet usable. In short, it does what is needed to get a user up and chatting quickly and with minimal fuss.

It did have a downside though, as I discovered; long user nicknames in chat could cause issues.

  • Some nicknames could cause an extra indent between the nick and message.
  • There was no safe-guard against irresponsible users who insisted on using long nicks such as WindowsBunnyIsSuspiciousOfFish.

What was happening as a result of the above is two things:

  • The pixel length of a nick was calculated via the average pixel width of a fonts char set, and multiplied by number of chars in the nick. As such, if a nickname used chars that put its actual pixel length over the calculated length, then an extra indent was inserted to compensate.
  • The indent between the displayed nick and its associated message is set by the above, so WindowsBunnyIsSuspiciousOfFish would shunt the indent over that far, leaving a massive void between regular nicknames and their associated message.

This issue was enough for me to stop using particular channels where that one user participated. This of course meant I wasn’t using those channels, and that is not good enough for anyone.

So I’d found what would be classed as a gronked/needed feature, and set out to fix it.

How easy is it to contribute?

I have to say that GNOME is one of the more friendlier and easier larger projects to contribute to. There are many well written newbie guides to peruse such as (this one for Gnome Builder)[https://wiki.gnome.org/Apps/Builder/Contribute] - which rapidly rising to be the chosen development environment for Gnome applications. And the GNOME (code contribution guide)[https://wiki.gnome.org/Newcomers/CodeContributionWorkflow].

Essentially your workflow (for code contributions) will look like:

  1. Set up git
  2. Checkout the code repository for your chosen project
  3. Choose a bug to work on
  4. Create a branch for the work you do (with a name relating to the bug you’re working on)
  5. Once work is completed, submit a patch for review

It’s as simple as that. Really. Many other OSS projects will have a similar workflow to this, and in the case of projects hosted on github, you will only need to submit a pull request once you complete the work.

The Importance of Commit Messages

This is something I think a lot of people get wrong when they first start off - commit messages.

If you look at a git log, on github for example, you will usually see only the first line of a commit message. For this reason, it is very good practice for you to get in the habit of writing a short, concise description of the commit that fits a single line - and then expand on that line in a new paragraph.

Why? Not only will this help others when analysing git logs, but it will help you in the long run by making it easy to see what was changed, why, and where. One of the best blog posts I’ve found that expounded on this is by (Chris Beams)[https://chris.beams.io/posts/git-commit/] - make sure to check it out.

Some projects have a guide for how they prefer their commit messages to be written (normally a plain text file named CONTRIBUTING); if this is the case, be sure to follow it.

Example of GNOME preferred commit message;

Short title describing your change

Context of the bug/issue/change.

Problem that it intends to fix and when happens.

Fix.

Bugzilla link

The First Task

The first thing I wanted to do was shorten the nicknames that went over a certain limit. But before I could start, I needed to checkout the source, and build it to ensure I could test the work I do.

For Polari, this was made dead easy by (Gnome Builder)[https://wiki.gnome.org/Apps/Builder] which Christian Hergert works on full time. One of the powerful features of Builder, is the ability to compile and run programs in a (Flatpak)[http://flatpak.org/] environment - for GNOME you can use either stable or nightly Flatpak enviroments (flatpak in itself is an incredible piece of work).

There are other ways to compile GNOME software, such as using Jhbuild (which I never got working due to repo errors), and in the case of openSUSE, using the (Open Build Service)[https://build.opensuse.org/]. For non-GNOME projects, the project itself will generally include some form of instruction on how to build the software from scratch.

Other Ways to Contribute to OSS

You don’t have to be a programming wiz to contribute to the world of OSS. There are a great deal of other ways you can help:

  • Testing and writing bug reports
  • Art and design
  • Documentation - general, guides to setting up software or using it)
  • Translations - The OSS world has a very far reach, translations are always needed.
  • Advocate for the use of OSS
  • Organising events around OSS
  • Website design and development - Good websites are critical.

And this is just a short list! Generally speaking, if you think a project could benefit from something you can provide, go ahead and do it! Contact those people responsible for the project via their chosen means, tell them what you can offer, and go from there.

Onwards!

Armed with Builder, Flatpak, and the source to Polari, I got to work and started reading the code to see what did what. And this is where I will leave this post for now.

Coming up next will be topics such as: code reading, importance of documentation, maybe some advice on using git, and how/what I did with Polari.

Moving Linux to a new partition

Occasionally I have a need (or just feel a need) to move my Linux installation around it’s little universe - it could be for any number of reasons;

  • Redoing a partition table
  • Backing up an install
  • Moving to a new HDD
  • Experimenting on an install

Moving a Linux install around is pretty painless compared something like Windows. The basic process is: use rsync to copy /, update /etc/fstab, update the bootloader, have a choice of beverage.

Preparing

The first thing to do is prepare your disks and mount points. Use your partitioning tool of choice to set up a new partition for where you will move the base install (/) to - be sure to make it large enough. If you are also moving your /home, now is a good time to do it (and if your /home isn’t a separate partition, you’re completely mad!)

Once you’ve created the partitions on the drive you’re going to use, you can mount them with sudo mkdir /mnt/root && sudo mount /dev/sdXX /mnt/root and if you’re moving /home, sudo mkdir /mnt/root/home && sudo mount /dev/sdXX /mnt/root/home.

/dev/sdXX is the device and partition you’ll be using, you can see a list of all current partitions on disks with lsblk, eg;

NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 119.2G  0 disk 
├─sda1   8:1    0   156M  0 part /boot/efi
├─sda2   8:2    0  25.3G  0 part /
├─sda3   8:3    0  25.3G  0 part 
├─sda4   8:4    0  25.3G  0 part 
├─sda5   8:5    0  25.3G  0 part 
└─sda6   8:6    0  18.1G  0 part [SWAP]
sdb      8:16   0 931.5G  0 disk 
└─sdb4   8:20   0 931.5G  0 part /home

Where sda is the first disk, and sdb is the second, then in the tree, partitions are number 1 to n. If I’m moving to the first disks second partition, then I use sudo mount /dev/sda2 /mnt/root.

The Big Move

There is just one command you need (from the Arch wiki at that);

sudo rsync -aAXv --exclude={"/dev/*","/proc/*","/sys/*","/tmp/*","/run/*","/mnt/*","/media/*","/lost+found","/home/"} / /mnt/root

If you’re also moving /home, then remove the ,"/home/" from the exclude list. Now, grab a beer and sit back while a list of copied files scrolls on by - and if /home is being moved, watch your life flash before your eyes!

/TODO: breakdown of rsync commands

Once everything has been copied over, it’s time to fix a few small things before they become problems.

fstab

/etc/fstab is what controls what gets mounted where at boot time. This will normally contain entries for /boot, /, /home, and swap, possibly /boot/efi too.

Mount the disk that you copied root to, sudo mount /dev/sdax /mnt/moved, and using an editor with root permissions such as sudo vim in a terminal, edit the fstab.

Quick Vi guide;

  • h, j, k, l are left, down, up, right respectively, or use arrow keys.
  • i activates insert mode.
  • esc drops you back to command mode where the following commands can be run.
  • cw deletes the word at the cursor and enters insert mode.
  • C will delete the current line and enter insert mode.
  • dw will cut the word at the cursor.
  • dd will cut the line.
  • p will paste the cut line or word.
  • u will undo the last change.
  • :q is quit, and :q! is force quit without saving.
  • :wq to save a file and exit.

There is much more to Vi and ViM than those commands, but they are sufficient for quick editing. Please check out (Vim Tutor)[http://www.openvim.com/] if you want to learn more about one of the most powerful and intuitive editors around.

The fstab HDD references generally use a UUID for identification, this is a unique identifier that looks similar too . To get this, open another terminal and run sudo blkid. This will give you a list of UUID for each partition, one of which you will need for the fstab.

Bootloader

The last step is to update the bootloader, you may or may not want to boot to the new partition (and delete the old one) before doing this. If you want to boot first and check all is okay, reboot and then on the grub screen press c to get to a console, the next steps are;

  • set root (hd0,2), where hd0 is the disk the new partition is on, and the partition number starts at 0.
  • linux /boot/vmlinuz root=/dev/sda2 - this sets the kernel to boot, and the root partition. Generally the kernel is symlinked as vmlinuz but this isn’t always true.
  • initrd /boot/initrd sets the ram disk for initial booting. This contains necessary stuff such as kernel modules for booting.

When using the grub console, you can press tab to autocomplete commands or get a list. This is also handy when checking which drives and partitions are available, eg set root (hd followed by tab will print a list of drives and the partitions on them. linux /boot/v then tab will print a list of all files starting with ‘v’ or autocomplete if only one file starts with ‘v’.

Once booted, or if you prefer to update the bootloader first, open a terminal and run;

$> grub-mkconfig -o /boot/grub/grub.cfg

This will output a new config with the new partition included. Reboot and select the new entry for the new partition (you can check this by pressing ‘e’ on the entry and verifying that the partitions are listed correctly for root/linux).

If the new entry hasn’t shown up at all, you may need to check that grub’s os-prober hasn’t been disabled - run sudo vi /etc/default/grub in a terminal look for a line like GRUB_DISABLE_OS_PROBER="false".

Booted!

So you’ve booted successfully from the new partition; if you haven’t already, run the grub commands above to ensure you can boot from it easily next time. And if you feel like it, delete the old partition.

Cross-Widget Transitions

You can do some pretty flashy things with Dazzle. Tonight I added the missing pieces to be able to make widgets look like they transition between parents.

The following is a video of moving a GtkTextView from one GtkStack to another and flying between those containers. Compare this to GtkStack transitions that can only transition between their children, not their ancestors. Fun!

Dazzle spotlight – Multi Paned and Action Muxing

There really is a lot of code in libdazzle already. So occasionally it makes sense to spotlight some things you might be interested in using.

Action Muxing

Today, while working on various Builder redesigns, I got frustrated with a common downfall of GAction as used in Gtk applications. Say you have a UI split up into two sections: A Header Bar and Content Area. Very typical of a window. But it’s also very typical of an editor where you have the code editor below widgetry describing and performing actions on that view.

The way the GtkActionMuxer works is by following the widget hierarchy to resolve GActions. Since the HeaderBar is a sibling to the content area (and not a direct ancestor) you cannot activate those actions. It would be nice for the muxer to gain more complex support, but until then… Dazzle.

It does what most of us do already, copy the action groups between widgets so they can be activated.

// my_countainer should be either your headerbar or parent of it
// my_view is your view containing the attached action groups
// the mux key is used to remove old action groups
dzl_gtk_widget_mux_action_groups (my_container,
                                  my_view,
                                  "a mux key");

Exciting, I know.

Multi Paned

The number of times I’ve been asked if Gtk has a GtkPaned that can have multiple handles is pretty high. So last year, while working on Builder’s panel engine, I made one. It also gained some new features this week as part of our Builder redesign.

It knows how to handle minimum sizes, natural sizes, expanding, dragging, and lots of other tricky bits. There are lots of really annoying bits about GtkPaned from it’s early days that make it a pain when dealing with window resizes. MultiPaned tries to address many of those and do the right thing by default.

It should be strictly a step up from using GtkPaned of GtkPaned of GtkPaned.

June 22, 2017

Enabling Python support in Libpeas

Libpeas has been for a long time one of the most used libraries to implement plugins in GNOME applications. These are applications such as Gedit, GNOME Videos, GNOME Builder and others. Libpeas’ README states:

Adding support for libpeas-enabled plugins in your own application is a matter
of minutes.

However this has not been the case in Python because there is a problem. Although, currently it is possible to write Python-based plugins like many applications currently do like Gedit that has its Python Console, it is not possible to implement plugin systems for applications. The source of the problem resides in the Libpeas’ function peas_extension_set_newv, that receives an array of GParameter structures. But GParameter is not instropectible! Due to this problem, there was a large discussion in bugzilla. After some discussions, Emmanuele Bassi say that GParameter should be deprecated and adding a new function with the following prototype:

  gpointer g_object_newv2 (GType gtype,
                           guint n_properties,
                           const char *names[],
                           const GValue values[]);

Some months ago, I decided to implement Emmanuele Bassi’s suggestion and finally the patches were merged in master branch, and the function mentioned about was added but with the name g_object_new_with_properties. After that, the same idea could be implemented in libpeas. I have written a new function called peas_extension_set_new_with_properties and peas_engine_create_with_properties. The patch was proposed before but Garret Regier told me to do some checking and tests. So the new patch is just pending of review and it means that it will be possible to do the following:

extension_set = Peas.ExtensionSet.new_with_properties(engine, Peas.Activatable, ["object"], [a_gobject])

I have tried this function and it works. I have a very simple example that shows how to use Libpeas in Python with this patch. I think that this function will be really useful to all the GNOME community. I think that the function internally should use g_object_new_with_properties but for the while after talking with Garrett Regier (mantainer of Libpeas), peas_extension_set_with_properties will use GParameter internally for the while.

You have a very simple example of how to implement a Plugin System using Libpeas in Python in my github repository. I hope you can soon have this functionality in master.

 

The GInterface problem

According Libpeas’ README:

One of the most frustrating limitations of the Gedit plugins engine was that it
only allows extending a single class, called GeditPlugin. With libpeas, this
limitation vanishes, and the application writer is now able to provide a set of
GInterfaces the plugin writer will be able to implement as his plugin requires.

The problem is that although PyGObject allows to import interfaces from libraries written in C like Peas.Activatable, it is not possible to define new interfaces. So we will be limited to use only Peas.Activatable and thus probably extending the application by accessing directly to the GApplication.

I have been investigating the problem in PyGObject but I was told to stop, probably because it may take too much time. I have been reading PyGObject’s source code and I have an idea of how to solve this problem. I think that the solution is to add a metaclass to GObject.GInterface and as soon as a new interface is going to be defined, a new GType and a new GInterfaceInfo should be registered. However GObject.GInterface is actually written in C. I didn’t have idea to be honest how to do that in C, but I knew I could get to a solution by investingating. I was investigating and I knew that the solution was to add a metaclass but I couldn’t find too much information about that. So I asked in Python IRC channel. According Ned Batchelder (nedbat),  I was doing “something very very esoteric”, but after some discussions I had an idea. I could finally add a metaclass to GObject.GInterface, so I think I am in the right way, but I know it will take time. So I will not give too much importance to it as I was told by my mentor and Pitivi mantainers. So Pitivi will have only extension sets implementing just Peas.Activatable for the while.

The case of GNOME Builder

I think that a clear example of the use of GInterface to add extension points is GNOME Builder. Gedit is also a good example, but they not have many extension points as GNOME Builder do. I have been reading the source code of GNOME Builder. They define multiple interfaces:

[cfoch@localhost gnome-builder]$ find -name *-addin.c  | head -10
./libide/buildconfig/ide-buildconfig-pipeline-addin.c
./libide/editor/ide-editor-workbench-addin.c
./libide/editor/ide-editor-view-addin.c
./libide/editor/ide-editor-layout-stack-addin.c
./libide/buildsystem/ide-build-pipeline-addin.c
./libide/workbench/ide-workbench-addin.c
./libide/workbench/ide-layout-stack-addin.c
./libide/preferences/ide-preferences-addin.c
./libide/buildui/ide-build-workbench-addin.c
./libide/genesis/ide-genesis-addin.c

For example, you can see that IdeWorkbenchAddin defines the following vfuncs:

struct _IdeWorkbenchAddinInterface
{
  GTypeInterface parent;

  gchar    *(*get_id)          (IdeWorkbenchAddin      *self);
  void      (*load)            (IdeWorkbenchAddin      *self,
                                IdeWorkbench           *workbench);
  void      (*unload)          (IdeWorkbenchAddin      *self,
                                IdeWorkbench           *workbench);
  gboolean  (*can_open)        (IdeWorkbenchAddin      *self,
                                IdeUri                 *uri,
                                const gchar            *content_type,
                                gint                   *priority);
  void      (*open_async)      (IdeWorkbenchAddin      *self,
                                IdeUri                 *uri,
                                const gchar            *content_type,
                                IdeWorkbenchOpenFlags   flags,
                                GCancellable           *cancellable,
                                GAsyncReadyCallback     callback,
                                gpointer                user_data);
  gboolean  (*open_finish)     (IdeWorkbenchAddin      *self,
                                GAsyncResult           *result,
                                GError                **error);
  void      (*perspective_set) (IdeWorkbenchAddin      *self,
                                IdePerspective         *perspective);
};

Having an interface like this one avoids to expose everything by accessing directly from the main application. In GNOME Builder, in libide/workbench/ide-workbench.c a extension set is created so all extensions implementing this interface can do what they are ordered to do in this file by calling the methods of the IdeWorkbenchAddinInterface:

  self->addins = peas_extension_set_new (peas_engine_get_default (),
                                         IDE_TYPE_WORKBENCH_ADDIN,
                                         NULL);

For example, to set the perspective. Different plugins may have different ways to set the perspective.

  if (self->addins != NULL)
    peas_extension_set_foreach (self->addins,
                                ide_workbench_notify_perspective_set,
                                perspective);

And the plugins that implement these interfaces doesn’t need to know of other types (like the application). They just care about the perspective (and other objects that can be passed as arguments to its virtual functions).

static void
ide_workbench_notify_perspective_set (PeasExtensionSet *set,
                                      PeasPluginInfo   *plugin_info,
                                      PeasExtension    *exten,
                                      gpointer          user_data)
{
  IdeWorkbenchAddin *addin = (IdeWorkbenchAddin *)exten;
  IdePerspective *perspective = user_data;

  g_assert (PEAS_IS_EXTENSION_SET (set));
  g_assert (plugin_info != NULL);
  g_assert (IDE_IS_WORKBENCH_ADDIN (addin));
  g_assert (IDE_IS_PERSPECTIVE (perspective));

  ide_workbench_addin_perspective_set (addin, perspective);
}

 

Pitivi: UI for the Ken-Burns effect

It’s been three weeks since the coding period for GSOC 2017 started, so it’s time to show the world the progress I made. A short recap: I’ve been working on building a user interface which allows simulating the Ken-Burns effect and other similar effects in Pitivi. The idea is to allow adding keyframes on x, y, width, height properties of a clip, much like we are doing with other effects.

Fortunately, my mentor, Thibault Saunier, implemented this feature about 2 years ago, but a rebase of that branch was impossible, as the codebase underwent a lot of changes in the meantime. Even so, having his work as a guideline allowed me to move pretty fast. By now, I’ve implemented an interface that can be used to add and remove keyframes on transformation properties, using the transformation box to specify values at various timestamps. Here is a short demo:

What remains to be done is integrate the newly added feature with the undo/redo system, as well as allow users to specify values at various timestamps by interacting with the viewer. I encourage you to keep reading my blog for further updates. You can also check out my branch.

Until next time!

 

 


GSOC 2017: And so it begins

Hey everyone,

This is the beginning of what seems to be a really exciting summer. Why, you ask? Well, it’s because I’m going to spend most of it as a true GNOME contributor, working on a really cool project.

But let’s start from the beginning. Only 4 months ago, I was making my first steps as a contributor in the open-source world. One of the first things I discovered is how amazing and helpful the GNOME community is. I started by trying out a lot of GNOME apps and looking through the code behind them and that’s how I discovered Pitivi, a really great video editing solution. After my first patch on Pitivi got accepted, I was really hooked up. Fast forward a couple of patches and now I have the opportunity and great pleasure to work on my own project: UI for the Ken Burns effect, after being accepted for Google Summer of Code 2017. In this amazing journey, I’ve had some great mentoring: special thanks to Thibault Saunier (thiblahute), who is also my current mentor for GSOC 2017, and Alexandru Balut (aleb), who helped me along the way.

The goal of my GSOC project is to allow Pitivi users to create effects based on the positioning and zoom of their clips (like the Ken Burns effect). More precisely, when this project will be completed, Pitivi will support adding values for the position and zoom of the clip at various timestamps and smoothly transitioning between these values.

I’m expecting a lot of fun while working on this project and I encourage you to keep reading my blog for further updates. 🙂

Happy coding!


Code Search for GNOME Builder : GSOC 2017

I am very happy to be part of GNOME and Google Summer of Code 2017. First of all, thank you for all GNOME members for giving me this opportunity and Christian Hergert for mentoring and helping me in this project. In this post I will introduce my project and approach for doing this project.

Goal of the project is to enhance Go to Definition and Global Search in GNOME Builder for C/C++ projects. Currently in GNOME Builder, using Go to Definition one can go from reference of a symbol to its definition if definition is there in current file or included in current file. In this project, Go to Definition will be enhanced and using that one can go from reference of a symbol to its definition which can be present in any file in the project. Global Search will also be enhanced by allowing to search all symbols in the project fuzzily right from search bar.

Approaches:

These are the approaches for implementing Go to Definition and Global Search.

Go to Definition:

In C/C++ files we will have declarations/definitions of various types of symbols like variables, functions, structs, enums, unions, classes, types and many more. Each declaration will have a name and some scope. And no two declarations of same type of symbol in same scope will have same name (with some exceptions like function overloading). Using this idea, a key can be generated for a declaration: key = scope+type+name. So we can use this key to uniquely identify a declaration. Thanks to libclang library which will generate this key for a declaration for making our lives easy.

Lets take 2 files,

1.c

int foo ()
{
  return 0;
}

2.c

int foo ();
int bar ()
{
  int a;
  a = foo ();
  return a;
}

If we want to go to definition of foo() from 2.c : line 5,  first we will go to declaration present in 2.c and generate key for that, key =File Scope+Fucntion+foo = ::F::foo. (Going from a reference to its declaration present in same file can be done by libclang library directly.) Now we will use this key and search in every file for a definition with same key. If search key matches to the key of a definition, then that is definition of our reference. If we form a database of keys of all declarations, then we can search for definition of reference very quickly. This is how Go to definition will be implemented in GNOME Builder.

Global Search:

Goal of this is to search all symbols in the project fuzzily. For doing that, names of all declarations in all files will be extracted by traversing clang AST and stored in a database. When ever we get aquery we will search this database for fuzzy matched. For storing all names, fuzzy index implementation in libdazzle library written by Christian Hergert (mentor for my project) will be used. This is the idea for implementing Global Search.

For doing above two things first an index of code needs to be created. This will be done by creating a new plugin in Builder which will do indexing process.  In conclusion implementation of project is divided into 3 tasks.

  1. Indexing: Creating a plugin which will index project and store that index into hard disk.
  2. Global Search: Implementing an interface which will searching index for a name and show fuzzy matches.
  3. Go to Definition: Implementing an interface which will search index for a key and give location of corresponding definition.

Now I am working on first part and here is the progress in implementation till now patch.  This patch does indexing but few things needs to be improved, like indexing in sub process and not in main process. By completing first part in my next post I will post about how indexing is done.


GSOC Magic – Story so far

GSOC is a very special kind of magic that was found by a great wizard of all time called Google. Even though I’m an arch mage, my knowledge on this magic is less than I want it to be. It’s time I changed that.

I always had an itch to get into GSOC. Somewhere at the end of 2016, I was browsing through past GSOC organizations and was discovered to GNOME. Even though I was already using GNOME products I didn’t have much knowledge about the organization. I did background research for a few organizations of my interest but for some reason I was liking GNOME more than other organizations. Therefore, decided to be committed to GNOME. Then there was the challenge to choose a project of my liking. However, it didn’t take me much time to find one.

From the day I got a PC at the age of 8 , I was passionate about gaming. As the time passed, turned out I’m passionate about Engineering as well.  Therefore I was always wanting to do something combines these two. So when I was going through gnome projects, I had no doubt which one I should pursue.

With a lot of help from a lot of people through IRC channels, I managed to have GNOME-GAMES through jhbuild. Then I got right into understanding the code base and learning about the project of my liking,  which was Making Nintendo DS a First Class Citizen. I was getting/get a lot of help from Adrien Plazas, the announced mentor and the head of gnome-games. After a lot of hard work, I managed to fix a small bug and it was my first contribution to GNOME and open source in general.

Then it was time to craft my proposal. I didn’t have any experience of making a GSOC proposal as this was my first time trying for a GSOC. I tried to craft a good one to the best of my ability. All I did was just being completely honest and describing my whole learning experience since the day I was intrigued by GNOME.

Currently, I’m further working on getting to know more about Gnome-Games and GNOME in general. Whether I get selected or not, that won’t affect my interest in gnome-games and my desire to be an open source contributor cos this is just fun 🙂


Setting Alt-Tab behavior in gnome-shell

After updating my distro a few months ago, I somehow lost my tweaks to the Alt-Tab behavior in gnome-shell.

The default is to have Alt-Tab switch you between applications in the current workspace. One can use Alt-backtick (or whatever key you have above Tab) to switch between windows in the current application.

I prefer a Windows-like setup, where Alt-Tab switches between windows in the current workspace, regardless of the application to which they belong.

Many moons ago there was a gnome-shell extension to change this behavior, but these days (GNOME 3.24) it can be done without extensions. It is a bit convoluted.

With the GUI

If you are using X instead of Wayland, this works:

  1. Unset the Switch applications command. To do this, run gnome-control-center, go to Keyboard, and find the Switch applications command. Click on it, and hit Backspace in the dialog that prompts you for the keyboard shortcut. Click on the Set button.

  2. Set the Switch windows command. While still in the Keyboard settings, find the Switch windows command. Click on it, and hit Alt-Tab. Click Set.

That should be all you need, unless you are in Wayland. In that case, you need to do it on the command line.

With the command line, or in Wayland

The kind people on #gnome-hackers tell me that as of GNOME 3.24, changing Alt-Tab doesn't work on Wayland as in (2) above, because the compositor captures the Alt-Tab key when you type it inside the dialog that prompts you for a keyboard shortcut. In that case, you have to change the configuration keys directly instead of using the GUI:

gsettings set org.gnome.desktop.wm.keybindings switch-applications "[]"
gsettings set org.gnome.desktop.wm.keybindings switch-applications-backward "[]"
gsettings set org.gnome.desktop.wm.keybindings switch-windows "['<Alt>Tab', '<Super>Tab']"
gsettings set org.gnome.desktop.wm.keybindings switch-windows-backward  "['<Alt><Shift>Tab', '<Super><Shift>Tab']"

Of course the above also works in X, too.

Changing windows across all workspaces

If you'd like to switch between windows in all workspaces, rather than in the current workspace, find the org.gnome.shell.window-switcher current-workspace-only GSettings key and change it. You can do this in dconf-editor, or on the command line with

gsettings set org.gnome.shell.window-switcher current-workspace-only true

reTrumplation, a Twitter bot experiment

A few years ago, somebody introduced me to Translation Party, a website which automatically translates a sentence back and forth until further translations produce the same English text. The results are mostly funny nonsense.

Recently at work we were talking about automatic translations, so I thought it could be funny to use the same principle for a Twitter bot which works on Donald Trump’s many tweets. The result is @reTrumplation.

@reTrumplation, first example

@reTrumplation, second example

June 21, 2017

Midsommer Maps

So we just released the third development release of Maps in the 3.25 series (leading up to 3.26.0 in September).

Some new noteworthy new features and fixes made it in. We gained a couple of new keyboard shortcuts


Control and 1 and 2 to switch the view between street (the “ordinary” map) and aerial view (the shortcuts where inspired by Nautilus) and Control+o to open shape layers (we had a bug report suggesting adding this feature, which indicates it might not have been discoverable enough). Using the standard “open a file” shortcut makes sense, since that might be something you may try without thinking too much about it. And ofcourse it will also show up in the help overlay (as pictured above).

Furthermore, we now remember the mode of transport used for routing between runs, so it no longer reverts to car every time you start Maps (and it also uses the currently set mode when routing to a place from a marker bubble in the view).

Another annoying thing we've fixed is that we no longer reset the list of found itineraries when doing a transit route search and you click on “Load later alternatives” (or earlier) and no more results where found, instead we now show a notification and keep the previous results:


as can be seen in this screenshot showing results for a bus route which only runs on limited days.

We now also, thanks to work done by Robery Ancell, show religion information (for places of worship and book store for example) and information on the availability of toylets (when data is provided in OSM ofcourse). These can also be edited for POIs in OSM.

And it didn't stop there, after the release another feature implemented by Elias Entrup and Neha Yadav has landed. And that's that the search popover no longer just disappears when no results where found, but rather show an indication that nothing was found:


I also managed to do a little mess-up with the “remember the mode of transport” feature, so it was a little broken when currenly using transit and clicking the route button in a place “bubble”. So that has been fixed since in master.

And there was certainly some more things I have already forgotten :-)

But the control center still crashes..

In the last milestone, adding a TODOist provider to gnome-online-accounts looked like it had a bug which made the control-center crash. Here’s a brief description as to what happened :

  • Run the control center.
  • Add the TODOist account.
      • The dialog to enter the credentials opens.
      • If you cross the above step, it asks for permission to access resources from your account.
      • Granting the permissions, the accounts.conf file gets written and the account gets added.

    And BAM!! it crashes.

    Next time you run the control-center right away, the account would be there showing up all nice.

Hence we go to GDB to seek help about what’s the matter with gnome-online-accounts or control-center. Now GDB’s stack trace (which I’ve lost now and can’t reproduce due to some other graphics problems) pointed out some problem with the Intel graphics drivers which I obviously did not know as how to overcome.

Then in discussion with mclasen about this thing, he pointed out the workaround for the bug which was running control center with the prefix

GDK_BACKEND=x11 

and yay! it works now.  Now this had something to do with wayland backend creating some issues with Intel graphics drivers which I cannot really decode at the moment ¯\_(ツ)_/¯

And closing this one with a realization I had after this bug:

“All software has bugs, and a working system is just one where most bugs are not triggered too frequently” –mclasen

All Systems Go! 2017 CfP Open

The All Systems Go! 2017 Call for Participation is Now Open!

We’d like to invite presentation proposals for All Systems Go! 2017!

All Systems Go! is an Open Source community conference focused on the projects and technologies at the foundation of modern Linux systems — specifically low-level user-space technologies. Its goal is to provide a friendly and collaborative gathering place for individuals and communities working to push these technologies forward.

All Systems Go! 2017 takes place in Berlin, Germany on October 21st+22nd.

All Systems Go! is a 2-day event with 2-3 talks happening in parallel. Full presentation slots are 30-45 minutes in length and lightning talk slots are 5-10 minutes.

We are now accepting submissions for presentation proposals. In particular, we are looking for sessions including, but not limited to, the following topics:

  • Low-level container executors and infrastructure
  • IoT and embedded OS infrastructure
  • OS, container, IoT image delivery and updating
  • Building Linux devices and applications
  • Low-level desktop technologies
  • Networking
  • System and service management
  • Tracing and performance measuring
  • IPC and RPC systems
  • Security and Sandboxing

While our focus is definitely more on the user-space side of things, talks about kernel projects are welcome too, as long as they have a clear and direct relevance for user-space.

Please submit your proposals by September 3rd. Notification of acceptance will be sent out 1-2 weeks later.

To submit your proposal now please visit our CFP submission web site.

For further information about All Systems Go! visit our conference web site.

systemd.conf will not take place this year in lieu of All Systems Go!. All Systems Go! welcomes all projects that contribute to Linux user space, which, of course, includes systemd. Thus, anything you think was appropriate for submission to systemd.conf is also fitting for All Systems Go!

2017-06-21 Wednesday.

  • Mail chew; massaged priorities. Thrilled to see the great work from the team ship in Collabora Online 2.1.2 with avatars, and various other UI features; well worth an upgrade.

June 20, 2017

2017-06-20 Tuesday.

  • Consultancy call, mail chew, built ESC agenda. Lunch. Commercial call; super hot - and 3x active building sites two next-door, and one across the road: wow.
  • Worked lateish and got my hash to wire-id conversion working nicely in online; more efficient and more readable.

casync — A tool for distributing file system images

Introducing casync

In the past months I have been working on a new project: casync. casync takes inspiration from the popular rsync file synchronization tool as well as the probably even more popular git revision control system. It combines the idea of the rsync algorithm with the idea of git-style content-addressable file systems, and creates a new system for efficiently storing and delivering file system images, optimized for high-frequency update cycles over the Internet. Its current focus is on delivering IoT, container, VM, application, portable service or OS images, but I hope to extend it later in a generic fashion to become useful for backups and home directory synchronization as well (but more about that later).

The basic technological building blocks casync is built from are neither new nor particularly innovative (at least not anymore), however the way casync combines them is different from existing tools, and that's what makes it useful for a variety of use-cases that other tools can't cover that well.

Why?

I created casync after studying how today's popular tools store and deliver file system images. To briefly name a few: Docker has a layered tarball approach, OSTree serves the individual files directly via HTTP and maintains packed deltas to speed up updates, while other systems operate on the block layer and place raw squashfs images (or other archival file systems, such as IS09660) for download on HTTP shares (in the better cases combined with zsync data).

Neither of these approaches appeared fully convincing to me when used in high-frequency update cycle systems. In such systems, it is important to optimize towards a couple of goals:

  1. Most importantly, make updates cheap traffic-wise (for this most tools use image deltas of some form)
  2. Put boundaries on disk space usage on servers (keeping deltas between all version combinations clients might want to run updates between, would suggest keeping an exponentially growing amount of deltas on servers)
  3. Put boundaries on disk space usage on clients
  4. Be friendly to Content Delivery Networks (CDNs), i.e. serve neither too many small nor too many overly large files, and only require the most basic form of HTTP. Provide the repository administrator with high-level knobs to tune the average file size delivered.
  5. Simplicity to use for users, repository administrators and developers

I don't think any of the tools mentioned above are really good on more than a small subset of these points.

Specifically: Docker's layered tarball approach dumps the "delta" question onto the feet of the image creators: the best way to make your image downloads minimal is basing your work on an existing image clients might already have, and inherit its resources, maintaining full history. Here, revision control (a tool for the developer) is intermingled with update management (a concept for optimizing production delivery). As container histories grow individual deltas are likely to stay small, but on the other hand a brand-new deployment usually requires downloading the full history onto the deployment system, even though there's no use for it there, and likely requires substantially more disk space and download sizes.

OSTree's serving of individual files is unfriendly to CDNs (as many small files in file trees cause an explosion of HTTP GET requests). To counter that OSTree supports placing pre-calculated delta images between selected revisions on the delivery servers, which means a certain amount of revision management, that leaks into the clients.

Delivering direct squashfs (or other file system) images is almost beautifully simple, but of course means every update requires a full download of the newest image, which is both bad for disk usage and generated traffic. Enhancing it with zsync makes this a much better option, as it can reduce generated traffic substantially at very little cost of history/meta-data (no explicit deltas between a large number of versions need to be prepared server side). On the other hand server requirements in disk space and functionality (HTTP Range requests) are minus points for the use-case I am interested in.

(Note: all the mentioned systems have great properties, and it's not my intention to badmouth them. They only point I am trying to make is that for the use case I care about — file system image delivery with high high frequency update-cycles — each system comes with certain drawbacks.)

Security & Reproducibility

Besides the issues pointed out above I wasn't happy with the security and reproducibility properties of these systems. In today's world where security breaches involving hacking and breaking into connected systems happen every day, an image delivery system that cannot make strong guarantees regarding data integrity is out of date. Specifically, the tarball format is famously nondeterministic: the very same file tree can result in any number of different valid serializations depending on the tool used, its version and the underlying OS and file system. Some tar implementations attempt to correct that by guaranteeing that each file tree maps to exactly one valid serialization, but such a property is always only specific to the tool used. I strongly believe that any good update system must guarantee on every single link of the chain that there's only one valid representation of the data to deliver, that can easily be verified.

What casync Is

So much about the background why I created casync. Now, let's have a look what casync actually is like, and what it does. Here's the brief technical overview:

Encoding: Let's take a large linear data stream, split it into variable-sized chunks (the size of each being a function of the chunk's contents), and store these chunks in individual, compressed files in some directory, each file named after a strong hash value of its contents, so that the hash value may be used to as key for retrieving the full chunk data. Let's call this directory a "chunk store". At the same time, generate a "chunk index" file that lists these chunk hash values plus their respective chunk sizes in a simple linear array. The chunking algorithm is supposed to create variable, but similarly sized chunks from the data stream, and do so in a way that the same data results in the same chunks even if placed at varying offsets. For more information see this blog story.

Decoding: Let's take the chunk index file, and reassemble the large linear data stream by concatenating the uncompressed chunks retrieved from the chunk store, keyed by the listed chunk hash values.

As an extra twist, we introduce a well-defined, reproducible, random-access serialization format for file trees (think: a more modern tar), to permit efficient, stable storage of complete file trees in the system, simply by serializing them and then passing them into the encoding step explained above.

Finally, let's put all this on the network: for each image you want to deliver, generate a chunk index file and place it on an HTTP server. Do the same with the chunk store, and share it between the various index files you intend to deliver.

Why bother with all of this? Streams with similar contents will result in mostly the same chunk files in the chunk store. This means it is very efficient to store many related versions of a data stream in the same chunk store, thus minimizing disk usage. Moreover, when transferring linear data streams chunks already known on the receiving side can be made use of, thus minimizing network traffic.

Why is this different from rsync or OSTree, or similar tools? Well, one major difference between casync and those tools is that we remove file boundaries before chunking things up. This means that small files are lumped together with their siblings and large files are chopped into pieces, which permits us to recognize similarities in files and directories beyond file boundaries, and makes sure our chunk sizes are pretty evenly distributed, without the file boundaries affecting them.

The "chunking" algorithm is based on a the buzhash rolling hash function. SHA256 is used as strong hash function to generate digests of the chunks. xz is used to compress the individual chunks.

Here's a diagram, hopefully explaining a bit how the encoding process works, wasn't it for my crappy drawing skills:

Diagram

The diagram shows the encoding process from top to bottom. It starts with a block device or a file tree, which is then serialized and chunked up into variable sized blocks. The compressed chunks are then placed in the chunk store, while a chunk index file is written listing the chunk hashes in order. (The original SVG of this graphic may be found here.)

Details

Note that casync operates on two different layers, depending on the use-case of the user:

  1. You may use it on the block layer. In this case the raw block data on disk is taken as-is, read directly from the block device, split into chunks as described above, compressed, stored and delivered.

  2. You may use it on the file system layer. In this case, the file tree serialization format mentioned above comes into play: the file tree is serialized depth-first (much like tar would do it) and then split into chunks, compressed, stored and delivered.

The fact that it may be used on both the block and file system layer opens it up for a variety of different use-cases. In the VM and IoT ecosystems shipping images as block-level serializations is more common, while in the container and application world file-system-level serializations are more typically used.

Chunk index files referring to block-layer serializations carry the .caibx suffix, while chunk index files referring to file system serializations carry the .caidx suffix. Note that you may also use casync as direct tar replacement, i.e. without the chunking, just generating the plain linear file tree serialization. Such files carry the .catar suffix. Internally .caibx are identical to .caidx files, the only difference is semantical: .caidx files describe a .catar file, while .caibx files may describe any other blob. Finally, chunk stores are directories carrying the .castr suffix.

Features

Here are a couple of other features casync has:

  1. When downloading a new image you may use casync's --seed= feature: each block device, file, or directory specified is processed using the same chunking logic described above, and is used as preferred source when putting together the downloaded image locally, avoiding network transfer of it. This of course is useful whenever updating an image: simply specify one or more old versions as seed and only download the chunks that truly changed since then. Note that using seeds requires no history relationship between seed and the new image to download. This has major benefits: you can even use it to speed up downloads of relatively foreign and unrelated data. For example, when downloading a container image built using Ubuntu you can use your Fedora host OS tree in /usr as seed, and casync will automatically use whatever it can from that tree, for example timezone and locale data that tends to be identical between distributions. Example: casync extract http://example.com/myimage.caibx --seed=/dev/sda1 /dev/sda2. This will place the block-layer image described by the indicated URL in the /dev/sda2 partition, using the existing /dev/sda1 data as seeding source. An invocation like this could be typically used by IoT systems with an A/B partition setup. Example 2: casync extract http://example.com/mycontainer-v3.caidx --seed=/srv/container-v1 --seed=/srv/container-v2 /src/container-v3, is very similar but operates on the file system layer, and uses two old container versions to seed the new version.

  2. When operating on the file system level, the user has fine-grained control on the meta-data included in the serialization. This is relevant since different use-cases tend to require a different set of saved/restored meta-data. For example, when shipping OS images, file access bits/ACLs and ownership matter, while file modification times hurt. When doing personal backups OTOH file ownership matters little but file modification times are important. Moreover different backing file systems support different feature sets, and storing more information than necessary might make it impossible to validate a tree against an image if the meta-data cannot be replayed in full. Due to this, casync provides a set of --with= and --without= parameters that allow fine-grained control of the data stored in the file tree serialization, including the granularity of modification times and more. The precise set of selected meta-data features is also always part of the serialization, so that seeding can work correctly and automatically.

  3. casync tries to be as accurate as possible when storing file system meta-data. This means that besides the usual baseline of file meta-data (file ownership and access bits), and more advanced features (extended attributes, ACLs, file capabilities) a number of more exotic data is stored as well, including Linux chattr(1) file attributes, as well as FAT file attributes (you may wonder why the latter? — EFI is FAT, and /efi is part of the comprehensive serialization of any host). In the future I intend to extend this further, for example storing btrfs sub-volume information where available. Note that as described above every single type of meta-data may be turned off and on individually, hence if you don't need FAT file bits (and I figure it's pretty likely you don't), then they won't be stored.

  4. The user creating .caidx or .caibx files may control the desired average chunk length (before compression) freely, using the --chunk-size= parameter. Smaller chunks increase the number of generated files in the chunk store and increase HTTP GET load on the server, but also ensure that sharing between similar images is improved, as identical patterns in the images stored are more likely to be recognized. By default casync will use a 64K average chunk size. Tweaking this can be particularly useful when adapting the system to specific CDNs, or when delivering compressed disk images such as squashfs (see below).

  5. Emphasis is placed on making all invocations reproducible, well-defined and strictly deterministic. As mentioned above this is a requirement to reach the intended security guarantees, but is also useful for many other use-cases. For example, the casync digest command may be used to calculate a hash value identifying a specific directory in all desired detail (use --with= and --without to pick the desired detail). Moreover the casync mtree command may be used to generate a BSD mtree(5) compatible manifest of a directory tree, .caidx or .catar file.

  6. The file system serialization format is nicely composable. By this I mean that the serialization of a file tree is the concatenation of the serializations of all files and file sub-trees located at the top of the tree, with zero meta-data references from any of these serializations into the others. This property is essential to ensure maximum reuse of chunks when similar trees are serialized.

  7. When extracting file trees or disk image files, casync will automatically create reflinks from any specified seeds if the underlying file system supports it (such as btrfs, ocfs, and future xfs). After all, instead of copying the desired data from the seed, we can just tell the file system to link up the relevant blocks. This works both when extracting .caidx and .caibx files — the latter of course only when the extracted disk image is placed in a regular raw image file on disk, rather than directly on a plain block device, as plain block devices do not know the concept of reflinks.

  8. Optionally, when extracting file trees, casync can create traditional UNIX hard-links for identical files in specified seeds (--hardlink=yes). This works on all UNIX file systems, and can save substantial amounts of disk space. However, this only works for very specific use-cases where disk images are considered read-only after extraction, as any changes made to one tree will propagate to all other trees sharing the same hard-linked files, as that's the nature of hard-links. In this mode, casync exposes OSTree-like behavior, which is built heavily around read-only hard-link trees.

  9. casync tries to be smart when choosing what to include in file system images. Implicitly, file systems such as procfs and sysfs are excluded from serialization, as they expose API objects, not real files. Moreover, the "nodump" (+d) chattr(1) flag is honored by default, permitting users to mark files to exclude from serialization.

  10. When creating and extracting file trees casync may apply an automatic or explicit UID/GID shift. This is particularly useful when transferring container image for use with Linux user name-spacing.

  11. In addition to local operation, casync currently supports HTTP, HTTPS, FTP and ssh natively for downloading chunk index files and chunks (the ssh mode requires installing casync on the remote host, though, but an sftp mode not requiring that should be easy to add). When creating index files or chunks, only ssh is supported as remote back-end.

  12. When operating on block-layer images, you may expose locally or remotely stored images as local block devices. Example: casync mkdev http://example.com/myimage.caibx exposes the disk image described by the indicated URL as local block device in /dev, which you then may use the usual block device tools on, such as mount or fdisk (only read-only though). Chunks are downloaded on access with high priority, and at low priority when idle in the background. Note that in this mode, casync also plays a role similar to "dm-verity", as all blocks are validated against the strong digests in the chunk index file before passing them on to the kernel's block layer. This feature is implemented though Linux' NBD kernel facility.

  13. Similar, when operating on file-system-layer images, you may mount locally or remotely stored images as regular file systems. Example: casync mount http://example.com/mytree.caidx /srv/mytree mounts the file tree image described by the indicated URL as a local directory /srv/mytree. This feature is implemented though Linux' FUSE kernel facility. Note that special care is taken that the images exposed this way can be packed up again with casync make and are guaranteed to return the bit-by-bit exact same serialization again that it was mounted from. No data is lost or changed while passing things through FUSE (OK, strictly speaking this is a lie, we do lose ACLs, but that's hopefully just a temporary gap to be fixed soon).

  14. In IoT A/B fixed size partition setups the file systems placed in the two partitions are usually much shorter than the partition size, in order to keep some room for later, larger updates. casync is able to analyze the super-block of a number of common file systems in order to determine the actual size of a file system stored on a block device, so that writing a file system to such a partition and reading it back again will result in reproducible data. Moreover this speeds up the seeding process, as there's little point in seeding the white-space after the file system within the partition.

Example Command Lines

Here's how to use casync, explained with a few examples:

$ casync make foobar.caidx /some/directory

This will create a chunk index file foobar.caidx in the local directory, and populate the chunk store directory default.castr located next to it with the chunks of the serialization (you can change the name for the store directory with --store= if you like). This command operates on the file-system level. A similar command operating on the block level:

$ casync make foobar.caibx /dev/sda1

This command creates a chunk index file foobar.caibx in the local directory describing the current contents of the /dev/sda1 block device, and populates default.castr in the same way as above. Note that you may as well read a raw disk image from a file instead of a block device:

$ casync make foobar.caibx myimage.raw

To reconstruct the original file tree from the .caidx file and the chunk store of the first command, use:

$ casync extract foobar.caidx /some/other/directory

And similar for the block-layer version:

$ casync extract foobar.caibx /dev/sdb1

or, to extract the block-layer version into a raw disk image:

$ casync extract foobar.caibx myotherimage.raw

The above are the most basic commands, operating on local data only. Now let's make this more interesting, and reference remote resources:

$ casync extract http://example.com/images/foobar.caidx /some/other/directory

This extracts the specified .caidx onto a local directory. This of course assumes that foobar.caidx was uploaded to the HTTP server in the first place, along with the chunk store. You can use any command you like to accomplish that, for example scp or rsync. Alternatively, you can let casync do this directly when generating the chunk index:

$ casync make ssh.example.com:images/foobar.caidx /some/directory

This will use ssh to connect to the ssh.example.com server, and then places the .caidx file and the chunks on it. Note that this mode of operation is "smart": this scheme will only upload chunks currently missing on the server side, and not re-transmit what already is available.

Note that you can always configure the precise path or URL of the chunk store via the --store= option. If you do not do that, then the store path is automatically derived from the path or URL: the last component of the path or URL is replaced by default.castr.

Of course, when extracting .caidx or .caibx files from remote sources, using a local seed is advisable:

$ casync extract http://example.com/images/foobar.caidx --seed=/some/exising/directory /some/other/directory

Or on the block layer:

$ casync extract http://example.com/images/foobar.caibx --seed=/dev/sda1 /dev/sdb2

When creating chunk indexes on the file system layer casync will by default store meta-data as accurately as possible. Let's create a chunk index with reduced meta-data:

$ casync make foobar.caidx --with=sec-time --with=symlinks --with=read-only /some/dir

This command will create a chunk index for a file tree serialization that has three features above the absolute baseline supported: 1s granularity time-stamps, symbolic links and a single read-only bit. In this mode, all the other meta-data bits are not stored, including nanosecond time-stamps, full UNIX permission bits, file ownership or even ACLs or extended attributes.

Now let's make a .caidx file available locally as a mounted file system, without extracting it:

$ casync mount http://example.comf/images/foobar.caidx /mnt/foobar

And similar, let's make a .caibx file available locally as a block device:

$ casync mkdev http://example.comf/images/foobar.caibx

This will create a block device in /dev and print the used device node path to STDOUT.

As mentioned, casync is big about reproducibility. Let's make use of that to calculate the a digest identifying a very specific version of a file tree:

$ casync digest .

This digest will include all meta-data bits casync and the underlying file system know about. Usually, to make this useful you want to configure exactly what meta-data to include:

$ casync digest --with=unix .

This makes use of the --with=unix shortcut for selecting meta-data fields. Specifying --with-unix= selects all meta-data that traditional UNIX file systems support. It is a shortcut for writing out: --with=16bit-uids --with=permissions --with=sec-time --with=symlinks --with=device-nodes --with=fifos --with=sockets.

Note that when calculating digests or creating chunk indexes you may also use the negative --without= option to remove specific features but start from the most precise:

$ casync digest --without=flag-immutable

This generates a digest with the most accurate meta-data, but leaves one feature out: chattr(1)'s immutable (+i) file flag.

To list the contents of a .caidx file use a command like the following:

$ casync list http://example.com/images/foobar.caidx

or

$ casync mtree http://example.com/images/foobar.caidx

The former command will generate a brief list of files and directories, not too different from tar t or ls -al in its output. The latter command will generate a BSD mtree(5) compatible manifest. Note that casync actually stores substantially more file meta-data than mtree files can express, though.

What casync isn't

  1. casync is not an attempt to minimize serialization and downloaded deltas to the extreme. Instead, the tool is supposed to find a good middle ground, that is good on traffic and disk space, but not at the price of convenience or requiring explicit revision control. If you care about updates that are absolutely minimal, there are binary delta systems around that might be an option for you, such as Google's Courgette.

  2. casync is not a replacement for rsync, or git or zsync or anything like that. They have very different use-cases and semantics. For example, rsync permits you to directly synchronize two file trees remotely. casync just cannot do that, and it is unlikely it every will.

Where next?

casync is supposed to be a generic synchronization tool. Its primary focus for now is delivery of OS images, but I'd like to make it useful for a couple other use-cases, too. Specifically:

  1. To make the tool useful for backups, encryption is missing. I have pretty concrete plans how to add that. When implemented, the tool might become an alternative to restic, BorgBackup or tarsnap.

  2. Right now, if you want to deploy casync in real-life, you still need to validate the downloaded .caidx or .caibx file yourself, for example with some gpg signature. It is my intention to integrate with gpg in a minimal way so that signing and verifying chunk index files is done automatically.

  3. In the longer run, I'd like to build an automatic synchronizer for $HOME between systems from this. Each $HOME instance would be stored automatically in regular intervals in the cloud using casync, and conflicts would be resolved locally.

  4. casync is written in a shared library style, but it is not yet built as one. Specifically this means that almost all of casync's functionality is supposed to be available as C API soon, and applications can process casync files on every level. It is my intention to make this library useful enough so that it will be easy to write a module for GNOME's gvfs subsystem in order to make remote or local .caidx files directly available to applications (as an alternative to casync mount). In fact the idea is to make this all flexible enough that even the remoting back-ends can be replaced easily, for example to replace casync's default HTTP/HTTPS back-ends built on CURL with GNOME's own HTTP implementation, in order to share cookies, certificates, … There's also an alternative method to integrate with casync in place already: simply invoke casync as a sub-process. casync will inform you about a certain set of state changes using a mechanism compatible with sd_notify(3). In future it will also propagate progress data this way and more.

  5. I intend to a add a new seeding back-end that sources chunks from the local network. After downloading the new .caidx file off the Internet casync would then search for the listed chunks on the local network first before retrieving them from the Internet. This should speed things up on all installations that have multiple similar systems deployed in the same network.

Further plans are listed tersely in the TODO file.

FAQ:

  1. Is this a systemd project?casync is hosted under the github systemd umbrella, and the projects share the same coding style. However, the code-bases are distinct and without interdependencies, and casync works fine both on systemd systems and systems without it.

  2. Is casync portable? — At the moment: no. I only run Linux and that's what I code for. That said, I am open to accepting portability patches (unlike for systemd, which doesn't really make sense on non-Linux systems), as long as they don't interfere too much with the way casync works. Specifically this means that I am not too enthusiastic about merging portability patches for OSes lacking the openat(2) family of APIs.

  3. Does casync require reflink-capable file systems to work, such as btrfs? — No it doesn't. The reflink magic in casync is employed when the file system permits it, and it's good to have it, but it's not a requirement, and casync will implicitly fall back to copying when it isn't available. Note that casync supports a number of file system features on a variety of file systems that aren't available everywhere, for example FAT's system/hidden file flags or xfs's projinherit file flag.

  4. Is casync stable? — I just tagged the first, initial release. While I have been working on it since quite some time and it is quite featureful, this is the first time I advertise it publicly, and it hence received very little testing outside of its own test suite. I am also not fully ready to commit to the stability of the current serialization or chunk index format. I don't see any breakages coming for it though. casync is pretty light on documentation right now, and does not even have a man page. I also intend to correct that soon.

  5. Are the .caidx/.caibx and .catar file formats open and documented?casync is Open Source, so if you want to know the precise format, have a look at the sources for now. It's definitely my intention to add comprehensive docs for both formats however. Don't forget this is just the initial version right now.

  6. casync is just like $SOMEOTHERTOOL! Why are you reinventing the wheel (again)? — Well, because casync isn't "just like" some other tool. I am pretty sure I did my homework, and that there is no tool just like casync right now. The tools coming closest are probably rsync, zsync, tarsnap, restic, but they are quite different beasts each.

  7. Why did you invent your own serialization format for file trees? Why don't you just use tar? — That's a good question, and other systems — most prominently tarsnap — do that. However, as mentioned above tar doesn't enforce reproducibility. It also doesn't really do random access: if you want to access some specific file you need to read every single byte stored before it in the tar archive to find it, which is of course very expensive. The serialization casync implements places a focus on reproducibility, random access, and meta-data control. Much like traditional tar it can still be generated and extracted in a stream fashion though.

  8. Does casync save/restore SELinux/SMACK file labels? — At the moment not. That's not because I wouldn't want it to, but simply because I am not a guru of either of these systems, and didn't want to implement something I do not fully grok nor can test. If you look at the sources you'll find that there's already some definitions in place that keep room for them though. I'd be delighted to accept a patch implementing this fully.

  9. What about delivering squashfs images? How well does chunking work on compressed serializations? – That's a very good point! Usually, if you apply the a chunking algorithm to a compressed data stream (let's say a tar.gz file), then changing a single bit at the front will propagate into the entire remainder of the file, so that minimal changes will explode into major changes. Thankfully this doesn't apply that strictly to squashfs images, as it provides random access to files and directories and thus breaks up the compression streams in regular intervals to make seeking easy. This fact is beneficial for systems employing chunking, such as casync as this means single bit changes might affect their vicinity but will not explode in an unbounded fashion. In order achieve best results when delivering squashfs images through casync the block sizes of squashfs and the chunks sizes of casync should be matched up (using casync's --chunk-size= option). How precisely to choose both values is left a research subject for the user, for now.

  10. What does the name casync mean? – It's a synchronizing tool, hence the -sync suffix, following rsync's naming. It makes use of the content-addressable concept of git hence the ca- prefix.

  11. Where can I get this stuff? Is it already packaged? – Check out the sources on GitHub. I just tagged the first version. Martin Pitt has packaged casync for Ubuntu. There is also an ArchLinux package. Zbigniew Jędrzejewski-Szmek has prepared a Fedora RPM that hopefully will soon be included in the distribution.

Should you care? Is this a tool for you?

Well, that's up to you really. If you are involved with projects that need to deliver IoT, VM, container, application or OS images, then maybe this is a great tool for you — but other options exist, some of which are linked above.

Note that casync is an Open Source project: if it doesn't do exactly what you need, prepare a patch that adds what you need, and we'll consider it.

If you are interested in the project and would like to talk about this in person, I'll be presenting casync soon at Kinvolk's Linux Technologies Meetup in Berlin, Germany. You are invited. I also intend to talk about it at All Systems Go!, also in Berlin.

Fedora Workstation 26 and beyond

Felt it been to long since I did another Fedora Workstation update. We spend a lot of time trying to figure out how we can best spend our resources to produce the best desktop possible for our users, because even though Red Hat invests more into the Linux desktop than any other company by quite a margin, our resources are still far from limitless. So we have a continuous effort of asking ourselves if each of the areas we are investing in are the right ones that give our users the things they need the most, so below is a sampling of the things we are working on.

Improving integration of the NVidia binary driver
This has been ongoing for quite a while, but things have started to land now. Hans de Goede and Simone Caronni has been collaboring, building on the work by NVidia and Adam Jackson around glvnd. So if you set up Simones NVidia repository hosted on negativo17 you will be able to install the Nvidia driver without any conflicts with the Mesa stack and due to Hans work you should be fairly sure that even if the NVidia driver stops working with a given kernel update you will smoothly transition back to the open source Nouveau driver. I been testing it on my own Lenovo P70 system for the last week and it seems to work well under X. That said once you install the binary NVidia driver that is what your running on, which is of course not the behaviour you want from a hybrid graphics system. Fixing that last issue requires further collaboration between us and NVidia.
Related to this Adam Jackson is currently working on a project he calls glxmux. glxmux will allow you to have more than one GLX implementation on the system, so that you can switch between Mesa GLX for the Intel integrated graphics card and NVidia GLX for the binary driver. While we can make no promises we hope to have the framework in place for Fedora Workstation 27. Having that in place should allow us to create a solution where you only use the NVidia driver when you want the extra graphics power which will of course require significant work from Nvidia to enable it on their side so I can’t give a definite timeline for when all the puzzle pieces are in place. Just be assured we are working on it and talking regularly to NVidia about it. I will let you know here as soon as things come together.

On the Wayland side the Jonas Ådahl is working on putting the final touches on Hybrid Graphics support

Fleet Commander ready for take-off
Another major project we been working on for a long time in Fleet Commander. Fleet Commander is a tool to allow you to manage Fedora and RHEL desktops centrally. This is a tool targeted at for instance Universities or companies with tens, hundreds or thousands of workstation installation. It gives you a graphical browser based UI (accessible through Cockpit) to create configuration profiles and deploy across your organization. Currently it allows you to control anything that has a gsetting associated with it like enabling/disabling extensions and setting configuration settings in GTK+ and GNOME applications. It allows you to configure Network Manager settings so if your updating the company VPN or proxy settings you can easily push those changes out to all user in the organization. Or quickly migrate Evolution email settings to a new email server. The tool also allows you to control recommended applications in the Software Center and set bookmarks in Firefox. There is also support for controlling settings inside LibreOffice.

All this features can be set and controlled on either a user level or a group level or organization wide due to the close integration we have with FreeIPA suite of tools. The data is stored inside your organizations LDAP server alongside other user information so you don’t need to have the clients connect to a new service for this, and while it is not there in this initial release we will in the future also support Active Directory.

The initial release and Fleet Commander website will be out alongside Fedora Workstation 26.

PipeWire
I talked about PipeWire before, when it was still called Pinos, but the scope and ambition for the project has significantly changed since then. Last time when I spoke about it the goal was just to create something that could be considered a video equivalent of pulseaudio. Wim Taymans, who you might know co-created GStreamer and who has been a major PulseAudio contributor, has since expanded the scope and PipeWire now aims at unifying linux Audio and Video. The long term the goal is for PipeWire to not only provide handling of video streams, but also handle all kinds of audio. Due to this Wim has been spending a lot of time making sure PipeWire can handle audio in a way that not only address the PulseAudio usecases, but also the ones handled by Jack today. A big part of the motivation for this is that we want to make Fedora Workstation the best place to create content and we want the pro-audio crowd to be first class citizens of our desktop.

At the same time we don’t want to make this another painful subsystem transition so PipeWire so we will need to ensure that PulseAudio applications can still be run without modification.

We expect to start shipping PipeWire with Fedora Workstation 27, but at that point only have it handle video as we need this to both enable good video handling for Flatpak applications through a video portal, but also to provide an API for applications that want to do screen capture under Wayland, like web browser applications offering screen sharing. We will the bring the audio features onboard in subsequent releases as we also try to work with the Jack and PulseAudio communities to make this a joint effort. We are also working now on a proper website for PipeWire.

Red Hat developer integration
A feature we are quite excited about is the integration of support for the Red Hat developer account system into Fedora. This means that you should be able to create a Red Hat developer account through GNOME Online accounts and once you have that account set up you should be able to easily create Red Hat Enterprise Linux virtual machines or containers on your Fedora system. This is a crucial piece for the developer focus that we want the workstation to have and one that we think will make a lot of developers life easier. We where originally hoping to have this ready for Fedora Workstaton 26, but atm it looks more likely to hit Fedora Workstation 27, but we will keep you up to date as this progresses.

Fractional scaling for HiDPI systems
Fedora Workstation has been leading the charge in supporting HiDPI on Linux and we hope to build on that with the current work to enable fractional scaling support. Since we introduced HiDPI support we have been improving it step by step, for instance last year we introduced support for dealing with different DPI levels per monitor for Wayland applications. The fractional scaling work will take this a step further. The biggest problem it will resolve is that for certain monitor sizes the current scaling options either left things to small or to big. With the fractional scaling support we will introduce intermediate steps, so that you can scale your interface 1.5x times instead of having to go all the way to 2. The set of technologies we are developing for handling fractional scaling should also allow us to provide better scaling for XWayland applications as it provides us with methods for scaling that doesn’t need direct support from the windowing system or toolkit.

GNOME Shell performance
Carlos Garnacho has been doing some great work recently improving the general performance of GNOME Shell. This comes on top of his earlier performance work that was very well received. How fast/slow GNOME shell is often a subjective thing, but reducing overhead where we can is never a bad thing.

Flatpak building
Owen Taylor has been working hard on putting the pieces in place for start large scale Flatpak building in Fedora. You might see a couple of test flatpaks appear in a Fedora Workstation 26 timeframe, but the goal is to have a huge Flatpak catalog ready in time for Fedora Workstation 27. Essentially what we are doing is making it very simple for a Fedora maintainer to build a Flatpak of the application they maintain through the Fedora package building infrastructure and push that Flatpak into a central Flatpak registry. And while of course this is mainly meant to be to the benefit of Fedora users there is of course nothing stopping other distributions from offering these Flatpak packaged applications to their users also.

Atomic Workstation
Another effort that is marching forward is what we call Atomic Workstation. The idea here is to have an immutable OS image kinda like what you see on for instance Android devices. The advantage to this is that the core of the operating system gets tested and deployed as a unit and the chance of users ending with broken systems decrease significantly as we don’t need to rely on packages getting applied in the correct order or scripts executing as expected on each individual workstation out there. This effort is largely based on the Project Atomic effort, and the end goal here is to have a image based OS install and Flatpak based applications on top of it. If you are very adventerous and/or want to help out with this effort you can get the ISO image installer for Atomic Workstation here.

Firmware handling
Our Linux Firmware project is still going strong with new features being added and new vendors signing on. As Richard Hughes recently blogged about the latest vendor joining the effort is Logitech who now will upload their firmware into the service so that you can keep your Logitech peripherals updated through it. It is worthwhile pointing out here how we worked with Logitech to make this happen, with Richard working on the special tooling needed and thus reducing the threshold for Logitech to start offering their firmware through the service. We have other vendors we are having similar discussions and collaborations with so expect to see more. At this point I tend to recommend people get a Dell to run Linux, due to their strong support for efforts such as the Linux Firmware Service, but other major vendors are in the final stages of testing so expect more major vendors starting to push firmware updates soon.

High Dynamic Range
The next big thing in the display technology field is HDR (High Dynamic Range). HDR allows for deeper more vibrant colours and is a feature seen on a lot of new TVs these days and game consoles like the Playstation 4 support it. Computer monitors are appearing on the market too now with this feature, for instance the Dell UP2718Q. We want to ensure Fedora and Linux is a leader here, for the benefit of video and graphics artists using Fedora and Red Hat Enterprise Linux. We are thus kicking of an effort to make sure this technology mature as quickly as possible and be fully supported. We are not the only ones interested in this so we will hopefully be collaborating with our friends at Intel, AMD and NVidia on this. We hope to have the first monitors delivered to our office within a few weeks.

Codecs
While playback these days have moved to streaming where locally installed codecs are of less importance for the consumption usecase, having a wide selection of codecs available is still important for media editing and creation usecases, so we want you to be able to load a varity of old media files into you video editor for instance. Luckily we are at a crossroads now where a lot of widely used codecs have their essential patents expire (mp3, ac3 and more) while at the same time the industry focus seems to have moved to royalty free codec development moving forward (Opus, VP9, Alliance for Open Media). We have been spending a lot of time with the Red Hat legal team trying to clear these codecs, which resulted in mp3 and AC3 now shipping in Fedora Workstation. We have more codecs on the way though, so this effort is in no way over. My goal is that over the course of this year the situation of software patents being a huge issue when dealing with audio and video codecs on Linux will be considered a thing of the past. I would like to thank the Red Hat legal team for their support on this issue as they have had to spend significant time on it as a big company like Red Hat do need to do our own due diligence when it comes to these things, we can’t just trust statements from random people on the internet that these codecs are now free to ship.

Battery life
We been looking at this for a while now and hope to be able to start sharing information with users on which laptops they should get that will have good battery life under Fedora. Christian Kellner is now our point man on battery life and he has taken up improving the Battery Bench tool that Owen Taylor wrote some time ago.

QtGNOME platform
We will have a new version of the QtGNOME platform in Fedora 26. For those of you who have not yet heard of this effort it is a set of themes and tools to ensure that Qt applications runs without any major issues under GNOME 3. With the new version the theming expands to include the accessibility and dark themes in Adwaita, meaning that if you switch to one of these themes under GNOME shell it will also switch your Qt applications over. We are also making sure things like cut’n paste and drag and drop works well. The version in Fedora Workstation 26 is a big step forward for this effort and should hopefully make Qt applications be first class citizens under your Fedora Workstation desktop.

Wayland polish
Ever since we switched the default to Wayland we have kept the pressure up and kept fixing bugs and finding solutions for corner cases. The result should be an improved Wayland experience in Fedora Workstation 26. A big thanks for Olivier Fourdan, Jonas Ådahl and the whole Wayland community for their continued efforts here. Two major items Jonas is working on for instance is improving fractional scaling, to ensure that your desktop scales to an optimal size on HiDPI displays of various sizes. What we currently have is limited to 1x or 2x, which is either to small or to big for some screens, but with this work you can also do 1.5x scaling. He is also working on preparing an API that will allow screen sharing under Wayland, so that for instance sharing your slides over video conferencing can work under Wayland.

C++: std::string_view not so useful when calling C functions

string_view does not own string data

C++17 adds std::string_view, which is a thin view of a character array, holding just a pointer and a length. This makes it easy to provide just one method that can efficiently take either a const char*, or a std::string, without unnecessary copying of the underlying array. For instance:

void use_string(std::string_view str);

You can then call that function like so:

use_string("abc");

or

std::string str("abc");
use_string(str);

This involves no deep copying of the character array until the function’s implementation needs to do that. Most obviously, it involves no copying when you are just passing a string literal to the function. For instance it doesn’t create a temporary std::string just to call the function, as would be necessary if the function took std::string.

string_view knows nothing of null-termination

However, though the string literal (“abc”) is null-terminated, and the std::string is almost-certainly  null-terminated (but implementation defined), our use_string() function cannot know for sure that the underlying array is null terminated. It could have been called liked so:

const char* str = "abc"; // null-terminated
use_string(std::string_view(str, 2));  // not 3.

or even like so:

const char str[] = {'a', 'b', 'c'}; //not null-terminated
use_string(std::string_view(str, 3));

or as a part of a much larger string that we are parsing.

Unlike std::string, there is no std::string_view::c_str() which will give you a null-terminated character array. There is std::string_view::data() but, like std::string::data(), that doesn’t guarantee the the character array will be null-terminated. (update: since C++11, std::string::data() is guaranteed to be null-terminated, but std::string_view::data() in C++17 is not.)

So if you call a typical C function, such as gtk_label_set_text(), you have to construct a temporary std::string, like so:

void use_string(std::string_view str) {
  gtk_label_set_text(label, std::string(str).c_str());
}

But that creates a copy of the array inside the std::string, even if that wasn’t really necessary. std::string_view has no way to know if the original array is null-terminated, so it can’t copy only when necessary.

This is understandable, and certainly useful for pure C++ code bases, or when using C APIs that deal with lengths instead of just null termination. I do like that it’s in the standard library now. But it’s a little disappointing in my real world of integrating with typical C APIs, for instance when implementing gtkmm.

Implementation affecting the interface

Of course, any C function that is expected to take a large string would have a version that takes a length. For instance, gtk_text_buffer_set_text(), so we can (and gtkmm will) use std::string_view as the parameter for any C++ function that uses that C function. But it’s a shame that we can’t have a uniform API that uses the same type for all string parameters. I don’t like when the implementation details dictate the types used in our API.

There is probably no significant performance issue for small strings, even when using the temporary std::string() technique, but it wouldn’t be nice to make the typical cases worse, even just theoretically.

In gtkmm, we could create our own string_view type, which is aware of null termination, but we are trying to be as standard as possible so our API is as obvious as possible.

The first weeks of GSoC

So some time has passed since the coding period started and I want to give you a quick update on how things are going so far.

I started working on the improved setup experience of the Nextcloud client, by adding a list of providers that offer hosted Nextcloud installations for their customers. As a second step support for a registration API will be integrated so that users can create an account right from within the setup wizard. A proof of concept implementation for that API is almost ready for the Nextcloud registration app.

I also started migrating Carlos proposal of a cloud providers API to a gtk-independent library that provides basic DBus interfaces for cloud providers and software that wants to access the functions provided by those, like Nautilus.

Although it required quite some time to wrap my head around gdbus and especially the generated code of gdbus-codegen, its getting clearer and I took some notes in order to write a HowDoI entry for the GNOME wiki to help others getting started with it in the future.

Over the next 2 weeks I’ll be continuing migrating the cloud providers library to use gdbus-codegen as well as adding support for the cloud providers API to the GtkPlacesSidebar.

So far for now, I’ll keep you updated on the progress, of course. Thanks for reading.

GSoC Report 1

The first two weeks of GSoC coding period have ended. I started to work early in the bonding period and got many things done.

May 4 - May 30

I was a bit scared after Adrien ported the gamepad system to C. I am a huge fan of Vala and had never done Gtk with C before. I spent a lot of time understanding how C with GObject works. GObject has to support many language bindings and hence the C code is highly verbose. One thing for sure is that C makes us more aware of many language features that we often take for granted.

After I found and fixed a few bugs that popped up during the porting, I started working on the gamepad mapping configuration.

What is it?

The goal is to allow the user to configure what each gamepad input does. These are the main components of the gamepad mapper system:

  • Gamepad View
    Show the standard gamepad and highlight its individual inputs.

  • Gamepad Mapping Builder
    Build the SDL string from the user provided inputs.

  • Gamepad Mapper
    Coordinate the work between UI and backend, and display relevant messages.

  • Gamepad Preferences Window
    Show the gamepads1.

What is interesting?

  • How to highlight gamepad inputs?

    I could have used Adrien’s old demo code to manually draw the gamepad:

    alt

    But SVGs provide a more accurate representation of the image, and also allow us to query and highlight their elements, so I started looking for relevant libraries.

    “Do you want to render non-animated SVGs to a Cairo surface with a minimal, no-nonsense API?” - Librsvg’s README

    All Librsvg can do for us is to render SVGs! How do I get it to highlight stuff?

    I still favoured Librsvg because it is a part of GNOME. I found two functions of use in Librsvg’s Vala API, one to render an SVG and one to render an SVG sub. I knew it could be exploited somehow.

    • Try 1: Render the gamepad SVG and render the button’s sub again

      What? How did I even expect it to work?

    • Try 2: Render the gamepad SVG and render the button’s sub with Cairo.Operator.CLEAR

      alt

    • Try 3: Paint the surface with blue, render the gamepad SVG and render the button’s sub with Cairo.Operator.CLEAR

      “Hello darkness, my old friend”

      alt

    I tried a lot of things, but before I and darkness became best friends, I understood that with my negligible Cairo skills, I wouldn’t be able to do it. So I went to Cairo’s IRC and @psychon immediately solved my problem.

    • Try 4: Draw button’s sub to another surface and use it as a mask to paint the gamepad SVG’s surface

      alt

  • SDL string for gamepad mapping

    Here’s an example of an SDL string. My task was to build them for user’s input mappings.

    03000000790000000600000010010000,DragonRise Inc. Generic USB Joystick,platform:Linux, x:b3,a:b2,b:b1,y:b0,back:b8,start:b9, dpleft:h0.8,dpdown:h0.4,dpright:h0.2,dpup:h0.1, leftshoulder:b4,lefttrigger:b6,rightshoulder:b5,righttrigger:b7, leftstick:b10,rightstick:b11, leftx:a0,lefty:a1,rightx:a3,righty:a4,

    Games was already using the SDL strings provided in SDL GameControllerDB as defaults. I read how Games parsed those strings, and reversed engineered2 how to build them.

    Let’s take one part of it as an example: start:b9, if button 9 is pressed on this gamepad with this mapping, then it will map to the start button of the standard gamepad. So while building the mapping, if I highlight the start button on the gamepad, and if button 2 is pressed then I’ll build a string with start:b2.

    This button to button mapping example was simple but the dpad mappings was a lot more complicated, it took me a while to do it completely.

June 1 - June 15

I was fairly busy for the first two weeks of June. I made some major enhancements to the previous work but didn’t go full steam. My aim is to finish the gamepad mapping configuration for the first GSoC evalution.

What is remaining?

  • Save the mappings to user directory
  • Allow user to reset mappings to default

I also end up doing many things at once. Whenever I find bugs/enhancements in the nearby code, I start working on it. While working on gamepad mapper, I filed new bugs and dived into code that I don’t even need for this project.

I will add the working videos/gifs in the next post. Stay tuned :D



  1. We’ll get back to this in July 

  2. I had to use it somewhere, sounds so cool! 

June 19, 2017

GNOME Tweak Tool 3.25.3

Today I released the second development snapshot (3.25.3) of what will be GNOME Tweak Tool 3.26.

I consider the initial User Interface (UI) rework proposed by the GNOME Design Team to be complete now. Every page in Tweak Tool has been updated, either in this snapshot or the previous development snapshot.

The hard part still remains: making the UI look as good as the mockups. Tweak Tool’s backend makes this a bit more complicated than usual for an app like this.

Here are a few visual highlights of this release.

The Typing page has been moved into an Additional Layout Options dialog in the Keyboard & Mouse page. Also, the Compose Key option has been given its own dialog box.

Florian Müllner added content to the Extensions page that is shown if you don’t have any GNOME Shell extensions installed yet.

A hidden feature that GNOME has had for a long time is the ability to move the Application Menu from the GNOME top bar to a button in the app’s title bar. This is easy to enable in Tweak Tool by turning off the Application Menu switch in the Top Bar page. This release improves how well that works, especially for Ubuntu users where the required hidden appmenu window button was probably not pre-configured.

Some of the ComboBoxes have been replaced by ListBoxes. One example is on the Workspaces page where the new design allows for more information about the different options. The ListBoxes are also a lot easier to select than the smaller ComboBoxes were.

For details of these and other changes, see the commit log or the NEWS file.

GNOME Tweak Tool 3.26 will be released alongside GNOME 3.26 in mid-September.

Improving the Search of Nautilus

This summer I’m really glad to be working again on Nautilus as part of Google Summer of Code. This time, the goal of the project is to improve the Search. Currently, it misses some features that would make searching easier and there are also some performance issues.

So far I worked on Full Text Search. This could be done until now, but from Desktop Search (tracker-needle). Since one of the main functions of Nautilus is searching files, it makes sense for it to include this feature.

Now, if the user chooses so, the search results will no longer include only matches with the file name, but also with the contents of the file. Also, to be more relevant, a short snippet with the context in which the text was found is offered. To get this information, a Tracker query is used, which means that in order to find the files, they will need to be indexed by Tracker.

Until last week I was busy with my exams and those took quite a bit of my time, but now that they’re finally done, I can give my full attention to Nautilus.

Next, I’ll focus on tags, since this is a feature that would sure come in handy at organizing files. So, there are more updates to come in the following weeks 🙂

 


Two hackathons in a week: thoughts on NoFlo and MsgFlo

Last week I participated in two hackathons, events where a group of strangers would form a team for two or three days and build a product prototype. In the end all teams pitch their prototypes, and the best ones would be given some prizes.

Hackathons are typically organized to get feedback from developers on some new API or platform. Sometimes they’re also organized as a recruitment opportunity.

Apart from the free beer and camaraderie, I like going to hackathons since they’re a great way to battle test the developer tools I build. The time from idea to having to have a running prototype is short, people are used to different ways of working and different toolkits.

If our tools and flow-based programming work as intended, they should be ideal for these kind of situations.

Minds + Machines hackathon and Electrocute

Minds + Machines hackathon was held on a boat and focused on decarbonizing power and manufacturing industries. The main platform to work with was Predix, GE’s PaaS service.

Team Electrocute

Our project was Electrocute, a machine learning system for forecasting power consumption in a changing climate.

1.5°C is the global warming target set by the Paris Agreement. How will this affect energy consumption? What kind of generator assets should utilities deploy to meet these targets? When and how much renevable energy can be utilized?

The changing climate poses many questions to utilities. With Electrocute’s forecasting suite power companies can have accurate answers, on-demand.

Electrocute forecasts

The system was built with a NoFlo web API server talking over MsgFlo with a Python machine learning backend. We also built a frontend where users could see the energy usage forecasts on a heatmap.

NoFlo-Xpress in action

Unfortunately we didn’t win this one.

Recoding Aviation and Skillport

Recoding Aviation was held at hub:raum and focused on improving the air travel experience through usage of open APIs offered by the various participating airports.

Team Skillport

Skillport was our project to make long layovers more bearable by connecting people who’re stuck at the airport at the same time.

Long layovers suck. But there is ONE thing amazing about them: You are surrounded by highly skilled people with interesting stories from all over the world. It sometimes happens that you meet someone randomly - we all have a story like that. But usually we are too shy and lazy to communicate and see how we could create a valuable interaction. You never know if the other person feels the same.

We built a mobile app that turns airports into a networking, cultural exchange and knowledge sharing hub. Users tell each other through the app that they are available to meet and what value they can bring to an interaction.

The app connected with a J2EE API service that then communicated over MsgFlo with NoFlo microservices doing all the interactions with social and airport APIs. We also did some data enrichment in NoFlo to make smart recommendations on meeting venues.

MsgFlo in action

This time our project went well with the judges and we were selected as the winner of the Life in between airports challenge. I’m looking forward to the helicopter ride over Berlin!

Category winners

Skillport also won a space at hub:raum, so this might not be the last you’ll hear of the project…

Lessons learned

Benefits of a message queue architecture

I’ve written before on why to use message queues for microservices, but that post focused more on the benefits for real-life production usage.

The problems and tasks for a system architecture in a hackathon are different. Since the time is short, you want to enable people to work in parallel as much as possible without stepping on each other’s toes. Since people in the team come from different backgrounds, you want to enable a heterogeneous, polyglot architecture where each developer can use the tools they’re most productive with.

MsgFlo is by its nature very suitable for this. Components can be written in any language that supports the message queue used, and we have convenience libraries for many of them. The discovery mechanism makes new microservices appear on the Flowhub graph as soon as they start, enabling services to be wired together quickly.

Mock early, mock often

Mocks are a useful way to provide a microservice to the other team members even before the real implementation is ready.

For example in the GE Predix hackathon, we knew the machine learning team would need quite a bit of time to build their model. Until that point we ran their microservice with a simple msgflo-python component that just gave random() as the forecast.

This way everybody else was able to work with the real interface from the get-go. When the learning model was ready we just replaced that Python service, and everything was live.

Mocks can be useful also in situations where you have a misbehaving third-party API.

Don’t forget tests

While shooting for a full test coverage is probably not realistic within the time constraints of a hackathon, it still makes sense to have at least some “happy path” tests. When you’re working with multiple developers each building a different parts of the service, interface tests serve a dual purpose:

  • They show the other team members how to use your service
  • They verify that your service actually does what it is supposed to

And if you’re using a continuous integration tool like Travis, the tests will help you catch any breakages quickly, and also ensure the services work on a clean installation.

For a message queue architecture, fbp-spec is a great tool for writing and running these interface tests.

Talk with the API providers

The reason API and platform providers organize these events is to get feedback. As a developer that works with tons of different APIs, this is a great opportunity to make sure your ideas for improvement are heard.

On the flip side, this usually also means the APIs are in a pretty early stage, and you may be the first one using them in a real-world project. When the inevitable bugs arise, it is a good to have a channel of communications open with the API provider on site so you can get them resolved or worked around quickly.

Room for improvement

The downside of the NoFlo and MsgFlo stack is that there is still quite a bit of a learning curve. NoFlo documentation is now in a reasonable place, but with Flowhub and MsgFlo we have tons of work ahead on improving the onboarding experience.

Right now it is easy to work with if somebody sets it up properly first, but getting there is a bit tricky. Fixing this will be crucial for enabling others to benefit from these tools as well.

BuildStream and host tools

It’s been a while since I had to build a whole operating system from source. I’ve mostly been working on compilers so far this year at Codethink in fact, but my new project is to bring up some odd target systems that aren’t supported by any mainstream distros.

We did something similar about 4 years ago using Baserock and it worked well; this time we are using the Baserock OS definitions again but with BuildStream as a build tool. I’ve not had any chance to get involved in BuildStream up til now (beyond observing it) so this will be good.

The first thing I’m getting my head around is the “no host tools” policy. The design of BuildStream is that every build is run in a sandbox that’s isolated from the host. Older Baserock tools took a similar approach too and it makes a lot of sense: it’s a lot easier to maintain build instructions if you limit the set of environments in which they can run, and you are much more likely to be able to reproduce them later or on other people’s machines.

However your sandbox is going to need a compiler and a shell environment in there if it’s going to be able to build anything, and BuildStream leaves open the question of where those come from. It’s simple to find a prebuilt toolchain at least for mainstream architectures — pretty much every Linux distro can provide one so the only question is which one to use and how to get it into BuildStream’s sandbox?

GNOME and Freedesktop base runtime and SDK

The Flatpak project has a similar need for a controlled runtime and build environment, and is producing a GNOME SDK, and a lower level Freedesktop SDK. These are at present built on top of Yocto.

Up to date versions of these are made available in an OSTree repo at http://sdk.gnome.org/repo. This makes it easy to import them into BuildStream using an ‘import’ element and the ‘ostree’ source:

kind: import
description: Import the base freedesktop SDK
config:
  source: files
  target: usr
host-arches:
  x86_64:
    sources:
      - kind: ostree
        url: gnomesdk:repo/
        track: runtime/org.freedesktop.BaseSdk/x86_64/1.4
        gpg-key: keys/gnome-sdk.gpg
        ref: 0d9d255d56b08aeaaffb1c820eef85266eb730cb5667e50681185ccf5cd7c882
  i386:
    sources:
      - kind: ostree
        url: gnomesdk:repo/
        track: runtime/org.freedesktop.BaseSdk/i386/1.4
        gpg-key: keys/gnome-sdk.gpg
        ref: 16036b747c1ec8e7fe291f5b1f667cb942f0267d08fcad962e9b7627d6cf1981

The main downside to using these is that they are pretty large — the GNOME 3.18 SDK weighs in at 1.5 GB uncompressed and around 63,000 files. Creating a hardlink tree using `ostree checkout` takes up to a minute on my (admittedly rather old) laptop. The Freedesktop SDK is smaller but still not ideal. They are also only built for a small set of architectures — I think just some x86 and ARM families at the moment.

Debian in OSTree

As part of building GNOME’s jhbuild modulesets inside BuildStream Tristan created a script to produce Debian chroots for various architectures and commit them to an OSTree repo. The GNOME components are then built on top of these base Debian images, with the idea that in future they can be tested on top of a whole variety of distros in addition to Debian to make us catch platform-specific regressions more quickly.

The script, which uses the awesome Multistrap tool to do most of the heavy lifting, lives here and pushes its results to a repo that is temporarily housed at and signed with this key.

The resulting sysroot are 2.7 GB in size with 105,320 different files. This again takes up to a minute to check out on my laptop. Like the GNOME SDK, this sysroot contains every external dependency of GNOME which adds up to a lot of stuff.

Alpine Linux Toolchain

I want a lighter weight set of host tools to put in my build sandbox. Baserock’s OS images can be built with just a C++ toolchain and a minimal shell environment, so there’s no need to start copying gigabytes of dependencies around.

Ultimately the Baserock project could built its own set of host tools, but to save faff while prototyping things I decided to try Alpine Linux, which is a minimal distribution.

Alpine Linux provide “mini root filesystem” tarballs. These can’t be used directly as they contain device nodes (so require privileges to extract) and don’t contain a toolchain.

Here’s how I produced a workable host tools sysroot. I’m using Bubblewrap (the same tool used by BuildStream to create build sandboxes) as a simple container driver to run the `apk` package tool as root without needing special host privileges. This won’t work on every OS; you can use something like Docker or plain old `chroot` instead if needed.

wget https://nl.alpinelinux.org/alpine/v3.6/releases/x86_64/alpine-minirootfs-3.6.1-x86_64.tar.gz
mkdir -p sysroot
tar -x -f alpine-minirootfs-3.6.1-x86_64.tar.gz -C sysroot --exclude=./dev

alias alpine_exec='bwrap --unshare-all --share-net --setenv PATH /usr/bin:/bin:/usr/sbin:/sbin  --bind ./sysroot / --ro-bind /etc/resolv.conf /etc/resolv.conf --uid 0 --gid 0'
alpine_exec apk update
alpine_exec apk add bash bc gcc g++ musl-dev make gawk gettext-dev gzip linux-headers perl e2fsprogs mtools

tar -z -c -f alpine-host-tools-3.6.1-x86_64.tar.gz -C sysroot .

This produces a 219MB host tools sysroot containing 11,636 files. This is not as minimal as you can go with a GNU C/C++ toolchain but it’s around the right order of magnitude and it checks out from BuildStream’s artifact store into the build directory in a matter of seconds.

We include gawk as it is needed during the GCC build (BusyBox awk is not enough), and gettext-dev is needed by GLIBC (at least, libintl.h is needed and in Alpine only gettext provides that header). Bash is needed by scripts/config from linux.git, and bc, GNU gzip, linux-headers and Perl are also needed for building Linux. The e2fsprogs and mtools are useful for creating disk images.

I’ve integrated this into my builds in a pretty lazy way for now:

kind: import
description: Import an Alpine Linux C/C++ toolchain
host-arches:
  x86_64:
    sources:
    - kind: tar
      url: file:///home/sam/src/buildstream-bootstrap/alpine-host-tools-3.6.1-x86_64.tar.gz
      base-dir: .
      ref: e01d76ef2c7e3e105778e2aa849a42d38dc3163f8c15f5b2de8f64cd5543cf29

This element is obviously not something I can share with others — I’d need to upload the tarball somewhere or set up a public OSTree repo that others could pull from, and then have the element reference that.

However, this is just the first step towards some much deeper work which will result in me needing to move beyond Alpine in any case. In future I hope that it’ll be pretty straightforward to obtain a minimal toolchain as a sysroot that can be pulled into a sandbox using OSTree. The work required to produce such a thing is simple enough to automate but it requires a server to host the binaries which then requires ongoing maintenance for security updates, so I’m not yet going to commit to doing it …


Killer Climate

This past weekend a big wildfire in the center of Portugal (Pedrógão Grande area) killed 62 people, left the same number of people injured, and around 150 families lost their homes.
Every year the country has a number of wildfires, many of them caused directly by people. However, according to the Portuguese authorities this fire has been caused by a lightening together with the record high temperatures.

One thing that caused a big impression on me is that the majority of the dead were not people that had their homes surrounded by the forest but they were just drivers who were caught by the fire while on the road. So something like this can happen to anyone, not only to the people who lived in the affected areas.

The firefighters have been tireless and are still trying to control a big fire that spawned from the one in Pedrógão Grande. While Spain, France, and Italy have deployed more resources to help fight the fire, the majority of the Portuguese firefighters are actually volunteers who risk their lives every year.
The 150 families who lost their homes come from a rural area and many lost not only their homes but also their cattle and, needless to say, will struggle to start over. So they could use everyone’s help.

There is a number of local initiatives to bring food and supplies to the area, and also a couple of bank accounts set up for donations. You can find the details for those in this Público’s article. Google Translator should be good enough if you don’t speak Portuguese. As a reference, I have donated to the account in the Caixa Geral de Depósitos bank that is listed in the article.

Surely many things will be said about the fire, that the forests could have been properly cleaned (in order to better contain the fire), that the roads could have been closed sooner (saving anyone from getting trapped), that it was “just” a tragedy. However, when every year the news talk about higher temperature records, it’s not crazy to think that Global Warming contributed to these deadly conditions. So when a man says he’s for Pittsburgh, not Paris, that’s not only a stupid argument and stance, it may also very well be a deadly one.

libinput 1.8 switching to git-like helper tool naming "libinput sometool"

I just released the first release candidate for libinput 1.8. Aside from the build system switch to meson one of the more visible things is that the helper tools have switched from a "libinput-some-tool" to the "libinput some-tool" approach (note the space). This is similar to what git does so it won't take a lot of adjustment for most developers. The actual tools are now hiding in /usr/libexec/libinput. This gives us a lot more flexibility in writing testing and debugging tools and shipping them to the users without cluttering up the default PATH.

There are two potential breakages here, one is that the two existing tools libinput-debug-events and libinput-list-devices have been renamed too. We currently ship compatibility wrappers for those but expect those wrappers to go away with future releases. The second breakage is of lesser impact: typing "man libinput" used to bring up the man page for the xf86-input-libinput driver. Now it brings up the man page for the libinput tool which then points to the man pages of the various features. That's probably a good thing, it puts the documentation a bit closer to the user. For the driver, you now have to type "man 4 libinput" though.

June 18, 2017

Not rebasing an old feature branch

Ultimately, my aim was to provide an interface, at the code level, for developers (rather designers) to improve the GUI of video effects. GStreamer effects are very beautifully handled in Pitivi, the main focus was to use this existing underlying infrastructure on top of which a way of easily adding custom UI for effects had to be setup.

One of the ways of stopping ‘duplication of effort’ in Open Source projects is to document everything, even failed/blocked attempts. Thanks to nekohayo (previous maintainer at Pitivi) opening task T3263, his work from 2013 towards providing such an interface is now up and running again.

Unless you want to preserve commits, rebasing a very old feature branch, that is not yours, is pointless. I did not want to waste time resolving merge conflicts in then unfamiliar code. Following a bottom-up approach, I started working on top of the current Pitivi master integrating the old code into it, step-by-step, one function at a time. Compiling and understanding the errors and then fixing them. I found this approach to be rather systematic and I think it is much faster since you start porting the code as you read it.

After I had completed porting, it was a first time for me hitting a git-core bug regarding support for multiple authors on a single commit. I simply settled with the temporary solution of using

Co-authored-by: Some One <some.one@example.foo>

in my commit messages.

Finally, at the end of all this I was able to get an example interface for the alpha filter built using glade to work via the above mechanism.

Glade UI File

Working UI

The exact API will, most likely, undergo change. I will describe it in detail in my next post in a couple of weeks. You can checkout my work so far D1744 and D1745, feel free to ping me on #pitivi on freenode :)

Feeds