Coincidentally three different people asked me in the last month, to write about new technologies that they should be knowing, to make them more eligible to get a job in a startup. All these people have been C/C++ programmers, in big established companies, for about a decade now. Some of them have had only glimpses of any modern technologies.
I have tried a little bit (with moderate success) to work in all layers of programming with most of the popular modern technologies, by writing little-more-than-trivial programs
(long before I heard of the fancy title "full stack developer
"). So here I am writing a "technology catchup" post, hoping that it may be useful for some people, who want to know what has happened in the technologies in the last decade or so.
Disclaimer 1: The opinions expressed are totally biased as per my opinion. You should work with the individual technologies to know their true merits.
Disclaimer 2: Instead of learning everything, I personally recommend people to pick whatever they feel they are connected to. I, for example, could not feel connected to node-js even after toying with it for a while, but fell in love with Go. Tastes differ and nothing is inferior. So give everything a good try and pick your choice. Also remember what Donald Knuth said, "There is difference between knowing the name of something and knowing something". So learn deeply.
Disclaimer 3: From whatever I have observed, getting hired in a startup is more about being in the right circles of connection, than being a technology expert. A surprisingly large number of startups start with familiar
technology than with the right
technology, and then change their technology, once the company is established.
Disclaimer 4: This is actually not a complete list of things one should know. These are just things that I have come across and experimented a little bit at least. There are a lot more interesting things that I would have have missed. If you need something must have been in the list, please comment :-)
With those disclaimers away, let us cut to the chase.
Version Control Systems
The most prominent change in the open source arena, in the last decade or so, is the invention of Git
. It is a version controlled system initially designed for keeping the kernel sources and has since then become the de-facto VCS for most modern companies and projects.Github
is a website that allows people to host their open source projects. Often startups recruit people based on their github profile. Even big companies like microsoft, google, facebook, twitter, dropbox etc. have their own github accounts. I personally have received more job queries through my github projects than via my linkedin profile in the last year.bitbucket
is another site that allows people to host code and give even private repos. A lot of the startups that I know of use this, along with the jira
project management software. This is your equivalent of MS Project in some sense.
I have observed that most of the startups founded by people who come from Banking or Finance companies to be using Subversion
. Git is the choice for people from tech companies though. Mercurial
is another open source, distributed VCS which has lost a lot of limelight in the recent times, due to Git. Fossil
is another VCS, from the author of sqlite, Dr. Richard Hipp. If you can learn only one VCS for now, start with Git.
has evolved to be a leading programming language of the last decade. It is even referred to as the X86 of the web. From its humble beginnings as a client-side scripting language to validate if the user has typed a number or text, it has grown into a behemoth and entered even the server-side programming through the node-js
framework. For incorporating ModelViewController
framework. JS is a dynamically typed language and to bring in some statically typed langauges' goodness, we have a coffeescript
is a web framework that is built on python to make it easy to develop web applications. In addition to being used in a lot of startups, it is used in even big companies like Google and Dropbox. There are variants of Python runtime such that you can run it in the JVM using Jython
or in the .NET CLR using the IronPython
. I have personally found this language to be lacking in performance though, which is elaborated more in a subsequent section.Ruby
is an old programming language that shot into fame in the recent years through the popular web application framework Ruby on Rails
, often called just Rails. I have learnt a lot of engineering philosophies such as DRY
etc. while learning RoR.
All these above languages and frameworks use a package manager such as npm
etc. to install libraries easily.Go
is my personal favorite in the new languages to learn. I see Go becoming as vital and prominent a programming language as C, C++ or Java in the next decade. It is developed in Google for creating large scale systems. It is a statically-typed, automatic-memory-managed language that generates native-machine-code and helps writing concurrent-code easily.
Go is the default language that I use for any programming task in the last year or so. It is amazingly fast even though (just because?) it is still in the 1.X series. In my dayjob we did a prototype in both go and python, and for a highly concurrent workflow in the same hardware, Go puffed Python in performance (20 seconds vs 5 minutes
). I won't be surprised if a lot of the python and ruby code gets converted to golang in their next edition of rewrites. Personally, I have found the quality of go libraries to be much higher compared to Ruby or nodejs as well, probably because not everyone has adapted to this language yet. However, this could be just my personal biased opinion.
If you like to get fancy with functional programming, then you can learn Scala
(on top of JVM), F#
(on top of .NET), Haskell
, etc. The last two are very old btw but in use even today. Most recently, Whatsapp
was known to use Erlang. D
is also seen in the news, mostly thanks to Facebook. Dart
is another programming language that is aimed for high-performance concurrent systems. But I have not played around with it, as they don't maintain a stable API and they are not 1.0 yet. Julia
is another programming language aimed at doing distributed systems, about which I have heard a lot of praise, but it still remains a exotic language afaik. R
is another language which I have seen in a lot of corporate demos where the presenters wanted to show statistics, charts. Learning this may be useful even if you are not a programmer and works with numbers (like a project manager)
There is a Swift
programming language from Apple to write iOS apps. I have not tried Swift yet, but from my experience of using Objective C
, it cannot be worse.Bootstrap
is a nice web framework from twitter, which provides various GUI elements that you can incorporate into your application, to rapidly prototype beautiful applications, that are fluidic even when viewed in mobile.jquery
(shortly CSS) is a markup language that helps configure the style of the web page UI elements. CSS is becoming mature to the extent of showing animations too. You should ideally spend a few weeks to learn about HTML5 and CSS.
is what the cool kids use these days as the editor. I have found the tutorial on tutsplus
to be extra-ordinarily good at explaining sublime. It is a free (as in beer) software and not open source.
is another editor that I have heard good things about. Lime
is an editor that is developed in Go, aimed to be an open-source replacement for the sublimetext.
Personally, after trying various text editors, I have always comeback to using vim
. There are a few good plugins for vim in the recent times. Vundle
are nice plugin managers for vim to ease up installation of plugins. YouCompleteMe
is a nice plugin for auto-completion. vim-spf13
is a nice distro of vim, where various plugins and colorschemes are pre-packaged.
In the modern day of computing, most programs have been driven by a Service Oriented Architecture (shortly SOA). Webservices are the preferred way of communication among servers as well. While we are talking about services, please read this
nice piece by Steve Yegge.
is a distributed (across multiple machines), caching system which can be used in front of your database. This was initially developed by Brad Fritzpatrick, while he was the head of the LiveJournal and who is now (2014) a member of the Go team at Google. While at Google, he has started GroupCache
which as the project page says is a replacement for memcache in many cases.
(GFS) is a seminal paper on how Google created a filesystem to suit their large needs of data processing. There is a database built on top of this filesystem named BigTable
which powered Google's infrastructure. Apache Hadoop
is an open source implementation of these concepts, which was originally started in Yahoo and now a top-level apache project. HDFS
is the equivalent of GFS for the Hadoop. Hive
are technologies to query and analyze data from the Hadoop.
As with the evolution of any software, GFS has evolved into a Colossus
filesystem and BigTable has evolved into a Spanner
distributed database. I recommend you to read these papers even if you are not going to do any distributed computing development.
is another distributed database which was started in Facebook initially, but is used in many companies such as Netflix
and Twitter. I have used Cassandra more than any other distributed project and actually like it a lot. It uses a SQL like query language called CQL - Cassandra Query Language. It is modelled after the DynamoDB
paper from Amazon. I am too tempted to write an alternative to this in Go, just to have the idea of writing a large scale distributed system, instead of just using it as a client, but have not got around to a good dataset or usecase with which I can test it.
is another document oriented database, which I tried using for a pet project
of mine. I don't remember exactly but there were some problems with respect to unicode handling. The project was done prior to go becoming 1.0, so the problem could be in any end.
Most of the new age databases are called NOSQL databases but what they really mean is that the database skips a lot of functions (such as datatype validation, stored procedures, etc.) and try to grow by scaling out instead of scaling up.
is a suite of open source projects that help you create a private cloud. DeltaCloud
is a project which was initially started by RedHat, and now an apache top-level project, as a way to provide a single API layer which will work across any cloud in the backend. This project is done in ruby. I was initially interested in participating in its development, until I got introduced to Go and moved into a different tangent.
To start off a software company is a very easy task to do in today's world. The public clouds are becoming cheaper and cheaper everyday and their capacity can be provisioned instantly.Amazon web services
provides an umbrella of various public cloud offerings. I have used Amazon EC2
which is a way to create a Linux (and windows) VM that runs on Amazon's datacenters. The machines come on various sizes. Amazon S3
is a cloud offering that provides you way to store data in buckets. This is used by Dropbox heavily for storing all your data. There are various other services
too. In some of our prototyping, we found the performance of Amazon EC2, to be consistent mostly, even in the free tier.
Google is not lagging behind with their cloud offerings either. When Google Reader was shut down, I used Google's Appengine
to deploy an alternative FOSS
product and I was blown away by the simplicity of creating applications on top of it. Google Compute
is the way to get VMs running on the Google Cloud. As with Amazon, there are plenty of other services
There are plenty of other players like Microsoft Azure, Heroku
etc. but I do not have any experience with their applications. While we are talking about Cloud, you should probably read about Orchestration
and know about at least Zookeeper
These are databases which you can embed into your application, without needing a dedicated server. They run on your process-space.
is the world's most deployed software and it competes with fopen to become the default way to store data for your desktop applications (if you are still writing them ;) ). A new branch
is coming with the latest rage on storage datastructures, a log-structured merge tree
is a database that is written by the eminent Googlers (and trendsetters of technology in the last decade or so) Jeff Dean and Sanjay Ghemawa
t who gave us MapReduce, GFS etc. It is forked by Facebook into RocksDB
are other projects on this space.
Since we have covered GFS, HDFS, etc. earlier. We will look at other popular filesystems.btrfs
is a copy-on-write filesystem in Linux. It is intended to be the defacto linux filesystem in the future, possibly obsoleting ext series in the longer run.XFS
is a filesystem that initially came from SGI to Linux. This is my personal favorite and I have been using it on all my linux machines. In addition to good performance, this offers robustness and comes with a load of features that are useful to me, like defragmentation.
We also have the big daddy of filesystems zfs
too on linux.
is another interesting distributed filesystem that works on the kernel space and is already merged in the linux kernel sources for a long time now. GlusterFS
is another distributed filesystem which works in the userspace. Both of these filesystems focus on scaling out instead of scaling up
Pick any of these technologies that you like and start writing a toy application on it, may be as simple as a ToDo application and learn through all the stages. This approach has helped me. It may help you also.
I have written this post from a Thinkpad T430 running openSUSE Factory and GNOME Shell with a bunch of KDE tools. I like this machine, However, in the past few months I have realized that, in today's world, If you are a developer, it is best if you run Linux on your server and Mac on your laptop.