Median Tracking

Median tracking is the term given to the process used to track the median in a given array as elements are inserted into an array. Remember that the median is not the average value but the value in the middle. So, for example, in an array of 3 items [1,2,3] the median element, assuming 0 based indexing, is the item at index 1, which is 2. Now we already know that a heap is a good data structure for tracking the minimum or maximum of any value. With a heap we know that we can read an array and push elements as they are read onto the heap and, if the heap is a max heap, the largest of the elements will also be at the top. If it’s a min heap the smallest of the elements will be at the top ready to be popped whenever we want. We can exploit this utility using 2 heaps to track the median value. One heap is a max heap containing the lowest numbers of the array and the other is a max heap containing all the minimum values in the array. Thus, the top of the max heap becomes the max of all the mimimum values and the top of the min heap becomes the min of all the maximum values. The median element is then one of those two values. If the heap sizes are uneven then pick the element at the top of the largest heap to be the median. If the heap sizes are even then pick one heap and always use the top of that heap. The algorithm is shown in C++ below.

The full source code is here

Additionally it would be trivial to extend this idea to track the kth smallest number by limiting the size of the max heap to be k.

Karger’s Min Cut

A minimum cut of a graph is the minimum number of edges of a graph that is needed such that the graph remains connected. Karger’s min cut algorithm is a process that takes a given graph finds these edges. It’s a random algorithm because the edge it picks at each stage is randomly choosen. That edge is removed and one endpoint of the edge collapses onto the other endpoint. The edges of attached to the collapsed vertex attach to the new vertex, the vertex that is collapsed to. The process repeats, choosing at random another edge, and collapsing the end points. The process is illustrated in the following sequence of diagrams.

Initial Graph

Here the edge {1,3} is randomly selected and we decide to collapse vertex 3 to vertex 1.

Collapse edge 1 3

This results in the following graph where vertex 3 has disappeared into vertex 1 and it’s edges are now effectively attached to vertex 1. Now we select one of the edges {1,2} and collapse vertex 2 onto vertex 1.

Collapse edge 2 1

This results in the following graph with a loop from vertex 1 to vertex 1. This was the other {2,1} edge in the previous graph. Self loops like this are removed. This is the final graph. Once we get to 2 nodes we stop. We can see that the number of edges is 2 therefore the minimum number of edges for a minimum cut of the graph is 2.


Note that this isn’t the only solution. The algorithm, because it randomly picks edges, could have come up with a min cut of 3. Therefore you have to run the algorithm multiple times to increase the probabilty of getting the mimimum cut.

My first attempt at this was in C++. I modelled the adjacency list of vertices using a vector and each element contained a vector of edges that were connected to the vertex. Each edge contained two endpoints which were pointers to vertices. The adjacency list of edges was a vector of these edges. My first attempt didn’t work. I got my head in spin trying to logically reason about pointers to vertices to edges that contained pointers to vertices, and the whole thing didn’t work properly. I was surprised that I couldn’t get it right, and frustrated since I’d spent a few days on it. Eventually I bit the bullet and decided to start from scratch again, but this time, by choosing a simpler representation of the graph. I still used an adjacency list but instead of pointers used the vertex labels. For example, the vertex list for the graph above is [1 => [4,3,2], 2 => [1,3], 3 => [2,4,1], 4=>[1,3]. The corresponding edge list is [{1,2},{2,3},{3,4},{4,1},{1,3}]. Then, with a pen and paper I stepped through what would happen if an edge was removed and the endpoints collapsed. It allowed me to observe the changes to the lists themselves and come up with a different algorithm for collapsing a vertex, one that was not based on the pictorial representation above, that I coded in C++.

  • Pick a random edge, call the endpoints i and j. Let j be the vertex that is collapsing.
  • Remove edge {i,j} from the list of edges
  • In the list of edges replace all occurrences of j with i.
  • Remove self loops from the edge list.
  • Find the vertices connected to vertex j excluding i
  • Append those vertices to the vertex i
  • Remove vertex j from the list of vertices.
  • For each vertex, replace all occurences of j with i in that vertex’s list of vertices.

I reimplemented the algorithm in Elixir and decided not to publish the source code since this is an assignment question from the Stanford Algorithm Design course on Coursera.

Written in Asciidoc using Vim.

No Debugging Programming in Elixir

At last I have found a language that I do not need to debug. I can just write the code and have it run on the input data, and it will give me the correct result. The language is This is the Holy Grail!

So this is what happened. I’m studying algorithms and doing a course and one of the assignments was to write a inversion counter. So what’s an inversion? It’s basically where, in a list of numbers, a number is out of sequence with respect to another number in the list.

Example of no inversions


Example of inversion. Here 3 is out of order.


Here there are 6 inversions.


A brute force algorithm that calculates this would have check each number against all the numbers in the list and therefore would be O(n^2) which is inefficient. The trick is to piggyback on mergesort which has a running time of O(nlogn). The reason this works is because during the merge step of the mergesort we detect inversions. When writing the larger array from the two subarrays the process detects inversions every time you find an element in the left array that is larger than one in the right array. At the end of the day a few subtle modifications to the mergesort algorithm is all that’s needed. And that’s what I did in Elixir. I stumbled along and my tired fingers and sore eyes cranked out some Elixir. The biggest surprise was it worked (ignoring compilation errors) right from the start. I didn’t have an input case that produced the wrong answer for the assignment. I didn’t have to go back and rewrite any of the pattern matches. I didn’t have to fix the recursion because I’d put the calls in the right place. Here it is.

The problem with the solution above is it is slow. On an input set of 100000 numbers it took 30 seconds to compute. It contains a hidden operation that makes the summarize function O(n^2). The hidden operation is the ++ that appends the element to the end of the list. To do this it has to walk to the end of the list, which is an O(n) operation thus turning the summarize function from an O(n) operation to an O(n^2) function. The way to fix it is to prepend the head on the list and then reverse the list later as shown below.

Applying algorithms intuitively

Algorithms need to be internalized such that you can produce a solution without conscious effort, similarly to how you produce words without thinking explicitly about them. In fact thoughts only are created in the subconscious and the consciousness becomes aware of them. There’s no conscious control of them. However what you do have control over is what you program into your subconscious via learning and repetition and active problem solving. I believe that this can be applied to different problem domains such that, after training for a period of time, one can intuitively and a subconscious level see a real world problem and map it to a set of algorithms that solve it.

Xnodes – Architecture of a Scalable Distributed System

Note that this document is a work in progress and represent my ideas about how a scalable distributed highly available cluster should work.

What is Enodes

  • A platform to enable a distributed highly available heterogeneous manageable cluster.
  • A platform for Internet of Things devices.
  • A platform for building network elements.

Let’s analyze the first description. There are a lot of terms in the sentence A platform to enable a distributed highly available heterogeneous manageable cluster. that need to be defined.

  • Distributed. Nodes in the system can be geographically distant. They need not be on the same local area network.
  • Highly Available. Services running in the system are highly available. The loss of one instance of a service does not affect the delivery of the service. Overloading one instance of a service does not block clients from using that service.
  • Heterogeneous. Nodes in the system do not have to be on the same hardware or same version of software.
  • Manageable. Facilities provided for the installation of applications, their upgrade and downgrade with preservation of transforms to that applications configuration.
  • Cluster. A set of computing units or nodes connected in a network.

Availability or 99.9999% uptime
Say we want to have a highly available service. Let’s pretend it’s a database. We want it to tolerate a failure on the host machine so that it is still available. Simplest solution is to have it running on another host machine that acts as a standby. But what if numerous requests come in thus overloading the active database? Well let’s make the standby node active as well. Now we just need a load balancer at the front that round robins connections from clients between the two active databases. Simple. No need for overly complicated redundancy policies. Now what if we want to scale out to three node in our cluster? Easy we just add another node, make it active and configure our load balancer to round robin connections between three nodes in our cluster. If one node dies the other two will pick up the slack. Again no need for complicated redundancy groups. Availabilty is ensured by having multiple instances of the service running at any one time. If the services need some sort of logic to decide who is the master instance they can build themselves or Enodes can provide a service that makes that decision.

Data consistency
Ok great so we have a cluster of nodes and we have a way to ensure that a service is always available. But what about the configuration data of that service? In other words, which node is the authority about the information stored? And if we have a service running on multiple nodes surely all instances of the same service should share the same view of the data? Surely we need consistency between values between databases? I say rethink that assumption. How do you think Facebook and Amazon ensure such high availability and scalability? (Hint: they don’t ensure consistency, only that eventually the system will tend to a consistent state) It’s up to the application designer to decide which part of their data they definitely want to keep consistent and in the case where there is contention, it’s up to the application to handle it. In short it becomes a user interface problem. This model is not as bad as it sounds and in practice is far better than trying to enforce consistency on a database that has to scale and be available. The alternative, forcing consistency, requires every client application to agree to the transaction that is about to process, thus slowing the responsiveness of every application since they all have to wait. This has the side effect of reducing availability. Even if the client has established a connection with the database service but it’s waiting for the database to answer another clients request, that database is effectively offline. This has been thought through in the past; if it hadn’t Amazon, Google, Facebook etc would not be in business. Even git uses this conceptual model. Every time we use git ( we are sacrificing consistency and gaining availability and scalability. We sacrifice consistency because we can make changes to out local repository without informing the upstream repository. When we commit a change locally, the local and upstream repositories diverge. When someone else commits a change locally, all other local repositories are no longer in sync. Importantly git doesn’t take any steps to enforce consistency. Consistency happens as part of the workflow and as part of the user interface. The users of git bear the responsibilty and do the work required to maintain consistency. We know that our local and upstream repositories will eventually be consistent because after everyone pushes their local commits, and everyone pulls from the upstream repository, all repositories will be consisitent. Similarly Apache Cassandra is a database that does what I’ve just outlined: All nodes will eventually tend towards a consistent state. The has this to say: “Cassandra weak consistency comes in the form of eventual consistency which means the database eventually reaches a consistent state. As the data is replicated, the latest version of something is sitting on some node in the cluster, but older versions are still out there on other nodes, but eventually all nodes will see the latest version.” In Cassandra’s case the applications, and not the user as is the case with git, bear the reposibility of resolving the conflicting case. For another view on the tradeoff between data consistency see


These are the properties we want a configuration database to satisfy.

  • Scale from 1 to thousands of nodes.
  • High availability.
  • Eventually consistent (always allow writes).
  • Tolerate network partitions.
  • Throughput and speed of read/write transactions isn’t a concern. It’s intended to be used as a configuration database and configuration data doesn’t change frequently.


Traditional SQL databases are strongly consistent and can be made highly available. They tend to be run on a handful of top of the line servers depending on application demands. Scaling out with them is hard as they favour strong consistency, so the more servers you add, the more time it takes for the service as a whole to maintain it’s consitency guarantees on transactions. They would be unsuitable for the enodes platform, since we require horizontal scalability.


Key Value database
These are database like Riak and Redis that store keys and values. Simple queries are straightforward but more complicated queries like I want key x if key y satisfies some condition require more code on the application side. Essentially the application has to implement the code that SQL queries give you for free. However the benefit of these databases is that they scale horizontally very easily, work well as a cache, and are good for structureless/schemaless data.

Graph databases
Theses databases are specialized to store information in a graph. They are application specific and used to model a social network or where there is a graph like structure. I’ve got a feeling that this sort of database is kind of database you turn to when you know already that you need it. In other words, if you’re questioning if you need this, then you don’t.

Document databases
These store data in what essentially look like structs or records from a programming langauge, or a JSON object. They map easily to the code one might right for an application since almost every language I’ve seen (except Bash) has some facility to define structs of records.

Column family databases
These are large volume, high availabilty databases, like Cassandra, that are used when all your data will not fit on a single server which means you probably have several hundred terrabytes of data. In the case of an enode, well there will definitely not be more configuration data than the enode can handle, so I think we can safely ignore these types of databases.

A note on monitored/read-only data

Monitored data requires fast read/write and has a different set of properties from config data. Most noticibly it’s not configurable. The user can’t change the value of a database entry that is used for monitoring some aspect of a system, like the packets per second on a nic. It’s not persisted either. I’m not sure what to use for monitored data yet. It would be possible to use the same db as the configuration database but that’s not ideal.

Hypothetical Use Cases

To give you an idea of what Enodes helps with the following use cases are outlined below as a thought experiment.

Running a web server
The typical scenario is an organization running a web server that delivers content to external clients. Enodes will manage changes to the configuration of the server. Transforms to the services config are preserved so that a rollback can be performed. Additionally a snapshot of the content associated with that instance of the config is taken. This allows a proper rollback of the data. It’s not sufficient to only record the configuration changes. Dependencies between configuration and associated binaries are also tracked. For example assume that the Apache Web Server (httpd) has a dependency on OpenSSL. The current system is running version A of httpd and version A of libssh2. Then one day OpenSSL implements version 320 of the TLS protocol and release version B of openssl. Later, the Apache developers release a version, B, of httpd that supports version B of openssl. They also add another key in the httpd.conf configuration file. This means that this config can only work with version B of httpd and requires the new version of the openssl shared object on the host machine. Anyway the system admin performs the upgrade, installing version B of httpd and version B of openssl and updating the httpd.conf file to use the new configuration option. A few days later, for some reason, they decided to rollback httpd. Maybe and exploit was discovered in the latest version of httpd, something they couldn’t test for before upgrading. Enodes allows them to do this safely because the configuration data associated with version A of httpd is preserved. This state becomes the state that all nodes in the system will tend to. Content can also be snapshot to the version of the software installed.

Installing software
Installing a new piece of software changes the configured state of the system. Software is first installed on any node in the system and then the other nodes follow that node automatically install the software. In cases where the installation fails an alarm is raised.

Upgrading changes the configured state of an existing software package. The old state is snapshot (including config data and content), the binaries updated and new config replaced. Configuration transforms are executed. Again this is done on a single node in the xnode system first. All other nodes then tend to this nodes configuration.

The software binaries are downgraded to the previous version. The previous configured state is restored. Other nodes tend to this configuration.

The application binaries are removed from the running instance of the node. Configuration state is preserved. All nodes will tend to this new state. A system admin can always rollback the system config to a time where the application was installed.

In production testing
Say you want to role out the latest version of WordPress. You have an Xnode system running but want to test this latest version of WordPress. Enodes will allow you to 1) select the nodes to isolate 2) role out the upgrade to WordPress on isolated nodes. Then you can test the new version of WordPress by configuring your network to redirect a percentage of requests to the isolated set of nodes.

Configuration data

The Enodes database is CouchDB. It is intended to store system configuration information, like the packages installed on the Enodes system, their configuration. Changes to the configuration are not immediately synchronized. Instead, because of the way Riak works, the state of the system will eventually become consistent. By trading consistency this allows Enodes to provide scalabilty and availabilty. The following is a quote from the OpenSAF IMM Programmers Guide and describes a design Enodes will avoid:

“During the time when sync is in progress, the IMMSv is not writable. The IMMNDs will reject any request that implies a mutation to the repository with SA_AIS_ERR_TRY_AGAIN. Read requests are of course still serviced. Updates to cached non-persistent runtime attributes are also allowed during sync.”

Open Issues

How are services that require redundancy but every instance of the service can’t be in use handled? For example how would Enodes handle a dhcp service?


Beating the CAP Theorem Checklist[]

Times, Clocks and the Ordering of Events in a Distributed System[]

Byzantine Generals[]

CAP 12 years later[]

People that drag you down …

People that are bad at what they do and don’t know it are funny to laugh at. But imagine having to work with someone like that? Their mistakes might be amusing for a while but it doesn’t take long for it to get tiring and frustrating resulting in me losing my motivation. Additionally, having to work with below average colleagues, who are supposed to be senior fucking developers, doesn’t extend me. I can’t grow and develop if I have to play nanny to questions and thought patterns that a 5 year old could work out. The key here is work out, or solve. It’s not so much the facts that matter in development but how you use them. I have an example like this below:

Wing: I can’t build. It errors with ‘eu-strip not found’.
Me: The error is telling you that you need eu-strip. Install the package that contains eu-strip.

… some time passes….

Wing: ‘apt-get install eu-strip’ doesn’t work.

I’m already annoyed at the first question. But then the second question? Did she even stop to think?

Hell of the North

Hell of the North is a cozy little restaurant with a rustic feel in Fitzroy, Melbourne. The food is French inspired with a fairly straigtforward no fuss menu with words you can recongnize. Chicken, fish, free range pig. I didn’t see ‘jus de ….’ anywhere on the menu. The most foreign thing on it is ‘pomme frites’. Unpretentiousness is a good thing and here they let the food do the talking. The entree was a black inked squid stuffed in a crispy ball and a twice cooked cheese souffle. Although I didn’t like the squid dish, and here I feel it just wasn’t to my taste and is in no way negatively reflective of the quality of the dish, which had a well balanced seafood flavour that never overpowered, it was eclipsed by the twice cooked cheese souffle. I’m an unsophisticated diner so a lot of the subtleties present in the dishes were lost on me. The simplicity of the menu understates the complex nature of the food and was interesting. Overall I thought it was very good.

Cured Kingfish with Apple

Black inked squid balls