Colors of The Dark Knight trilogy

Screen Shot 2013-09-14 at 15.00.30 Screen Shot 2013-09-14 at 15.37.13 Screen Shot 2013-09-14 at 16.14.32


These three images above show the average color of every single second of The Dark Knight trilogy.

In order to create them, I used ffmpeg to split up the movie into separate PNG files at a speed of 1 fps – representing one file per second. For each film there were around 8000 of these files.

I then wrote a simple Go program that would read each of the image files and calculate the average RGB values across the entire image. Thanks to Go’s concurrency patterns I was able to easily run 1000 of these tasks at once before generating the final images – my computer started to struggle running more than 1000 go routines at once, so it made sense to split it up into 1000 second blocks instead.

The whole project was reasonably easy to do and only took a few hours, and most of the processing time seemed to be ffmpeg rather than Go.

From the final images, I would say it is reasonably easy to discern which scenes are going on where if you are reasonably familiar with the movies.


My first Chrome extension


My tumblr, visualized

After doing some data visualizations on reddit over the weekend based on using text to create ‘ASCII/word circles’ that show how each letter is connected to another I made my first Chrome extension. It is basically a visualization of Markov chains.

You can get the Chrome extension from the Chrome Web Store now. Simply click the button in your toolbar to view the visualization for any web page.

Parsing HTML easily in Objective-C with ObjectiveGumbo

Previously if you wanted to parse HTML on iOS/OSX using Objective-C you had to use an XML parser however I have now written an Objective-C wrapper around Gumbo, Google’s new C HTML5 parsing library, so that it is really easy to get parse HTML with minimum hassle.

To get started you will firstly need to download the Gumbo source code from their GitHub repo and follow their getting started instructions.

Once you have a working local copy of Gumbo (you can validate this by running one of the samples) you should download the source code forObjectiveGumbo from GitHub. To add ObjectiveGumbo to your Xcode projects do the following:

  1. Add the ObjectiveGumbo folder from the repository – this contains the source code for the framework, which is basically just a few classes
  2. Add the src folder from the Gumbo repo (only the .c and .h files)
  3. Ensure that Xcode is set to compile all .c and .m files (Xcode 5 doesn’t do this by default when adding files to a project
  4. Validate that the project builds correctly
  5. Import “ObjectiveGumbo.h” when you want to use it in your app

Usage for ObjectiveGumbo is pretty simple. You can either parse a whole document – which will gain you access to the DOCTYPE information – or just the root element (i.e. the body). These can be loaded from an instance of NSData, NSURL or NSString.

Simple example fetching the current top stories on Hacker News:

OGNode * data = [ObjectiveGumbo parseDocumentWithUrl:[NSURL URLWithString:@""]];
NSArray * tableRows = [data elementsWithClass:@"title"];
for (OGElement * tableRow in tableRows)
    if (tableRow.children.count > 1)
         OGElement * link = tableRow.children[0];
         NSLog(@"%@", link.attributes[@"href"]);

More detail is explained in the README and there is also a more developed Hacker News example (iOS) and simple Markdown to HTML parser (OSX – not complete) in the GitHub repo.

Building color palettes from images with C#

This image shows the Wikimedia 'Best picture of 2012' with the generated color palette.

This image shows the Wikimedia ‘Best picture of 2012’ with the generated color palette.

Today I was playing around a little with C# and I wrote a simple tool that will generate a color palette of the colors that make up an image. You can get the source on GitHub.

The code works by first loading the image and then counting the number of pixels for each value of hue and saturation. The mean number of pixels per color is then calculated and the results are plotted to a 255px by 360px image and the color is determined by how much greater than the mean it is. Below are some more samples of palettes that I produced:

N-sided shapes

You’ll probably know that the formula for calculating the interior angle of a regular polygon is 180(n – 2) / n where n is the number of sides:

Angle formula

So if we put 4 (i.e. a square) into the formula we 90 degrees, if we stick in 5 (pentagon) we get 108 degrees. Simple. In fact, lets create a JavaScript function to draw these shapes if we enter a starting coordinate, side length and number of sides:

 function drawShape(c, startX, startY, sideLength, sideCount)
  var interiorAngleDegrees = (180 * (sideCount - 2)) / sideCount;
  var interiorAngle = Math.PI - Math.PI * interiorAngleDegrees / 180; //Convert to radians;
  c.translate(startX, startY);
  for (var i = 0; i < sideCount; i++)
Example of a square

Example of a square drawn with function

Now, if what happens if we take our original formula and stick in something that isn’t an integer. If we put in 2.5 we get out 54 degrees. What does a 2.5 sided shape look like with the above function?

A '2.5' sided shape.

A ‘2.5’ sided shape.

Clearly this isn’t a full polygon, so perhaps it is better to think of 2.5 as 5/2. Let’s adapt the function a little:

function drawShape(c, startX, startY, sideLength, sideCountNumerator, sideCountDenominator)
   var sideCount = sideCountNumerator * sideCountDenominator;
   var decimalSides = sideCountNumerator / sideCountDenominator;
   var interiorAngleDegrees = (180 * (decimalSides - 2)) / decimalSides;
   var interiorAngle = Math.PI - Math.PI * interiorAngleDegrees / 180; //Convert to radians;
   c.translate(startX, startY);
   for (var i = 0; i < sideCount; i++)

Now the function can draw 5 / 2 shapes:

A 5 / 2 sided regular polygon.

A 5 / 2 sided regular polygon.

If you want to check out a full source code example, you can see it here.

Stereoscopic 3D on iOS

Stereoscopic  3D

iOS devices don’t have any kind of 3D option built in, which I’m glad of because it is a pointless gimmick that gives me headaches. Having said that, I quite like the ‘retro’ stereoscopic 3D that could be achieved with red-cyan glasses.

The premise of 3D images is really simple: you have two different images taken it two different positions that are roughly eye distance apart and you then have to find a way of making sure each image only gets into one eye.

About three months ago I was learning the basics of OpenGL ES and I created a dumb Minecraft style world:

'Minecraft' style world

I actually got it pretty well optimised and it can run happily at about 60fps. I then began to look at the code this morning and figured out that stereoscopic rendering wouldn’t be that hard to add to it.

The method is actually really simple:

  • Create two offscreen textures and associated frame buffers* that are the same dimensions as your view.
  • On each frame create two view matrices from your original view matrix with each one shifted a little (I went for about 5mm on an iPad screen)
  • Render the separate view matrices twice into the associated offscreen textures
  • Present both textures blended together with red and blue

The end result (this is the same view as above) looks a bit like this:

iOS Simulator Screen shot 1 Apr 2013 18.00.17

There are a number of disadvantages to this technique. The first is that you can’t go to really high resolutions. I got this running at 60fps on a retina iPad, but I had to render at 1024 * 768 (rather than native 2048 * 1536) without anti-aliasing. The second is that you lose a lot of color information; I had to grayscale my images and even then they appeared quite dull compared to the original image. The third is that this doesn’t scale well across devices because of the distance between the left/red camera and the right/blue camera. I added a pinch to change distance feature so that I could compare between the iOS simulator and real devices.

Despite this, it is actually quite a cool technique although I don’t think that it will become particularly mainstream yet.

iOS Simulator Screen shot 1 Apr 2013 18.16.53 iOS Simulator Screen shot 1 Apr 2013 18.16.58

Update: After some discussion on Reddit I’ve now updated the source a little so that the blending is a little different: the final pixel is made up of the red component of the left pixel and the green and blue components of the right pixel. This maintains color and produces brighter images:

As per request, I’ve also stuck the code on GitHub. It is a little verbose at the moment, but it’s still readable.

*Oh yeah, no GLKit here!

Things I wanted from Android

A year ago today (March 24th 2012) I published a post on my old blog about things that I wanted from Android. I thought I would do a follow up post now, a year later, to see whether or not those things did ever happen:

  • SVG support everywhere: the key advantage of using SVG for UI graphics is that you don’t have problems with pixel density on the screen. Unfortunately Android still doesn’t support the use of SVG like this, so we are stuck with PNGs for UI graphics. The ADT does now make it easier to import one high-resolution graphic and produce lower resolution versions, and 9-patch PNGs are always an option
  • Better font support: Roboto was introduced with Android 4, but Android still lags behind big time on the number of fonts
  • Higher PPI displays: At the time nothing really competed with the iPhone 4/4S, however now there are a good number of Android phones that do.
  • One Android: I had envisioned that all devices would come with one standard Android. This hasn’t happened, but the growing popularity of Nexus devices has lead to more devices using stock/Google Android
  • Google Play gift cards: these have been introduced, and sales have increased as a result 🙂
  • Better integration between Eclipse and Google Play: when you finish producing an iOS app, it only takes a couple of clicks in Xcode to send it off to Apple. With Android, you still have to produce an APK and upload it via the web interface. Obviously this isn’t much of a pain, but I like Apple’s simple system.
  • OTA updates: obviously automatic updating of the Android system is something that vendors do rather than Google (aside from Nexus devices, and their increasing dominance is a very good thing for developers) however the difference between users on the latest version of iOS and Android is still huge. Interestingly, long before the original post was written Google had announced in 2011 the ‘Android Update Alliance’ which promised timely updates by manufactures every 18 months after a device release. This never happened, and I should imagine that the reasoning is that the devices of early 2011 would have been running Android 2.x, whereas the devices released last summer would have been running Android 4.x which required much higher performance
  • A better emulator: today the emulator is much faster, especially thanks to the introduction of GPU/hardware acceleration and a new Intel (rather than ARM) image that makes the execution of apps a hell of a lot faster

Obviously Android still hasn’t met all of these requirements and I didn’t think it will meet some of these for quite some time, if ever. I’m pleased, however, to see that changes have happened and that they have worked. Here is what I would still like from Android, from a developing perspective:

  • Faster development time: This is most likely due to my lack of experience with Android development compared to iOS development, however I’ve found that I can write an iOS app much faster than I can write an Android app. Setting up the UI takes longer, linking the UI to code takes longer, the IDE is slower, the emulator* is slower, deploying the app takes longer and the export process takes longer
  • A better IDE: Eclipse has been a great IDE and it is still very usable, however IntelliJ seems to have a lot more useful features
  • Java?: I’m not sure that we need to stick with Java for Android development at all. Xaramin have done an awesome job at bringing C# to Android, and they claim that performance is just as good (if not better) than Dalvik. That’s impressive, and with a better IDE, nicer language and full API support (but high cost) it certainly is a viable option

*The iOS ’emulator’ is actually a simulator. When you test iOS apps on desktop it compiles an x86 version instead of an ARM version.

Estimating Syllables

The English language, unlike many other languages, has no set rules for what does and doesn’t make a syllable; the dictionary definition is ‘an uninterrupted sound’ that helps to form words. Some dictionaries go onto explain that they can estimated by counting the number of vowels that are surrounded on either side by a syllable.

This basic method actually works in about 75% of the cases that I tried with it. In fact, if you assume Y as a vowel it works with every word in the previous sentence. The problem, however, is that vowels can sometimes appear next to one another and in some cases they will form one continuous sound (such as the io in action) or form two separate sounds (such as the io in lion).

What, therefore, can we do to combat this? The first method is to look for sounds that contain multiple vowels and remove one of the vowels, so ome becomes om and ime becomes im – using the previous method sometime would have four syllables, however by replacing these two sounds we instead correctly identify somtim as having two syllables. I equally found that it was necessary to convert ine to in, ely to ly and ure to ur.

This then solves quite a few problems. Here are some example sentences using solely this method:

  • Shall I compare thee to a summer’s day? = 11
  • Thou art more lovely and more temperate = 13
  • The winds do shake the darling buds of may = 11

Clearly there are faults with this current algorithm, given that Shakespearean sonnets are set in iambic pentameter and therefore have 10 syllables per line (I did find examples, however, where the pronunciation of a word is completely different today, and so my algorithm doesn’t work as effectively on Shakespearean English). By outputting the number of syllables that the program thought was in each word I was able to identify a fault:

  • Compare = 3 syllables
  • More = 2 syllables
  • Temperate = 4 syllables

If the word ended in an e (and later I discovered that provided it didn’t end in le) then there was one fewer syllable than I had originally thought, so I simply adapted the program to subtract one at the end if appropriate, as well as handling plural words to (so compares has the s stripped off and e is considered its last letter).

-e wasn’t the only suffix that I’ve had problems with; -ing causes similar problems. For example, according to the current version walking has two syllables whereas going or flying have only one. Thankfully the rule for fixing this is relatively simple: if the word ends in -ing (or -ings) and the letter immediately before that is a vowel then one syllable should be added. I final suffix that causes problems in a similar fashion (there are others, I won’t discuss them all here) is -ed because happened only has two syllables but the program estimates 3. Despite this, fainted does have two syllables which are correctly found. Therefore I have to examine whether or not there is a proceeding t or d and if there is not then remove a syllable.

So how accurate is the algorithm in its current state? Let’s compare the lines of the sonnet again:

  • Shall I compare thee to a summer’s day? = 10
  • Thou art more lovely and more temperate = 10
  • Rough winds do shake the darling buds of may = 10
  • And summer’s lease hath all to short a date = 10

OK, this is a pretty significant improvement but the algorithm is still far from perfect. For example, taking the first sentence of this post incorrectly identifies didn’t as having only one syllable and languages as having only two, however gets everything else correct.

In my case I did not want to write a 100% accurate algorithm and I believe to do so would probably require a great number of defined ‘special cases’ that would lead to the algorithm becoming bloated, which is a shame because it currently occupies just over 20 lines of C# code.

This algorithm is suited to counting syllables however it is not yet capable of finding syllables. In some cases people may pronounce a word differently (such as long-er and lon-ger) and in some cases it is not clear which letter starts and ends a syllable. Furthermore, language (common use English especially) is slowly devolving towards using fewer syllables in words – missing out the letter t (a pet hate) significantly reduces the number of syllables in one’s speech.

You can check out the source code on GitHub.

HTML5 apps on the App Store

There are many HTML5 apps on the App Store but they are not first class citizens on iOS, unlike on other platforms. They have to be executed by UIWebView, a component allowing the presentation of web pages using WebKit but with the major disadvantage that the JavaScript performance is massively throttled compared to performance in Safari (because Safari uses the Nitro engine). This was expressed dramatically by the old Facebook app, and continues to be shown by many other apps.

Apple could quite easily allow developers to produce iOS apps with HTML5 and package them in Xcode for the App Store. Rather than being executed by a UIWebView they could instead be executed by the full Safari engine with its faster JavaScript performance. A similar feature exists in Safari already that allows you to ‘pin’ websites to the Home Screen and opening them executes them like an app, but being run by Safari without the navigation bar.

This would be incredibly advantageous for web developers because it would allow them to charge for their web apps on mobile platforms and lock down the source code if they were distributed via the App Store. It would also make porting between devices a lot easier, although this wouldn’t be as advantageous for Apple.

There are a few main problems with this, of course. The first is In App Purchases. Currently they account for a huge proportion of App Store revenue, however JavaScript has no way of accessing them unless Apple allowed for the transaction to launch the App Store app, which could prove effective.

Another disadvantage is appearance. Should Apple allow web developers to produce regular HTML markup or should it extend the standard to contain UIViews? The advantage of the former is it would make apps easier to design because existing tools and frameworks could be used. The disadvantage is that it would ruin the current design standards that Apple runs on. Interestingly, Intel announced a UIKit toolkit for JavaScript earlier this month.

Native APIs are also a disadvantage. How does an HTML5 app send notifications (Chrome does have an API for this)? How does an HTML5 app access the camera (again, W3 has some drafts on this)? How does an HTML5 app access photos or contacts? Either Apple doesn’t let them and places them at a disadvantage against native apps or it ports the Foundation framework to JavaScript, but that would ruin JavaScript standards if we started relying on NS* rather than JavaScript standards.

Unfortunately there really isn’t much in it for Apple because it either involves a huge port of their libraries of a huge loss of design I’m apps. The bare minimum is to allow current (major?) apps that rely on UIWebView for the majority of their content to access Nitro, ensuring faster performance.

Ultimately I don’t think that Apple will make HTML5 apps first class citizens on the App Store any time soon, but accessing Nitro could provide a major boost to mobile web apps.

The great GitHub migration

Over the last few months I’ve noticed a pattern emerging on Google Code’s Project Hosting. Almost every single major project (excluding, although sometimes including, Google’s own projects) on the site seems to have migrated over to GitHub. I decided that I would do a little investigation to calculate the proportion of projects that have moved over.

Google Code allows you to search for products by tags so you can view projects written in a specific language. On their homepage they have a list of tags that I assume are most popular/most committed to/most downloads. I then went through this list and recorded the total number of projects for each tag. Obviously there will be some overlap. I then carried out each search again with GitHub appended onto the end, hopefully identifying projects that mentioned that they had moved over. Obviously a more accurate way would be to get a full listing of all projects on Google Code and then to try and find the same project on GitHub, but my method seemed accurate enough – all of the results I clicked on definitely showed themselves as having migrated.

I then recorded all of this in a spreadsheet:

googlecodestatsObviously there are a few things to note from this. Firstly, almost 30% of all projects mentioned GitHub in their description. Secondly, the total repos thrown up by a search often produced very, very similar numbers so I cannot be certain that these numbers are accurate (on the other hand it could be that every Android project is also tagged with Database, but that is clearly not true).

A very important thing to note is the order. I assume that the tags are listed in some kind of popularity/activeness order on the Google Code homepage. It certainly isn’t total number of repos. Therefore we may assume that more popular projects written in Java and Python have also been more active historically, however these projects are now moving to GitHub.

Very little can be concluded from this except that there is definitely some migration of projects away from Google Code and onto GitHub (and I get almost 4 million search results for ‘google code to github’) and even without statistics the open source community is definitely converging on GitHub.