Realtime updates from Postgres to Elasticsearch

Recently I’ve been evaluating elasticsearch, and more specifically how to get data into elasticsearch indices from source-of-truth databases. elasticsearch is sometimes lumped in with the general NoSQL movement, but it’s more usually used as secondary denormalised search system to accompany a more traditional normalised datastore, e.g. an SQL database.

The trick with this pattern is getting the data out of the master store and into the search store in an appropriate timeframe. While there is already a mechanism for updates from SQL databases in the form of the JDBC river (‘rivers’ being the elasticsearch external data-feed mechanism); this operates by polling the database intermittently to retrieve any new or updated data. This is fine, and sufficient for most applications (e.g. an online storefront). However some of the systems I work on are less tolerant of delay (and as a rule I prefer event-based systems to polling), so I was curious to see if it’s possible to implement event-driven updates from PostgreSQL that would propogate to the search cluster immediately.

tl;dr: It is, but requires some non-standard components; the steps required are described below, and a proof-of-concept test implementation exists. Also, this mechanism is not elasticsearch specific, so could be applied to other secondary datastores (e.g. an Infinispan cache grid).

The basic idea behind this is pretty simple; we can use SQL triggers and PostgreSQL’s notify extension to tell a dedicated gateway server that a change has occurred. notify/listen updates occur asynchronously, so this doesn’t block the Postgres trigger procedure. The gateway then reads the changed data and injects it into the elasticsearch cluster.

The first problem with this concept is that I’m working on a JVM platform, and the PostgreSQL Java driver doesn’t actually support asynchronous updates via notify. It instead requires you to poll the server for any new notifications, effectively negating the benefits of using notify. In fact, the driver doesn’t support a lot of newer Postgres features such as multi-dimensional arrays.

However while searching for possible workarounds for this I came across an alternative Java driver that attempts to fix the deficiencies in the current one, including adding true asychronous notifications.

The second issue with this concept is that notifies are not queued; so if the gateway server is down for any period of time updates will be lost. One possible workaround is to maintain a modified column on the tables and read any newer entries on gateway startup. This is fine for simple data-models, but for more hierarchical data this rapidly becomes a maintenance pain (as child tables may need to trigger an update from the parent tables). The workaround for this is to implement an intermediate staging table that stores references to updated data; on each update the gateways reads from it and then deletes the reference; on startup it is read for any unretrieved references that occurred during downtime.

So the final workflow looks like:

  1. Create a trigger against any tables that need to be pushed to the search cluster on modification
  2. The trigger calls a function that add a reference to the staging table, then raises a notification with that reference as the payload.
  3. On notification the gateway reads referenced data, pushes it to the search cluster and then deletes the reference in the staging table. This should be done in a transaction to avoid loss of references in case of a crash.
  4. On startup the gateway performs a read/update of any outstanding references from the staging table and then deletes them.

As a test of the principles I’ve implemented a Clojure-based proof-of-concept project that will propogate changes between a PostgreSQL server and an ElasticSearch cluster in <500ms; these results are for a PostgreSQL server and elasticsearch node running inside a Vagrant/Virtualbox VM on a standard rotating disk, so I’d expect to see better results in a tuned production environment. If you’re interested in trying this yourself the gateway, Vagrant config and test code is all available in Bitbucket.

Atlassian User Group London slides

In February I had the pleasure of speaking with the London Atlassian user group about some of our experiences with continuous delivery and deployment at Atlassian. The slides for this are available below; there was no video this time, but there will be a longer blog post on the Atlassian developer blog appearing soon recapping some of the things we discussed.

SGI Screen font for OS X

I use the terminal a lot, even under OS X[1], and use bitmap fonts there. Until recently I had been using the Proggy family of fonts, as they were the best bitmap fonts available for the Mac. However I recently updated my machine to Mavericks, which managed to mess up the font rendering on iTerm2 (truncating the bottom of dangling characters such as ‘g’, which can get a bit confusing when you use git a lot).

Rather than just adopting an officially sanctioned but less usable Apple font, I got nostalgic for the old SGI Screen font. This is my terminal font of choice under Linux, but does not exist in a Mac-compatible format. However I dug up a copy of the PCF files online (it’s been relicensed to MIT and now ships with OpenSUSE) and ran it through FontForge (via Alistair Buxtons bitmap2ttf wrappers). This produced a passable TTF version of the fonts that will install on OS X and is usable under iTerm2:

sgi-screen-iterm2

The glyphs have some artifacts in the font-book previewer and when antialiased, but they work well in bitmap mode in iTerm2 (i.e. with antialiasing disabled).

The TFF and PCF files are available in this git repo:

https://bitbucket.org/tarkasteve/sgi-screen-ttf

[1] And if anybody can tell me how to do the equivalent of Alt-Backspace (delete backwards by word) in iTerm2 I’d be eternally grateful. Note that Ctrl-W is not the same thing. e.g. give the line cat myfile.txt, Alt-BS will just delete txt, whereas Ctrl-W will delete myfile.txt.

Update: It turns out this can be fixed by configuring iTerm2 to send the hex codes 0x1b, 0x08 in response to Command-Delete or Ctrl-Delete. More details are available in this blog post.

A related issue is that Ctrl-Left/Right for backward/forward-word doesn’t work under Mac. This is due to OS X not shipping with an /etc/inputrc (used by the Readline library). This can be fixed by copying this file from a Linux host (this one from Arch should suffice); either to /etc/inputrc or ~/.inputrc.

New Clojure library: ‘geolocation’

I’ve written a small Clojure library for geolocation routines for a side project I’m working on. The routines in it so far are based on the algorithms described by Jan Philip Matuschek, but converted to be more idiomatic Clojure.

The code, etc:

Some more talks

While uploading the slides from my Devopsdays talks, I thought I’d upload some other talks I’ve given. The main ones of historical interest are the talk to accompany my paper on AccessGrid over XMPP for APAC ’05, and my presentation to the Sydney SIGGraph chapter on the emerging technologies we were pursuing at the Sydney University visualisation lab.

My talks from Devopsdays London

This November I represented Atlassian at Devopsdays London, giving two ignite talks. The ignite format is a 5-minute talk, with 20 slides and a fixed 15-seconds per slide cadence. It’s not a format I’m particularly comfortable with as it removes the ability to ad-lib and go off on tangents, which I like to do when speaking. I think this shows up in talks; the bits where I sound most comfortable are where I briefly go ‘off script’.

The first talk was intended to raise the idea that much of the perceived separation between dev/ops and other aspects of business is purely that, perception. As more and more desktop and manual tools migrate into the cloud the difference is largely moot. With this in mind, I suggest that the advantages of devops culture and tools should apply equally to other functions with the company, and provide some concrete suggestions on how to do this. After all, if we’re breaking down silos, why limit those silos to ‘dev’ and ‘ops’?

The second ignite started as a longer talk on the rebuilding of the Atlassian order system to be atomic, but I ended up paring it down to a few key points. It outlines the credit card pre-authorisation technique we use to attempt to wring robustness out of notoriously unreliable credit card gateways. The deeper point was about the necessity of anticipating the effects of catastrophic system failure and preparing for it.

The talk sparked a few conversations afterwards, with others sharing their woes at making credit card systems reliable. It turns out others have used this technique too (and I think I may be partially responsible for its adoption)

Friend and collegue Otto Jongerius also there, and presented on scaling Atlassian’s OnDemand ops.

Update: The slides for the talks are up on Slideshare:

Haltcondition: Now in SPDY (where available)

SPDY is the next big thing in web technology. Nominally it is intended to speed up websites by multiplexing multiple site requests over a single connection; however there is some question about how effective it is at this. But personally I see it’s advantages in the datacenter; by reducing the number of TCP connections required to serve up a page to 1, the resources required for file-descriptors and firewall entries is massively reduced for high-volume sites. I suspect this is why sites such as Twitter and Facebook are adopting it before its usefulness for the end user has been proven.

Always one to jump on a passing bandwagon, Haltcondition is now being served via SPDY if your browser supports it. This is possible via the recently-released Nginx patches. Prior to this I had been testing the official Google Apache module, however this proved unstable as it is incompatible with mod_php; running WordPress under FCGI proved flaky. Adding Nginx as a caching/SPDY/SSL frontend allowed me to continue using Apache as an application container for WordPress.

To enable SPDY on Haltcondition I took the following strategy:

  • Download the Nginx patches and follow the instructions to build an SSL/SPDY-enabled instance. Personally I installed it under /opt/nginx…
  • Modify the existing Apache/Wordpress vhost to bind to a different port; 8080 is traditional.
  • Configure Nginx to serve HTTP and HTTPS, and forward requests to 8080.
  • On the HTTP vhost configure Nginx to send the ‘Alternate-Protocol “443:npn-spdy/2″‘ header; this tells the browser that SPDY is available on the HTTPS port.
  • Configure your system to start Nginx; personally I use daemontools with Nginx is foreground mode.

One gotcha is that WordPress doesn’t handle this sort of proxy-chaining very well and will tend to go into redirect loops. The workaround for this is to disable the ‘redirect_canonical’ filter; there’s no official way to do this but the ‘Fix Multiple Redirects’ plugin will do this for you.

IPv6 is stalking you (and what you can do about it)

Imagine a not-too-distant future where IPv6 is starting to see widespread adoption. On sunday evening you login to Amazon.com on your laptop and purchase some sex-toys for you and your wife for your upcoming anniversary; good for you for keeping it interesting. Naturally you enable privacy mode in Firefox so it won’t show up in your history, society being what it is.

On Monday you head into your job at a large daycare center where you’re a manager in HR. There’s an upcoming restructure and you want to make sure the employees are reassured that its a good thing; in between meetings you flick through some change-management books on Amazon on your laptop, but can’t see anything useful.

Congratulations! Amazon.com (and anyone they feel free to share with) now know that you have sex-toys and access to young children. No logins, no cookies, all they need to do is look in their logs for laptop’s unique identifier and then match your work’s network block to your purchases at Amazon.

How does this work? First a bit of background (the following skips a few details but is basically true for most people)…

Every piece of network hardware in every computer, phone, etc. in the world has a unique identifier: the Media Access Control address, or MAC. This address is 48 bits long, and different from the IP address you use on the internet; it is used purely for finding machines on your local network.

Although it was never a deliberate design decision, the IPv4 internet has a few privacy mechanisms built into it, almost as a side-effect of its limitations. IPv4 addresses are 32 bits long, far too small to contain any significant portion of the MAC address or any other identifier; the MAC address is quietly dropped the moment your traffic enters the wider internet. And although the IP assigned to you or your employer by your ISP is globally unique, in practice its tracking potential is limited: your home IP is regularly reused by your ISP for other customers, and at work the public address is shared by dozens or even hundreds of employees due to NAT.

With IPv6 it’s a different story. A 128-bit IPv6 address consists of two components; a network address that identifies your whole network (usually 64 bits) and a local component that identifies your machine on your network. This local component is based on your MAC address, and by default is included in all communication with the wider internet. Because it’s bound to your physical hardware the local part always stays the same, regardless of which network you’re connected to; it is in essence a global tracking code, and can be used by remote sites to infer some interesting information about you. The example above is the simplest I could come up with; advertising providers operating across multiple sites are going to be able to do some truly stunning pattern matching. And hardware vendors will already have massive database mapping MAC addresses to users and credit-cards; some of them (e.g. Apple) have deep ties with organisations such as the RIAA, who would dearly love to be able to match an IP address to a name and mailing address without any of that inconvenient subpoena stuff.

Luckily this problem was anticipated during the IPv6 specification process and a solution added; RFC3041 privacy extensions. The gist of this is that your operating system can generate a random, short-lived fake local-address that is used for outgoing connections. In the example above, assuming the temporary address is set to a short enough timeout, by the time you’re at work the next day the address you used from home will have been replaced by a new one.

There’s only one problem; it’s not enabled by default in all operating systems. Here’s how to enable it in some of the common ones:

Linux desktop/server distributions

Most Linux distributions seem to have temporary addresses disabled by default. Enabling them is simple enough though:

sudo sysctl -w net.ipv6.conf.all.use_tempaddr=2
sudo sysctl -w net.ipv6.conf.default.use_tempaddr=2
echo net.ipv6.conf.all.use_tempaddr=2 | sudo tee -a /etc/sysctl.conf
echo net.ipv6.conf.default.use_tempaddr=2 | sudo tee -a /etc/sysctl.conf

Android

Temporary addresses seem to be disabled by default in Android. However if you have rooted your phone then you can use the Linux method. Either use an Android terminal app or ‘adb’ from the SDK to get a root shell:

mount -o remount,rw /system
cd /system/etc/
echo net.ipv6.conf.all.use_tempaddr=2 >> sysctl.conf
echo net.ipv6.conf.default.use_tempaddr=2 >> sysctl.conf

Then reboot your phone.

Mac OS X

As of 10.6.7 temporary addresses are disabled. Enabling them is similar to the Linux method:

sudo sysctl -w net.inet6.ip6.use_tempaddr=1
echo net.inet6.ip6.use_tempaddr=1 | sudo tee -a /etc/sysctl.conf

iPhone/iPad

This security advisory implies that iOS 4.3 has this enabled by default. For older releases you’re probably out of luck though.

Window XP/Vista/7

IPv6 temporary addresses seem to been enabled by default; if you can confirm please comment.

Haltcondition: Now in IPv6 (where available)

Well, as of Friday the 4th of February 2011 IANA is officially out of IPv4 addresses. It’s now up to the regional registries to dole out the remaining addresses as they see fit, which will be increasingly sparingly.

To celebrate the beginning of the end of IP as we know it, Haltcondition.net is now available over IPv6:

I’ve also added an IPv6 detection widget on the right, courtesy of Patux. The IPv6 connectivity is provided by a Hurricane Electric tunnel to my Linode box; the fact that I even need to use a tunnel at a professional hosting site is sign of how painful the next couple of years are going to be.

Luckily my ISP are currently trialling consumer-level IPv6, so I can at least test the site. However at this point setting up IPv6 in the home is far from simple; I had to convert from DD-WRT to OpenWRT on my router and do a lot of manual configuration to get an end-to-end connection. It’s going to be a painful transition.

Update: Linode have announced provisional support for IPv6, so this blog is now native end-to-end if your ISP has support. The Linode setup is a bit odd (they only provide a single IP rather than the usual /64) but appears to work.

One of the more intriguing speculations doing the rounds is that Linode rolled this out early as Slicehost are gearing up for IPv6 as they transition into Rackspace’s cloud. If so this is promising, as I hadn’t expected IPv6 to be product differentiator for some time.

XBMC on the Giada N20

We finally updated our old CRT TV to a shiny new 1080p LCD/LED TV. Unfortunately this meant the end-of-life of my trusty hacked v1 XBox, which served as our HTPC via XBMC. The XBox won’t do 1080p though, and realtime decoding of HD x264 requires dedicated hardware such as the NVidia ION chipset.

I originally planned on getting a Boxee Box, but initial reviews were disappointing. I considered building my own rig; there are some nice fanless Intel Atom mini-itx boards out there, but then I saw mention of the Giada N20 on Whirlpool. The N20 is an Atom D525 with an GT218-ION chipset, 2GB of RAM, a 320GB HDD, Gigabit LAN, 802.11N, and the clincher; a built-in IR remote. In short, it’s a near-perfect HTPC; the only thing missing is a blu-ray drive, but as the TV came with a free PS3 I didn’t need or want one.

Out of the box the N20 comes installed with Ubuntu and XBMC; however it’s a very grab-bag install; there’s a lot of additional cruft on the system, whereas a HTPC should be cut-down to boot fast and ‘just work’. I was going to roll my own Ubuntu-based install, but after quick trial of the XBMC-Live distribution I was so impressed I went with that as-is. XBMC-Live is Ubuntu-based anyway (10.04/Lucid LTS) so is highly customisable, but has some nice polish such as an XBMC boot-splash. Despite the name it installs straight to the HD. It mostly works out of the box but requires a few tweaks to get the most out of it, so here’s a step-by-step run-through.

Installing XBMC-Live

To do the install you’ll need the following:

  • A USB drive; a 2GB thumb-drive should be plenty
  • The XBMC-Live image from here
  • UNetbootin
  • A live internet connection
  • A wired ethernet connection (as wireless doesn’t work during the install)
  • A USB keyboard for the install phase

To do the install, back-up anything you want from the original distribution and then:

  1. Burn the XBMC-Live image to the USB drive using UNetbootin (Ubuntu’s USB drive creator doesn’t appear to like the image).
  2. Plug in the ethernet, keyboard and USB drive, then start the N20.
  3. When the splash screen shows press Delete to bounce to the BIOS
  4. Change the boot order to boot the USB drive first, save the config and reboot; XBMC-Live should now start
  5. If you wish you can now boot into the Live XBMC and play-around
  6. To do a full install, reboot and select install during the startup
  7. The installer is the Ubuntu text-based one; instructions for using it are on the Ubuntu wiki but the defaults are fine for most users

On completion you will have a mostly-working XBMC installation, including traditional problem areas such as power-on by remote. Suspend/hibernate work out of the box, but with a ~1 minute boot-up from power-on to a responding system I haven’t found them necessary.

But a few tweaks are needed to get the most out of the system …

HDMI Audio

To get XBMC fully working over HDMI the following tweaks are required:

In System Config->System Settings->Audio Setting change the following:

  • Set Audio Output to HDMI
  • Unset “Device is DTS Capable”
  • Set Audio Output Device to “HDA NVidia HDMI”
  • Set Audio Passthrough Device to “HDA NVidia HDMI”

This will get audio working for playback. However the menu feedback sounds do not work; this is because the analog output is the default and XBMC doesn’t appear to use the audio device setting above for UI sounds. This can be worked-around by changing the default in ALSA; simply create the file /etc/asound.conf (or ~/.asoundrc) and add the following:

pcm.!default {
		type plug
		slave {
			pcm "hw:1,3" 
		}
}

(“hw:1,3″ is the HDMI device, found by getting the device list with ‘aplay -L’; see the ALSA docs for details.)

Enabling more keys on the remote

The IR receiver is interesting, in that it doesn’t interact with IRDA, but appears to the system as a keyboard/mouse combo. By default XBMC expect IRDA/Lirc events; it’s technically possible to turn these keypresses into events, but it’s easer to just tell XBMC to use it as a keyboard:

  • Go to System Config->System Settings->Input
  • Enable “Remote Control sends keyboard presses”
  • Disable “Enable Mouse”

This gets the core buttons working, including power on/off. One unsolved problem is that some of the more specialised buttons don’t work. This is more than a case of mapping buttons; as far as I can tell many of them don’t even register as events in the Linux subsystem. I’ll need to look into this some more.

Configuring wireless

The N20 has an Atheros AR9285 chip; this is fully supported with the ath9k driver out of the box. Ubuntu normally controls networking via NetworkManager but that is not installed with XBMC-Live. However we can fall-back to the powerful but less user-friendly Debian interfaces method:

  • Edit /etc/network/interfaces
  • Add the following lines:
    auto wlan0
    iface wlan0 inet dhcp
      wpa-ssid YOURNETWORKSSID
      wpa-psk YOURNETWORKPASSWORD
    
  • Do ‘sudo ifup wlan0′ to bring the wireless network up

Extra tweaks

Adding Add-Ons

The latest version of XBMC supports ‘add-ons’, which enable extra functionality. While there are only a few official add-ons, there are a number of unofficial repositories that supply 3rd-party modules. For Australian users the ‘Catchup TV’ repository adds support for the various online channel streaming TV services, including ABC’s iView.

It’s also worth reiterating that this is a dual-hyperthreaded machine, equivalent to a high-end workstation just a few years ago, and has access to the full Ubuntu software repositories. As such it is more than capable of running the full suite of P2P and download apps in the background with no effect on playback performance. Personally I use a Sabnzbd/Sickbeard combo to automatically download US current-affairs programs that are otherwise unavailable in Australia.

Removing the stand

The N20 is designed to be used upright on a (surprisingly sturdy) stand. This didn’t fit into my TV cabinet, but just laying it down didn’t seem like a good idea as it would partially block the air intake. However I found some small stick-on ~1cm feet at Jaycar that gave it sufficient height for decent airflow.

To-dos and other possibilities

As mentioned above, there are a number of buttons on the remote that would be useful to have but don’t show-up in XBMC. This may be a Linux or Xorg driver-level question, but I need to investigate further.

As well as supporting power-on via the remote, the Giada BIOS has support for Wake-on-LAN; this would be useful for remote administration but I haven’t played with it yet. It turns out I was wrong about this; the N20 doesn’t have WOL.

While I’m happy with the system as-is, it would be nice to have the option to modify the system at a later date (such as adding an SSD for even faster boot-times). But case looks well-sealed, but it should be possible to get it open some-how. Update: See the comment by Rich below about opening the case and replacing the drive.

This blog is protected by Dave\\\'s Spam Karma 2: 258670 Spams eaten and counting...