Monday, 9 June 2014

Maverick Adventures with Google Authenticator

Please note, this article assumes you're somewhat familiar with running commands in a terminal and editing system files. If you are not comfortable with this, I'd suggest you don't particularly want to mess around with these things just yet.

Today I decided to improve the login security for my OS X install with the use of 2-factor authentication with the Google Authenticator PAM. Turns out there's not much information on this, and what is there neglects some interesting steps, so I'll write up my own experience here in the hope that it'll help someone.

First, download the code from

Extract it and in the directory run make && sudo make install. This will compile the PAM library and the command line tool for creating your secret. It will also put the command line tool in the correct place of /usr/local/bin and the PAM library in the wrong place of /usr/lib

So, now we put the library in the correct place with mv /usr/lib/ /usr/lib/pam/

The library is now installed, so now it's onto config. At this point, it's possible to lock yourself out, so it would be really smart to make sure you have SSH turned on for your mac and have another machine to SSH in from handy... just in case (n.b. I didn't do this. I then had to figure out how to unmangle my PAM file to login again... you don't want to have to do this).

If you want, you can also install the qrencode library, which will allow a QR code to be generated in the next step. With homebrew, this is brew install qrencode

First part of the config is generating your authenticator secret. Run the command line tool google-authenticator. This will generate a secret and a set of backup tokens. It will also ask you a few questions, answer as appropriate.

Now you get to decide what services will use 2-factor authentication. I went for SSH, sudo, login and unlock, which involved the PAM files authorization, login, screensaver, sudo and sshd (all in the /etc/pam.d directory). The configs for these are as follows:

In each of the files, the important modifications are around the line
auth required

This line, and the options following it, insert the google authenticator PAM into your authentication chain. With SSH and sudo, this is done with a 'challenge-response' prompt, where it will ask you for a verification key. However, the OSX login lacks this ability to insert another prompt, so for logging in or unlocking the screen, we use the forward_pass option, which expects the password and TOTP to be provided together as <password><totp>. This may cause your keychain login to complain about an incorrect password - do not change the password unless you want to write down the token when you logged in so you can remember the password next time. Instead, when you have finished logging in, the keychain will prompt for your password again. Type it in normally and you're all set.

If you wanted, you could also try the following for sudo:

# sudo: auth account password session
auth sufficient
auth required
account required
password required
session required

This will alter your sudo authentication so that you will be prompted for a TOTP first. If this is correct then sudo authentication succeeds. If it fails, it will fall back to the traditional asking for a password.

Tuesday, 4 February 2014

Encrypted Chef Node Data

When dealing with Chef, you frequently want to store data in an encrypted form so that your Chef Server doesn't become a single point of vulnerability for the security of your entire infrastructure. For this purpose, Chef has the concept of an Encrypted Data Bag.

Also when dealing with Chef, you frequently want to generate certain bits of data and remember them for future runs. When dealing with a Chef Server, you also sometimes want to let other nodes grab this data for their own configuration. For this, you have standard, plain text Node data.

So what happens when you want to generate something that needs to be kept secure, such as the passwords for your database users? You could put them in an Encrypted Data Bag, but then you need to generate this data in advance. You could put them in the node data, but then your passwords are stored in plaintext on the Chef Server.

Neither of these solutions are ideal. What we want is to be able to generate the data on the node and store it in an encrypted form. Unfortunately, Data Bags can't be updated during a run (and this does lead to some race conditions, as what would happen if several nodes tried to update the same data bag at the same time? Distributed race conditions are never fun).

How about encrypting the data in the node though? This seems like a good approach, as the node data is safe(ish) from distributed race conditions, but it would be fairly tedious to have to build our own encryption mechanism for nodes, so maybe we can re-use some of the encryption used for Encrypted Data Bags? This is surprisingly easy to do, as the data bag encryption is well factored into code for Secrets, Encryptors and Decryptors separately from the main Encrypted Data Bag code that binds them together (although the namespace doesn't lead to the clearest code at this point).

Without belabouring the point too much, by placing something like this:

into libraries/node_encryption.rb in a cookbook, you can then make use of this in your recipe in the following way:

This does still rely on your secret being shared between your servers, in the same way as Encrypted Data Bags, so it isn't perfect. Our own infrastructure makes use of multiple sets of secrets in order to mitigate this on an environment basis (the secrets for our development servers are separate from the ones for our production environment). What would be really fun would be if this could be integrated into a tool like Chef-Vault (see also this blog post on chef-vault),  so you could generate a value on a node and specify a search to specify which nodes are allowed to decrypt the value. This would extend the usefulness of Chef-Vault a bit, as you could then rely on your cookbooks to handle some of the re-encrypting that currently needs to be done manually with Chef-Vault (such as re-encrypting a value when a new node is added to grant access with the new node's public/private keypair).

Monday, 24 June 2013

Asset Pipeline Organisation

MIQly (our soon-to-be flagship product) is a application, but could be classed as 3 large-scale Javascript applications (setting questions, marking assessments, providing student feedback). As it exists in the Rails 3.1+ ecosystem, the Javascript (written one-step removed as ) lives inside the . I can already hear the shudders of dread [1] from many Rails people at this thought, but we managed to make the asset pipeline work for us. 

This is how.

The out-the-box organisation of assets within Rails provides very little control over the ordering and packaging of JavaScript and assets. Problems that follow from the standard and rather painful approach of a default, single application.js manifest file become apparent as an application grows, especially with one default line in application.js:

//=require_tree . 

This one line causes major headaches with indeterminate code ordering. This is especially apparent if the assets are developed on and then precompiled on a machine, as these have different default ordering of files loaded from the filesystem with respect to capitalisation. The result is painful manual resolution of dependency ordering problems, and frequently ends up with (the library that does a lot of the heavy lifting for the asset pipeline) being criticised for what is ultimately a code organisation issue.

We needed to impose a more suitable structure within the asset pipeline to avoid having to repeatedly perform manual resolutions as our code base grew.  We considered the use of a system like , but the integrations were immature at the time of consideration, and the asset pipeline itself provided advantages in terms of an integrated build process that we didn't want to abandon. As such, we wanted a solution that worked well within the asset pipeline rather than attempting to force a different solution on top of the pipeline.

Our solution is mostly organisational and not tooled beyond the Sprockets tooling already available. The key part is in recognising and distinguishing two distinct types of sprocket require directives and two different types of files. These get paired together as follows.

The first type is the 'typical' rails usage, with a manifest that pulls in various files into a precompiled asset. These are something we term an 'assembly' and defines a compilation unit. The items that get pulled into such an assembly are components, not lower-level units of composition. The initial application.js file in a new Rails project is an example of this (albeit a poor one, as it uses require_tree rather than a more stable dependency declaration method). One of our manifest files looks like:

The second type of file is an implementation file. These also have sprocket directives that will pull in required dependencies. For example, a screen widget item will pull in lower-level widgets and components (such as autogrowing text fields, or in-line editor code). The lower level code will also pull in its own dependencies, eventually 'grounding out' in either dependency-less code, or base libraries like jQuery  (n.b. in miqly, jquery is pulled in with a separate sheet to avoid certain issues with inter-manifest conflicts). One of our implementation files looks somewhat like:

This approach lets us assemble various components into compiled files and be sure of the dependency loading and code ordering within these files with very little effort: We have the asset pipeline resolving all our dependency ordering issues for us across several thousand lines of coffeescript and CSS.

[1] These 3 links were just picked after skim reading some results from the query 'asset pipeline criticism'. Feel free to provide more criticisms (or praise) in the comments :)

Monday, 29 April 2013

Time for some paper prototyping.. Tooltips everywhere!!

Recently, as a team exercise, we sat around around a sketch book to decide where to place helpful tooltips on our interactive marker application. With a little cutting and writing this is what we came up with:

Conventionally, designing a set of tool tips like this might result in a long and boring email thread discussing specifications and showing edited screenshots, or a couple of hours of  design activities (using a browser inspector, whose contents could have been lost on accidentally pressing back or refresh).

Instead we did this as a fun  visual exercise that we finished in less than an hour with everyone understanding what was needed on the page. 

Saturday, 9 March 2013

Avoiding developer interrupts

I'm going to start posting here about our lean and agile development processes. They will fit more here than on the main Hedtek blog. A fitting start is the elimination of waste and in particular waste caused by developers not being 'zoned in' and focusing totally on development processes. Often this is called flow (Wikipedia has a great background article). Being in the flow has two aspects, of how to get into the flow, and avoiding interrupts which jerk a developer out of the zone or flow.

Positive ways of getting into the flow is something we haven't touched on so far, although we are aware of the need for and strive for a good working environment with relatively good levels of quietness. What we have been actively working on (with a good level of success) is avoiding unnecessary interrupts.

What is the problem with interrupts?  Don't you just hate it when you are six levels deep in code and your manager walks in to say to the team: "Hey guys, don't you think we should transmogrify the hubbleity widget?" Instant loss of concentration and exit from the zone, only to struggle to get back to where you were.

One research study (blog post, research paper) examined developer behaviour of 414 developers in 10,000 programming sessions and found that for developers in the sample

  • Devs took 10-15 minutes to resume editing code after an interruption.
  • Devs were likely to get only one uninterrupted 2-hour period for development in a day.

Both of these are alarming statistics. Avoid interruptions at all costs. As a practical measure we identified what each of our devs was doing when zoned in or not zoned in, and now we just look for these behaviours  before asking something. For example:

  • May be zoned in: Wearing headphones can be a sign of being zoned in.
  • Definitely zoned in: Not looking at ancillary sources for information on the web, and only focusing on and flicking between vi buffers or IDE tabs.
  • Not zoned in: Looking at a mobile (cell) phone.

Better still, we have adopted gmail chat and email for question requests, knowing that a developer will ignore these until having left the zone. Similarly, to remove more potential interrupts when I need to disappear from the office, theres a Where's Mark corner on one of our whiteboards, I simply write where I am and when I'm expected back there, and avoid the need to announce that to the dev team when I leave.

Of course pragmatism still rules, we can still interrupt, but we only use that for emergencies.

Additional sources

Monday, 4 March 2013

Testing Strategy part 3 - Acceptance Tests

After looking at code coverage and unit tests, I'd like to jump to the other end of the testing spectrum and examine what sort of Bayesian evidence an acceptance test provides. Again, this is skirting around the central question of "How can you demonstrate that your tests are correct?"

Friday, 1 March 2013

Helping others

It's an old cliche, but the saying 'if you want to learn a subject, teach it' holds a certain amount of truth. The act of ordering your thoughts and understanding about a subject allows you to spot the holes in your understanding, whether that subject is how to use a text editor or how nuclear fission works.

I've noticed this in many people, myself included. The primary case I would cite is the helpers in the IRC room #rubyonrails on The majority of newcomers will come into the room, ask a question, and leave (or remain silent) as soon as they have an answer. I've seen some of these people still asking the same sort of questions months or years after first appearing. The other newcomers are people that will ask some questions but will also try to answer other peoples questions straight away. These people have a tendency to develop all aspects of their development skills much quicker, as they are constantly exposing themselves to new situations that they would otherwise have never encountered.

Writing blog posts, doing podcasts, writing newsletters, helping out in a support channel. They're all ways to help teach other people and by extension help you to develop your own understanding further.

What are your preferences for helping develop the skills of team-mates, colleagues and your wider development community?