I recently found a security issue in the open source Alaveteli project that meant an attacker could pretend to be the government responding to a Freedom of Information request. I reported the issue and it has since been fixed and publicly disclosed.
This post explains how I found the problem and the code that caused it.
What is Alaveteli?
Alaveteli is an open source project developed by mySociety that can be used to create websites that make it easy for people to submit Freedom of Information (FOI) requests. It was originally built for the UK-based WhatDoTheyKnow and now is used to power many similar FOI request sites in other countries.
I find Alaveteli interesting because there are few Ruby on Rails projects that are as large and old (8 years in this case) and are open source. You can’t rely on security through obscurity in such a project— the code and versions of libraries & frameworks you are using are public for all to see.
Verifying Identities
I started looking at WhatDoTheyKnow after I read a FOI request to the Home Office for the recent internet history of the Home Secretary, Theresa May1.
I saw that there was a link on the page to allow the Home Office to respond to the request. I wondered how WhatDoTheyKnow verified that users were acting on behalf of the Home Office, and thus were allowed to reply.
Looking at the Alaveteli codebase I saw that a user can submit a FOI request to a public body. A public body has a request email which any new request from a user is sent to. I saw that anyone could reply to the request as the public body if their user account had an email belonging to the same domain as the public body’s request email.
Putting My White Hat On
I wondered: if I wanted to exploit this system, how would I do so?
First I looked at if I could register an account with an email address for the ‘homeoffice.gov.uk’ domain. As far as I could see I couldn’t because Alaveteli requires all accounts to verify their email address.
I said above that Alaveteli lets you respond to the request if your email address is on the same domain as the public body. I investigated how this worked and found that this is done by the PublicBody#is_foi_officer?
method which checks the user given as an argument. It looks something like this:
I then found the extract_domain_from_email
method that it makes use of:
The first part of the function uses a regular expression to extract the domain from the email by removing anything after the ‘@’ character. It then downcases the domain so that case is ignored when domains are compared.
The interesting part of this method is what comes next. There are three string substitutions that occur before the extracted email is returned.
As the comment implies, this is done to handle the email infrastructure specific to the UK government. Government departments may send email from a centralised domain such as ‘gsi.gov.uk’. For example, the email ‘chris@homeoffice.gov.uk’ might also take the form ‘chris@homeoffice.gsi.gov.uk’2. Alaveteli tries to normalise these email addresses by removing the extra domains.
Given that a public body is set up with a request email of ‘foi@homeoffice.gov.uk’, here are some example user emails and the return value of is_foi_officer?
:
is_foi_officer? |
Is this correct? | |
---|---|---|
chris@homeoffice.gov.uk | true |
Yes, anyone with that domain can reply. |
chris@homeoffice.co.uk | false |
Yes, this is a different domain. |
chris@HOMEOFFICE.gov.uk | true |
Yes, capitalisation doesn’t matter. |
chris@homeoffice.gsi.gov.uk | true |
Yes, ‘gsi.gov.uk’ is a domain owned by the government used to send outgoing email. |
chris@homeoffice.gov.gsi.uk | true |
No, ‘gov.gsi.uk’ is a different domain that is not owned by the government. |
Because of the way it does string substitution, the is_foi_officer?
method is flawed. It means that two emails might be considered on the same domain even
if they are not.
The Attack
Here’s are the steps that would have to be followed to reproduce this attack:
- A public body is registered with a FOI email address of ‘foi@example.gov.uk’
- A user submits a request to this body on Alaveteli
- An attacker registers the domain ‘gsi.uk’, and sets up a subdomain of ‘example.gov.gsi.uk’
- An attacker creates a user account on Alaveteli with the email ‘foi@example.gov.gsi.uk’
- The attacker can then respond to the users request, appearing as if they was a member of the public body.
The attack can be considered low risk. It doesn’t directly expose any personal user details, and upon investigation an administrator of the site would be able to see that the response came from an unauthorised domain and remove it. It might however spread false information and damage the credibility of sites that are powered by Alaveteli.
Is the Attack Feasible?
Since some of these domains (such as ‘gsi.uk’) are already registered the attack could be carried out by their current owners or someone who purchases the domain from them.
For UK domains, the attack has become easier since June 2014 when second level ‘.uk’ domains became available to register (although you currently need to own the corresponding ‘.co.uk’ or similar domain if it already exists).
Not just the UK government is vulnerable to impersonation. Alaveteli powers websites across the world so it may be possible to pretend to be other governments or public bodies. The feasibility of each attack depends on the domain of the body being targeted and if other domains that could confuse the system are available to register.
Prevention
mySociety made two changes to prevent this attack:
- The UK-specific code was moved from the main codebase to another project used only by the UK WhatDoTheyKnow site. This avoids the problem for most Alaveteli deployments.
- To fix it in the WhatDoTheyKnow project, the substitutions were made more specific so that it was only applied on subdomains of emails ending in ‘gov.uk’. I believe this to be the original intent of the code.
More generally, and for other projects, when writing regular expression and string substituions it’s important to be as specific as possible otherwise you might be making substitutions that lead to unexpected behaviour. For structured data such as an email address or domain, it might be safer to parse it first (ideally with an already existing library) before performing any manipulation as your less likely to end up with data in an unexpected form.
Thanks to Louise and Gareth at mySociety for their prompt response and fix for the issue.
More Links
- The announcement of Alaveteli 0.23.1.1 which patches this issue.
- Find out how to get involved in Alaveteli and other projects by mySociety
-
This FOI request was submitted in response to the Investigatory Powers Bill (also known as the ‘snoopers charter’) to troll the Home Secretary. As an aside, if you oppose the bill consider supporting the Open Rights Group’s campaign against it . ↩
-
Some of these domains are to support the government’s secure email system. ↩