Introduction to Hansard

Evan Odell

2017-03-24

hansard is an R package to pull data from the UK parliament through the http://www.data.parliament.uk/ API. It emphasises simplicity and ease of use, so that users unfamiliar with APIs can easily retrieve large volumes of high quality data. Each function accepts a single argument at a time, and functions that require additional information to retrieve the data you requested will ask for it after you execute the function. Functions retrieve data in json format and convert it to a data frame. The hansard_generic function supports the building of API requests for XML, csv or HTML formats if required. Note that the API is rate limited to returning 5500 rows per request in some circumstances.

Installing hansard

From CRAN

install.packages("hansard")`

From GitHub (Development Version)

install.packages("devtools")
devtools::install_github("EvanOdell/hansard")

Load hansard

library(hansard)

Using hansard

hansard contains functions for calling data for the UK Parliament API. Most of the functions are designed to call data from a specific http://www.data.parliament.uk/ API, but several functions have special or experimental features:

The members_search function

Looking up information on an individual MP or Lord through the Parliamentary API requires knowing their parliamentary ID number. This can be hard to find on the web, but luckily you can look it up through the API. We want information on the voting record of the Labour MP for Hackney North and Stoke Newington Diane Abbott, but we don’t know her ID number, so we search for her:

library(hansard)
members_search("abbot")
#>   MP.ID additionalName                      constituency constituencyCode
#> 1   172          Julie Hackney North and Stoke Newington           146966
#> 2  1651      Granville                              <NA>               NA
#> 3  4249           <NA>                      Newton Abbot           147092
#> 4  3827         Edmond                              <NA>               NA
#>   familyName                       fullName gender  givenName
#> 1     Abbott                Ms Diane Abbott Female      Diane
#> 2    Hodgson Lord Hodgson of Astley Abbotts   Male      Robin
#> 3     Morris              Anne Marie Morris Female Anne Marie
#> 4  Neuberger   Lord Neuberger of Abbotsbury   Male      David
#>                            homePage
#> 1     http://www.dianeabbott.org.uk
#> 2                              <NA>
#> 3 http://www.annemariemorris.co.uk/
#> 4       http://www.judiciary.gov.uk
#>                                                      label        party
#> 1                Biography information for Ms Diane Abbott       Labour
#> 2 Biography information for Lord Hodgson of Astley Abbotts         <NA>
#> 3              Biography information for Anne Marie Morris Conservative
#> 4   Biography information for Lord Neuberger of Abbotsbury         <NA>
#>                             twitter
#> 1 https://twitter.com/HackneyAbbott
#> 2                              <NA>
#> 3    https://twitter.com/AMMorrisMP
#> 4                              <NA>

The search function is not case sensitive, and searchs in the names and constituencies of all MPs and Lords. So even though we spelled her surname incorrectly, we can still find her. This API provides limited biographical details, to retrieve more detailed biographical information, use the related mnis package to retrives data from the Members’ Names Information Service.

The commons_divisions and mp_vote_record functions

The commons_divisions function returns divisions in the House of Commons, including the result of votes and information on what we being voted on. mp_vote_record returns a data frame the voting record of a given MP on each division they voted in. The example below returns all Commons Divisions where Diane Abbott voted aye.

x <- mp_vote_record(172, "aye")
#> Retrieving page 1 of 4
#> Retrieving page 2 of 4
#> Retrieving page 3 of 4
#> Retrieving page 4 of 4
#> head(x)
#>  _about                                                              title
#> 1 http://data.parliament.uk/resources/626911                            Opposition Motion: Community Pharmacies
#> 2 http://data.parliament.uk/resources/626953                           Opposition Motion: Police Officer Safety
#> 3 http://data.parliament.uk/resources/621252                                           Opposition Motion: Yemen
#> 4 http://data.parliament.uk/resources/607490 Closure Motion: Sexual Offences (Pardons Etc.) Bill second reading
#> 5 http://data.parliament.uk/resources/582798      Opposition Motion: NHS sustainabiliy and transformation plans
#> 6 http://data.parliament.uk/resources/576587                                Finance Bill: Report Stage Amdt 174
#>                 uin date._value date._datatype
#> 1 CD:2016-11-02:142  2016-11-02       dateTime
#> 2 CD:2016-11-02:143  2016-11-02       dateTime
#> 3 CD:2016-10-26:139  2016-10-26       dateTime
#> 4 CD:2016-10-21:138  2016-10-21       dateTime
#> 5 CD:2016-09-14:134  2016-09-14       dateTime
#> 6 CD:2016-09-06:125  2016-09-06       dateTime

The results of votes in the House of Lords can be retrieved with the lords_divisions function. The voting record of individual Lords can be retrieved using the lords_vote_record functions.

The research_briefings function

The research_briefings function includes the experimental feature of requesting data using lists created using the research_briefings_lists functions:

#> research_topics_list <- research_topics_list()
#> 
#> research_subtopics_list <- research_subtopics_list()
#> 
#> research_types_list <- research_types_list()
#>
#> research_topics_list[[7]]
#> [1] "Defence"
#>
#> research_subtopics_list[[7]][10]
#> [1] "Falkland Islands"
#>
#> research_types_list[[1]]
#> [1] "Lords Library notes"

In this case I have given them the same name as their function, but you can assign any name you wish to them.

Having created the lists, they can be used to specify which topics and subtopics to call, although strings can also be used.

#> x <- research_briefings(topic = research_topics_list[[7]])
#>
#> x <- research_briefings(topic = research_topics_list[[7]], subtopic=research_subtopics_list[[7]][10])
#>
#> x <- research_briefings(topic = "Defence")

If a specific subtopic is called, but the topic is not specified, the function will still return all data within that specific subtopic. Note that this is slower than specifying the topic and subtopic.

#> x <- research_briefings(subtopic = research_subtopics_list[[7]][10])
#> 
#> x <- research_briefings(subtopic = "Falkland Islands")
#>
#> system.time(without_topic <- research_briefings(subtopic = research_subtopics_list[[7]][10]))
#> Retrieving page 1 of 1
#>   user  system elapsed 
#>   1.12    2.59    4.71 
#>
#> system.time(with_topic <- research_briefings(topic = research_topics_list[[7]], subtopic=research_subtopics_list[[7]][10]))
#> Retrieving page 1 of 1
#>    user  system elapsed 
#>    0.47    1.31    1.89 
#>
#> all.equal(with_topic, without_topic)
#> [1] TRUE

If a specified subtopic is not a subtopic of the specified topic, the function will not return any data.

The hansard_generic function

The hansard_generic function allows you to put in your own paths to the API. Information on all the paths available in the API can be found on the DDP Explorer website.

x <- hansard_generic("commonsansweredquestions.json")

Note that the API defaults to returning 10 items per page, but allows up to 500 items per page.