Getting started 1: Parsing JSON with jsonlite

The jsonlite package is a JSON parser/generator optimized for the web. Its main strength is that it implements a bidirectional mapping between JSON data and the most important R data types. Thereby we can convert between R objects and JSON without loss of type or information, and without the need for any manual data munging. This is ideal for interacting with web APIs, or to build pipelines where data structures seamlessly flow in and out of R using JSON.

library(jsonlite)
identical(mtcars, fromJSON(toJSON(mtcars)))
[1] TRUE

This vignette introduces basic concepts to get started with jsonlite. For a more detailed outline and motivation of the mapping: arXiv:1403.2805.

Simplification

Simplification is the process where homogeneous JSON arrays automatically get converted from a list into a more specific R class. The fromJSON function has 3 arguments which control the simplification process: simplifyVector, simplifiyDataFrame and simplifyMatrix. Each one is enabled by default.

JSON structure Example JSON data Simplifies to R class Argument in fromJSON
Array of primitives ["foo", "bar"] Atomic Vector simplifyVector
Array of objects [{"x" : "foo"}, {"x": "bar"} ] Data Frame simplifyDataFrame
Array of arrays [[1,2,3], [4,5,6]] Matrix simplifyMatrix

Atomic Vectors

A JSON array containing primitives (strings, numbers or booleans) gets converted into an atomic vector. Any possible null values turn into NA.

# An array with primitives
json <- '["Mario", "Peach", null, "Bowser"]'

#This turns into an (atomic) vector
fromJSON(json)
[1] "Mario"  "Peach"  NA       "Bowser"

Without simplification, an array always turns into a list:

#If we disable simplifyVector it would be a list
fromJSON(json, simplifyVector = FALSE)
[[1]]
[1] "Mario"

[[2]]
[1] "Peach"

[[3]]
NULL

[[4]]
[1] "Bowser"

Data Frames

A JSON array containing JSON objects (key-value pairs) turns into a data frame. Missing fields turn into NA values.

json <-
'[
  {"Name" : "Mario", "Age" : 32, "Occupation" : "Plumber"}, 
  {"Name" : "Peach", "Age" : 21, "Occupation" : "Princess"},
  {},
  {"Name" : "Bowser", "Occupation" : "Koopa"}
]'
mydf <- fromJSON(json)
mydf
    Name Age Occupation
1  Mario  32    Plumber
2  Peach  21   Princess
3   <NA>  NA       <NA>
4 Bowser  NA      Koopa

The data frame can be converted back into the original JSON structure using toJSON (whitespace and line breaks are ignorable in JSON).

mydf$Ranking <- c(3, 1, 2, 4)
toJSON(mydf, pretty=TRUE)
[
  {
    "Name" : "Mario",
    "Age" : 32,
    "Occupation" : "Plumber",
    "Ranking" : 3
  },
  {
    "Name" : "Peach",
    "Age" : 21,
    "Occupation" : "Princess",
    "Ranking" : 1
  },
  {
    "Ranking" : 2
  },
  {
    "Name" : "Bowser",
    "Occupation" : "Koopa",
    "Ranking" : 4
  }
]

Matrices and Arrays

A JSON array containing equal-length arrays turns into a matrix or higher order array. Using simplifyMatrix only makes sense in conjunction with simplifyVector.

json <- '[
  [1, 2, 3, 4],
  [5, 6, 7, 8],
  [9, 10, 11, 12]
]'
mymatrix <- fromJSON(json)
mymatrix
     [,1] [,2] [,3] [,4]
[1,]    1    2    3    4
[2,]    5    6    7    8
[3,]    9   10   11   12

Again, we can use toJSON to convert the matrix or array back into the original JSON format:

toJSON(mymatrix)
[[1,2,3,4],[5,6,7,8],[9,10,11,12]]

The simplification works for arrays of arbitrary dimensionality, as long as the dimensions (length of the arrays) match, because R has no special data structure for ragged arrays.

json <- '[
   [[1, 2], 
    [3, 4]],
   [[5, 6], 
    [7, 8]],
   [[9, 10],
    [11, 12]]
]'
myarray <- fromJSON(json)
myarray[1, , ]
     [,1] [,2]
[1,]    1    2
[2,]    3    4
myarray[ , ,1]
     [,1] [,2]
[1,]    1    3
[2,]    5    7
[3,]    9   11