autoharp User Manual

A Motivating Example

The purpose of this package is to assist an instructor in grading R codes. Thus, we begin with a simple example. We describe the assignment, solution file, and two student scripts in the subsequent sections. Following this, we demonstrate how the package can help.

When the autoharp is installed, a folder called examples containing example question sheets, solution templates and sample student scripts is created at the top level. However, the path to this folder varies according to the OS and the R installation. In order to access the files in this folder, we can use system.file:

system.file("examples", package="autoharp")
#> [1] "/tmp/RtmpIdF3w0/Rinst2183750ffd93b/autoharp/examples"
list.dirs(system.file("examples", package="autoharp"))
#> [1] "/tmp/RtmpIdF3w0/Rinst2183750ffd93b/autoharp/examples"                
#> [2] "/tmp/RtmpIdF3w0/Rinst2183750ffd93b/autoharp/examples/question_sheets"
#> [3] "/tmp/RtmpIdF3w0/Rinst2183750ffd93b/autoharp/examples/soln_templates" 
#> [4] "/tmp/RtmpIdF3w0/Rinst2183750ffd93b/autoharp/examples/student_scripts"

There are two example question sheets included with the installation of autoharp. In this vignette, we utilise sample_questions_01.Rmd to demonstrate how to use the main functions within the package.

Sample Questions 01 (Exact Rmd Worksheet Content)

Consider the following probability density function (pdf) over the support \((0,1)\):

\[\begin{equation*} f(x) = \begin{cases} 4x^3 & \text{if } 0 < x < 1 \\ 0 & \text{otherwise} \end{cases} \end{equation*}\]

Write a function called rf, that generates i.i.d observations from this pdf. It should take in exactly one argument, \(n\), that determines how many random variates to return. For instance:

set.seed(33)
rf(n = 5)
#> [1] 0.8171828 0.7925983 0.8339701 0.9790711 0.9584520

Now generate 10,000 random variates from this pdf and store them in a vector named X. Your script must generate a function named rf and a vector named X.

Solution Template 01

If this is an elementary course in R, or if it is just the first assignment of a class that uses R, the instructor may wish to test just a few details in each students’ solution.

  1. Is the number of formal arguments for rf equal to 1?
  2. Is the length of the vector X equal to 10,000?
  3. Compute the mean and standard deviation of the values in the X vector.
    • The theoretical mean and s.d. are 0.8 and 0.16 respectively.
  4. Has a for loop been used within the function definition of rf?
    • It shouldn’t be used, because most operations in R are vectorised.

Student Scripts

Consider the two student submissions, that are included with the package. Student 1 (qn01_scr_01.R) has a model solution. However student 2 (qn01_scr_02.R) has made a few mistakes:

Package Output

Here is how the autoharp package can be used to assess the scripts.

library(autoharp)

# retrieve soln template path
soln_template_path <- system.file("examples", "soln_templates",  
                                  "soln_template_01.Rmd", package="autoharp")
# retrieve installation-specific filenames of examples
stud_script_paths <- system.file("examples", "student_scripts", package="autoharp")
stud_script_names <- file.path(stud_script_paths, c("qn01_scr_01.R", "qn01_scr_02.R"))

if(rmarkdown::pandoc_available()) {
  # populate solution environment
  s_env <- populate_soln_env(soln_template_path, pattern="test", getwd(),
                             output=tempfile())
  
  # run autoharp function "render_one" on student scripts.
  #corr_out <- lapply(stud_script_names, render_one, out_dir = "test_out",   
  corr_out <- lapply(stud_script_names, render_one, out_dir = tempdir(),
                     knit_root_dir = getwd(), soln_stuff = s_env)
  
  # combine output, dropping initial columns which pertain to runtime stats.
  do.call("rbind", corr_out)[, -(1:5)]
}
#> # A tibble: 2 x 5
#>   for_loop mean.X  sd.X lenX  lenfn
#>   <lgl>     <dbl> <dbl> <lgl> <lgl>
#> 1 FALSE     0.802 0.161 TRUE  TRUE 
#> 2 TRUE      0.472 0.128 FALSE TRUE

The first column checks if a “for” loop had been used in the function definition. The next two colums contain the mean and sd of the X vectors from the respective students. The last two columns assess if the rf function has only a single argument, and whether the length of the created X is 10,000.

As we can see, the package correctly detected that student 1 did not use a for loop. The objects created in his script also passed the unit tests. Student 2, on the other hand, used a for loop. The mean and s.d. were also incorrect.

What Can This Package Do?

This package would be useful to an instructor who runs a class that requires students to submit short to medium length assignments in R. In those cases, it can assist the instructor in the following ways:

  1. To run all the scripts submitted by students, regardless of what packages they used. It generates a html page of thumbnails presenting all the images generated by students, with links to their actual html output.
  2. To generate features from each script that has been submitted. These features fall under three categories:
    • Runtime statistics: How long did it take to run the script and how much memory did the final set of objects take?
    • Correctness statistics: Do the objects generated achieve a basic level of correctness? For instance, the mean of the vector above should be very close to 0.8.
    • R code quality: By representing the R expressions in the script as trees, the package allows one to detect coding styles or expressions with quite a bit of flexibility - hence the ability to detect for loops within the function defined.
  3. It provides a shiny server interface for students to check against before they submit their code (detailed in another vignette). This allows them to ensure that their code will run on the instructor’s machine, generating the correct output. This enables them to avoid problems of
    • using incorrect relative path specifications
    • using incorrect names for objects e.g. case-sensitive mismatches.
    • accidentally running their solution based on objects in their local environments.

The package requires some basic knowledge of unit testing with testthat package.

The autoharp Framework

In this section, we detail the framework that the autoharp uses to achieve the tasks above. Before that, we provide an overview of what we envision the instructor has to do. Ideally, he simply has to prepare a question paper, a solution template, and the autoharp should do the rest.

Overview

Overview

Elements of the Framework

The Question Paper

The question paper details what the students need to create within their submission, which could be a plain R script, or an Rmd file. The required objects could be any R object, such as a data frame, a vector, a list, or a function. The question paper should clearly state the name of the object(s) and their key attributes.

For instance, if a function is to be created, the question paper should specify it’s name, number of arguments, names of formal arguments and the return value.

The Solution Key or Template

The solution template must be an Rmd file. It is where you specify the things that should be checked about the student script. First of all, it should generate the correct versions of the objects. These “model” objects can then be used to check against student-created objects.

testthat chunks

Next, it can contain two types of test chunks: testthat chunks and autoharp scalar chunks. testthat chunks contain testthat unit tests. These typically take the form

test_that("check X properties", {
  expect_true(exists("X"))
  expect_equal(mean(X), mean(.X))
})

If the chunk above were in a solution template, the autoharp would check to see if an object named X exists at the end of running the student script. Then, it will compare the model X to the student-created X. The “model” X is referred to as .X within the solution template.

There can be more than one testthat chunk, and each one can contain several tests.

autoharp chunks

The other type of solution chunks are autoharp scalar chunks. These chunks contain normal R code that utilise the objects created by students. They could also contain autoharp code that analyses the structure of student R code. For instance, the following chunk would extract the maximum and minimum values from a student created X object.

max_X <- max(X)
min_X <- min(X)

Alternatively, the following autoharp-specific code would extract the number of calls made to mutate in the student script:

f1 <- rmd_to_forestharp(.myfilename)
mutate_count <- fapply(f1, count_fn_call, combine = TRUE, pattern="mutate")

rmd_to_forestharp, count_fn_call and fapply are autoharp functions. We shall see more about them soon. The variable .myfilename is hard-coded to contain the path to the current student script. It allows the autoharp to access the student file from within the solution script.

Here is a visual summary of what the solution template should contain:

Solution template

Solution template

Before proceeding, ensure that your solution template can knit. This is an important part of the process. The other thing to note is that all the chunks with testthat code or autoharp scalars code must be labelled with the same prefix, e.g. “test_”.

The Student Scripts

The student script can be an R script or an Rmd file. This is where the most amount of uncertainty comes into the process. Student scripts can go wrong in a multitude of ways: they could contain infinite loops, they may use obscure packages, may overwrite your own datasets(!) and call interactive functions such as View.

How the Elements Work Together

The job of the autoharp is to run the testthat and scalars chunks from the solution script in the student environment. Here are the detailed steps.

  1. First, the solution environment has to be populated. This is where the populate_soln_env from autoharp comes into play. The inputs to this function are the solution script, a pattern that identifies which chunks are test/autoharp chunks, and the directory to knit the solution script in. The function will first run all the code within the solution script and store the objects in an environment. Let’s call this the soln_env. In addition, there are two autoharp-specific knitr hooks that will be processed. If, for instance, one of the testthat chunks contains the hook autoharp.objs = c("X", "rf"), then copies of these objects are also placed in the soln_env. These objects can be used in test code. All the chunks whose labels are prefixed with “test” will be extracted and placed in a solution script within a temporary directory. Within this script, all chunks that contain the hook autoharp.scalars will be wrapped within a try expression. At the end of this script, some extra lines will be added, to indicate that certain scalars have to be copied into the student solution environment. The return object from this function is a list of length 2, containing the solution environment, and a path to the solution script. Consider the example above in section 1.4. These are the two items contained in s_env.
if(rmarkdown::pandoc_available()){
  names(s_env)
}
#> [1] "env"        "test_fname"

The solution environment contains the following objects:

if(rmarkdown::pandoc_available()){
  ls(s_env$env, all.names=TRUE)
}
#>  [1] ".X"               ".myfilename"      ".rf"              ".scalars_to_keep"
#>  [5] "X"                "f1"               "for_loop"         "lenX"            
#>  [9] "lenfn"            "mean.X"           "rf"               "sd.X"

The contents of the solution script s_env$test_fname are:

library(autoharp)
library(rlang)
try_out <- try({
lenX <- (length(X) == length(.X))
lenfn <- (length(fn_fmls(rf)) == length(fn_fmls(.rf)))
})
try_out <- try({
mean.X <- mean(X)
sd.X <- sd(X)
})
try_out <- try({
f1 <- rmd_to_forestharp(.myfilename)
for_loop <- fapply(f1, detect_for_in_fn_def, fn_name = "rf", combine=TRUE, 
                   combiner_fn = function(x) any(unlist(x)))
})
 
get_objs <- mget(.scalars_to_keep, ifnotfound=NA)
mapply(base::assign, x=.scalars_to_keep, value=get_objs, MoreArgs = list(envir=.myenv))
  1. The next step is to render the student script or Rmd into a html file. This is done within the render_one function of autoharp. Once the student file has been successfully rendered, it’s objects are stored in the student_environment. At this point, run-time statistics would already have been generated. The next step is to run the correctness check. The “model” objects from the solution environment are then copied into the student environment. Remember, these will not conflict with what is in the student environment because they would have a period in their prefix. For instance, the student environment will now contain .X (from solution template) and X (from student).

  2. Correctness is assessed by running the solution script (from step 1) within the solution environment using test_file, with the student environment supplied as the env argument. The testthat output is parsed to retrieve the success or failure of expectations. These columns are appended to the runtime statistics, along with the new scalars generated within the student environment.

This figure contains a more detaild breakdown of an instructor’s workflow when using the package. He essentially needs to prepare a question paper and the solution template; ideally, populate_soln_env and render_one should do the rest.

Instructor workflow

Instructor workflow

This final figure zooms in on the tasks that render_one accomplishes:
Details of render_one

Details of render_one

  1. If the instructor needs an overview of the images created by students, he or she can run generate_thumbnails on the output directory. This stitches all the images together on an html page, and provides links to the original student output files.

Workhorse Functions

This section goes into much more details about the two main functions discussed here.

populate_soln_env

This function contains the following arguments:

args(populate_soln_env)
#> function (soln_fname, pattern, knit_root_dir, render_only = FALSE, 
#>     output = NULL) 
#> NULL

The first step is to knit soln_fname within a new environment. This environment has now been populated with the objects created by the R code in soln_fname. Let’s call this e_soln.

In addition, the autoharp.objs hooks within soln_fname will trigger knitr to duplicate the “model” objects within the same environment e_soln, with a different name. For instance, X will be copied as .X, and so on.

A second hook, autoharp.scalars, will keep track of all scalars that need to be generated and returned from the student environment. The namelist of scalars will be stored in a vector within e_soln as .scalars_to_keep.

The testthat chunks will be parsed (using autoharp) and the test names will be stored. These will be matched with testthat output later on.

Finally, a solution script is generated. This is done by extracting all chunks whose labels have the “test” prefix and writing them to a temporary file. Any chunks with the autoharp.scalars hook will be wrapped in a try expression because they could fail in a student environment. In addition, lines will be added to the bottom of this script to copy the scalars out to the student environment. This is necessary because this file will be run through test_file, which does not write to the environment in runs in. We force those objects to be created using these last two lines of code.

render_one

This function contains the following arguments:

args(render_one)
#> function (rmd_name, out_dir, knit_root_dir, log_name, soln_stuff, 
#>     max_time_per_run = 120, permission_to_install = FALSE) 
#> NULL

Here’s how it works:

First, a connection to the log file is opened. The search path is accessed and stored in a vector. Then, the libraries that the rmd file uses are extracted. If necessary, libraries will be installed. Next, the student rmd file is rendered, into a student environment, which we shall call e_stud. This should contain all the objects that the question asked for. Next, the solution script (in the temporary folder) is evaluated in e_stud using test_file() from the testthat package. This results in a testthat output object, which is then parsed to see which unit tests passed and which failed. In addition, the scalars generated will be extracted from the e_stud and combined with the unit test results into a data frame. This data frame with one row is then returned.

FAQ

Why do we need those extra lines at the bottom of the solution script?

When you take a look at the test file that is output from populate_soln_env, you might see a few lines added to the solution template that you provided, for instance:

get_objs <- mget(.scalars_to_keep, ifnotfound=NA)
mapply(base::assign, x=.scalars_to_keep, value=get_objs, MoreArgs = list(envir=.myenv))

These are there for the following reasons: When we run the test file through test_file(), it is done in a fresh environment, which is then discarded. As a result, new objects created in the autoharp-scalar chunks will be lost. These final few lines force the created objects to be copied to the student environment, from which render_one retrieves them.

Is it necessary to have both types of chunks in the solution template?

No, your solution template could have only testthat chunks, or only autoharp-scalar chunks, or both.

Can we run the correctness check by hand, i.e. without render_one?

Yes, but this takes a little bit of hard-coding. If your solution contains autoharp-scalar chunks, you need to create an environment called .myenv, populate it with the student objects, and pass it to check_correctness. If your test code creates TreeHarp objects from the student solution, you will need to populate .myenv with the student filename too.


if(rmarkdown::pandoc_available()){
  .myenv <- new.env()
  #rmarkdown::render(stud_script_names[2], output_dir =getwd(), envir=.myenv)
  rmarkdown::render(stud_script_names[2], output_dir =tempdir(), envir=.myenv)
  .myenv$.myfilename <- normalizePath(stud_script_names[2])
  check_correctness(.myenv, s_env$env, s_env$test_fname)
}
#> processing file: qn01_scr_02.spin.Rmd
#> output file: qn01_scr_02.knit.md
#> /usr/lib/rstudio/bin/pandoc/pandoc +RTS -K512m -RTS qn01_scr_02.utf8.md --to html4 --from markdown+autolink_bare_uris+tex_math_single_backslash --output /tmp/RtmpzC0byE/qn01_scr_02.html --lua-filter /home/viknesh/R/x86_64-pc-linux-gnu-library/4.0/rmarkdown/rmarkdown/lua/pagebreak.lua --lua-filter /home/viknesh/R/x86_64-pc-linux-gnu-library/4.0/rmarkdown/rmarkdown/lua/latex-div.lua --self-contained --standalone --section-divs --template /home/viknesh/R/x86_64-pc-linux-gnu-library/4.0/rmarkdown/rmd/h/default.html --no-highlight --variable highlightjs=1 --variable 'theme:bootstrap' --include-in-header /tmp/RtmpzC0byE/rmarkdown-str218753c2d07fb.html --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'
#> 
#> Output created: /tmp/RtmpzC0byE/qn01_scr_02.html
#>   for_loop    mean.X      sd.X  lenX lenfn
#> 1     TRUE 0.4771899 0.1167338 FALSE  TRUE

If your solution template does not create TreeHarp objects, you do not need the .myfilename variable in .myenv.

References

The autoharp framework capitalises on the concept of environments in R. It sets up one environment for the solution objects, one for the student objects, and then runs test code within the student environment. In order to understand more about the package, it would be useful to know about environments and the testthat package.

  1. The testthat package.
  2. Advanced R, chapter 7
  3. Here is more information on how knitr works, and how knitr chunk hooks work.