The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

An Introduction to excerptr

Andreas Dominik Cullmann

2021-08-03, 16:26:49

excerptr is an R interface to the python package excerpts. See there for more on the Why.

Suppose you have a script

path <- system.file("tests", "files", "some_file.R", package = "excerptr")
cat(readLines(path), sep = "\n")

#######% % All About Me
#######% % Me
####### The above defines a pandoc markdown header.
####### This is more text that will not be extracted.
#######% **This** is an example of a markdown paragraph: markdown 
#######% recognizes only six levels of heading, so we use seven or
#######% more levels to mark "normal" text.
#######% Here you can use the full markdown 
#######% [syntax](http://daringfireball.net/projects/markdown/syntax).
#######% *Note* the trailing line: markdown needs an empty line to end
#######% a paragraph.
#######%

#% A section
##% A subsection
### Not a subsubsection but a plain comment.
############% Another markdown paragraph.
############%
####### More text that will not be extracted.

and you would want to excerpt the comments marked by ‘%’ into a file giving you the table of contents of your script. Then

excerptr::excerptr(file_name = path, run_pandoc = FALSE, output_path = tempdir())

## [1] 0

gives you

cat(readLines(file.path(tempdir(), sub("\\.R$", ".md", basename(path)))), 
    sep = "\n")

% All About Me
% Me
**This** is an example of a markdown paragraph: markdown 
recognizes only six levels of heading, so we use seven or
more levels to mark "normal" text.
Here you can use the full markdown 
[syntax](http://daringfireball.net/projects/markdown/syntax).
*Note* the trailing line: markdown needs an empty line to end
a paragraph.

# A section
## A subsection
Another markdown paragraph.

If you have pandoc installed, you can convert the markdown output into html:

is_pandoc_installed <- nchar(Sys.which("pandoc")) > 0 &&
                              nchar(Sys.which("pandoc-citeproc")) > 0
is_pandoc_version_sufficient <- FALSE
if (is_pandoc_installed) {
    reference <- "1.12.3"
    version <- strsplit(system2(Sys.which("pandoc"), "--version", stdout = TRUE), 
                        split = " ")[[1]][2]
    if (utils::compareVersion(version, reference) >= 0)
        is_pandoc_version_sufficient <- TRUE
}
if (is_pandoc_version_sufficient) 
    excerptr::excerptr(file_name = path, pandoc_formats = "html", 
                       output_path = tempdir())

This runs pandoc on your excerpted comments and generates an html file you can view via:

if (is_pandoc_version_sufficient) 
    cat(readLines(file.path(tempdir(), sub("\\.R$", ".html", basename(path)))), 
        sep = "\n")

You browse it via

browseURL(file.path(tempdir(), sub("\\.R$", ".html", basename(path))))

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.