Module 3

This module is an introduction to RMarkdown.

Module Objectives

Why R

RGui and Rstudio

Getting Started

RStudio

What are R Packages?

R packages extend the functionality of R by providing additional functions, data, and documentation. They are written by a worldwide community of R users and can be downloaded for free from the internet.

A good analogy for R packages is they are like apps you can download onto a mobile phone:

So R is like a new mobile phone: while it has a certain amount of features when you use it for the first time, it doesn’t have everything. R packages are like the apps you can download onto your phone from Apple’s App Store or Android’s Google Play.

Let’s continue this analogy by considering the Instagram app for editing and sharing pictures. Say you have purchased a new phone and you would like to share a photo you have just taken with friends on Instagram. You need to:

  1. Install the app: Since your phone is new and does not include the Instagram app, you need to download the app from either the App Store or Google Play. You do this once and you’re set for the time being. You might need to do this again in the future when there is an update to the app.

  2. Open the app: After you’ve installed Instagram, you need to open it.

Once Instagram is open on your phone, you can then proceed to share your photo with your friends and family. The process is very similar for using an R package. You need to:

  1. Install the package: This is like installing an app on your phone. Most packages are not installed by default when you install R and RStudio. Thus if you want to use a package for the first time, you need to install it first. Once you’ve installed a package, you likely won’t install it again unless you want to update it to a newer version.

  2. “Load” the package: “Loading” a package is like opening an app on your phone. Packages are not “loaded” by default when you start RStudio on your computer; you need to “load” each package you want to use every time you start RStudio.

Errors, warnings, and messages

What is RMarkdown?1

Overview

R Markdown provides an authoring framework for data science. You can use a single R Markdown file to both

Installation

Like the rest of R, R Markdown is free and open source. You can install the R Markdown package from CRAN with:

install.packages("rmarkdown")

Once you have installed rmarkdown you can create a document

Notice that the file contains three types of content:

Rendering output

To generate a report from the file, run the render command:

library(rmarkdown)
render("1-example.Rmd")

Better still, use the “Knit” button in the RStudio IDE to render the file and preview the output with a single click.

R Markdown generates a new file that contains selected text, code, and results from the .Rmd file. The new file can be a finished web page, PDF, MS Word document, slide show, notebook, handout, book, dashboard, package vignette or other format.

How it works


When you run render, R Markdown feeds the .Rmd file to knitr, which executes all of the code chunks and creates a new markdown (.md) document which includes the code and its output.

The markdown file generated by knitr is then processed by pandoc which is responsible for creating the finished format.

This may sound complicated, but R Markdown makes it extremely simple by encapsulating all of the above processing into a single render function.

Knowing the details is beyond this class, but the gist is that we can create documents with code and output. The facilitates reproducible reports.

Imagine having to generate the same reports each week/month from excel spreadsheets. To some extent you would have to add the formulas that you need to generate the numbers/summary stats you need for the report. BUT if your original report contained the necessary code to generate the desired output all you would have to do is update the data.

Code Chunks

You can quickly insert chunks like these into your file with

the keyboard shortcut Ctrl + Alt + I (OS X: Cmd + Option + I) the Add Chunk command in the editor toolbar or by typing the chunk delimiters {r} and.

When you render your .Rmd file, R Markdown will run each code chunk and embed the results beneath the code chunk in your final report.

Chunk Options

Chunk output can be customized with knitr options, arguments set in the {} of a chunk header. Above, we use five arguments:

Exercise #1

Answer the following questions. \(\color{red}{\text{Do not peek at the answers.}}\)👊

  1. Add echo=FALSE to the following code in your rmarkdown file what happens?

    ```{r cars}
    summary(cars)
    ```

    Answer

  2. Add include=FALSE to the following code in your rmarkdown file what happens?

    ```{r cars}
    summary(cars)
    ```

    Answer

  3. Add a caption to the following code in your rmarkdown file.

    ```{r pressure, echo=FALSE}
    plot(pressure)
    ```

    Answer

See the R Markdown Reference Guide for a complete list of knitr chunk options.

Global Options

To set global options that apply to every chunk in your file, call knitr::opts_chunk$set in a code chunk. Knitr will treat each option that you pass to knitr::opts_chunk$set as a global default that can be overwritten in individual chunk headers.

Caching

If document rendering becomes time consuming due to long computations you can use knitr caching to improve performance. Knitr chunk and package options describes how caching works and the Cache examples provide additional details.

Inline Code

Code results can be inserted directly into the text of a .Rmd file by enclosing the code with `r `.

copy the following code and add it to your default .Rmd file

# Inline Code example

```{r inline, include=FALSE}
cars_variable<-"dist"
#cars_variable<-"speed"
```


You can use the `<-` to store information in objects. 
`object <- information`. The above code stores  `r cars_variable`  in cars_variable.


```{r select, message=FALSE}
library(dplyr)
cars%>%
  select(c(paste(cars_variable)))

```

Code Languages

knitr can execute code in many languages besides R. Some of the available language engines include:

To process a code chunk using an alternate language engine, replace the r at the start of your chunk declaration with the name of the language. To connect to a data base first connect to the database using R code. See the following example:

CODE

```{r, echo=TRUE}
library(DBI)
db = dbConnect(RSQLite::SQLite(), dbname = "data/03-Data-file-classroom-exercise-Chinook_Sqlite.sqlite")
```

OUTPUT

library(DBI)
db = dbConnect(RSQLite::SQLite(), dbname = "data/03-Data-file-classroom-exercise-Chinook_Sqlite.sqlite")

Next start your chunk with the declaration of the language. Below is the answer to the first question from the first exercise in module 1.

CODE

```{sql, connection=db, echo=TRUE}
SELECT count(*) 
FROM pragma_table_info("Customer")
```

OUTPUT

SELECT count(*) 
FROM pragma_table_info("Customer")
Table 1: 1 records
count(*)
13

Note that chunk options like echo and results are all valid when using another language engine.

Learn more about using other languages with R Markdown in knitr Language Engines.

Parameters

R Markdown documents can include one or more parameters whose values can be set when you render the report.

Update your Rmd with

YAML CODE

Watch the spacing and placement

params: 
  cars_var: "dist"

CODE

# Parameters

The YAML code above stores  `r params$cars_var` in params$cars_var.


```{r select 2, message=FALSE}
library(dplyr)
cars%>%
  select(c(paste(params$cars_var)))

```

Declaring Parameters

Parameters are declared using the params field within the YAML header of the document. For example, the file above creates the parameter cars_var and assigns it the default value "dist".

Using Parameters in Code

Parameters are made available within the knit environment as a read-only list named params. To access a parameter in code, call params$<parameter name>.

Setting Parameters values

Add a params argument to render to create a report that uses a new set of parameter values. Here we modify our report to use the speed variable with

render("rmarkdown_demo.Rmd", params = list(cars_var = "speed"))

Better yet, click the “Knit with Parameters” option in the dropdown menu next to the RStudio IDE knit button to set parameters, render, and preview the report in a single user friendly step.

Parameters are useful when you want to re-render the same report with distinct values for various key inputs, for example:

Learn more about parameters at Paramaterized Reports.

Tables

By default, R Markdown displays data frames and matrices as they would be in the R terminal (in a monospaced font). If you prefer that data be displayed with additional formatting you can use the knitr::kable function.

CODE

# Tables

**knitr kable**
```{r kable tables ,echo=FALSE, results='asis'}
library(knitr)
kable(head(cars,5),caption="Kable Table")
```

**default table**
```{r default tables, echo=FALSE}
head(cars,5)
```

Note the use of the results='asis' chunk option. This is required to ensure that the raw table output isn’t processed further by knitr.

Markdown Basics

Format the text in your R Markdown file with Pandoc’s Markdown, a set of markup annotations for plain text files. When you render your file, Pandoc transforms the marked up text into formatted text in your final file format.

syntax becomes
plain text plain text

End a line with two spaces

to start a new paragraph

End a line with two spaces to start a

new paragraph

*italics* and _italics_ italics and italics
**bold** and __bold__ bold and bold
superscript^2^ superscript2
~~strikethrough~~ strikethrough
[link](www.rstudio.com) link
# Header 1

Header 1

## Header 2

Header 2

### Header 3

Header 3

#### Header 4

Header 4

##### Header 5
Header 5
###### Header 6
Header 6
endash: -- endash: –
emdash: --- emdash: —
ellipsis: ... ellipsis: …
inline equation: $A = \pi*r^{2}$ inline equation: \(A = \pi*r^{2}\)
image: ![](images/VERTICAL_PRINT_maroon_cmyk.jpg)
horizontal rule (or slide break): ***

horizontal rule (or slide break):


> block quote

block quote

* unordered list

* item 2

+ sub-item 1

+ sub-item 2

  • unordered list

  • item 2

    • sub-item 1

    • sub-item 2

1. ordered list

2. item 2

+ sub-item 1

+ sub-item 2

  1. ordered list

  2. item 2

    • sub-item 1

    • sub-item 2

Table Header | Second Header

------------- | -------------

Table Cell | Cell 2

Cell 3 | Cell 4

Table Header Second Header
Table Cell Cell 2
Cell 3 Cell 4

CHEATSHEET

CHEATSHEET2

Exercise #2

  1. Create an Exercise 2 heading in your Rmd file and give at least 5 examples of syntax (don’t include any headers). Be sure and knit to check it.

Output Formats

Set the output_format argument of render to render your .Rmd file into any of R Markdown’s supported formats. For example, the code below renders NAMEOFYOURFILE.Rmd to a Microsoft Word document.

library(rmarkdown)

render("NAMEOFYOURFILE.Rmd", output_format = "word_document")

If you do not select a format, R Markdown renders the file to its default format, which you can set in the output field of a .Rmd file’s header. The header of NAMEOFYOURFILE.Rmd shows that it renders to an HTML file by default.

The RStudio IDE knit button renders a file to the first format listed in its output field. You can render to additional formats by clicking the dropdown menu beside the knit button:

The following output formats are available to use with R Markdown.

Documents

Presentations (slides)

More

You can also build books, websites, and interactive documents with R Markdown.

Output Options

Each output format is implemented as a function in R. You can customize the output by passing arguments to the function as sub-values of the output field.

To learn which arguments a format takes, read the format’s help page in R, e.g. ?html_document.

Assignment

Turn in an RMD file and a html of the questions and answers from Module 1. You have an example of how to connect to your database and you already have your sql code. Copy and paste the questions and add the code chunk answers. Be sure and show the code and the results.


  1. The information for this module is from RStudio↩︎