class: center, middle, inverse, title-slide # Intro to R Markdown ## Automate! Reproduce! Win! ### Matt Dray ### First presented 14 May 2018 --- # Hello * Who's used R? * Who's used RMarkdown? This talk is for *beginners*. It isn't about how to code. It's about improving what we do. ??? R training sessions are coming! Materials are available from the [R Training Course site](https://dfe-analytical-services.github.io/r-training-course/). --- # The problem is the process The Excel-Word approach is looooong and error-prone. Exaggerated example: 1. Receive non-machine-readable data in Excel format 2. Use a pre-prepared Excel template to reformat the data 3. Generate figures, tables and plots 4. Copy and paste to ~100-page Word document 5. Find error/need to alter something 6. Fix it 7. Spend ages re-copy-pasting the updated bits 8. Get confused about the current working version 9. Repeat steps 5 to 7 until 4am --- # Is this reproducible? Questions to ask yourself: * are the steps recorded? * what's the chance of human error (e.g. bad copy-paste)? * is it easy to keep track of changes? * could someone else reproduce it from scratch independently? --- # Minimise the risk We can reduce error and go faster with R. 1. Receive machine-readable data in Excel format 2. Run a pre-prepared file written with **R Markdown** 3. Make minor tweaks if needed and re-render stress-free 4. Use the saved time to prepare more informative commentary and have a cup of tea --- # What's an R Markdown file? A document with filetype .Rmd in which you: * write plain text that's 'marked-up' with symbols (`*`, `[]`, `^`, etc) * embed R code to read, process and visualise data * click a button to render it to a HTML, Word or PDF ??? This is already being done in the department and across government ([Reproducible Analytical Pipelines](https://dataingovernment.blog.gov.uk/2017/03/27/reproducible-analytical-pipeline/)). --- # What does it look like? <img src="img/example-rmd-rstudio.PNG" width="100%" alt="RStudio window showing example R Markdown file on the left and the rendered output on the right"> --- # What does it look like? <img src="img/example-rmd-rstudio-notes.jpg" width="100%" alt="RStudio window showing example R Markdown file on the left ('header', 'text' and 'code chunk' highlighted) and the rendered output on the right"> --- # Header Enclosed in three hyphens. Specify metadata and what output should be produced. There's also space for some style information. ``` --- title: "Your title" author: "Your name" date: "The date" output: word_document: reference_docx: mystyles.docx highlight: "tango" --- ``` ??? The header section doesn't contain R code. It's a language called YAML, which stands for *Yet Another Markup Language*! --- # Body text: formatting You type: ``` *Italic*, **bold** super^script^ and a [link](www.gov.uk). ``` You get: > *Italic*, **bold**, super^script^ and a [link](www.gov.uk). ??? If you type these in RStudio, the text will change colour to indicate the formatting. <img src="img/format-colour.PNG" width="100%" alt="Text highlighting in R Studio"> --- # Body text: inline code (maths) You can write some R code in the middle of a sentence! Wrap the code in backticks (the button under the Esc key) and start it with the letter `r`. You type: ```r The answer to 1 + 1 is `r 1 + 1` ``` You get: > The answer to 1 + 1 is 2 ??? You start with the letter `r` to indicate that R code is contained within the backticks. --- # Body text: inline code (stored values) Let's say `my_value <- "Lotad"` You type: ```r The best Pokemon is `r my_value` ``` ??? Your choice of Pokemon is *wrong*. <img src="https://cdn.bulbagarden.net/upload/thumb/2/2c/Lotad_PMRS.png/200px-Lotad_PMRS.png" width="75%" alt"The mighty Pokemon Lotad is held aloft by an excited fellow"> You get: > The best Pokemon is Lotad --- # Code chunks: input ```` ```{r chicks, echo=FALSE} chickwts %>% group_by(feed) %>% ggplot() + geom_col(aes(x = feed, y = weight)) ``` ```` Don't worry about the code, just know that it's going to produce something. --- # Code chunks: output ![](index_files/figure-html/chicks-1.png)<!-- --> --- # Simple steps 1. In RStudio: File > New File > R Markdown 2. Write your text and embed your code 4. Click 'Knit' to render the document 5. The file is output to your current working directory **DEMO TIME!** --- # To summarise <img src="img/knit-button.jpg" width="70%" alt="RStudio window with 'Knit' button highlighted abnd content of RMarkdown tab shown"> --- # What's actually hapening? Your R Markdown is rendered into 'plain' markdown by the package `knitr` (hence why you 'knit' to render the document), then something called `pandoc` converts it from markdown to your output format. .Rmd ➡️ `knitr` ➡️ .md ➡️ `pandoc` ➡️ .html/.pdf/.docx This isn't *essential* knowledge. --- # Further reading * [R Markdown: The Defninitive Guide](https://bookdown.org/yihui/rmarkdown/) by Yihui Xie, JJ Allaire and Garrett Grolemund * In RStudio: Help > Cheatsheets for hints and tips. * [Introduction to R Markdown](https://rmarkdown.rstudio.com/lesson-1.html) (webpages featuring video) * [Getting started with R Markdown](https://www.rstudio.com/resources/webinars/getting-started-with-r-markdown/) (video, 1 hr) * [See the R Markdown gallery for examples of other document types](https://rmarkdown.rstudio.com/gallery.html), including dashboards, interactive documents, presentations (like this one) and books with R Markdown * '[Knitting club](https://matt-dray.github.io/knitting-club/)', a Coffee & Coding session (webpage) * A workflow with R Markdown works much better using an [R Project](https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects) and [version control](http://happygitwithr.com/), which are subjects for another day