{a11ytables}

#EarlConf, September 2022
NHS-R, January 2023

Matt Dray, Civil Service

tl;dr

Generate best-practice spreadsheets with help from {a11ytables}


co-analysis.github.io/a11ytables

github.com/co-analysis/a11ytables

QR code that points to the documentation website of the R package 'a11ytables'.

Ahoy

  • R for reproducibility (official)
  • Off-piste R (personal)

@mattdray@fosstodon.org

matt-dray.com

16 by 16 pixelated image of the author's face, which has big eyebrows and a goatee.

An opportunity

It looks like you’re trying to improve a spreadsheet.

An unofficial version of the Microsoft Office mascot Clippy, a paperclip with eyes.

It looks like you’re trying to improve a spreadsheet.

An unofficial version of the Microsoft Office mascot Clippy, a paperclip with eyes.

An unofficial version of the Microsoft Office mascot Clippy, a paperclip with eyes.

An unofficial version of the Microsoft Office mascot Clippy, a paperclip with eyes.

An unofficial version of the Microsoft Office mascot Clippy, a paperclip with eyes.

Improve

  1. User experience
  2. Producer experience

User experience

Producer experience

Tool requirements

  1. Simple
  2. Opinionated
  3. Compliant
  4. Known

Fresh eggs

Install

install.packages("remotes")

remotes::install_github(
  "co-analysis/a11ytables",
  build_vignettes = TRUE
)

library(a11ytables)

Workflow

create_a11ytable() |>
  generate_workbook() |>
  openxlsx::saveWorkbook()

Workflow

create_a11ytable(
  tab_titles, sheet_types, sheet_titles, # required char vectors
  blank_cells, sources,                  # optional char vectors
  tables                                 # required list of dataframes
) |>
  generate_workbook() |>
  openxlsx::saveWorkbook("file.xlsx")

# The package also has an RStudio Addin with templates

Pre-prepared inputs

source("eggs/eggs.R")  # load pre-prepared data objects
ls()                   # print environment objects
[1] "egg_blank_cells"  "egg_sheet_titles" "egg_sheet_types"  "egg_sources"     
[5] "egg_tab_titles"   "egg_tables"      

Pre-prepared inputs

source("eggs/eggs.R")  # pre-prepared data objects
ls()                   # print environment objects
[1] "egg_blank_cells"  "egg_sheet_titles" "egg_sheet_types"  "egg_sources"     
[5] "egg_tab_titles"   "egg_tables"      
egg_sheet_titles[1:4]  # first few sheet titles
[1] "UK egg packing station throughput and prices"
[2] "Contents"                                    
[3] "Notes"                                       
[4] "Table 1: Throughput by country, quarterly"   
class(egg_tables)      # a list of tables
[1] "list"

Combine

create_a11ytable()

Combine

create_a11ytable(
  tab_titles   = egg_tab_titles, 
  sheet_types  = egg_sheet_types, 
  sheet_titles = egg_sheet_titles,
  blank_cells  = egg_blank_cells, 
  sources      = egg_sources,
  tables       = egg_tables
) -> egg_at

# Provides warnings/errors if there's problems

Combine

create_a11ytable(
  tab_titles   = egg_tab_titles, 
  sheet_types  = egg_sheet_types, 
  sheet_titles = egg_sheet_titles,
  blank_cells  = egg_blank_cells, 
  sources      = egg_sources,
  tables       = egg_tables
) -> egg_at

# Provides warnings/errors if there's problems

class(egg_at)
[1] "a11ytable"  "tbl"        "data.frame"

A dataframe!

egg_at
# a11ytable: 9 x 6
  tab_title sheet_type sheet_title                          blank…¹ source table
  <chr>     <chr>      <chr>                                <chr>   <chr>  <nam>
1 Cover     cover      UK egg packing station throughput a… <NA>    <NA>   <df> 
2 Contents  contents   Contents                             <NA>    <NA>   <df> 
3 Notes     notes      Notes                                <NA>    <NA>   <df> 
4 Table_1   tables     Table 1: Throughput by country, qua… <NA>    Quart… <df> 
5 Table_2   tables     Table 2: Throughput by system, quar… <NA>    Quart… <df> 
6 Table_3   tables     Table 3: Average packer to producer… <NA>    Quart… <df> 
7 Table_4   tables     Table 4: Throughput by country, ann… <NA>    Quart… <df> 
8 Table_5   tables     Table 5: Throughput by system, annu… <NA>    Quart… <df> 
9 Table_6   tables     Table 6: Average packer to producer… <NA>    Quart… <df> 
# … with abbreviated variable name ¹​blank_cells

One row is one sheet

str(egg_at[egg_at$tab_title == "Table_2", ] )
Classes 'a11ytable', 'tbl' and 'data.frame':    1 obs. of  6 variables:
 $ tab_title  : chr "Table_2"
 $ sheet_type : chr "tables"
 $ sheet_title: chr "Table 2: Throughput by system, quarterly"
 $ blank_cells: chr NA
 $ source     : chr "Quarterly UK Egg Packing Station Survey."
 $ table      :List of 1
  ..$ qtrly_system:'data.frame':    424 obs. of  4 variables:
  .. ..$ Year                : chr  "1996" "1996" "1996" "1996" ...
  .. ..$ Quarter             : chr  "Q1" "Q1" "Q1" "Q1" ...
  .. ..$ System [note 1]     : chr  "Enriched" "Barn" "Free Range" "Organic" ...
  .. ..$ Eggs (million dozen): chr  "151" "3.2" "16.6" "[z]" ...

Convert to Workbook

egg_wb <- generate_workbook(egg_at)
class(egg_wb)
[1] "Workbook"
attr(,"package")
[1] "openxlsx"

Structured, styled

egg_wb
A Workbook object.
 
Worksheets:
 Sheet 1: "Cover"
 
    Custom row heights (row: height)
     2: 34, 4: 34, 6: 34, 8: 34, 10: 34, 12: 34 
    Custom column widths (column: width)
      1: 72 
 

 Sheet 2: "Contents"
 
    Custom column widths (column: width)
      1: 16, 2: 56 
 

 Sheet 3: "Notes"
 
    Custom column widths (column: width)
      1: 16, 2: 56 
 

 Sheet 4: "Table_1"
 
    Custom column widths (column: width)
      1: 16, 2: 16, 3: 16, 4: 16 
 

 Sheet 5: "Table_2"
 
    Custom column widths (column: width)
      1: 16, 2: 16, 3: 16, 4: 16 
 

 Sheet 6: "Table_3"
 
    Custom column widths (column: width)
      1: 16, 2: 16, 3: 16, 4: 16 
 

 Sheet 7: "Table_4"
 
    Custom column widths (column: width)
      1: 16, 2: 16, 3: 32 
 

 Sheet 8: "Table_5"
 
    Custom column widths (column: width)
      1: 16, 2: 16, 3: 32 
 

 Sheet 9: "Table_6"
 
    Custom column widths (column: width)
      1: 16, 2: 16, 3: 32 
 

 
 Worksheet write order: 1, 2, 3, 4, 5, 6, 7, 8, 9
 Active Sheet 1: "Cover" 
    Position: 1

Write

openxlsx::saveWorkbook(
  egg_wb,
  "eggs/2022-07-28_eggs-packers_test.xlsx",
  overwrite = TRUE
)

Crack it open

An unofficial version of the The Microsoft Office mascot Clippy, a paperclip with eyes, with a red heart above his head.

An unofficial version of the The Microsoft Office mascot Clippy, a paperclip with eyes, with a red heart above his head.

An unofficial version of the The Microsoft Office mascot Clippy, a paperclip with eyes, with a red heart above his head.

An unofficial version of the The Microsoft Office mascot Clippy, a paperclip with eyes, with a red heart above his head.

An unofficial version of the The Microsoft Office mascot Clippy, a paperclip with eyes, with a red heart above his head.

An unofficial version of the The Microsoft Office mascot Clippy, a paperclip with eyes, with a red heart above his head.

But

Not perfect

  • Final tweaks needed
  • Assumes simple spreadsheets
  • Garbage in, garbage out

Clunk!

  • {openxlsx} has limitations
  • Output isn’t ODS
  • Fiddly prep?

Future

  • Bugfixes, requests, user testing
  • YAML input?
  • Converge with gptables

So what?

Forget spreadsheets!

  1. Improve user experience
  2. Make things easier for you
  3. Create common tools

tl;dr

Generate best-practice spreadsheets with help from {a11ytables}

co-analysis.github.io/a11ytables

github.com/co-analysis/a11ytables

matt-dray.com

QR code that points to the documentation website of the R package 'a11ytables'.

Credits

  • Guidance docs: Hannah Thomas, Analysis Function
  • gptables: Rowan Hemsi and contributors
  • {a11ytables} contributors: Tim Taylor, Matt Kerlogue
  • {openxlsx} and {pillar} contributors/maintainers
  • Quarto presentations with Revealjs
  • OpenMoji: paperclip, heart (edited, CC BY-SA 4.0)