Install

You can install {altcheckr} from GitHub using the {remotes} package.

install.packages(remotes)
remotes::install_github("matt-dray/altcheckr")
library(altcheckr)

Get image elements

Use the alt_get() function to scrape the attributes of each <img> element on a web page that you name in the url argument,

The function uses {xml2} and {rvest} to scrape a given web page and extract image attributes, with a little bit of {purrr} to get it into a data frame.

get_img <- alt_get("https://www.bbc.co.uk/news")

The function returns a tibble where each row is an image element from that page and columns are the the image source (src), alt text (alt) and link to a file with a longer description (longdesc), if it exists (sometimes used for complex images). The alt column will be created and filled with NA if it isn’t present.

Setting the argument all_attributes to TRUE will return all the attributes provided in the <img> element, not just src, alt and longdesc.

Here is a preview of the tibble that is output from alt_get():

print(get_img)
#> # A tibble: 50 x 2
#>    src                                    alt                                   
#>    <chr>                                  <chr>                                 
#>  1 https://a1.api.bbc.co.uk/hit.xiti?&co… ""                                    
#>  2 https://ichef.bbci.co.uk/news/320/cps… "Alexei Navalny (centre) is escorted …
#>  3 data:image/gif;base64,R0lGODlhAQABAIA… "Guatemalan soldiers and police beat …
#>  4 data:image/gif;base64,R0lGODlhAQABAIA… "Militia groups gather to protect pro…
#>  5 data:image/gif;base64,R0lGODlhAQABAIA… "Capitol Police officer wearing a MAG…
#>  6 data:image/gif;base64,R0lGODlhAQABAIA… "Police in DC"                        
#>  7 data:image/gif;base64,R0lGODlhAQABAIA… "A traveller passes through O'Hare In…
#>  8 data:image/gif;base64,R0lGODlhAQABAIA… "Members of a rescue team work at the…
#>  9 data:image/gif;base64,R0lGODlhAQABAIA… "Australian player Bernard Tomic pict…
#> 10 data:image/gif;base64,R0lGODlhAQABAIA… "Britain's Health Secretary Matt Hanc…
#> # … with 40 more rows

Check alt text

You can then pass the output of alt_get() to alt_check() to perform a series of basic assessments of each image’s alt text.

(You can also pass any data frame that contains a src and alt column, where alt contains the text to be assessed by alt_check(). For example, {altcheckr} has a built-in dataset: example_get.)

check_img <- alt_check(get_img)

This will return the same tibble as alt_get(), but new columns have now been appended.

Each new column is the outcome of a check for a possible accessibility issue with the alt text. For example, whether the alt text actually exists and whether it is long.

print(check_img)
#> # A tibble: 50 x 10
#>    src   alt   alt_exists nchar_count nchar_assess file_ext self_evident
#>    <chr> <chr> <chr>            <int> <chr>        <lgl>    <lgl>       
#>  1 http… ""    Empty               NA <NA>         NA       NA          
#>  2 http… "Ale… Exists             103 OK           FALSE    TRUE        
#>  3 data… "Gua… Exists             105 OK           FALSE    FALSE       
#>  4 data… "Mil… Exists             185 Long         FALSE    FALSE       
#>  5 data… "Cap… Exists              41 OK           FALSE    FALSE       
#>  6 data… "Pol… Exists              12 Short        FALSE    FALSE       
#>  7 data… "A t… Exists              55 OK           FALSE    FALSE       
#>  8 data… "Mem… Exists             174 Long         FALSE    FALSE       
#>  9 data… "Aus… Exists              72 OK           FALSE    TRUE        
#> 10 data… "Bri… Exists             171 Long         FALSE    FALSE       
#> # … with 40 more rows, and 3 more variables: terminal_punct <lgl>,
#> #   spellcheck <list>, not_basic <list>

And here is the structure now:

dplyr::glimpse(check_img)
#> Rows: 50
#> Columns: 10
#> $ src            <chr> "https://a1.api.bbc.co.uk/hit.xiti?&col=1&from=p&ptag=…
#> $ alt            <chr> "", "Alexei Navalny (centre) is escorted by police in …
#> $ alt_exists     <chr> "Empty", "Exists", "Exists", "Exists", "Exists", "Exis…
#> $ nchar_count    <int> NA, 103, 105, 185, 41, 12, 55, 174, 72, 171, 44, 104, …
#> $ nchar_assess   <chr> NA, "OK", "OK", "Long", "OK", "Short", "OK", "Long", "…
#> $ file_ext       <lgl> NA, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, F…
#> $ self_evident   <lgl> NA, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TR…
#> $ terminal_punct <lgl> NA, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE, FA…
#> $ spellcheck     <list> [<>, <"Navalny", "centre", "Khimki">, "Chiquimula", <…
#> $ not_basic      <list> [<>, <"alexei", "navalny", "centre", "escorted", "pol…