Reference management

How to use citations and include your bibliography in R Quarto.
module 1
week 3
Quarto
authoring
BibTeX
programming
Author
Affiliation

School of Life Sciences, University of Hawaii

Published

January 31, 2023

Pre-lecture materials

Read ahead

Read ahead

Before class, you can prepare by reading the following materials:

  1. Authoring in R Markdown from RStudio
  2. Citations from Reproducible Research in R from the Monash Data Fluency initiative
  3. Bibliography from R Markdown Cookbook

Acknowledgements

Material for this lecture was borrowed and adopted from

Learning objectives

Learning objectives

At the end of this lesson you will:

  • Know what types of bibliography file formats can be used in a R Markdown file
  • Learn how to add citations to a Quarto file
  • Know how to change the citation style (e.g. APA, Chicago, Evolution, etc)

Introduction

Your citation list is a critical aspect of research, as all research is built on the contribution in prior work. One of the most tedious tasks of science is hand-editing bibiolographies — and one of the great benefits of the LaTeX environment is the automated formatting references for bibliographies. 🎊

Itʻs worth taking a little time to learn how to use these tools, as citations are critical for every paper, project report, or academic website, and is often even required in paper assignments. Since R markdown and Quarto work nicely with LaTeX (and BibTeX, the associated reference management software), we get a bonus in having access to the power of BibTeX without writing any LaTeX, maing it easy to add citations to our Quarto markdown projects.

Quarto documentation currently doesnʻt include citations, but seems to support all the features presnt in R markdown. We will be using .bib format references in this tutorial, but R markdown (and presumably Quarto) supports other citation formats as well (for more info see).

The Parts

There are three basic parts:

  1. The citation data which is stored in the .bib file Section 4, and
  2. In-text citations included by the author into the .qmd document Section 7, and
  3. Linking the ʻ.bibʻ file in the YAML header Section 6.

From this, both the in-text citations as well as the citation list at the end of the document will be rendered.

There are additional customization options. You can change the style of the bibliography using style and class files Section 8. This is at the level of the entire document, it is easy to switch. Within the document, there are also many options for in-text citation styles Section 9. You can also customize the style file by editing the LaTeX.

Reference managers are helper software (independent of BibTeX) that are wonderful tools to help you collect and organize your citation data Section 5.

The Bibliography is in .bib files

The citation data are stored in a .bib file which is a text file. Most people donʻt type these entries themselves (unless itʻs for a new publication!) but rather download these through journal databases such as Web of Science, PubMed, directly from jornal websites, etc.

R provides nice function citation() that generating the citation blob for R packages. Let’s try generating citation text for the R environment by using the following command

citation("base")

To cite R in publications use:

  R Core Team (2022). R: A language and environment for statistical
  computing. R Foundation for Statistical Computing, Vienna, Austria.
  URL https://www.R-project.org/.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2022},
    url = {https://www.R-project.org/},
  }

We have invested a lot of time and effort in creating R, please cite it
when using it for data analysis. See also 'citation("pkgname")' for
citing R packages.

You can replace “base” with any R package name, such as “rmarkdown” or my package “ouch”. You can see that the format is simple, the information is in a key = value format, and the various parts have tags.

Multiple citations are stored in your .bib file, with each citation separated by a blank line.

Reference Managers

Personally, I really like [BibDesk]https://bibdesk.sourceforge.io for Mac OS to manage my citations. It is easy to use and has all the features I need. A cross-platform option that is popular now is JabRef https://github.com/JabRef/jabref. JabRef is written in Java, is open source and has support for Linux, Windows, and MacOS.

Citation management software

In addition to .bib (BibTeX) there are a lot of file formats in use including .medline (MEDLINE), .ris (RIS), and .enl (EndNote), among others. You can generally download the results of your literature search in the format of your choice (some citation manager software can convert formats as well).

If you recall the output from citation("rmarkdown") above, one option is to copy and paste the BibTeX output into a text file labeled .bib or into citation management software, but instead we can use Rʻs write_bib() function from the knitr package to create a bibliography file.

Let’s run the following code in order to generate a my-refs.bib file

knitr::write_bib("rmarkdown", file = "my-refs.bib")

You can output multiple citations by passing a vector of package names:

knitr::write_bib(c("rmarkdown","base"), file = "my-refs.bib")

Now we can see we have the file saved locally.

[1] "ButlerPapers.bib" "evolution.csl"    "index 2.html"     "index.qmd"       
[5] "index.rmarkdown"  "my-refs.bib"     

If you open up the my-refs.bib file, you will see

@Manual{R-base,
  title = {R: A Language and Environment for Statistical Computing},
  author = {{R Core Team}},
  organization = {R Foundation for Statistical Computing},
  address = {Vienna, Austria},
  year = {2022},
  url = {https://www.R-project.org/},
}

@Manual{R-rmarkdown,
  title = {rmarkdown: Dynamic Documents for R},
  author = {JJ Allaire and Yihui Xie and Jonathan McPherson and Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and Winston Chang and Richard Iannone},
  year = {2021},
  note = {R package version 2.8},
  url = {https://CRAN.R-project.org/package=rmarkdown},
}

@Book{rmarkdown2018,
  title = {R Markdown: The Definitive Guide},
  author = {Yihui Xie and J.J. Allaire and Garrett Grolemund},
  publisher = {Chapman and Hall/CRC},
  address = {Boca Raton, Florida},
  year = {2018},
  note = {ISBN 9781138359338},
  url = {https://bookdown.org/yihui/rmarkdown},
}

@Book{rmarkdown2020,
  title = {R Markdown Cookbook},
  author = {Yihui Xie and Christophe Dervieux and Emily Riederer},
  publisher = {Chapman and Hall/CRC},
  address = {Boca Raton, Florida},
  year = {2020},
  note = {ISBN 9780367563837},
  url = {https://bookdown.org/yihui/rmarkdown-cookbook},
}

Note there are three keys that we will use later on:

  • R-rmarkdown
  • rmarkdown2018
  • rmarkdown2020

Linking .bib file with .rmd or .qmd files

In order to use references within a R Markdown file, you will need to specify the name and a location of a bibliography file using the bibliography metadata field in a YAML metadata section. For example:

---
title: "My top ten favorite R packages"
output: html_document
bibliography: my-refs.bib
---

You can include multiple reference files using the following syntax, alternatively you can concatenate two bib files into one.

---
bibliography: ["my-refs1.bib", "my-refs2.bib"]
---

Inline citation

Now we can start using those bib keys that we have learned just before, using the following syntax

  • [@key] for single citation
  • [@key1; @key2] multiple citation can be separated by semi-colon
  • [-@key] in order to suppress author name, and just display the year
  • [see @key1 p 12; also this ref @key2] is also a valid syntax

Let’s start by citing the rmarkdown package using the following code and press Knit button:


I have been using the amazing Rmarkdown package (Allaire et al. 2022)! These look like great books to read (Xie, Allaire, and Grolemund 2018; and Xie, Dervieux, and Riederer 2020).


Pretty cool, huh??

Citation styles

By default, Pandoc will use a Chicago author-date format for citations and references.

To use another style, for example, in ecology and evolution, we often use the evolution.csl style. You will need to specify a CSL (Citation Style Language) file in the csl metadata field, e.g.,

---
title: "My top ten favorite R packages"
output: html_document
bibliography: my-refs.bib
csl: biomed-central.csl
---
Resources

The Zotero Style Repository makes it easy to search for and download your desired style. Just include it in the same directory as with your .qmd file.

CSL files also can be tweaked to meet custom formatting requirements. For example, we can change the number of authors required before “et al.” is used to abbreviate them. This can be simplified through the use of visual editors such as the one available at https://editor.citationstyles.org.

Other cool features

Add an item to a bibliography without using it

By default, the bibliography will only display items that are directly referenced in the document. If you want to include items in the bibliography without actually citing them in the body text, you can define a dummy nocite metadata field and put the citations there. They will be included in the references cited at the end, even though there is no in-text citation.

---
nocite: |
  @item1, @item2
---

Add all items to the bibliography

If we do not wish to explicitly state all of the items within the bibliography but would still like to show them in our references, we can use the following syntax:

---
nocite: '@*'
---

This will force all items to be displayed in the bibliography.

Resources

You can also have an appendix appear after bibliography. For more on this, see:

A Comment

We have learned that within your bibliography file (e.g. my-refs.bib), each reference gets a key, which is a shorthand that is generated by the reference manager or by yourself.

For instance, I use a format of author:year for example Butler:2023 because itʻs easy for me to remember the literature that I want to cite that way. It makes it easier to write the content.

When there are multiple citations in a year by the same author last name, my BibDesk software adds two unique letters after the year (I usually manually change it to one letter, i.e., Butler:2023a) so that each citation has a unique key.

You can choose any format you wish, just pick one and use it forever. As you write more papers, you can easily copy citations into your different bibliographies. You are building a database that is reusable.

In your R Markdown document, you can then cite the reference by adding the key, such as ...in the paper by Butler et al. [@Butler:2022]....

Post-lecture materials

Practice

Here are some post-lecture tasks to practice some of the material discussed.

Questions

Try out the following:

  1. What do you notice that’s different when you run citation("tidyverse") (compared to citation("rmarkdown"))?

  2. Install the following packages:

install.packages(c("bibtex", "RefManageR")

What do they do? How might they be helpful to you in terms of reference management?

  1. Practice using a different CSL file to change the citation style.

References

Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2022. Rmarkdown: Dynamic Documents for r. https://CRAN.R-project.org/package=rmarkdown.
Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.
Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook.