Chapter 1 Motivation & TLDR

The purpose of this reference manual is to help early-career trainees in biostatistics to have a place to begin. This work is a reference manual, which means most of its contents are not new. Rather, the novelty in what I am providing here is in its form and organization. During the last couple of years of my PhD program, I set out to gather resources together to help the newer GRAs on my team, with the goal of helping them by sharing the links, lectures, and tips that have helped me the most. What I am publishing here is the result of that work. Although this reference manual is intended to serve graduate students and early career professionals in biostatistics, some of those working in the fields of statistics, data science, and bioinformatics may also find this manual relevant to their work.

The objective of this manual is twofold: to provide starting points from which to address common questions of practice, and to summarize the references and tools which I have found most helpful in my journey through gradate school and my early years as a biostatistician. Overarching these two goals is a desire to make a contribution toward training biostatisticians whose work is professional, accessible, rigorous, and reproducible.

If even one other person is encouraged and sharpened by this work, my goal will be accomplished.

1.1 Context

The meta-question behind this short book is ‘where to begin?’. I started graduate school in August of 2019, coming straight from an undergrad math program 1 to a doctoral program in biostatistics. In the same week that I started classes, I started my work as a graduate research assistant (GRA) in the Division of Biostatistics and Computational Biology at the University of Iowa College of Dentistry. It was a steep learning curve. Now heading into my 6th year as a Ph.D. student, I began compiling some notes/tips to share with the other GRA’s who joined our team after me. Those notes have become this reference manual.

When you are starting off as a GRA in a new field, you need to know what to Google. Yes, a significant component of graduate level study is learning how to learn. You need to struggle with hard concepts and new ideas – the learning is truly in the struggle. At the same time, however, you need to know where to start. This creates a tension: new GRAs need to engage in independent study work, but they also need resources that teach them some first principles to guide their search. If you don’t know what your searching for, you won’t know if you’ve found it. Moreover, if you are a novice who thinks you know what you are searching for, you could easily come across bad advice (by ‘bad’, I mean that which leads to suboptimal outcomes with respect to rigor and reproducibility).

With this tension in mind, the goal of this reference manual is to (1) outline some of the main ideas that you will need to be successful, and (2) to provide you with some vocabulary to help you as you learn independently. Many of the chapters here outline some general concepts and then provide links to further reading.

1.2 What this book assumes

This is a reference manual, not a textbook. As such, my philosophy in writing this is to summarize the most helpful things I have learned in a way that is conducive to quick review. The chapters that are titled after broad areas of research (e.g., ‘survival analysis’) contain more links than new content. I assume throughout this writing that the readers are either familiar with foundational statistical concepts or are currently in courses that are teaching those concepts.

1.3 Acronymns

Throughout this work, I use the following acronyms:

  • GRA: Graduate research assistant
  • PI: Primary investigator
  • CC: Carbon copy (as in an email)
  • UI: University of Iowa
  • HPC: High performance computing
  • SAS: Statistical Analysis System software
  • GWAS: Genome-wide association study

1.4 Working with the garage door open

This manual is not comprehensive, and it is a work in progress. I am writing this as I work my way through my own program, making notes as I go. This is in the spirit of working with the garage door open, an approach to learning that really resonates with me.

If you think of a reference that you’d like to see included, open a GitHub issue or make a pull request to let me know.

1.5 Dedication

This work is dedicated in memory of my father, Bill Peter (1955-2022).


  1. I got a B.S. in math from Wheaton College, IL↩︎