In case you don’t know me personally or this is the first article you’ve read by me, I study biostatistics as a full time job. I’ve spent the last two and a half years studying this stuff, but I have not quite nailed down how to describe what I do to friends and family. With finals breathing down my neck this week, I don’t have the same amount of time to dedicate to a Journal Club article or describing a statistical concept. So instead, I’ll step back even further and have an honest go at answering this question. At the end of it all, I hope I can have a simple ten-word sentence that I can just parrot whenever I’m asked.
Statistics vs Biostatistics?
First off, is there any meaningful difference between statisticians and biostatisticians? From my perspective, not really. The main difference lies in the type of data that these two positions work with. Whereas a statistician might work with many types of data in general, a biostatistician will be more focused on health-oriented data. The human and ethical elements of clinical trials and health give it some extra considerations that may not be relevant to something like say, economic or poll data. In the end, both deal with data to answer research questions, the questions just might be different.
That being said, the skill sets are almost identical from my point of view. Some biostatistics departments have their own classes, especially if that university has its own school of public health (ie Columbia or Johns Hopkins). Other universities may actually guide their biostatistics students to take classes with the statistics students themselves, who may be more associated with the mathematics department (ie UCSD). In this article, I’ll refer to biostatisticians, but you can just as easily substitute the term “statistician” instead and it will be 95% correct.
I think that one of the easiest ways to understand what a biostatistician does is see what they write. If you’ve ever gone to the Pubmed website to look for different research manuscripts, then you’ve most likely encountered a familiar structure that follows in each paper:
When I was an undergraduate reading through these papers, I thought that the Methods section was the least informative section of the paper. Get to the results already, and tell me why the paper is important!
Little did I know that I would be going to grad school for 6+ years to be the literal person that writes the Methods section of the research manuscript. The Methods section is where the researcher/biostatistician describes how the data was analyzed. The “how” involves exciting decisions such as: what statistical test should I use to answer my question? How did I analyze the data? Was the data processed in a special way that I should know about?
More often than not, it will be biostatisticians that actually perform the analysis in the first place. The primary researcher is in charge of creating the research question since they most likely have the expert perspective on the particular biology/question/study area. This doesn’t necessarily mean they will know the correct way to analyze the data in a way that best answers the questions they come up with. That is the bread and butter of the biostatistician.
Enabling Good Science
In last week’s issue, I discussed how biostatisticians often go into consultation and help researchers with their research problems. These problems run the gamut of “help me design this study” to “help me analyze this data”. What all of these problems have in common is that they’re intended to make the resulting manuscript as strong as possible so that it has a chance to be published.
For the uninitiated, professors live and die by their research output, measured as the number of articles that they publish. These papers don’t just get submitted and end up in a journal, they must first be vetted and criticized by peers in the field. If the prospective paper has shaky methodology or a questionable research aim, the peer review is meant to detect this and help a researcher improve their paper. This is the essence of good science: if someone can poke holes in your research, then you should address these weaknesses.
Biostatistical consulting is there to help researchers address these weaknesses (hopefully) in advance or (realistically) perform triage when the peer reviews come back. Sometimes researchers will use an analysis tool (like a t-test or linear regression) to answer a question when it might not be appropriate for their unique situation. A biostatistician is trained to be familiar with the nuances of different research situations and offer the most suitable course of action.
You might think of a biostatistician as the person with a metaphorical toolkit. Each tool is good for a particular analytic situation. One for randomized trials, one for observational studies, one for causal inference, the list goes on and on. A biostatistician might see a researcher using an incorrect “tool” on their data and suggest a better one.
At this point, you might wonder if there’s any difference between a biostatistician and a glorified data analyst. My experiences so far have come from a sort of student-apprentice perspective; most of the time, I’m there to hear about the intent of the study and perform the analysis itself. In consultation, I acted as a sort of data nit-picker.
It’s the bonafide Ph.Ds who perform the task that really help distinguish biostatistician from mere data analyst. In some cases, a biostatistician will help design an experiment. Good experiment design is really concerned with questions like:
How many people should we recruit?
What things should we measure? Do these things really capture a clinical variable of interest?
Should we make the experiment adaptive as opposed to fixed?
Experiment design is interesting enough to warrant its own articles, so I won’t delve too deeply here. It’s important to note that designing experiments is distinct from actually performing the experiments themselves. Statisticians have an indispensable role in research, so much so that the Harvard Business Review wrote a famous article about how data science was the “sexiest job” of the 21st century. Data scientist and biostatistician are not synonymous, but towards the end of the article, we see:
Hal Varian, the chief economist at Google, is known to have said, “The sexy job in the next 10 years will be statisticians.
Articles like these are certainly helpful for convincing family members I’m pursuing a good degree, but they don’t quite capture what I do on a daily basis.
Finally, The Sentence
You: So, what do you do as a biostatistician?
Me: I help design and run experiments by telling scientists the best way to approach them.