In this issue, I’d like to talk about my research. Yesterday, I did my advancement to candidacy exam and locked in the three ideas that I’ve committed to researching and publishing. Thankfully, I passed!
This means that I’m no longer a Ph.D student, but a “Ph.D candidate”. All that stands between me and my doctorate is my dissertation. I thought it would be a great idea to tell my subscribers what I personally do outside of Youtube and Substack; also because I was so slammed for time, I couldn’t write another issue earlier in the week.
I’ll do my best to make it accessible as possible! Let’s get started.
My general area of interest
I’m interested in a specific type of clinical trial: the N-of-1 design. Here’s a schematic of how the design works.
Like their name suggest, N-of-1 trials are trials that only focus on a single individual. Instead of just taking a few observations from this individual, the trial switches the treatment that this person is on and follows them over time. This is referred to as multiple cross-over.
Data is collected where a person is on either some treatment “A” or another treatment “B”. One of these is usually a placebo or standard of care. By comparing the observations on one treatment to another, we can estimate an individual treatment effect. Not only is this an individual effect, but if the treatments are randomized and blinded, then this effect can be thought of as causal. That’s huge.
Therefore, N-of-1 trials are trials that can help guide individuals to better treatment that works for them. I learned about this type of clinical trial in my Master’s degree, and I decided to pursue them in my Ph.D.
Aim 1: Additive Bayesian Networks
This project started when I first came into the program. It has evolved a lot since then. I’m not super proud of it because my research skills weren’t as developed at the time, but it’s still work I’ve done.
The context that this project takes place in is observational N-of-1 designs. These are variations of N-of-1 trials that do not have an experimental component (aka the thing that researchers control). Instead, the goal of this variant is to observe how one or several variables evolve and interact over time in an individual. Think physical activity or moment-to-moment emotions.
With these designs, there are several exposure-outcome relationships that a researcher might want to study simultaneously. But it may not be known ahead of time which exposures should be used for each outcome.
What I wanted to do was to suggest a new type of model to be used for this type of data. The answer I eventually arrived at was the Additive Bayesian Network.
An additive Bayesian network is a set of (generalized) linear models that are subject to the constraints of a directed, acyclic graph, or DAG. In a DAG, nodes represent variables, while edges represent exposure-outcome relationships.
There are several algorithms available that try to learn the edges in network of variables, given some data. I tried to show that this type of model performs the best among other plausible candidates on observational N-of-1 data and then applied it to an actual real-world dataset. In short, the novelty here is that this type of model has not been applied to this type of dataset and has value because it makes it easier to discover possibly interesting trends in your data that can be pursued in future experiments.
Aim 2: The Platform-of-1 Design
This project was inspired by a previous work that my advisor had done with her postdoc. I was looking for a project that I could incrementally extend and practice dealing with the statistical considerations of this small change.
This project proposes a new clinical trial design. In other words, I am proposing plan for how data should be collected such that it accomplishes a given goal. The goal of a randomized controlled trial is to demonstrate comparative effectiveness of a new treatment to current standards. The goal of this project is to identify an optimal treatment for an individual among a set of possibly many candidates. Here’s a schema of my proposed design:
I call this design the “Platform-of-1” because it combines elements of a platform trial and an N-of-1 trial together. It allows for interim decisions and adaptive randomization to more quickly identify the optimal treatment. The novelty here is that most N-of-1 trials are not designed with multiple treatments in mind, and the value is that it provides recommendations for how someone should plan a trial around them.
Aim 3: A Multivariate N-of-1 Trial
This last project is an extension of my Aim 2, but this time, the changes I’m suggesting are rather large. I wanted this project to stretch my programming abilities and give me more experience implementing statistical models from scratch. I consider this to be one of the skills I’m sorely lacking as a biostatistician, in addition to lacking in mathematical theory. The idea behind this project is to explicitly use and plan for multiple primary outcomes in an N-of-1 setting. For context, most trials are planned around a single primary outcome, meaning that stuff like sample size calculations are centered around it. Secondary outcomes may not be powered as a result.
But most diseases and treatments can be characterized in multiple ways. By considering this information in the design process, my thinking is that it will improve the efficiency (read: smaller sample sizes) of the design. Furthermore, it allows individuals to specify their own goals and desired outcomes that trial designers can work with.
I’ve only done literature review for this project, so I have no interesting images to show you. Using multiple, possibly mixed, outcomes is a completely new topic to me. None of my coursework has directly addressed multiple outcomes like this, so it’s a much higher intellectual lift than the previous two projects. But I think it’s doable in a year-and-a-half, and I have confidence in my ability to implement the ideas I see in code.
Last Remarks
There’s lots of detail I needed to leave out of these descriptions. I tried to focus on the precise problem I was trying to solve, rather than the solution that I came up with. I hope you find this interesting. If you plan to steal my idea, then go for it. We’ll probably both produce some cool stuff along the way.
See you next week.
Christian
😵💫 What am I working on right now?
Still working on a video about correlation and causation. This should be published next week.
💀 Moving 💀
🧐 What am I enjoying right now?
Books — Charles Duhigg (author of The Power of Habit) wrote a new book, so I put everything else down to start listening to it. The new book is called Supercommunicators (affiliate link), and I’ve liked it so far. Good communication is a keystone skill for biostatisticians and educating on Youtube, so I’m always ready to learn something new here.
📺 Recent videos
Explaining Confidence Intervals and The Critical Region: an explainer video on the critical region/confidence interval method of making decisions for hypothesis tests. A (better) alternative to Fisher’s p-value method, and a necessity for understanding standard statistical analyses in research.
since the research is heavily based on statistics and probability, we have frequent and bayesian statistics. I always believed that these two schools are outdated. The reason for this is that frequent statistics focus only on one element (factor) which is frequency while bayesian statistics is kinda an added stuff to frequentist approach. My idea is that what if there are many elements (factors) that decide that probability of an event but we just reiled on one factor which is frequency?! in the old times when they discover something they stick to it and they dont try to change the rooths of that theory such as considering more factors more than frequency. Come to think about it, it does not seem very convincing that what decide the probability of events is just how many times an event occur, though it is a working theory.