Software for analyzing global family-based association studies

Penalized linear mixed models for correlated genetic data with application to orofacial clefts

Summary

Oral cleft research was limited by the inability to fully analyze a complex multinational genetic dataset due to the absence of appropriate statistical tools. My role involved developing a custom statistical methodology capable of handling this intricate data structure. I rigorously tested and refined this method through simulations and then built a robust, open-source R/C++ software package for its practical application. The successful analysis of the dataset with this software provided valuable new insights into the genetics of oral clefts, contributing to the ongoing effort to understand, prevent, and treat these conditions.

Humanizing the project - meeting the participants

One unremarkable day of my doctoral work, I recognized something extraordinary about this dataset: Colombia was the country with the most representation in the study. Yes, Colombia had the most participants, hundreds more than the US. Which is astonishing, given that globally there is precious little genetic data representing Latin American populations. Less than 2% of GWAS data represent people of Hispanic or Latin American ancestry. I marveled at how the directors of the grant had managed to collect such rich data from a Colombian population. I wondered: was there a large research hospital in Colombia’s capital that was an exceptionally productive data collection site? It was in pursuing this question that I encountered the work of the Clínica Noel – a relatively small hospital, not in the national capital, providing exceptional treatment for children and hope for their families. This hospital was the leading recruitment site for the genetics study I was researching. I was so deeply moved by what I read about the Clínica Noel’s work that I determined that I had to go.

This was much deeper than one doc student’s desire for adventure. This was a principled decision, borne out of the need to humanize the genetic data. In my deeply computational work, I had lost the humanity of the data. Somewhere in the R/C++ code, I had forgotten that the purpose driving this work was really about children and their families. Reading the Clínica Noel website brought me near tears, seeing the stories of parents who found hope for their kids. Seeing the children’s smiles. Reading stories of children who received cleft lip/palate treatment at the Clínica Noel years ago, and then had grown up to become surgeons and specialists who serve in that clinic today.

Through the directors of the grant, I was able to contact some of the cleft care specialists at the Clínica Noel. We discussed the potential for a qualitative, community-based research project with the goal of humanizing the genetic research – that is, humanizing in the sense of integrating the elements of human personhood. We wanted to include participating families’ stories, experiences, and questions into the genetic research. Our idea became a grant that was funded for the 2024-2025 academic year by the Fulbright research program. I lived in Medellín for 7 months of that academic year conducting this research with my colleagues from the Clínica Noel, and it was a privilege to be a part of the work the clinic is doing.

There is a YouTube video capturing a lecture I gave at the Clínica Noel about our research. An article detailing our work in the Clínica Noel is in preparation.