I chose to participate in an internship at the Texas Biomedical Research Institute during the spring of 2013 in the genetics department. I have always been interested in anthropology and it ultimately became my major, but I was not interested in genetics until I took both statistics and genetics in the spring of 2012. The classes seemed to fit together perfectly in my mind and I enjoyed attending them and building connections when they overlapped in their scope. I knew the genetics department in Texas Biomed would be the perfect fit because Dr. Comuzzie is an anthropologist with the primary concern of statistical genetics. His focus is the environmental influence of diet on diabetes and heart disease
I worked on data sets from three studies during my internship. I analyzed problem sets for heritability, multipoint gene mapping, and correlations between variables for the Oman Family study, the Strong Heart Family study, and the GOCODAN study. All three of the studies emphasized the relationships between body weight, obesity, diabetes, and diet.
My immediate supervisor was Dr. Voruganti. She is a genetic scientist and her background in nutrition was essential to understanding and interpreting blood test results. In addition, she understood how body processes interacted with one another in the human body. She introduced me to the statistical analysis phase of genetic research. She showed me how to use the SYSTAT program and I was practicing using data from the Strong Heart Study (SHS) to find mean, range, standard error, p-value, minimum and maximum value, correlations and variance.
During my internship, I worked with the local SSH server used by Texas Biomed. I manipulated the files using the SSH server, but to perform calculations, I learned to use the SOLAR and PEDSYS software systems. These programs required me to enter in lines of code to instruct the computer in the assigned calculation. Most importantly, I used them to identify if the sample had to high of variance or if the findings had to high of an error rate. I calculated for heritability percentages, skewness in the data, and gene mapping information. I would calculate and compare data for the study that Dr. Voruganti would give me that morning. She would explain where the study took place and why it was conducted. She explained how the study was carried out, what data was gathered, and what tests were necessary to analyze the data set. I would ask questions for anything that I could not figure out, but I always gave myself about 20 minutes to think about the problem and try multiple solutions before I would just ask for the answer. This strategy helped me learn my way around commands in the Solar and SSH secure network. When my work was mostly complete, I would compile the finding into a word document that showed clearly the tests performed and the outcomes. Dr. Voruganti would review the data and if something significant were found, she and I would discuss what could be interpreted from the findings.
SOLAR stands for Sequential Oligogenic Linkage Analysis Routines. It is a software package for genetic variance components analysis that include linkage analysis, quantitative genetic analysis, SNP analysis, and covariate relationships. Operations are included for calculation of marker-specific or multipoint matrices in pedigrees of great size and complexity It is also used for linkage analysis of multiple quantitative traits and/or discrete traits which may involve multiple loci.
PEDSYS is a database system developed as a specialized tool for management of genetic, pedigree and demographic data. It was designed principally for use with pedigree analysis of either human or non-human subjects.
During my work and the explanations of why I needed to do it, I was surprised that I understood most of the genetic terminology and calculations. I realized that I had a great genetics professor that taught us what we needed to know and it was very helpful that I remembered most of it. When I had to do statistical calculations, I asked questions mostly to remind me of what I had already learned. My classes in anthropology, statistics, and genetics prepared me with a good foundation to understand anthropological genetic testing and interpretation. If I had not taken both statistics and genetics before starting the internship, I know that I would have felt completely lost. As it turned out, I felt confidant of my work and I know I was expanding on my prior knowledge of genetics and statistics.
During my internship, I thought I would just be analyzing data sets, but I learned so much more. I left with a basic understanding of how genetics projects are conducted from start to finish. I was able to talk with the field scientists and learn how data sets are gathered, analyzed, and interpreted. I was introduced to the whole picture of genetic studies and not just the analyzing aspect. I enjoyed the statistical work at Texas Biomed and I am thankful for the instructions I received from Dr. Saroja Voruganti. She not only showed me how to complete the work, but she also explained why the experiments were done and what they all meant. I was very satisfied with my work because I knew what I was doing and why I was doing it. My internship was concluded during the last week of April and I am very grateful to the skills I acquired.