“R” U ready?: a case study using R to analyze changes in gene expression during evolution
As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in t...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2024-03-01
|
Series: | Frontiers in Education |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/feduc.2024.1379910/full |
_version_ | 1797248698681393152 |
---|---|
author | Amy E. Pomeroy Andrea Bixler Stefanie H. Chen Stefanie H. Chen Jennifer E. Kerr Todd D. Levine Elizabeth F. Ryder |
author_facet | Amy E. Pomeroy Andrea Bixler Stefanie H. Chen Stefanie H. Chen Jennifer E. Kerr Todd D. Levine Elizabeth F. Ryder |
author_sort | Amy E. Pomeroy |
collection | DOAJ |
description | As high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets. |
first_indexed | 2024-04-24T20:18:44Z |
format | Article |
id | doaj.art-a3a46ee370a84cfeba8f6d3b68d0193c |
institution | Directory Open Access Journal |
issn | 2504-284X |
language | English |
last_indexed | 2024-04-24T20:18:44Z |
publishDate | 2024-03-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Education |
spelling | doaj.art-a3a46ee370a84cfeba8f6d3b68d0193c2024-03-22T12:39:56ZengFrontiers Media S.A.Frontiers in Education2504-284X2024-03-01910.3389/feduc.2024.13799101379910“R” U ready?: a case study using R to analyze changes in gene expression during evolutionAmy E. Pomeroy0Andrea Bixler1Stefanie H. Chen2Stefanie H. Chen3Jennifer E. Kerr4Todd D. Levine5Elizabeth F. Ryder6Department of Pharmacology, Computational Medicine Program, UNC Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC, United StatesBiology Program, Clarke University, Dubuque, IA, United StatesDepartment of Biological Sciences, North Carolina State University, Raleigh, NC, United StatesBiotechnology Program, North Carolina State University, Raleigh, NC, United StatesDepartment of Biology, Notre Dame of Maryland University, Baltimore, MD, United StatesDepartment of Life Sciences and Prairie Springs Environmental Education Center, Carroll University, Waukesha, WI, United StatesDepartment of Biology and Biotechnology, Worcester Polytechnic Institute, Worcester, MA, United StatesAs high-throughput methods become more common, training undergraduates to analyze data must include having them generate informative summaries of large datasets. This flexible case study provides an opportunity for undergraduate students to become familiar with the capabilities of R programming in the context of high-throughput evolutionary data collected using macroarrays. The story line introduces a recent graduate hired at a biotech firm and tasked with analysis and visualization of changes in gene expression from 20,000 generations of the Lenski Lab’s Long-Term Evolution Experiment (LTEE). Our main character is not familiar with R and is guided by a coworker to learn about this platform. Initially this involves a step-by-step analysis of the small Iris dataset built into R which includes sepal and petal length of three species of irises. Practice calculating summary statistics and correlations, and making histograms and scatter plots, prepares the protagonist to perform similar analyses with the LTEE dataset. In the LTEE module, students analyze gene expression data from the long-term evolutionary experiments, developing their skills in manipulating and interpreting large scientific datasets through visualizations and statistical analysis. Prerequisite knowledge is basic statistics, the Central Dogma, and basic evolutionary principles. The Iris module provides hands-on experience using R programming to explore and visualize a simple dataset; it can be used independently as an introduction to R for biological data or skipped if students already have some experience with R. Both modules emphasize understanding the utility of R, rather than creation of original code. Pilot testing showed the case study was well-received by students and faculty, who described it as a clear introduction to R and appreciated the value of R for visualizing and analyzing large datasets.https://www.frontiersin.org/articles/10.3389/feduc.2024.1379910/fullhigh-throughput data analysisR programmingcase studiesevolutionary biologydata cleaningdata visualization |
spellingShingle | Amy E. Pomeroy Andrea Bixler Stefanie H. Chen Stefanie H. Chen Jennifer E. Kerr Todd D. Levine Elizabeth F. Ryder “R” U ready?: a case study using R to analyze changes in gene expression during evolution Frontiers in Education high-throughput data analysis R programming case studies evolutionary biology data cleaning data visualization |
title | “R” U ready?: a case study using R to analyze changes in gene expression during evolution |
title_full | “R” U ready?: a case study using R to analyze changes in gene expression during evolution |
title_fullStr | “R” U ready?: a case study using R to analyze changes in gene expression during evolution |
title_full_unstemmed | “R” U ready?: a case study using R to analyze changes in gene expression during evolution |
title_short | “R” U ready?: a case study using R to analyze changes in gene expression during evolution |
title_sort | r u ready a case study using r to analyze changes in gene expression during evolution |
topic | high-throughput data analysis R programming case studies evolutionary biology data cleaning data visualization |
url | https://www.frontiersin.org/articles/10.3389/feduc.2024.1379910/full |
work_keys_str_mv | AT amyepomeroy rureadyacasestudyusingrtoanalyzechangesingeneexpressionduringevolution AT andreabixler rureadyacasestudyusingrtoanalyzechangesingeneexpressionduringevolution AT stefaniehchen rureadyacasestudyusingrtoanalyzechangesingeneexpressionduringevolution AT stefaniehchen rureadyacasestudyusingrtoanalyzechangesingeneexpressionduringevolution AT jenniferekerr rureadyacasestudyusingrtoanalyzechangesingeneexpressionduringevolution AT todddlevine rureadyacasestudyusingrtoanalyzechangesingeneexpressionduringevolution AT elizabethfryder rureadyacasestudyusingrtoanalyzechangesingeneexpressionduringevolution |