Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

For example, given 10 profiles of people who can roll their tongue and those who

ID: 32507 • Letter: F

Question

For example, given 10 profiles of people who can roll their tongue and those who cannot, I am trying to identify what is different between these people and what's similar.

A typical record is in this format:

Rsid, chromosome, Position on chromosome, actual 2 letter mutation

How can such records be compared to identify similarities for a given phenotype? Does RSID uniquely identify a mutation, or is it a combination of rsid and position?

In other words, if one user has rsid 123 at position 555 and another one has rsid 123 at position 666, and the mutation is the same, is this a similarity, or does position matter?

Explanation / Answer

Each rsid identifies a unique SNP in the genome. Thus there should not be any entry in the files that have the same rsid but different chromosomes and/or positions. If you do find this, it is likely that you have data from different versions of the assembled genomes.

To find genetic variants associated, i.e. correlated, with your trait, you need to focus on the genotype at each of the SNPs. For example, SNP rs3094315 is an A/G polymorphism, i.e. an individual can either have the genotype AA, AG, or GG. To find if it is associated with tongue rolling, you would count the number of AA, AG, and GG individuals (alternatively you could sum the number of A and G alleles) in the group that can roll their tongue and compare those numbers to the group that cannot. You would then repeat this for each SNP that you want to test. For most SNPs, there will be no difference in proportion.

That being said, this is not a modest undertaking as this a cornerstone of human and statistical genetics. There are many issues that could strongly bias your results (e.g. population stratification, genotyping error, linkage disequilibrium, multiple testing).