Thursday, July 19, 2007

Formatting

So this week I was supposed to work on analyzing the data I had gathered. I had hundreds of sequences available, but not all of the genes had enough individuals to be useful. I deleted all the genes that did not have seven or more individuals available. With less than 14 haplotypes, you can’t make very reliable conclusions about that particular gene. So it turned out that only ten out of twenty-seven genes were fit for analyzing. First, I had to transfer the sequencher files to MacClade, a program that helps you align the base pairs so that you aren’t starting in the middle of a codon. The MacClade program lets me see all the sequences in the gene at the same time. The SNPs show up very clearly because each base is a different color, so if there were no SNPs, the gene sequences would just look like pinstripes. So then I had to export the MacClade files as fasta files. I had to format all the sequences individually, which was time consuming, but now I am finally done and all I have to do is give Ralph, who has written programming to analyze selection and variation frequency for sequences. I was formerly using a program called DNAsp, which essentially does the thing that Ralph’s program does, but it would not read any of the data files I made on my computer. Also, Ralph’s program not only gives you the numbers, but also tells you how certain it is in this number. DNAsp just gives you the number. On Friday and Monday I’ll probably work on fixing my poster.

No comments: