For a long time I’ve been fascinated by the power of DNA Sequencing and informatics – from my days pouring sequencing polyacrylamide gels at Emory and interpreting the Sanger sequences by hand (in some cases) to the culmination of my strict medicine education at Cornell, I’ve gained both a molecular and big-picture medical view of the technology’s power.
With the advent of cheap, complete DNA sequencing on the cusp of reality thanks to companies like Pacific Biosciences, Complete Genomics and even start-ups like Halcyon Molecular. But despite the breakthroughs in sequencing, we’re going to need piles and piles of data to really understand what’s going on in the chaotic world of genomics. As evidenced in David Goldstein’s article in the New England Journal of Medicine back in April, we might have identified many genes and understand that they play critical roles in disease and traits, but we have miles and miles to go – of the 20 genes we’ve identified in height variation, those genes can only explain 2% of the actual variation. Clinicians are finally beginning to realize that a disease like breast cancer comes from many gene combinations (that vary from case to case). News articles, mostly run by corporate PR, only cause confusion and obfuscation of the heart of the matter.
I’m fascinated by the space and will continue to comment as news comes out and ideas hit my head. There are so many questions that abound – what do we do about the 97.5% of the genome we barely understand? How do we correlate genes to disease? How do we understand factors in the genetic code, that while they do not code for a gene, have some sort of effect on the ‘strength’ of the gene’s expression? How is that represented statistically? Understanding this absolutely chaotic variation in human traits and disease will be a foundation of medicine in the future, but for now, we’ve got lots of little problems and topics to cover.
{ 1 comment… read it below or add one }
What would be your first step?