In 2003, the Human Genome Project made history when it sequenced 92% of the human genome. But for nearly two decades since, scientists have struggled to decipher the remaining 8%. Now, a team of nearly 100 scientists from the Telomere-to-Telomere (T2T) Consortium has unveiled the complete human genome – the first time it’s been sequenced in its entirety, the researchers say.
“Having this complete information will allow us to better understand how we form as an individual organism and how we vary not just between other humans but other species,” Evan Eichler, a Howard Hughes Medical Institute investigator at the University of Washington and the research leader, said Thursday.
The new research introduces 400 million letters to the previously sequenced DNA – an entire chromosome’s worth. The full genome will allow scientists to analyze how DNA differs between people and whether these genetic variations play a role in disease.
The research, published in the journal Science on Thursday, was previously in preprint, allowing other teams to use the sequence in their own studies.
Until now, it was unclear what these unknown genes coded.
“It turns out that these genes are incredibly important for adaptation,” Eichler said. “They contain immune response genes that help us to adapt and survive infections and plagues and viruses. They contain genes that are … very important in terms of predicting drug response.”
Eichler also said that some of the recently uncovered genes are even responsible for making human brains larger than those of other primates, providing insight into what makes humans unique.
This remaining 8% of the human genome had stumped scientists for years because of its complexities. For one thing, it contained DNA regions with several repetitions, which made it challenging to string the DNA together in the correct order using previous sequencing methods.
The researchers relied on two DNA sequencing technologies that emerged over the past decade to bring this project to fruition: the Oxford Nanopore DNA sequencing method, which can sequence up to 1 million DNA letters at once but with some mistakes, and the PacBio HiFi DNA sequencing method, which can read 20,000 letters with 99.9% accuracy.
Sequencing DNA is like solving a jigsaw puzzle, Eichler said. Scientists must first break the DNA into smaller parts and then use sequencing machines to piece it together in the correct order. Previous sequencing tools could sequence only small sections of DNA at once.
With a 10,000-piece puzzle, it’s hard to correctly arrange small puzzle pieces when they look alike, much like it is to sequence small sections of repetitive DNA. But with a 500-piece puzzle, it’s much easier to arrange larger pieces – or, in this case, longer segments of DNA.
A second challenge was finding cells that contained only one genome.
Standard human cells contain two sets of DNA, a maternal copy and a paternal copy, but this team used DNA from a group of cells called a complete hydatidiform mole, which contains a duplicate of the paternal set of DNA. A complete hydatidiform mole is a rare complication of a pregnancy caused by the abnormal growth of cells that originate from the placenta. This approach simplifies the genome so that scientists need sequence only one set rather than two sets of DNA.
Because the research team used a duplicate set of DNA, the scientists were unable to sequence the Y chromosome originally. According to lead study author Adam Phillippy, the team has managed to sequence the Y chromosome using a different set of cells.
A complete set of 24 sequenced chromosomes is available on the University of Santa Cruz genome browser.
Decoding this gapless sequence has a high price. Phillippy, who is also head of the gene informatics section at the National Human Genome Research Institute, said that altogether, the project cost a few million dollars or more. But that’s a fraction of the almost $450 million that it cost the Human Genome Project to achieve its final sequence in 2003. And with new technology, sequencing is only getting cheaper.
For now, it’s still too costly and time-consuming for everyone to sequence their own genome. But research is underway that uses this genome to identify whether certain genetic differences are linked with specific cancers. Knowing the genetic variations could also allow doctors to better tailor treatments, said Michael Schatz, another researcher on the team and a professor of computer science and biology at Johns Hopkins University.
Phillippy said he hopes that within the next 10 years, sequencing individuals’ genomes can become a routine medical test that costs less than $1,000. His team continues to work toward that goal.
Charles Rotimi, scientific director of the National Human Genome Research Institute, said in a statement that this scientific achievement is “moving us closer to individualized medicine for all humanity.” Rotimi was not involved in the research.
Correction: A previous version of this story misspelled Evan Eichler’s name.