< No: 5 >

The Human Genome Project

The project to sequence the complete human genome was launched at the end of the 1980s, based on new technology for automated DNA sequencing facilitated by fluorescent detection. The project was initially controversial due to the staggering cost needed to complete it. However, several methodological advances were made in the 1990s, including the concepts of expressed sequence tags (ESTs) and whole-genome assembly based on shotgun sequencing, both first described by Craig Venter’s laboratory. In addition, several technological advances were made, such as more efficient instruments for fluorescent sequencing and the introduction of automated methods for sample preparation. The solid-phase methods for sequencing (see Milestone 2) and the next-generation sequencing methods (see Milestone 3) were also described during this time, but these were not introduced to the research community until several years after the completion of the human genome sequence. In fall 2000, President Clinton held a press conference in the White House to announce that sequencing of the human genome was complete, achieved by both private and public initiatives. The descriptions of the sequencing and analysis efforts were later published in two landmark papers in 2001. In the initial publications, the number of protein-coding genes in the human genome was estimated to be around 40,000, which turned out to be a gross overestimation, and the number has since been revised down to less than 20,000. This effort has allowed the HPA to generate antibodies to proteins corresponding to nearly all of the genes predicted from the genome sequence, including those not studied previously. Thus, the HPA portal has provided the first available information about many thousands of proteins in the human genome.

Key publications

Other selected publications

Figure legend: President Clinton, flanked by J. Craig Venter and Francis Collins, in the White House on June 26, 2000 to announce the completion of a “rough draft” of the human genetic code.

Key facts

  • The completion of the human genome project was announced at a press conference in 2000
  • The announcement was followed by two landmark publications in 2001
  • The publication by Venter et al. (2001) has been cited by more than 16,000 publications
  • The publication by Lander et al. (2001) has been cited by more than 19,000 publications
  • The original estimation of the number of protein-coding genes based on the human genome sequence was later recognized to be an overestimation and was revised from 40,000 to around 20,000