From Promise to Progress: 21 Years After the Human Genome Project

Genomics and Personalised Medicine | Ali Nguyen and Aariana Rao

It’s almost two metres long and consists of 3.2 billion repeating A’s, C’s, T’s, and G’s coding for 30,000 genes, all intricately coiled and nestled within every single nucleus in your body [1]. This remarkable molecule, DNA, holds the power to cause devastating genetic diseases or, for the fortunate few, grant extraordinary abilities. DNA is undeniably one of life’s most complex and fascinating molecules, encoding the blueprint for every function and trait in living organisms. Unlocking its secrets was once a distant dream, but today, accessing the human genome is as simple as a Google search, thanks to an array of free and comprehensive databases. The far-reaching implications of this achievement extend across fields, from medical research to the study of human evolution. The Human Genome Project (HGP) made this possible, ushering in an era of unprecedented scientific exploration. In this article, we will delve into how the HGP has revolutionised science, driving advancements in genomics, and paving the way for new possibilities in diagnosing and treating genetic diseases. By exploring the project's milestones and its enduring impact, we can fully appreciate the profound changes it has brought to our understanding of human biology and its potential for future breakthroughs.

The Human Genome Project was the largest and most ambitious international scientific effort, with the goal of sequencing the entire human genome. Before the project began, genes for specific diseases were already known and sequenced, but no effort had been made to sequence DNA in its entirety [3]. Launched in 1990 and completed in 2003, this feat involved the National Institute of Health (NIH), the US Department of Energy (DOE), and Celera Genomics from the US, alongside collaborators from the UK, Germany, France, China, and Japan [4]. The DNA sequence came from 20 randomly selected individuals from Buffalo in New York and represents African, European, Asian, and admixed American ancestry [5]. Mapping a genome involves creating multiple different types of maps, notably genetic maps and physical maps. Genetic maps show the position of genes and other features on a genome, while physical maps show the positions of sequence features [6]. We look at the nucleotide sequence in the physical map, and in the genetic map, we see the relative locations of genetic markers (Figure 1).

Figure 1: The difference between genetic and physical maps [2].

The overarching goal of the Human Genome Project was to decipher the chemical sequence of the entire human genome, identifying all of its 50,000 to 100,000 genes, while providing research tools to analyse this genetic information [7]. Despite the ambitious nature of this research, a variety of promises were made by the NIH and other influential individuals, including presidents, to the public. The initial project envisioned that the HGP would advance the field of genetics and develop more advanced technologies. It cited the benefits of having complete cancer genomes that could be advantageous in treatments [8]. Closer to the launch of the programme in 1990, the narrative began to change and claims became more exaggerated, saying the fields of biology, biotechnology, and drug development would be revolutionised, with monumental societal impacts. Furthermore, predictions were made in regard to the personalisation of therapies and liberation of previously unusable drugs through the identification of genome sequences that could act as direct targets [8]. It was extrapolated that common perilous diseases and problematic behavioural traits might be solved, or their effects diminished, through an understanding of the fully sequenced human genome. The media and the wider public became particularly fixated on a follow-on idea of this concept, where ‘super babies’ could be created and genomic material manipulated to breed certain favoured traits into children and the wider population [8]. Looking back, these claims of importance and significance of the HGP to the public were highly overstated.

In June 2000, President Bill Clinton stated “Genome science will have a real impact on all our lives - and even more, on the lives of our children. It will revolutionise the diagnosis, prevention, and treatment of most, if not all, human diseases” [9]. This bold statement was made eight months before the paper would appear in Nature, with some analyses yet to be conducted. Tony Blair, UK’s Prime Minister in 2000, stood with President Clinton and said in a televised statement “...we are witnessing today a revolution in medical science whose implications far surpass even the discovery of antibiotics, the first great technological triumph of the 21st century” [10]. These claims, while based on some truth, were overdramatized. While genome sequencing may be a vital part of understanding disease pathology, it has not revolutionised modern science.

Undoubtedly, the most monumental feat of the HGP was the successful completion of the human DNA sequence. In 2000, the International Human Genome Sequencing Consortium (IHGSC) announced the completion of a working draft sequence, covering 85% of the genome [12]. This milestone was celebrated with a historic White House event that featured UK Prime Minister Tony Blair and US President Bill Clinton [13]. A year later, the journal Nature published a 62-page paper detailing the draft sequence, which covered 90% of the genome, alongside its initial analysis [14], [15]. Finally, in 2003 the International IHGSC announced that all project goals had been met, marking the end of the HGP two years ahead of schedule [16]. The final sequence covered 99% of the genome’s gene-containing regions at a 99.99% accuracy, published and analysed by Nature in October of the same year [17]. Beyond the human genome, the project also achieved its goal of sequencing the genomes of various model organisms. The C. elegans worm genome was successfully sequenced in 1998 [18], the D. melanogaster fruit fly genome in 2000 [19], and the draft sequence for the mouse genome in 2002 [20], to name a few. Today, these sequences are freely accessible through continually updated genome browsers such as the UCSC Genome Browser, Ensembl, and the NIH’s Genome Data Viewer.

Another ambitious goal of the HGP involved the ethical, legal, and social implications (ELSI) committee. Through various educational initiatives, workshops, and media outreach, ELSI worked to inform professionals and the public about the HGP and the ethical, legal, and social implications surrounding genome sequencing and genetic testing [21] and continues to do so currently. At a legislative level, ELSI played a crucial role in shaping policies and regulations in the US by addressing concerns about genetic discrimination, experimentation, and testing. One of its significant contributions was the foundation of the Genetic Information Nondiscrimination Act (GINA), a landmark civil rights law that protects individuals from genetic discrimination in the workplace and by insurance companies while offering protection for the privacy of genetic information.

The legislative journey of GINA began with the Genetic Information Nondiscrimination in Health Insurance Act of 1995, which aimed to prohibit insurance providers from raising premiums or denying coverage based on genetic information, and banned insurers from requiring individuals to disclose genetic information or undergo genetic testing without consent [22]. In 1999, the proposed bill was expanded to include the protection of individuals against employment discrimination based on genetic services or predictive genetic information [23]. The momentum for these protections continued to grow and in February of 2000, President Clinton signed an Executive Order prohibiting genetic discrimination in federal workplaces [24]. After years of advocacy and legislative efforts, the bill finally passed the House in 2007 and the Senate in 2008, ending with President George W. Bush signing it into law in the same year [25].

The Human Genome Project created a large advancement in sequencing technology and, naturally, the field of genetics itself. One of the main findings of the project was that ~2% of the human genome is involved in protein synthesis, which is around 20,000 protein-coding genes [26]. This came as a shock to the scientific community, as previous assumptions had predicted that there would be around 100,000 protein-coding genes. At only 20,000 protein-coding genes, humans are put on the same level as nematodes, a type of worm, indicating that the human genome sequence is not as superior as people initially believed [26]. This raised the idea that complexity derives from the coding of different proteins, rather than the quality of the genes themselves, leading to the revision of the ‘one gene, one protein’ concept. The remainder of repeat DNA sequences were termed ‘junk DNA’, as researchers had not identified a use for it. It was hypothesised that figuring out the roles of junk DNA would lead to an understanding of how new genes are created or how old genes are modified to create new genes [26]. To date, junk DNA has been researched extensively and humans have developed a greater understanding of the 98% of the genome that does not code for protein. Regardless, there are still more questions than answers around this, which in our opinion, reduces the overall impact that the HGP has had in terms of real-life applications in science.

Figure 2: ELSI Program Goals for the final years of the HGP [11].

DNA sequencing was approached on a large scale, which advanced the development of sequencing technology to increase capacity and decrease the overall size of equipment [7]. Automated machines were created that reduced the time and cost of biochemical processes, including sequencing, improving analyses, and facilitating data entry into databases. Robotic devices were also created to accelerate the time for repetitive tasks that are inherently found in large-scale research projects, while simultaneously reducing the chances for error in sequencing and mapping steps [7]. This was a substantial improvement in science as sequencing time was reduced from days or weeks to mere hours. Additionally, an added benefit was the development of databases with an increase in storage and improved software development to retain and analyse data.

Despite this improvement, the public impact of this has been relatively insignificant. DNA sequencing is used in very controlled environments, with regulatory boards dictating what is allowed to be done. While this varies from country to country, the Food and Drug Administration, Centres for Medicare and Medicaid Services, and Federal Trade Commission are three of the largest agencies that regulate next-generation sequencing testing and innovation for genomic medicine [27]. The applications of genome sequencing in modern science are limited, with a multitude of regulatory, legal, and ethical considerations in place. Gene editing technologies have been implemented in several clinical trials, investigating conditions including cancer, cardiovascular diseases, blindness by retinitis pigmentosa, haemophilia, beta thalassemia, sickle cell disease, and cystic fibrosis [28]. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) is one of the most widely known and used gene editing technologies. CRISPR-Cas9 is a relatively cheap and easy tool to use, involved in editing and modifying genomic sequences. It has been conducted in cellular and animal models, functional genomic screens, and live imaging of cellular genomes. Irrespective of the advancements made in the field, the clinical relevance of CRISPR-Cas9 and other gene editing technologies is negligible for human health and treatment [29]. In the future, there is potential for gene therapies for specific conditions, as mentioned above, though this will be a long time coming. In real-life applications, genomic science lacks the impact that President Bill Clinton and UK Prime Minister Tony Blair promised it would have.

While there are benefits and drawbacks to the HGP, it did provide scientists with a previously undetermined reference sequence to be used as a foundation for further research [30]. Genome-wide association studies (GWAS) are difficult to conduct, requiring thorough quality control and a robust study design to yield results that may be clinically relevant. The HGP remains the most influential GWAS to date, though future GWASs would be encouraged to change the scope of the research and potentially introduce more real-life applications in genomic science [31].

A consideration of the HGP outcomes is that identifying genes and linking them to certain characteristics are two different things. While genes may be identified in common populations, there is usually background biological noise that interferes in large datasets, raising the question: how can researchers tell if there is a true biological signal and distinguish this from confounding biological information [30]? The HGP made a valuable step forward in establishing genomic science as an integral part of biology and medicine, however, it is only one small step when considering the big picture. The extent to which functionality is observed in individual genomic elements remains to be seen, with some associations being made in very specific contexts [30].

It is the researcher’s job to remain neutral and unbiased when reporting on their studies, otherwise, incorrect associations may be drawn. This can have damaging effects on the populations studied and diminish the trust that the public has for scientists and researchers. A prime example of this is the Māori ‘Warrior Gene’. A group of researchers from the Institute for Environmental Science and Research–the leading scientific council of New Zealand’s Ministry of Health–announced at the 2006 International Congress of Human Genetics that they found a genetic polymorphism associated with higher levels of monoamine oxidase (MAO) in Māori people [32]. MAO had been linked to addictive behaviours in previous studies. The researchers extrapolated from this finding, without appropriate scientific evidence, indicating that Māori had a propensity towards violence and aggression. They further suggested that this violence is due to the very nature of Māori people themselves as a result of having a specific gene, which they dubbed the ‘Warrior Gene’ [32]. Other scientists, researchers, and the wider public later tore apart these findings, as there was no scientific evidence to support this and the comments made were devastating to the Māori community. This shows that despite the beneficial outcomes the HGP introduced, there is still a lot that is unknown and researchers have a huge responsibility to uphold their duties in appropriate and unbiased reporting of scientific knowledge.

Although the HGP was a monumental achievement, it did not accomplish everything initially hoped for in the sequencing and understanding of DNA. Contrary to popular belief, the project did not fully sequence the human genome. The HGP managed to sequence 92% of the genome, focusing primarily on gene-coding sequences, due to the technological capabilities at the time. It wasn't until 2022, with the efforts of the Telomere-to-Telomere (T2T) consortium, that the first truly complete human genome sequence was achieved [33]. This gap in the HGP’s accomplishments was largely due to the highly repetitive nature of DNA, especially in regions around the centromere and telomeres, where analysis has traditionally relied on the study of individual chromosomes rather than the large-scale sequencing methods used by the HGP [24].

Additionally, the majority of the DNA used during the project came from individuals with European ancestry, reflecting a broader trend in genetic research and genome-wide association studies. This lack of diversity has led to significant gaps in our understanding of the genomes of other ancestries, which in turn restricts our ability to fully comprehend health inequalities. Consequently, integrating this genetic research into clinical practice and policy-making risks the end product being incomplete and potentially inaccurate [25].

The NHGRI is aware of the missing gaps that remain in the field of genetics. The HGP has served as a catalyst for numerous subsequent initiatives, such as the 1000 Genomes Project, sequencing and cataloguing variants in the human genome, and the Cancer Genome Atlas (TCGA), aiming to characterise mutations responsible for cancer, among others [26].

Overall, even with its shortcomings, the Human Genome Project has given us an invaluable tool, used every day to improve our society. It slowly inches closer towards making a significant clinical impact, while making huge bounds and leaps towards technological and scientific advancements. It asks important questions regarding the implications of genome sequencing and, now more than ever, genetic engineering. It brought both the international science world and, most importantly, our entire species closer together than ever before. Truly, it is a miracle of science.

[1] A. McKenna et al., “The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data,” Genome Research, vol. 20, no. 9, pp. 1297–303, 2010, doi: https://doi.org/10.1101/gr.107524.110.

[2] E. Green, “Physical Map,” Genome. gov, 2024. https://www.genome.gov/genetics-glossary/Physical-Map

[3] U.S. Department of Energy, “Understanding Our Genetic Inheritance The U.S. Human Genome Project; The First Five Years: Fiscal Years 1991-1995,” 2024. https://doe-humangenomeproject.ornl.gov/understanding-our-genetic-inheritance-the-u-s-human-genome-project-the-first-five-years-fiscal-years-1991-1995/

[4] Lander et al., “Initial sequencing and analysis of the human genome,” Nature, vol. 409, no. 6822, pp. 860–921, Feb. 2001, doi: https://doi.org/10.1038/35057062

[5] National Human Genome Research Institute, “Human Genome Project,” National Human Genome Research Institute, Aug. 24, 2022. https://www.genome.gov/about-genomics/educational-resources/fact-sheets/human-genome-project.

[6] T. A. Brown, "Mapping Genomes" in Genomes. Oxford: Wiley Liss, 2012, ch.5. Available: https://www.ncbi.nlm.nih.gov/books/NBK21116/

[7] F. S. Collins and L. Fink, “The Human Genome Project,” Alcohol Health and Research World, vol. 19, no. 3, pp. 190–195,1995. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6875757/

[8] R. A. Gibbs, “The Human Genome Project changed everything,” Nature Reviews Genetics, vol. 21, no. 10. 2020. doi: 10.1038/s41576-020-0275-3.

[9] F. Collins, “Has the revolution arrived?,” Nature, vol. 464, no. 7289. 2010. doi: 10.1038/464674a.

[10] N. Wade, “Scientists Complete Rough Draft of Human Genome,” The New York Times. https://archive.nytimes.com/www.nytimes.com/library/national/science/062600sci-human-genome.html

[11] F. S. Collins, “New Goals for the U.S. Human Genome Project: 1998-2003,” Science, vol. 282, no. 5389, pp. 682–689, Oct. 1998, doi: https://doi.org/10.1126/science.282.5389.682.

[12] C. Yarbrough and A. Thompson, “International Human Genome Sequencing Consortium Announces "Working Draft" of Human Genome,” Genome.gov, 2013. https://www.genome.gov/10001457/2000-release-working-draft-of-human-genome-sequence

[13] Office of the Press Secretary, “June 2000 White House Event,” Genome.gov, 2012. https://www.genome.gov/10001356/june-2000-white-house-event

[14] International Human Genome Sequencing Consortium, “Initial sequencing and analysis of the human genome,” Nature, vol. 409, no. 6822, pp. 860–921, 2001.

[15] G. Spencer, “International Human Genome Sequencing Consortium Publishes Sequence and Analysis of the Human Genome,” Genome.gov, 2012. https://www.genome.gov/10002192/2001-release-first-analysis-of-human-genome

[16] National Human Genome Research Institute, “International Consortium Completes Human Genome Project,” Genome.gov, 2012. https://www.genome.gov/11006929/2003-release-international-consortium-completes-hgp

[17] International Human Genome Sequencing Consortium, “Finishing the euchromatic sequence of the human genome,” Nature, vol. 431, no. 7011, pp. 931–945, Oct. 2004, doi: https://doi.org/10.1038/nature03001.

[18] M. Berks, “The C. elegans genome sequencing project. C. elegans Genome Mapping and Sequencing Consortium.,” Genome Research, vol. 5, no. 2, pp. 99–104, Sep. 1995, doi: https://doi.org/10.1101/gr.5.2.99.

[19] M. D. Adams, “The Genome Sequence of Drosophila melanogaster,” Science, vol. 287, no. 5461, pp. 2185–2195, Mar. 2000, doi: https://doi.org/10.1126/science.287.5461.2185.

[20] Mouse Genome Sequencing Consortium, “Initial sequencing and comparative analysis of the mouse genome,” Nature, vol. 420, no. 6915, pp. 520–562, Dec. 2002, doi: https://doi.org/10.1038/nature01262.

[21] D. Drell and A. Adamson, “DOE ELSI Program Emphasizes Education, Privacy A Retrospective (1990-2000),” 2001. https://doe-humangenomeproject.ornl.gov/wp-content/uploads/2022/08/elsiprog.pdf

[22] L. M. Slaughter, Genetic Information Nondiscrimination in Health Insurance Act of 1995. 1995. https://www.congress.gov/bill/104th-congress/house-bill/2748/text

[23] L. M. Slaughter, Genetic Nondiscrimination in Health Insurance and Employment Act of 1999. 1999. https://www.congress.gov/bill/106th-congress/house-bill/2457/text

[24] The White House, “Executive Order 13145 to Prohibit Discrimination in Federal Employment Based on Genetic Information,” Genome.gov, 2010. https://www.genome.gov/10002084/2000-release-barring-genetic-discrimination

[25] L. M. Slaughter, “H.R.493 - 110th Congress (2007-2008): Genetic Information Nondiscrimination Act of 2008,” www.congress.gov, May 21, 2008. https://www.congress.gov/bill/110th-congress/house-bill/493

[26] F. Moraes and A. Góes, “A decade of human genome project conclusion: Scientific diffusion about our genome knowledge,” Biochemistry and Molecular Biology Education, vol. 44, no. 3, 2016, doi: 10.1002/bmb.20952.

[27] F. Luh and Y. Yen, “FDA guidance for next generation sequencing-based testing: balancing regulation and innovation in precision medicine,” npj Genomic Medicine, vol. 3, no. 1. 2018. doi: 10.1038/s41525-018-0067-2.

[28] H. Li, Y. Yang, W. Hong, M. Huang, M. Wu, and X. Zhao, “Applications of genome editing technology in the targeted therapy of human diseases: mechanisms, advances and prospects,” Signal Transduction and Targeted Therapy, vol. 5, no. 1. 2020. doi: 10.1038/s41392-019-0089-y.

[29] M. Redman, A. King, C. Watson, and D. King, “What is CRISPR/Cas9?,” Archives of Disease in Childhood: Education and Practice Edition, vol. 101, no.4, 2016, doi: 10.1136/archdischild-2016-310459.

[30] L. Hood and L. Rowen, “The human genome project: Big science transforms biology and medicine,” Genome Medicine, vol. 5, no. 9, 2013, doi: 10.1186/gm483.

[31] E. Uffelmann et al., “Genome-wide association studies,” Nature Reviews Methods Primers, vol. 1, no. 1. 2021. doi: 10.1038/s43586-021-00056-9.

[32] L. Perbal, “The ‘warrior gene’ and the mãori people: The responsibility of the geneticists,” Bioethics, vol. 27, no. 7, 2013, doi: 10.1111/j.1467-8519.2012.01970.x.

[33] S. Nurk et al., “The complete sequence of a human genome,” Science, vol. 376, no. 6588, pp. 44–53, Mar. 2022, doi: https://doi.org/10.1126/science.abj6987.

[34] N. Altemose et al., “Complete genomic and epigenetic maps of human centromeres,” Science, vol. 376, no. 6588, Apr. 2022, doi: https://doi.org/10.1126/science.abl4178.

[35] G. Sirugo, S. M. Williams, and S. A. Tishkoff, “The Missing Diversity in Human Genetic Studies,” Cell, vol. 177, no. 1, pp. 26–31, Mar. 2019, doi: https://doi.org/10.1016/j.cell.2019.02.048.

[36] E. D. Green, J. D. Watson, and F. S. Collins, “Human Genome Project: Twenty-five years of big biology,” Nature, vol. 526, no. 7571, pp. 29–31, Sep. 2015, doi: https://doi.org/10.1038/526029a.

Ali is a second-year Biomedical Science student at AUT, minoring in Psychology and Molecular Genetics. She has a keen interest in personalised medicine and hopes to pursue a future in the growing sector.

Ali Nguyen - BSc, Biomedical Science

Aariana is currently doing her postgraduate studies in Biomedical Science. Her main area of interest is neuroscience, particularly the consequences of pathology and injury in the brain.

Aariana Rao - PGDip, Biomedical Science