Genetic make-up of an organism or an individual lies in the DNA sequences. The differences in two individuals will naturally be reflected in the differences of their nucleotide sequences.
Human Genome Project
They can known only if the entire human genome is mapped. With the establishment of genetic-engineering techniques where it was possible to isolate and clone any piece of DNA and availability of simple and fast techniques for determining DNA sequences, a very ambitious project if sequencing human genome was launched in the year 1990.
![Human Genome Project- DNA](https://bioneethub.in/wp-content/uploads/2024/07/DNA.gif)
HGP was the international, collaborative research program whose goal was the complete mapping and understanding of all the genes of human beings. All genes together of haploid set of chromosomes are known as genome.
Human genome project as ‘Mega project’ was a 13 year project, co-ordinated by the US Department of Energy and the National Institute of Health.
Soon welcome trust (UK) joined the project as major partner, additional contributions came from Japan, France, Germany, China and others. The project was completed in 2003. HGP has been called a megaproject due to:-
(i) Huge cost estimated to be 9 billion US dollars, The cost of sequencing 1 bp is US$3.
(ii) Very large number of base pairs (3 x 910 bp) to be identified and sequenced.
(iii) Requires a large number of scientists, technicians and supporting staff.
(iv) Storage of data generated which requires some 3300 books, each with 1000 pages and each page having 1000 typed letters. However, high-speed computational devices for storage, retrieval and analysis of data made it easier to do the same.
(v) The science of Bioinformatics also developed during this period and helped HGP.
Goals of HGP
Following are the important goals of HGP:-
(i) Identification of all the approximately 20,000 – 25.000 genes in human DNA.
(ii) To determine the sequences of the 3 billion chemical base pairs that make up human DNA..
(iii) To store this information in databases
(iv) To improve tools for data analysis.
(v) Transfer-related technologies to other sectors, such as industries.
(vi) ELSI : To solve any ethical, legal and social issues.
(vii) Bioinformatics i.e close association of HGP with the rapid development of a new area in biology.
![Human Genome Project-Bioinformatic](https://bioneethub.in/wp-content/uploads/2024/07/Bioinformatics.png)
(viii) Sequencing of model organisms :- Non-human organisms DNA sequences can lead to an understanding of their natural capabilities that can be applied towards solving challenges in health-care, agriculture, energy production, environment remediation. Many non-human model organisms such as bacteria, yeast, Caenorthabditis elegans (a free living non-pathogenic nematode), Drosophila, plants like rice and Arabidopsis, etc., have been sequenced
As for examples:-
Organisms | Base pairs | No. of genes |
E.coil | 4.7 million | 4,000 |
Saccharomyces cerevisiae | 12 million | 6,000 |
Caenorhabditis elegans | 97 million | 18,000 |
Drosophila melanogaster | 180 million | 13,000 |
Arabidopsis | 130 million | 25,000 |
Oryza sativa | 430 million | 32,000 – 50,000 |
.
Methodologies
The methods involved two major approaches:
(i) ESTs/Expressed Sequence Tags :- Identifying all genes that are expressed as RNA.
(ii) Sequence Annotation: Sequencing the whole set of genome that contained all the coding and non-coding sequences and later different regions in the sequence with functions.
For sequencing, the total DNA from a cell is isolated and conveted into random fragments of relatively smaller sizes (recall DNA is a very long polymer, and there are technical limitations in sequencing very long pieces of DNA) and cloned in suitable host using specialized vectors.
The cloning resulted into amplification of each piece of DNA fragment so that it subsequently could be sequenced with ease. The commonly used hosts were bacteria and yeast, and the vectors were called as BAC (bacterial artificial chromosomes), and YAC (Yeast artificial chromosomes).
The fragments were sequenced using automated DNA sequencers that worked on the principle of a method developed by Frederick Sanger. Sanger is also credited for developing method for determination of amino acid sequence in proteins. These sequences were then arranged based on some overlapping regions present in them. This required generation of overlapping fragments for sequencing
![Human Genome Project-Frederick Sanger](https://bioneethub.in/wp-content/uploads/2024/07/Frederick-Sanger.jpg)
Alignment of these sequences was humanly not possible. Therefore, specialized computer-based programs were developed. These sequences were subsequently annotated and were assigned to each chromosome. The sequence of chromosome 1 was completed only in May 2006 (this was the last of the 24 human chromosomes- 22 autosomes and X and Y- to be sequenced).
![](https://bioneethub.in/wp-content/uploads/2024/07/diagram-of-human-genome-project.jpg)
Salient Features of Human Genome:-
Some of the salient observations drawn from human genome project are as follows:-
(i) The human genome contains 3164.7 million nucleotide bases.
(ii) The average gene consists of 3000 bases, but size variesgreatly, with the largest known gene being dystrophin as 2.4 million bases and TDF gene as smallest gene with 14 bases.
(iii) The total number of genes is established at 30,000 much lower that previous estimates of 80,000 to 1,40,000 genes. Almost all (99.9 percent) nucleotide bases are exactly the same in all people.
(iv) The functions are unknown for over 50 percent of discovered genes.
(v) Less than 2 percent of the genome codes for proteins.
(vi) Repeated sequences make up very large portion of the human genome.
(vii) Repetitive sequences are stretches of DNA sequences that are repeated many times, sometimes hundred to thousands times. They are thought to have no direct coding functions, but they shed light on chromosome structure, dynamics and evolution.
(viii) Chromosomes 1 has most genes (2968) and the Y has the fewest (231).
(ix) Scientists have identified about 1.4 million locations where single base DNA differences occur in humans. This is known as SNPs – single nucleotide polymorphisms pronounced as ‘snips’. This information promises to revolutions the process of finding chromosomal locations for disease-associated sequences and tracing human history.
Applications and Future Challenges
Completion of first phase of human genome project has been compared to discovery of antibiotics because it has opened a vast data base of knowledge about various aspects of human genome.
Soon we shall be mapping all the human genes, all sequences, transposons and junk DNA.
There are more than 1200 genes that cause common cardiovascular ailments, endocrine disease like diabetes, Alzheimer’s disease, cancers and other neurological ailments. After taking their snapshots, it will be possible to know the method to alter them and remove the possibility of the disorders.
Single gene defects produce a number of hereditary diseases, that can be corrected.
It will be possible to study interactions between various genes, proteins, as well as mechanism of forming tissues, organs, tumours or switch over to different developmental stages.
It holds promise of healthier and longer living, designer drugs and genetically modified diets according to needs of individual human beings.