Nucleotide Excision Repair & Single Nucleotide Polymorphism


Nucleotide excision repair

The excision repair mechanism is a standard method for the repair of the mismatched or damaged DNA fragment. The process of excision repair involves cutting, removing and re-synthesizing the mismatched or damaged portion of DNA. Three distinct extraction excision repair mechanisms have been described: mismatch repair, base excision repair and nucleotide excision repair. All the mechanisms mentioned above follow a simple format of cutting, duplicate, and ligating stages. In the cutting stage, an enzyme complex eliminates the damaged fragment of DNA or mismatched base.

In the replicating or duplicating stage, the DNA polymerase (usually DNA polymerase I in the case of E.coli) will copy the template DNA to displace the mismatched or damaged DNA fragment. The DNA polymerase can start DNA synthesis from the 3′ end at damage or mismatch in the DNA fragment. At last, in the ligation stage, DNA ligase helps in sealing the remaining damage to complete the repair process to produce intact DNA.

In the nucleotide excision repair (NER) mechanism, the mismatched or damaged nucleotide is removed along with the neighbouring nucleotides and displaced with the nucleotides synthesized using an undamaged DNA strand as a template. The NER mechanism eliminates pyrimidine dimers formed after exposure to bulky chemical adducts or UV radiation. The basic component of the DNA damage fixed by nucleotide excision is that the damaged or modified nucleotides cause a critical distortion in the DNA double-helical structure. NER happens in practically all life forms.

The enzymes catalyzing NER in E. coli are UvrD helicase and UvrABC excinuclease. The genes encoding NER enzymes were initially considered mutants that are profoundly sensitive to damage induced by UV light exposure. However, in wild type E. coli, only prolonged exposure to UV radiations was killing cells.

The mutant strains can be recognized as considerably more sensitive towards UV radiation; these are damaged in their functioning capacities required for resistance against UV radiation (UV). By collecting an enormous number of mutants and testing them for their capacity to restore protection or resistance from UV radiation in different combinations. Four complementation groups were identified in the study, which codes for proteins that play a significant role in NER mechanism; these proteins are uvrA, uvrB, uvrC and uvrD.

The uvr genes coding for enzymes has been exhaustively studied. The uvrA, uvrB, and uvrC genes code for the subunits of UvrABC excinuclease which is a multisubunit enzyme. The UvrABC complex identifies the damage-induced structural changes in the DNA, like the formation of pyrimidine dimers. It then cuts on both sides the damage.

At that point, UvrD (also known as helicase II), which is produced as a result of the expression of uvrD gene, causes DNA unwinding and helps in the release of the damaged fragment. Accordingly, for this system, the UvrABC and UvrD proteins are involved in the cutting and excision of the damaged DNA. the gap created by cutting is filled by the action of DNA polymerase and the newly formed segment is sealed by the activity of DNA ligase.

The UvrABC protein structures a complex which identifies the damage and cuts the damaged DNA from both sides (endonucleolytic cuts). The helicase activity of uvrD helps remove the excised fragment of damaged DNA.   the undamaged fragment of the DNA directs the synthesis of the gap with the help of DNA polymerase and form a duplex DNA. Now the newly formed fragment of DNA is no longer damaged.

In a more detailed way, UvrA2 (a dimer) and UvrB identifies the damaged fragment after forming a (UvrA)2 UvrB complex. UvrA2 later dissociates after utilizing ATP. UvrA acts as an ATPase for the hydrolysis of ATP. After the dissociation of UvrA, UvrB forms a complex with UvrC at the site of damage. Now, this UvrBC complex acts as an active nuclease.

It cuts the DNA from both sides of the damage with the utilization of ATP. The sugar-phosphate (phosphodiester) backbone is cleaved at a position eight nucleotides away from the 5′ side of damaged DNA and 4-5 nucleotides away from the 3′ side of the damaged DNA. At last, the cleaved fragment is removed by the helicase action of the UvrD. It unwinds the DNA and takes out the cleaved fragment. The damaged DNA fragment later dissociated from the UvrBC complex. All the above mentioned three steps require ATP hydrolysis.

nucleotide excision repair
Figure: Nucleotide Excision Repair Mechanism

The NER system is highly dynamic in mammalian cells, just like many other organisms. A typical skin cell DNA presented to daylight would accumulate thousands of dimers each day if this maintenance cycle didn’t eliminate them! One human genetic disease, known as xeroderma pigmentosum (XP), is a skin abnormality brought about by defective enzymes that destroy UV-induced DNA lesions.

The fibroblasts of XP patients are extremely sensitive towards UV radiation when allowed to grow in a culture medium, as displayed by the uvr mutants of E. coli. These XP cell lines can be grown in a culture medium for evaluating the capacity to re-establish resistance against UV damage.

NER operates in two ways in the majority of mammals, yeast and bacteria.

– the repair system that acts upon the entire genome.

– the other repair system shows activity coupled with transcription.

The XP gene forms a complex with hHR23B protein which can sense the DNA damage.

In the nucleotide excision repair coupled with transcription, the RNA polymerase shows lesser activity at the damage point on the template strand; maybe this is the damage recognition activity for this method of NER. One of the basal transcription factors accompanying RNA polymerase II plays an essential role in both kinds of NER. An uncommon genetic problem in people known as Cockayne syndrome (CS) is also related to a transcription-coupled factor defect.

Two complementation bunches have been recognized, CSA and CSB. Determination of the enzyme activity and the nature enzymes encoded by them will give extra knowledge into the repair process of the transcribed DNA. The phenotype of CS patients is pleiotropic, expressing premature ageing, severe developmental and neurological disorders and light sensitivity. These symptoms are more severe than the symptoms of XP individuals with non-detectable NER mechanism. This indicates that CS (transcription-coupled repair mechanism) proteins have some more functions other than nucleotide excision repair.

Several other genetic diseases are a result of an insufficiency DNA repair mechanism. For example, Fanconi’s anaemia and Bloom’s syndrome. These are currently potential areas for research. An appropriate resource for up to date information on genetic diseases is the Online Mendelian Inheritance in Man (OMIM) portal.

Ataxia telangiectasia (AT) shows the impact of structural alterations in the protein associated with the repair process and the proteins involved in the signalling process for the proper repair of damaged DNA. AT is characterized by ataxia (uneven gait), telangiectasia (eye and facial blood vessel dilation), premature ageing, immune deficiencies, mental retardation, cerebellar degeneration, and more prone to the subject malignancies.

That phenotype is more concerned about its locus. Since heterozygotes, which are about 1% of the population, are also more prone to a mutation in the ATM gene, called “ATM”.

The ATM gene doesn’t seem to encode a protein directly in the DNA repair (dissimilar to the genes that cause XP after mutation). AT develops after a defect in the cell signalling pathway because of the similarities of +defected protein with the other protein. The product of the ATM gene may also be involved in the cell cycle progression and the regulation of the length of telomeric DNA.

The C-terminal domain of ATM proteins show homology with the Ser/Thr protein kinase (phosphatidylinositol-3-kinase); thus, it is involved in the signalling pathways. The ATM proteins also exhibit homology with the DNA-dependent protein kinase, which requires gaps in the DNA fragment to show its kinase activity. These findings suggest the involvement of ATM proteins in targeting nucleotide excision repair mechanism.

Single Nucleotide Polymorphism

The single nucleotide polymorphism (SNP or often called as “snip”) is a much frequent genetic variation found in the human DNA. SNP represents variation in a single nucleotide of DNA at a point. Say, for example, cytosine (C) replaced Thymine (T) in a polynucleotide strand of DNA.  

The frequency of SNP occurrence is one in 1000 nucleosides, which indicates that approximately 4 million SNP are present in the genome of every human being. After carefully examining the genome of 100 million individuals, researchers concluded that the SNPs are present in the DNA found to exist between the genes (usually introns). 

SNPs are considered biomarkers for various diseases, which help researchers study the genes associated with a particular disease. Sometimes, SNP takes place in the exon region or regulatory region of the gene, which directly impacts the functioning of the gene. Hence, this SNP directly interferes with the cause of disease.

Figure: Demonstration of Single Nucleotide polymorphism

Generally, SNPs do not affect the individual’s health except a few SNPs found in the exon region of a gene and known to influence the gene function. SNP analysis is an important parameter for studying human health. SNP analysis provides scope for predicting the genetic disposition of developing a disease.

SNP analysis is very helpful in predicting the disease risk, susceptibility to toxins and several other environmental factors, pharmacological response of an individual, tracking the genes linked with the inheritance of a particular disease in a family. Scientists are trying to develop a process for identifying SNPs linked to chronic diseases like cancer, heart disease, diabetes etc. In the case of SNP linked with a trait, the neighbouring DNA stretches can be examined to determine genes responsible for the trait. 

Applications of SNPs

SNP analysis is used to determine genotyping, copy number change in the gene expression, genome-wide analysis, cancer mutation, and the detection of other diseases. The SNP microarray technology can be used to detect the dosage changes and polymorphism in an individual’s DNA. SNP microarray analysis is capable of detecting small changes in the copy number of gene expression. 

SNP microarray used for the genotyping can detect the changes in the methylation pattern of the DNA of cancer cells, changes in the genome and loss of heterozygosity. 

SNP microarrays are also used for disease prediction, identifying tumour suppressor genes and oncogenes in cancer cells. Therefore, SNPs have a good scope in selecting a pharmacologically active molecule, disease prognosis, malignancy risk assessment, etc.

How many nucleotides make up a codon?

The consecutive sequence of three nucleotides in the polynucleotide strand of DNA or RNA constitutes a codon. The codon is specific for an amino acid to be incorporated, and sometimes codon stops the translation process. Such codons are known as stop codons. DNA and RNA are composed in a four-letter language of nucleotides; however, the language of proteins incorporates 20 amino acids. Codons give the key that permits these two languages to be converted into one another.

Every codon specifies an amino acid (except three codons which stop the translation process). The entire set of codons is known as the genetic code. The three-letter code incorporates 64 potential combinations of three-letter nucleotide from the four nucleotides language of DNA.

Figure: Codon (a triplet sequence of nucleotides) is recognized by specific tRNA (transfer RNA) molecule which brings standard amino acids to the site of protein synthesis.

Out of the 64 codons, 61 codes for specific amino acids, while the remaining three stop codons. For instance, the codon AUG codes for the amino acid methionine, and UAA is a stop codon. The genetic code is portrayed as degenerate as one amino acid is coded by multiple codons. The genetic code is non-overlapping, which means the codons are read in continuation without repeating or skipping nucleotides.

Genetic code

The genetic code is a bunch of rules characterizing how a four-letter code of DNA is translated into the 20 standard amino acids, which helps synthesise proteins of our body. The genetic code is a bunch of three nucleotides known as codons, every one of which relates to a particular amino acid or a stop codon.

The concept of codons was initially depicted by Francis Crick and his associates in 1961. During that same year, Marshall Nirenberg and Heinrich Matthaei executed experiments for explaining the genetic code. They proved that the sequence of RNA UUU explicitly coded for phenylalanine (one of the 20 standard amino acids of our body). After this finding, Nirenberg, Philip, and Har Gobind Khorana recognized the remaining genetic code and entirely depicted every three-letter codon and related amino acids.

Figure: Genetic Code

There are 64 potential combinations of the three-letter nucleotide codes that can be produced using the four nucleotides. Out of 64 codons, 61 correspond to standard amino acids, and the remaining three are stop signals/stop codons. Albeit every codon is fixed for only one amino acid (or a stop signal), the genetic code is depicted as redundant because multiple codons can code an amino acid.

Besides, the genetic code is almost universal, with some rare exceptions. For example, mitochondria have some different genetic codes as compared to the cellular genetic code.


In this article we have discussed about the SNPs and nucleotide excision repair mechanism. To know more about nucleotides Click here

Interview Q & A related to this article

Q1. What is the main difference between a gene and a nucleotide sequence?

Answer: Gene is the functional unit (capable of expressing a protein) of the DNA, while nucleotide is the structural unit (capable of forming a building block during the synthesis) of DNA.

Q2. What is the difference between SNP single nucleotide polymorphism and mutation? What does polymorphism mean?

Answer: SNP represents variation in a single nucleotide of DNA at a point. Say, for example, cytosine (C) replaced thymine (T) in a polynucleotide strand of DNA.

 A mutation is referred to as any change in the sequence of DNA. The difference may be in the single nucleotide sequence or maybe in the multiple nucleotide sequence.

Polymorphism is the property of DNA (or anything) of having more than one or multiple forms.

Q3. Can a single nucleotide have both deoxyribose and ribose? 

Answer: A single nucleotide can have only one type of sugar. It may be ribose sugar either deoxyribose sugar.

Q4. How many nucleotides are present in the DNA of a bacteriophage?

Answer: DNA of bacteriophage contains several thousands of nucleotides. For example, bacteriophage φx174 contains 5375 nucleotides.

Q5. A protein is produced, which contains seven amino acids. What will be the length of mRNA

Answer: A codon of 3 nucleotides codes one amino acid. 

Similarly, seven amino acids will be coded by 7 x 3 = 21 nucleotides. 

But mRNA contains a stop codon (3 nucleotides) to stop the process of translation.

Thus, mRNA coding 7 seven amino acids will be containing 21 + 3 = 24 nucleotides.

Q6. How many water molecules are removed during the formation of a nucleotide?

Answer: Two water molecules will be removed during the formation of a nucleotide.

One water molecule is removed when the nitrogenous base binds with the ribose sugar, and the other water molecule is released when ribose sugar binds to the phosphate group.

Q7. Is adenine a nucleotide or a nitrogenous base

Answer: Adenine is a nitrogenous base, while a nucleotide (adenosine monophosphate) contains phosphate group, ribose sugar and a nitrogenous base (adenine).

Q8. If DNA is made of 6 nucleotides instead of 4, what are the total number of triplet codons possible?

Answer: Number of triplet codon combinations = (types of nucleotides)3

If DNA contains six kinds of nucleotides, then the total number of triple codon combinations will be (6)3 = 216

Q9. How is single nucleotide polymorphisms identified?

Answer: DNA microarrays can easily identify SNPs.

Q10. How does a virus express multiple proteins from the same nucleotide sequence?

Answer: Viruses do this by two mechanisms:

– Alternative splicing

– Gene overlapping

Q11. Why is the amino acid sequence so much shorter than of nucleotide sequence?

Answer: Amino acid sequence is usually shorter than the nucleotide sequence the mRNA formed after transcription undergoes splicing and the non-coding regions (introns) are removed, and the mature mRNA only contains the coding regions (exons)

Q12 Name the nucleotide with three phosphate groups.

Answer: Nucleotide with three phosphate groups is known as nucleotide triphosphate. For example ATP (adenosine triphosphate), GTP (Guanosine triphosphate) etc.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top