Every time a cell divides asexually, each daughter cell must contain identical copies of the mother cell's genome.  The original strands serves as a template from which two pairs of daughter strands are synthesized.  Once replication starts, it continues until the genome is copied and cell division cannot occur until replication is over.   DNA replication is similar in all organisms, suggesting that all life on earth today have only modified the same ancestral mechanisms.  Archebacteria possess some characteristics which were formerly thought to exist in eukaryotes alone.  Eukaryote genomes are more complex and DNA replication in eukaryotes is likewise more complex.


1) STEP 1: An Origin of Replication is Found

     In E. coli there is one origin on the chromosome (a region of 245 base pairs called OriC) which is recognized by initiation proteins that open up the double helix.  Because eukaryotic chromosomes are so much larger, it would require 830 hours to replicate one if there was only one origin of replication.  As a result, eukaryotes have many origins of replication for each chromosome (up to 60,000 for the sum of mammalian chromosomes); each region which is replicated from a certain origin is called a replicon.  Replicons vary from around 40 to 100 kb in eukaryotes.  If all replicons were active at once, a mammalian S phase could be completed in an hour; typically it requires around 6 hours since usually only around 15% replicons active at any one time although there are some exceptions (such as in Drosophila embryos).  Six proteins are required to form the origin recognition sequence (ORC) that are conserved throughout eukaryotes from yeast through humans.

     In eukaryotes, the origin recognition complex is a complex of proteins (six in yeast, four of which have homologs in humans) which binds the DNA and the site where replication begins.  The helicase loading factor Cdc6/Cdc18 (two homologs known from two different species of yeast) is in the same gene family as two of these proteins of the origin recognition complex.  There are similarities between Cdc6/Cdc18 and the prokaryotic helicase loader DnaC.  CDKs target Cdc6/Cdc18.  Six members of the minichromosome maintenance family (MCMs) seem to function as a helicase homologous to that of prokaryotes (Leatherwood, 1998).


  The genes for humans are designated as follows:


ORC1L is the largest subunit.  It requires G1-cyclin dependent kinase activity and its concentration varies throughout the cell cycle.




ORC3L possesses helix-loop-helix motifs.






ORC6L is the smallest subunit.  During mitosis it localizes to kinetchores and functions in chromosome separation.  Inactivating ORC6L through silencing RNA leads to multipolar spindles, multinucleate cells, and other abnormalities.



2A) Unwinding and Separating the Strands

     Two molecules of the enzyme helicase bind to the initiator proteins and move along the strands in each direction, separating the strands. 



      Helicases typically use ATP to unwind DNA or RNA (thus using ATP and binding to nucleotides are characteristics which they share with ABC proteins, another gene family to which they are related).  Those which bind DNA can function in DNA replication, DNA repair, and even control of gene expression.  Those which bind RNA can function in transcription, translation, RNA processing, and transport of mRNA from the nucleus.  There are a variety of helicases in the human genome (estimated 40-50?), many of which are members of subfamilies which have ancient origins and are widely distributed among organisms.


a) ReqQ-like family

     These 5 proteins are homologs of the helicase in E. coli which unwinds DNA.

            --ReqQ-like 2 mutations are the cause of Werner syndrome which causes premature aging; children can show many symptoms of old age and can die before they reach their teens

            --ReqQ-like 3 mutations cause genetic instability caused by hypermutability and high levels of recombination.  The normal protein seems to encode an anti-recombinase that suppresses tumor formation.

--ReqQ-like 4 mutations cause the premature aging of Rothmind-Thomson syndrome.


b) DEAD/H Box RNA helicases

     This family has a number of conserved regions, including a Asp-Glu-Ala-Asp (DEAD) region. 

            --DDX6 is also known as Oncogene Rck and is overexpressed in some tumors

            --DDX9 can unwind double-stranded DNA and RNA

            --DDX15 is a pre-mRNA splicing factor

            --DDX19 is located in nuclear pores and functions in mRNA transport

--BRCA-interacting protein interacts with BRCA1; mutations in this gene cause early onset breast cancer



c) DBX and DBY are located on the sex chromosomes; DBX is not inactivated in the formation of Barr bodies


d) Nucleolar Protein H is a helicase expressed in the fetal and adult brain.


e) Eukaryotic Translation Initiation Factors EIF4A and EIF4B

--These proetins bind to the mRNA cap and unwind the secondary structure of mRNA to begin transcription.  EIF4A is overexpressed in some tumors.


f) C10ORF2

     This gene encodes the protein twinkle which is a helicase active in the mitochondrial nucleoid regions.  Mutations cause ophthalmoplegia.



ATRX encodes a helicase enzyme whose mutations have been linked to a-thalassemia, X-linked mental retardation, and genital abnormalities which in some cases result in complete male to female sex reversal.  It is present on the X chromosome in humans and mice but it is present on both the X and Y in marsupials.   In marsupials, this Y chromosome copy is expressed only in the testis while ATRX is expressed widely throughout the body.  In placental mammals, ATRX is expressed throughout the body and in the testes; apparently the ATRY gene was lost in the evolution of the placental Y chromosome.  While the cascade of signals resulting in male-specific fetal development in most placental mammals, some placental mammals have lost SRY, there is no evidence that marsupial SRY is involved in sex determination, and no SRY homolog has yet been found in monotremes.  Given the mutations of ATRX in humans that result in sex reversal and the testis-specific ATRY expression in marsupials, it is possible that ATRY was an ancestral signal for male-specific development which is now downstream of the SRY signal of placental mammals (Pask, 1999).


h) CHD family

--CHD1 binds DNA and may be involved in chromosome structure and transcription.

--CHD3 mutations cause dematomyositis.

--CHD4 is involved in nucleosome remodelling.


i) HELLS is a lymphoid specific helicase required for T cell proliferation.


j) HELZ is a helicase with a zinc finger domain involved in embryonic development.


k) RuvBL genes (1 and 2)

    These helicases are homologous to bacterial helicases involved in recombination and double stranded break repair.



XH2 is a transcriptional regulator which, when mutated can cause mental retardation (ATR-X syndrome) and a thalassemia (Gibbons, 1995).



2A) Unwinding and Separating the Strands

DNA topoisomerases are required for many essential aspects of DNA action including DNA replication, transcription, recombination, and chromosome condensation. The main families of these enzymes [Topo IA, Topo IB, Topo IC (Topo V), Topo IIA and Topo IIB (Topo VI)] are non-homologous and apparently have independent origins (Forterre, 2007).

Type I topoisomerases introduce breaks into one of the DNA strands. The subfamilies Topo IA, Topo IB, and Topo IC (Topo V) arose independently. Topo IA enzymes are known from all three domains of life and some viruses. TopoIB enzymes are not homologous to Topo IA enzymes and can relax both positive and negative supercoiling. Topo IB enzymes are primarily known in eukaryotes although they are also present in poxviruses and some bacteria. Topo IC is known only in an archaea which grows optimally above 100 degrees Celsius.

Type II topoisomerases break both DNA strands and force another double-stranded DNA molecule through the break. These enzymes can wind and unwind DNA and knot and unknot circular DNA strands. All require ATP. Topo IIA enzymes are found throughout diverse organisms while gyrases are known in bacteria. Plants apparently obtained the enzyme from the cyanobacterial ancestor of chloroplasts. Topo IIB is known from archaea, plants, and a few bacteria and is unrelated to Topo IIA (Forterre, 2007).

Like the DNA polymerases, the unusual distribution of topoisomerases throughout the domains of life supports the idea that LUCA was functioning in an RNA world and separate mechanisms to manage DNA evolved separately. Since topoisomerases and polymerases which are unique to viruses are known, it is possible that DNA viruses introduced DNA into RNA-based precells (Forterre, 2007).

Unwinding DNA is a problem since it is difficult to unwind a very long, tightly wound molecule in a small space.  DNA molecules can be wound or supercoiled either in the same direction as helix twists (positive supercoiling) or in the opposite direction (negative supercoiling).  Thus the same sequence can exist in several differently coiled forms called topoisomers; the enzymes which control this twisting are called topoisomerases.  Topoisomerases twist & untwist the DNA ahead of and behind replication by making cuts in either one strand or both strands that are resealed after the new twisting has occurred. This prevents tangles and to prevent the molecule from whipping around at 3000 rpm.  Topoisomerases are needed for DNA strand separation, chromatin compaction, DNA repair, and recombination. Topoisomerase 1 makes a break in 1 DNA strand; topoisomerase 2 breaks both strands.  Topoisomerases are abundant proteins; it is estimated that HeLa cell nuclei possess 100,000 topoisomerase molecules, which is about 0.1% of the nuclear extract.   There are several human topoisomerases.


TOPO1 mutations may be involved in Fanconi Anemia and ataxia telangiectasia.  One polymorphism determines sensitivity to the anti-tumor cancer drug camptothecin. 


TOPO1, mitochondrial, is homologous to nuclear TopoI.


TOPO2A is involved in chromatin changes which occur during transcription.  One polymorphism affects sensitivity to the drug amsacrine.


TOPO2B exists as two separate genes whose isoenzymes are 72% identical.  In mice, homozygous recessive mutants die before birth.


TOPO3A and TOPO3B are involved in meiotic recombination.  In mice, TOPO3A mutations are lethal while TOPO3B mutants survive but with a shorter life than normal.    


Homolog to Topoisomerase-Related Function Protein 4-2 is needed for sister chromatid cohesion.  Variants of this protein affect sensitivity to DNA damge by certain drugs.


One of the DNA polymerases (POLS, mentioned below) is also known as the topoisomerase-like protein.


2) Stabilizing Single Stranded DNA

Once single stranded DNA is created, single stranded binding proteins (ssb) stabilize the strands.  The single stranded binding proteins form a gene family and they share common intron/exon boundaries.  All of them can be alternately spliced.  Humans have 4 SSBP genes.

SSBP1 exists as a homotetramer in mitochondria.




3)  RNA primers are created

A new DNA strand cannot be begun from nothing; new DNA bases have to be added to an existing molecule.  One class of RNA polymerase called primase adds a few RNA bases to the template strand to create a short RNA primer (in humans, 8 bases beginning either with A or G; in Drosophila 7-9 bases beginning with A; in yeast 8-12 beginning with A).  This RNA polymerase is only used in DNA replication.  Helicase and primase are bound together in a moving structure known as the primosome.  Humans have 3 primase genes.


Primase Polypeptide 1; PRIM1 is 49 kD in size.


PRIM2A is 55 kD in size and PRIM2B is 58 kD.  Both are located in 6p12-11.



     New DNA nucleotides which complement the template strand are attached by DNA polymerase.  There are 3 DNA polymerases in E.coli (I, II, III) and 5-6 in eukaryotes.  DNA polymerase III (or DNA polymerase delta in eukaryotes) is the primary enzyme which attaches new nucleotides; this huge protein has 10 subunits.  In E. coli, 25,000 new bases can be inserted per minute; in eukaryotes 2,000 per minute.  DNA polymerases can function in both replication and DNA repair.  Why repair?  All DNA polymerases not only have the ability to bind new nucleotides to an elongating strand, they can also remove incorrect bases which have just been inserted; this ability allows them to proofread the new chain for mismatches (which may bulge).  This preliminary proofreading (there are other mechanisms) makes a mistake every million base pairs; otherwise a mutation might occur once per thousand bases.  DNA polymerases can only insert bases in the 5' to 3' direction (remember this refers to which end of the strand has a free 5' carbon and which has a free 3' carbon).

     Humans have a number of DNA polymerase genes.

POLA (DNA polymerase alpha)

     This enzyme is located on the X chromosome of placental mammals but is autosomal in marsupials and monotremes. 


POLB (DNA polymerase beta)

     This enzyme functions in base excision repair by inserting bases into the damaged area and removing the 5’ phosphate.  In mice, mutations in POLB cause a hypersensitivity to certain mutagens.  Mutations in humans may contribute to colorectal cancers and Werner syndrome.

DNA polymerase b belongs to the polX DNA polymerase family and then also the larger nucleotidyl transferase family.  These proteins have a helix-loop-helix DNA binding region.  The only nonviral nucleotidyl transferase in vertebrates is TdT, the template-independent terminal nucleotidyl transferase which is functional in the production of immunoglobulins and T cell receptors.  Yeast have homologues of both human DNA polymerase b (pol l) and TdT (polm).  (Aoufouchi, 2000).


POLD1and POLD2 (DNA polymerase delta)

     This enzyme’s exonuclease activity is required in DNA repair.  Mice with mutations in POLD1 are at a higher risk of cancer.  The D1 subunit of POLD has the polymerase and exonuclease function, the function of the smaller POLD2 unit is unknown.  Both have been conserved in eukaryotes.


POLE (DNA polymerase epsilon)

     This enzyme is not present in all eukaryotes.  In mammals, it functions in repair and perhaps in replication.

POLE2 is involved in DNA repair and replication and is homologous to DPB2 in yeast.

POLE3 is a histone binding protein.  It binds to DNA and histones and interacts with other protein complexes in the control of transcription, replication and DNA packaging.  It is homologous to DPB4 of yeast.

POLE4 interacts with POLE 3 and is homologous to DPB3 of yeast.


POLG (DNA polymerase gamma)

     POLG forms a homotetramer and is located in both the nucleus and the mitochondria.  It is the DNA polymerase in mitochondria, and is homologous to the mitochondrial DNA polymerase in yeast.  It is most highly expressed in muscle.

     POLG2 is a small accessory subunit of POLG and is similar to the POLG beta subunit of Drosophila.  It is homologous to class II aminoacyl tRNA synthetases.


POLH (DNA polymerase eta)

     This enzyme functions is not as error-proof as other polymerases and lacks an exonuclease function, it allows cells to bypass areas of DNA which have been damaged by ultraviolet light.  It is homologous to other enzymes which function in UV damage repair such as UmuC and DinB of E. coli and RAD30 and Rev1 of yeast.  In humans, mutations of POLH can cause a variant type of xeroderma pigmentosum.


POLK (DNA polymerase kappa)

    This enzyme also belongs to the family of enzymes homologous to RAD30 of yeast but it differs from the others in its inability to bypass thymine dimers resulting from UV light damage.


POLI (DNA polymerase iota)

     This enzyme is also a member of the family of enzymes homologous to RAD30 of yeast.  It works in conjunction with POLZ in areas of DNA damage.  While POLI inserts nucleotides in regions opposite the damage, POLZ functions as a mispair extender.  POLI is involved in the immunoglobulin mutosome.


POLZ (DNA polymerase zeta) also called Homolog of Yeast Protein REV3 (REV3L)

     This is a nonessential enzyme in yeast which functions in damaged regions of DNA but increases the rate of mutation.  The human protein seems to function similarly.    Cells without functional POLZ grow normally; they develop fewer mutations as a result of UV light but are slightly more likely to die as a result of UV exposure.



     This enzyme’s expression is dependent upon damage to DNA.


POLM (DNA polymerase mu)

    This enyzme is most highly expressed in lymphoid tissue.  Some splice variants of both POLL and POLM produce nonfunctional proteins.


POLQ (DNA polymerase theta)



     This enzyme is homologous to the yeast enzyme TRF4.  It is important in the pairing of sister chromatids during replication.  This DNA polymerase is homologous to topoisomerases and is also known as the topoisomerase-like protein.



5) Joining Segments of DNA

DNA polymerase I remove RNA primers (RNase performs this task in eukaryotes) and then replaces them with DNA bases.  The last bond between the fragments is sealed by the enzyme ligase.  Okazaki fragments created during DNA replication must also be joined.  Humans have several ligase genes.


LIG1 mutations in mice allow normal development up until hematopoesis occurs in the liver.  In humans mutations cause a number of physical and developmental abnormalities.


LIGIII is highly expressed in primary spermatocytes and may seal the breaks resulting from recombination.  (Ligase II and III are now recognized as the same enzyme.)


LIGIV alleles vary in their sensitivity to ionizing radiation and their affects on the recombination of antibodies.


Flap Structure-Specific Endonuclease FEN1

    This enzyme processes the ends of Okazaki fragments and the fragments occurring during DNA repair.  It has been conserved from archebacteria through humans.  In yeast, a deletion of the homologous gene RAD27 increases genome instability and sensitivity to mutagens.



6) Termination of Replication

In E. coli there are terminator sequences and terminator proteins although replication can be completed without them.  A topoisomerase separates circular chromosomes; DNA polymerase I and ligase seal the last segments.  In eukaryotes, special events occur at the ends of chromosomes since there will be a gap at the 5' end after the primer is removed.  The enzyme telomerase prevents chromosome shortening by adding telomere repeats to the ends of chromosomes.


Replication Factor C1; RFC1

    RFC1 is involved in DNA synthesis and telomere stability.  It associates with BRCA1 and other proteins in a multiprotein complex involved in tumor suppression.  The ATP binding region of RFC1 is homologous to that of the other RFCs, other DNA binding proteins, and bacterial transcription factors.






Among the metazoans, only sponges retain high levels of telomerase activity in somatic cells.  It may be a related observation that even differentiated cells such as choanocytes in sponges can give rise to gametes.  Sponges can even undergo sex reversal, indicating that gender is determined by physiologic rather than genetic mechanisms (Muller, W.E.G., from Muller, 1998.89).



Geoujon, Christophe.  A common mechanism for ATP hydrolysis in ABC transporter and helicase superfamilies.  Trends in Biochemical Sciences 26:539-544, 2001.




     Eukaryotes use RecA family members for repair of double stranded breaks and homologous recombination is essential for meiosis.  RecA family includes RecA of eubacteria, RadA in Archea, and Rad51 and Dmc1 in eukaryotes (Gasior, 2001).   Yeast and mammals share two homologs of RecA: Dmc1 and Rad51 whose function share a number of similarities with that of RecA.   Rad51 mutations are embryonic lethal mutations in mice.  Rad51 functions in protection from ionizing radiation and interacts with p53.

The bacterial recA gene has homologs in eukaryotes which apparently were introduced as a result of endosymbiotic events which established the eukaryotic lineages. Another homolog, RAD, had undergone duplication to produce a pair of genes in archaea (RADA and RADB). Additional duplications produced a series of genes in eukaryotes (RAD51, RAD51B, RAD51C, RAD51D, DMC1, XRCC2, XRCC3), some of which perform vital functions in meiosis (Lin, 2006).

     A bacterium such as E. coli may suffer up to 5,000 DNA lesions in a generation, most of which are caused by oxygen.  RecA functions in the repair of these lesions and RecA mutations can lead to a 50% mortality in bacterial cells.

Homologous recombination is required for proper segregation of chromosomes in meiosis (Cox, 1997).  The ability of homologous chromosomes to undergo genetic recombination through the action of RecA/Rad51 is an advantage that DNA has over RNA and may have been important in the establishment of DNA as the universal genetic code (Shibata, 2001).


RAD51—this protein forms a complex with both BRCA1 and BRCA2 proteins.  Some mutations cause breast cancer while other increase the risk of breast cancer but only in those women who are carriers for BRCA mutations.


RAD51-like 1 is expressed in areas where recombination occurs, such as the ovary, testis, thymus and spleen.

RAD51-like 3 is most highly expressed in the testis.


Dmc1 is involved in the synapsis of homologous chromsomes in meiosis.  Mutations cause sterility in which meiosis stops in prophase I (Li, 1997).  


Double-strand breaks in chromosomes are one of the first steps in recombination during meiosis.  This is mediated by the Spo11 protein which belongs to a gene family which includes proteins from nematodes and even archaea.  It is similar to archeabacterial topoisomerase enzymes which are unlike topoisomerases of eubacteria and eukaryotes (Keeney, 1997).