In the cells in the developing embryo of this lancelet (a primitive chordate), a number of tasks which must be performed if the DNA message is to be converted to the amino acid sequence of a protein.  Some of these tasks are not performed by proteins themselves but by RNA molecules which have never been translated into proteins.  In addition to the ribosomal RNAs and tRNAs discussed previously, there are a few other kinds of functional RNA molecules in humans.  Interestingly, they are involved in processes such as RNA splicing and the formation of ribosomes—functions which the first cells would have needed to perform without the benefit of protein.  Because of the diversity of functions performed by RNA in modern cells, it is appropriate to speak of the “modern RNA world” rather than speaking only of an “RNA World” which existed in the distant past.  In addition to the functional, non-translated RNA molecules which are already known, many functional RNAs probably have yet to be discovered because their genes are more difficult to locate than those which encode proteins (Bompfunewerer, 2005).

Some have calculated that less than 1.5% of the genome actually codes for proteins compared to the 24% composed of introns and 75% composed of intergenic DNA. Much of the genome which is not translated may produce regulatory RNAs which are capable of binding DNA, RNA, or protein (Tannenbaum, 2005).

Metazoans share a U7-snRNP mechanism which processes the pre-mRNA of histone proteins (Bompfunewerer, 2005).



Small Nuclear RNAs

snRNAs are not unique to humans: they are RNAs which function in splicing pre-mRNAs to determine which sequences will be translated into protein.

    snRNAs (small nuclear RNAs) range in size from 80 to 350 nucleotides.  They exist in all eukaryotes and small nuclear ribonucleoproteins (snrps; made with snRNAs U1 through U6) which form a structure known as the spliceosome which control the splicing of pre-mRNAs to produce mRNAs.  The RNA is critical in this process: U2 and U6 can begin splicing even without the protein and mutations in the RNA sequences affect the specificity of the splicing.  It is estimated that about 15% of the single point mutations which cause human disease affect mRNA splicing (Maniatis, 2002).  Splicing variations are significant source of evolutionary diversity since they allow tissue-specific variants from common genes; this mechanism is considered as the most  significant source of protein diversity in vertebrates.

     The spliceosomes are one of the most complex structures in components of the cell, containing about 145 proteins, many of which have roles in gene expression (other than splicing).  The components of the spliceosome are highly conserved from yeast to humans.  Introns of the premRNAs are removed on the basis of 3 sequences: a 5’ splice site, a branch point, and a 3’ splice site.  snRNAs U1 through U6 (except U4?) are incorporated in the spliceosome and two other snRNAs, U6atac and U4atac, are also present in low abundance.  Introns typically start with GT and end with AG (the GT-AG rule) and splice sites are generic and not tissue specific.  In principle, any GT site can interact with an AG site, allowing alternate splicing possibilities; some sites bind SNRPs less well than others.  About 1/10,000 introns begins with AT and ends with AC; they utilize different snRNAs.   The spliceosome can frequently cause different versions of the same original transcript through alternative splicing; the human gene RBP-MS can produce at least 12 different transcripts (OMIM; Sharp, 1985; Zhou, 2002).

     One group is classified as the U family (for “uridine rich”).  The snRNAs U1 to U6  (with the exception of U5) exist in multiple copies: there may be 30 functioning copies of a sequence and 15-30 times that number of pseudogenes. 

     About 200,000 copies of snRNA 75K are produced in the nucleus.  This RNA binds to CDK9 and inhibits CDK9/cyclin T.  The snRNPs can be antigens against which the immune system can react in lupus.  U2 sequences are one of several sequences where adenovirus 12 can insert itself, sometimes leading to cancer.


     Another sequence, snRNA export adaptor RNUXA, functions in the export of snRNA precursors from the nucleus.  The phosphorylated form helps to assemble the snRNA export complex.

      Not all introns require spliceosomes to be removed: some introns remove themselves.  There are different mechanisms that this is accomplished and the 2 main groups of introns differ in their internal organization and not related to each other.  Group I and II introns found in both organelles and bacteria; group I also in nuclei of lower eukaryotes.  Such catalytic RNAs are called ribozymes.




Small nucleolar RNAs are not unique to humans.  They may very well be descendants of functional RNA sequences from very early cells.


     snoRNAs (small nucleolar RNAs) are never translated into protein; the RNA is functional and helps the maturation of the rRNA in ribosome formation.  More than thirty kinds are known.  snoRNAs can modify the bases of other RNA molecules. SnoRNAs are known in both eukaryotes and archaea.  Functional RNA molecules, especially those involved in peptidyl transfer of the ribosome often contain modified nucleotides which have undergons 2-O-ribose methylation and pseudouridylation.  snoRNAs guide many of these modifications (Bachellerie, 2002).  Archaea and eukaryotes use snoRNAs with a C/D box perform 2’-O-methylation of nucleotides while box H/ACA snoRNAs covert uridine to pseudouridine.  The core of the complex which modifies RNA nucleotides in archaea and eukaryotes is similar to part of the ribosome, suggesting a common origin (Tran, 2004).  While some snoRNAs are ubiquiotous, others are tissue specific (many being found only in the brain, for example) and some are imprinted (Bompfunewerer, 2005).

     There are several families of snoRNAs; the largest group contains sequences called C and D boxes and associate with the protein fibrillarin, others complex with the proteins NOP1, SOF1, GAR1, and SSB1.  snoRNAs exist in ribonucleoproteins (snoRNPs) after complexing with these proteins.  A few snoRNAs are widely conserved throughout organisms (such as four which humans and yeast share; there of these are present in plants as well).  Three snoRNAs have been shown to be essential for different steps of rRNA processing (Mishra, 1997). Some snoRNAs are transcribed on their own without being contained in the introns of other genes.

     The U3 snoRNA (known from protozoa, yeast, plants, invertebrates, vertebrates, and perhaps even archebacteria) forms a large ribonucleoprotein with 28 proteins.  It processes pre-rRNAs and seems to correspond to the terminal knobs visible on nascent pre-RNAs.   Only 4 of 21 snoRNAs in yeast were essential for growth (the others may have functions, simply not essential ones).  Some snRNPs modify RNA bases (Dragon, 2002; Maxwell, 1995)

     Perhaps the most interesting aspect of the snoRNAs is that most of them (except for a few, for example U17A and U17B on human chromosome 1p36) are not encoded by their own genes but are rather encoded by the introns of other genes.  U108 is in an intron of LAMR1 (there are up to 16 copies of this gene in humans), U20 is in an intron of the nucleolin gene, and U21 is an intron in the RPL5 gene.  U14 is an intron of the heat shock 70 gene (in vertebrates).  UP73 is also encoded in the introns of a ribosomal protein gene: both humans and mice have copies in the 3rd and 4th introns but humans only use the copy from intron 4 while mice use both.  Perhaps the most interesting case is the UHG gene (U22 Host Gene).  This gene appears to make absolutely no protein and thus its exons are functionless.  Its introns however, code for 7 snoRNAs: U22, U25, U26, U27, U28, U29, U30, and U31.  The sequences of these introns are conserved in mice and humans while the exons (which mice don’t use either) are poorly conserved.  Could it be that the oldest introns were actually functional RNA molecules and their originally useless exons gradually became protein coding?  (Maxwell, 1995; Poole, 1998; OMIM). 

     Although the snoRNA U86 is conserved from yeast through humans, it can be positioned in an intron (in humans) or in an open reading frame (yeast and frogs) (Filippini, 2001).   Eukaryotic nucleoli contain snoRNAs which function in rRNA processing.  Giardia are primitive eukaryotes which lack nucleoli but do possess the protein fibrillarin.  In Giardia  is a primitive eukaryote which interacts with snRNAs and snoRNAs of higher eukaryotes (Ghosh, 2001).



     Several small cytoplasmic RNAs called Y RNAs form Ro ribonucleoproteins.  These ribonucleoproteins are known from vertebrates and invertebrates (Matera, 1995).  They are transcribed by RNA polymerase III and range in size from 69 to 112 nucleotides.  In addition to their location in the cytoplasm, they can also be found around the nucleoli in “perinucleolar compartments”.  The human genome possesses genes for RNAY1, RNAY3, RNAY4, and RNAY5.

     Ro RNPs are known from a variety of vertebrates as well as from nematodes.  In humans, autoantibodies directed against them can result in lupus erythematosus and Sjogren’s syndrome.  Although their function is not completely known, it seems they are involved in the production of 5S rRNA molecules (Matera, 1995).

     Y RNAs are known from frogs and some of the human genes (Y1 and Y3 but not Y4 and Y5) are known from mice.  Y4 seems to have resulted from a duplication of a member of Y1 or Y3 in primates.  All 4 human Y RNA genes are located on chromosome 7 and many pseudogenes exist in the genome (e.g. at least 100 Y4 psedogenes exist in the human genome).  Their expression is highest in the heart and brain (Maraia, 1994).




    Short interfering RNAs (siRNAs) are double stranded RNA molecules about 21-25 nucleotides in length known in plants and animals.  They can inhibit gene expression by targeting specific sequences of mRNA for degradation by ribonucleases (Caplen, 2001).  RNA interference occurs in nematodes.  Exposure to small amounts of dsRNA can affect gene expression throughout a nematode’s body and even in its progeny (Hannon, 2002).

All cells regulate gene activity through RNA interference in which small interfering RNAs (siRNAs) collaborate with an RNA-induced silencing complex (RISC) to cleave specific mRNAs. To date, more than a thousand miRNAs have been identified in the human genome. These noncoding RNAs are transcribed by RNA polymerase II to produce primary miRNAs (pri-miRNA) which are processed in the nucleus (by Drosha) into a pre-miRNA which is processed in the cytoplasm (by Dicer) into the miRNA of about 22 nucleotides (Sioud, 2007).

Plants and invertebrates utilize siRNAs to cleave viral RNAs as part of an innate immunity. In mammals, it seems that the importance of siRNAs in immunity was reduced as the interferon mechanisms became more prominent (Sioud, 2007).


Small temporal RNA: stRNA

     This 21 nucleotide RNA seems to be widespread (if not universal) in bilateran animals (including humans) but not in more primitive organisms.  In C. elegans it is involved in development and seems to function in the down-regulation of genes (OMIM).



     The signal recognition particle is a ribonucleoprotein composed of a 7S RNA and 6 proteins.  Eubacteria possess several SRP subunits and archebacteria possess two which are homologous to two of the six known mammalian SRP  subunits.  The ribonucleoprotein is involved with the direction of proteins from ribosomes to biological membranes (Hainzi, 2002).



     In the mitochondria of some protozoa, RNA transcripts are modified after transcription by small RNA molecules (guide RNA or gRNA).  Guide RNA molecules are also coded by intergenic mitochondrial DNA and form an RNA-RNA duplex at the sites which are to be edited with the addition or deletion of uracil residues (Blum, 1990a & b).



     When a ribosome has stalled its translation of an mRNA strand, tmRNA which has been charged with alanine enters the A site, acting as a tRNA.  It then displaces the mRNA and the ribosome proceeds to translate the open reading frame of tmRNA (thus it acts as an mRNA) until a stop codon is reached (Gillet, 2001).


Given that the majority of non-coding RNAs are small and often lack strong consensus sequences, their identification has been slow (Bompfunewerer, 2005).

Many micro-RNAs (miRNAs) can base-pair to messenger RNA to facilitate or inhibit translation. Their average size is about 22 nucleotides. Bacteria such as E. coli possess 50 small RNAs which include those which regulate translation and roughly half of the transcription products from human chromosomes 21 and 22 represent non-coding RNAs. One miRNA, let-7, seems to have evolved in higher flatworms. Although no miRNAs have been identified which are conserved between animals and plants, there are forms which are conserved within animals and other forms conserved within plants (Bompfunewerer, 2005). Plant microRNAs (miRNAs) function in a variety of mechanisms including the formation of leaves, roots, and flowers and stress responses (Yang, 2007).

Noncoding RNA classes tend to have a size of about 2,000 nucleotides compared to the microRNAs which are typically about 25 nucleotides. The lack of sequence conservation in these longer RNAs is not necessarily evidence that they perform no function (Pang, 2006).

The low levels of mutation in microRNA sequences suggest that selection pressure favors the retention of the original sequences (Borenstein, 2006).

MicroRNAs are known which are complementary to both the MIR/LINE-2 elements and Alu elements. Apparently these elements affect cellular physiology enough to provide a selective advantage for elements which regulate them (Smalheiser, 2006).



      A variety of eubacteria possess an RNA molecule which can function as both a tRNA (being charged with alanine with alanyl-tRNA synthetase) and as an mRNA (Karzai 2000).  Small noncoding RNA genes are known in bacteria which regulate translation (Bompfunewerer, 2005).  Eubacteria also possess a small number of sRNAs.  Functional viral RNA which is never translated is known as well (Bompfunewerer, 2005).


Vertebrates utilize 7SK RNP which has been shown in mammals to control transcriptional elongation (Bompfunewerer, 2005).


 RNase P

Ribonuclease P (RNase P) is a complex of RNA and protein shared by all three domains of life which processes tRNA. In prokaryotes, but not eukaryotes, the RNA portion is the catalytic region and truly an enzyme. Eubacteria utilize only one protein in this complex, archaea involve more, and eukaryotes incorporate even more proteins (Evans, 2006). RNase P and MRP are endonucleases which are required for the processing of tRNAs and mitochondrial DNA replication, respectively.  RNase P is known from all kingdoms of life while MRP is known in eukaryotes (Bompfunewerer, 2005). RNaseP excises portions of the 5’ end of precursor tRNAs (ptRNAs) to enable function (Willkomm, 2007). Prokaryotic RNaseP RNA is capable of function without its protein component (Hartmann, 2003). One model for RNaseP evolution is that it was originally a catalytic RNA to which proteins were added in archeal and eukaryotic lineages. Many of these proteins are homologous. The RNA portion typically measures 350 to 450 nucleotides in length (Hartmann, 2003).


Telomerase RNA includes a core which is conserved among eukaryotes.  H/ACA box snoRNAs convert uridine to pseudourine.  Vertebrate telomerase possesses this domain (Bompfunewerer, 2005).



MicroRNAs (miRNAs) are small single stranded chains of about 22 nucleotides.  They are known in both animals and plants and underwent expansion in the vertebrate lineages.    Their function is not yet known (Bompfunewerer, 2005).


Eukaryotes possess noncoding RNA which is similar to mRNA but is never translated (Bompfunewerer, 2005).


Xist is the signal which is released from the X which will be inactivated which intiates and spreads this X inactivation.  It is repressed by its antisense sequence, Tsix (Bompfunewerer, 2005).


H19 is an imprinted gene expressed during fetal development (Bompfunewerer, 2005).

A 27 nucleotide RNA molecule produced by an intron of the endothelial nitric oxide synthase gene reduces the transcription of the gene, perhaps as a negative feedback mechanism (Zhang, 2005)

The expression of small RNAs is involved in the virulence of pathogenic Staphylococcus (Pichon, 2005).

Antisense RNAs can form double stranded RNA which can trigger mRNA degradation and block transcription (Bompfunewerer, 2005).


     Other RNA molecules are known to convert a single cytosine to uracil in mammalian apolipoprotein B mRNA, convert C to U or U to C in plant RNA, add 54 extra C nucleotidesin mRNA for ATP synthetase in Physarum mitochondria, and add G residues in paramyxovirus P mRNA (Simpson, 1990).  It is possible that these are remnants of an ancient mechanism in which RNA edited the gene content of DNA (Simpson, 1990).


In addition to functional molecules which are not translated, the RNA structure of mRNA can have significance as well.  RNA molecules may include structural motifs which guide them to certain positions in a cell, such as the mRNA of the Drosophila gene bicoid which accumulates at one pole of the cell to establish an axis for the embryo (Bompfunewerer, 2005).



RNAi is a mechanism capable of silencing virus and transposon expression (Abrusan, 2006). RNA interference is involved in the regulation of MAP-1 levels controlling apoptosis (Tan, 2005). Many transposable elements are prevented from activity by hypermethylation in both plants and animals. One study demonstrated that primate pseudogenes are more likely to be hypermethylated when they exist in multiple copies and proposed a mechanism of homology-dependent methylation which may utilize RNA silencing (Meunier, 2005).


The growing role of RNA is so significant that some have proposed that RNA is actually the brain which organizes the eukaryotic cell through its interactions with DNA and protein (Tannenbaum, 2006).