Amphipathic molecules can interact with both polar and nonpolar molecules because one side of the molecule is polar and another is nonpolar.  Detergents and the phospholipids of the cell membrane are amphipathic.  When some proteins form a-helices, polar and charged amino acids occur on one side of the helix (where they can interact with the watery environment inside the cell) while the other side of the helix possesses nonpolar amino acids that can interact with nonpolar organic molecules.  Some proteins which regulate transcription are amphipathic including the leucine zipper proteins (which possess an array of leucine amino acids on the nonpolar side of the helix which interacts with DNA) and helix-loop-helix proteins.

     Helix-loop-helix (HLH) proteins are a family of 13-20 kDa proteins which share a domain of 2 amphipathic α helices separated by an intervening sequence (Norton, 1998).  During embryonic development, they are essential in neurogenesis, myogenesis, lymphogenesis, and sex determination.  Some basic HLH (bHLH) proteins (Class I)  bind to DNA as either homodimers or heterodimers such as E2A and E47 and are known as E proteins.  Other bHLH proteins (Class II) must form heterodimers with Class I bHLH proteins and typically are expressed in a  more restricted set of tissues.  The HLH region interacts with other bHLH proteins in dimerization and the basic region interacts with specific DNA sequences.

     Group B bHLH proteins are known in fungus (Caruso, 2002) and in angiosperms (Damerval, 2003) while groups A, C, D, and E are only known in animals.  There are five bHLH genes known in yeast and they function to facilitate transcription in general and to regulate the cell cycle (Ledent, 2001).  One plant bHLH functions in the transduction of light signals from photoreceptors (Ni, 1998).  Plants and animals have amplified the gene family independently.  Of the 44 families of bHLH proteins, 36 occur only in animals. The ancestor of coelomates may have possessed as many as 35 bHLH genes.  Mammals may possess 90 bHLH genes, including homologs of all the bHLH genes known in flies (Ledent, 2001).  bHLH proteins bind to a DNA sequence of CANNTG known as the E box.  Twist proteins bind best to CATATG while MyoD binds best to CACCTG (Castanon, 2002). 



1) GROUPA Bhlh


     Cnidarians (such as the hydra depicted below) homologs of Achaete-Scute in sensory cells.

  Achaete-Scute and Atonal had been duplicated by the last common ancestor of metazoans.  Ash homologs are involved in secretory/digestive cells in both cnidarians and bilaterans.  Atonal1 and NP1 are expressed only in bilateran neural tissue but are expressed in both neural and muscle tissue in cnidarians (such as that of the sea anemone in the adjacent photo), supporting a common origin for neurons and muscle cells (Seipel, 2004; Muller, 2003). 

More than 200 HLH proteins are known in eukaryotes from yeast to humans.  In yeast they are involved in phosphate and phospholipids pathways.  Twist was first identified in Drosophila as a transcription factor which functioned at gastrulation (Massari, 2000).


     The processes involved in the differentiation of neurons has been highly conserved in coelomates (Kageyama, 1997).  In Drosophila, proneural HLH genes such as achaete-scute and atonal promote the formation of neural tissue from neurectoderm while other HLH proteins such as hairy and Enhancer of split inhibit the action of proneural genes and the formation of neural tissue.  In mammals, the proneural genes Mash1 and Mash2 are homologs of achaete-scute (although they named Cash in chicks and Xash in Xenopus) and Math, neurogenin, and NeuroD are homologs of atonal (Kageyama, 1997). 

     The differentiation of neuroepithelial cells (neurogenesis) involves a number of basic helix-loop-helix (bHLH) transcription factors Mash 1 (Homolog of achaete-scute in Drosophila), neurogenin, and NeuroD.  Other HLH proteins are negative regulators of this process such as Hes-1 and Hes-5 (the homoolgs of hairy and Enhancer of Split) (Nakashima, 2001)


Achaete-Scute Homolog 2; HASH2 (mouse versions called mash)


Mash-1 mutations disrupt the differentiation of autonomic, olfactory, and retinal neurons (Kageyama, 1997; OMIM).


Mash-1 and Mash-2 are homologs of achaete-scute in Drosophila.  A-S homologs are imprinted in both humans and mice.  Mash-1 is required for neural development.  Mash-2 functions in the trophoblast (Norton, 1998) and the developing placenta including Mash-2 (Morasso, 1999).




Neurogenin, NeuroD1, NeuroD2, NeuroD3 are expressed in developing neurons.  Overexpression causes non-neural cells to differentiate as neurons (Norton, 1998).


Neurogenin 1; NeuroG1

Neurogenin 1 promotes neurogenesis and inhibits the differentiation of astrocytes (OMIM).


Neurogenin 2; NeuroG2

Neurogenin 2 promotes the differentiation of neurons from precursor cells.  A neuron is pictured below.


Neurogenin 3; NeuroG3

Mice with mutations in NeuroG3 fail do develop endocrine pancreas cells and die of diabetes.  It is also expressed in the nervous system (OMIM).


Neurogenic Differentiation 1; NeuroD1

NeuroD1 is expressed in the mammalian brain and mutations in mice result in the absence of the granule cell layer of the hippocampus and seizures.  It is also expressed in the pancreas and mutations have been linked to diabetes.  The endocrine cells of the pancreas are depicted below.


Neurogenic Differentiation 2; NeuroD2

NeuroD2 is important in the differentiation of neurons and its expression persists into adulthood.



There is a family of Twist homologs which are expressed in developing mesoderm which include bHLH-EC2, scleraxis, and dermo-1 (Hidai, 1998).  Twist and Snail function in the differentiation of mesoderm in both flies and vertebrates  (Muller, 2003).  Twist binds the promoters of a number of genes, affecting their transcription.  Many targets of Twist have been conserved evolutionarily in animals as diverse as nematodes, flies, and vertebrates such as FGFR-like genes and NK homeodomain genes (including tinman) (Castanon, 2002).  In Drosophila, the Twist gene plays important roles in the formation of mesoderm from gastrulation through the patterning of muscle.  The involvement of Twist in gastrulation is not conserved across all organisms and Twist may actually have opposite effects on muscle differentiation (promoting vs. inhibiting) in different organisms (vertebrates compared to flies, for example). Twist homologs are known in cnidarians, nematodes, protostomes, and deuterostomes.  A second known domain called the WR motif exists in Twist proteins in diverse organisms, although it is most highly conserved in vertebrates (Castanon, 2002).

     Although jellyfish have both smooth and striated muscle, they technically lack mesoderm.  The entocodon which produces this muscle is described as ‘mesoderm-like’. Twist functions in the differentiation of mesoderm in tribloblast bilaterans but it is also present in diploblast cnidarians.  Twist expression first occurs in the entocodon, the mesoderm-like layer from which muscle tissue differentiates (Spring, 2000; Castanon, 2002).  The development of striated and smooth muscle from this third cell layer have led some to consider them as tribloblastic  (Muller, 2003).

    Mammalian Twist homologs are also involved in inflammation and repression of osteoblast differentiation.  Mutations in humans cause craniosyntosis, Robinow-Sorauf syndrome, and Saethre-Chotzen syndrome (OMIM). 


Twist2 (Dermo1)

     Twist 2 function in bone differentiation and repress the myogenic bHLH proteins (OMIM).



These proteins are involved in the differentiation of muscle.  Overexpression of these gene induce other genes to progress towards a muscle fate and mutations MyoD and Myf-5 can cause the absence of muscle in mice. (Norton, 1998). MyoD and Myf5 are involved in the differentiation of myoblasts while myogenin functions in the subsequent differentiation of muscle (Lu, 1999).

     Jellyfish are known to possess at least four bHLH transcription factors.  The sequence, dimerization, and expression of the JellyD1 protein in striated muscle indicate that it is a homolog of MyoD in bilaterans (it is more similar to MyoD than any other bHLH protein).  Vertebrate MyoD genes, which can form both homodimers and heterodimers, can form dimers with JellyD.  This indicates that the striated muscle of jellyfish is homologous to that of bilaterans (Muller, 2003).

Myogenic factor 4; MYOG

Myogenin, or MYOG, is a factor in myogenesis



Group A and B HLH proteins include the original E2A gene (which can alternately transcribed to produce the E12 and E47 proteins) and a number of proteins which promote the differentiation of neurons or muscle.  Cnidarians and bilaterans use similar mechanisms in muscle and neuronal differentiation, including bHLH proteins (Seipel, 2004).  This protein functions is a ubiquitously expressed Class I bHLH protein.  E2A/Daughterless is ubiquitously expressed.  Many bHLH proteins expressed in specific tissues form inactive homodimers unless they can form heterodimers with E2A (Ledent, 2001).


Helix-Loop-Helix Transcription Factor 4 (HTF4)

HTF is similar to E2A and is present in many tissues.


Atonal Homolog; Atoh1

Cnidarians possess homologs of Ash and Atonal (Seipel, 2004).  Atoh1 is required for the formation of the granule layer of the cerebellum (OMIM).

The development of hair cells in the mammalian ear requires the expression of the bHLH gene MATH1, a homolog of the Drosophila gene atonal (Bermingham, 1999).



MIST1 regulates the development of the exocrine portion of the pancreas, pictured in the following photo (OMIM).




Musculin (MyoR)

MyoR represses muscle differentiation (Lu, 1999) and is expressed in a number of lymphatic cells, such as activated B cells.  Mutant mice lack chewing muscles derived from the first branchial arch such as the masseter, temporalis, and medial and lateral pterygoids (OMIM).  Two of these muscles are evident in the rhesus monkey below.


Mesoderm Posterior ; Mesp

Mesp1 seems to play a role in the early development of mesoderm and of primordial germ cells (OMIM).


bHLH proteins Mesp2 and paraxis play a role in somitogenesis (McGrew, 1998).  Mesp2 expression in mesoderm decreases after the formation of somites (OMIM).


Transcription Factor 15 TCF15 (Paraxis)

TCF15 mutations in mice disrupt somite formation and the patterning of skeletal muscle (OMIM).



HAND1  is required for the formation of both heart muscle and the embryonic trophoblast (OMIM).  The developing heart of chick embryo is pictured below.


HAND2 (dHAND) is involved in the development of the heart (Norton, 1998; OMIM).HAND helix-loop-helix transcription factors are known from fruit flies to humans.

     HAND genes are closely related to Twist and are known in both flies and vertebrates (2 HAND genes exist in humans).  In vertebrates they are involved in the development of the heart, pharyngeal arches, gut, nervous system while in flies HAND functions in the development of the heart, nervous system, and other structures.

C.elegans possesses a HAND-like gene (Firulli, 2003).



Pancreas Transcription Factor 1; PITF1a

PITF1A is required for the formation of the pancreas (OMIM).



Mutations in vertebrate Scleraxis or Twist homolog Dermo1 cause the absence of mesoderm (Spring, 2000).    SCL is required for hematopoeisis and mutations may cause leukemia (Norton, 1998).


 Lyl-1 is similar to SCL and can also be involved in leukemia (Norton, 1998).




THe bHLH protein Hxt promotes the differentiation of trophoblast, the first cell lineage to develop in mammalian embryos (Cross, 1995).


2) B—Many members of the B group also possess a leucine zipper.

There are a number of families in the B group of bHLH proteins which are only known in plants or yeast.




     The myc gene family are major proto-oncogenes in the control of cell division and differentiation and changes in myc protein production, whether through gene amplification, chromosomal translocation, or the insertion of a virus, is a factor in many cancers.  This gene family, which includes myc, mad, and mxi1, are helix-loop-helix DNA binding proteins which regulate transcription.  Max, which interacts with myc, is more distantly relates. C-myc is widely expressed throughout the body and has can contribute to a wide variety of tumors.  In contrast, N-myc and L-myc demonstrate a more specific tissue expression and are involved in a narrower range of tumors.  Sequence comparisons suggest that early in vertebrate evolution, there was a duplication which led to c-myc and the ancestral gene for N-myc and L-myc which diverged after a later gene duplication (Atchley, 1995).

     Myc, Mad, and Max contain both HLH and leucine zipper domains.  Myc and Mad can both bind Max.  Myc/Max heterodimers activate transcription while Mad/Max heterodimers repress transcription.   Although these dimers can bind to many of the same genes, there are some targets of Myc:Max (such as apoptotic genes) which are not targets of Mad/Max (James, 2002).  Both Myc and Twist exist in both flies and vertebrates (Muller, 2003).  

    The mad family genes include Mad1, Mxi1, Mad3, and Mad4 (James, 2002).  Mnt and Mlx members of the Myc bHLH protein family control cell division and apoptosis.  Homologs of these genes are known in Drosophila (Peyrefitte, 2001).


Upstream Stimulatory Factor; USF1

USF1 is ubiquitously expressed.   Mutations can cause susceptibility to hyperlipidemia.

Upstream Stimulatory Factor; USF2

USF2 is ubiquitously expressed and is involved in glucose metabolism.



Microphthalmia-associated transcription factor; MITF
MITF is a homodimeric transcription factor in mice whose mutations cause loss of pigmentation, reduced eye size, and deafness.  In humans, mutations can cause Waardenburg syndrome, type IIA and effects include hearing loss and loss of pigmentation (OMIM).


Transcription Factor EC; TFEC





MAX-Like Protein; MXL

MXL and MAX can repress transcription from E-box proteins after association with MAD1, MAD4, or MNT (OMIM).


MAX-Interacting Protein 1; MXI1

MXI1 suppresses the activity of Myc and thus is a potential tumor suppressor protein (OMIM).


3) C

     Proteins containing PAS domains are integral in the circadian clocks of both flies and vertebrates.  PAS proteins are known since bacteria and have functions ranging from the detection of blue light to development.  Many PAS proteins also contain bHLH domains.  The PAS domain is about 200 amino acids long and includes two duplicate regions (PAS-A and PAS-B) (Hahn, 1998; Whitmore, 1998). 

Mutations in CLOCK affect the length of circadian rhythms (Whitmore, 1998).  CLOCK and cycle genes are bHLH-Pas domain proteins involved in circadian rhythms possessed by both mice and flies.  Drosophila per and vertebrate mper also contain PAS domains and mper may also possess a bHLH domain (Whitmore, 1998).  Clock and mper proteins are expressed in a diversity of tissues.  It seems that there are multiple independent circadian pacemakers in both flies and vertebrates.  In vertebrates, the hypothalamus (SCN), retina, and pineal gland can establish their own rhythms (Whitmore, 1998). 


    In bHLH/PAS proteins, the PAS domain is the source of dimerization and DNA binding specificity unlike that of other bHLH proteins (Kewley, 2004).


Neuronal PAS protein-1, 2

NPAS1 is expressed most highly in the brain.


NPAS2 is expressed in the brain and is involved in circadian rhythms.  Mutations in mice resulted in deficits in specific types of memory (OMIM).




Clock is highly expressed in the suprachiasmatic nucleus (a part of the hypothalamus depicted below which is involved in mammalian circadian rhythms in all species studied), the supraoptic nucleus, and the cerebellum.  Clock proteins regulate circadian rhythms in both vertebrates and invertebrates.  Clock proteins form heterodimers with ARNTL.



All eukaryotes and some prokaryotes are capable of maintaining circadian rhythms.  In mammals, the expression of Per1 in the retina varies over a daily period being highest at the onset of dusk.  Daily variation of expression also occurs in the suprachismatic nucleus of the hypothalamus, the pars tuberalis, and the purkinje cells of the cerebellum.  The pars tuberalis secretes LH which regulates melatonin.  Mice with mutations in Per1 and Per2 lacked circadian rhythms (OMIM).



Single-minded (SIM1), 2

SIM1 is expressed in the brain and kidney.  It is required for formation of two of the nuclei of the hypothalamus and mutant mice developed early-onset obesity (OMIM).


SIM2 mutations may be involved in a number of disorders of the nervous system including (given its location on chromosome 21) Down Syndrome.


     AHR receptors are known in flies, mollusks, and nematodes (Butler, 2001).  Aryl hydrocarbon receptors (AHR) bind a variety of halogenated aromatic hydrocarbons such as dioxin and mediates their effects on the body.  Once bound to its ligand, AHR dimerizes with ARNT and binds DNA, increasing the expression of a number of genes, such as the cytochrome P450 enzymes which metabolize xenobiotics.  AHR and ARNT are both bHLH-PAS transcription factors (Hahn, 1998).  The N-terminal domain is homologous to the human gene ARNT and the Drosophila genes Sim and Per while the helix-loop-helix region is similar to those found in Myc and MyoD (Burbach, 1992; Kewley, 2004). 

     Hydrocarbons such dioxin (halogenated aromatic hydrocarbons) affect cell activity and are highly toxic.  They affect cells through aryl hydrocarbon receptors, which belong to the PAS family of proteins whose members play diverse physiological roles.  Vertebrates have multiple AHR genes resulting from gene duplication near the base of the vertebrates and, among the PAS family members known in invertebrates, possible homologs of vertebrate AHR receptors are known (Hahn, 1997).




ARNT1 and 2—AHR nuclear translocator

ARNT induces P450IA1, P450IA2, and other enzymes which metabolize xenobiotics.  These enzymes can converts substances in cigarette smoke, smog, and cooked meat to carcinogens.  The AHR/ARNT heterodimer can interact with dioxin-like substances and activate estrogen receptors (OMIM).  A mutation has been linked to leukemia.


ARNT2 is only expressed in the dorsal spinal cord and in the first branchial arch.  It may have a role in cellular responses to reduced oxygen.


AHR nuclear translocator-Like 1; ARNTL (BMAL1)

ARNTL functions in establishing circadian rhythms and can form heterodimers with Clock proteins.


Hypoxia-inducible factor 1; HIF1

HIF1 is expressed in cells undergoing hypoxia and mediates cellular responses to hypoxia (such as angiogenesis, apoptosis, and metabolic changes).  Mutations in mice affect the activity of the carotid bodies in response to hypoxia (OMIM; Kewley, 2004). 


Endothelial PAS protein-1

EPAS1 heterodimerizes with ARNT, is expressed in well-vascularized tissues, and may regulate endothelial responses to low oxygen levels.



Steroid receptor coactivator (SRC-1)

Transcriptional intermediary factor (TIF2)

Receptor associated coactivator 3 (RAC3)

 (Hahn, 1998).



4) D


5) E

Hairy; E(spl)

      Notch is a cell membrane protein which, after interacting with its ligand delta, causes to Suppressor of Hairless/RBP-J (Drosophila/vertebrate homologs) to move to the nucleus where it increases the transcription of Enhancer of Split/Hes1/Hes5.  Notch activates anti-neuronal genes.  This pathway has been conserved in coelomates (Kageyama, 1997).  In flies, notch signaling is required for the differentiation of all three germ layers.  Most of their activity is mediated through a group of seven bHLH genes located in the Enhancer of Split complex which is homologous to the 4 HES proteins in vertebrates.  Hairy and deadpan are also homologs of E(spl)-C but do not occur in the complex and do not mediate Notch signaling (Wumbach, 1999).

     In flies, hairy is a pair rule gene expressed in every other segment of the developing embryo.  In chick embryos, hairy is expressed in every somite resulting from oscillations an autonomous developmental clock along the longitudinal axis.  Disruptions of hairy causes the arrest in somitogenesis.  The utilization of hairy during the embryonic development of segmentation in both vertebrates and invertebrates suggests that ancestral mechanisms of segmentation have been preserved (Palmeirim, 1997).  Embryonic somites of a pig are depicted below.

nerve chord

Hairy/Enhancer of Split-Related with YRPW Motif; HEY1

HEY1  and HEY2 seem to be mediators of Notch signals in the development of blood vessels (OMIM).


Cardiovascular bHLH factor; CHF1 (HEY2)

     CHF1 is involved in the development of blood vessels.


Sharp1 is a member of the Hairy/Enhancer of Split family known from mice where it is expressed in the brain sensory system, vertebrae, liver, and several glands (Azmi, 2002).


In vertebrates, anti-neuronal genes include homologs of Drosophila hairy and Enhancer of split named Hes1-5 and homologs of extramacrochaetae (emc) named Id1-4 (Kageyama, 1997).  BHLHB2 (Dec2) is expressed in the suprachiasmatic nucleus of the hypothalamus in a circadian fashion (OMIM).



BHLHB2 (Dec1) is expressed in the suprachiasmatic nucleus of the hypothalamus in a circadian fashion.


Hairy/Enhancer of Split Homologs (HES)

Hes-1, Hes-2, Hes-3, Hes-5 are mammalian homologs of Drosophila enhancer of split. 

Hes-3 is expressed only in Purkinje cells of the cerebellum.

Hes-1 and Hes-2 are widely expressed.


Hes1 has important roles in neurogenesis, myogenesis, and gender determination (OMIM).




Hes-5 is expressed in the developing nervous system (Norton, 1998).  The developing brain of a chick embryo is depicted below.



Transcription Factor 3 TCF3 (E2A/E47)

bHLH proteins form heterodimers with the two alternate transcripts of the TCF3 gene (E2A and E47) in order to bind DNA. These alternate transcripts can also homodimerize with each other. TCF3 is expressed in most cells, function in the differentiation of B cells, and mutations can cause leukemia (OMIM; Norton, 1998).


Inhibitor of DNA binding 1; ID1

Most HLH proteins possess a basic domain adjacent to the HLH domain and are referred to as bHLH proteins.  The group of Id HLH proteins lack this basic region and can inhibit cell differentiation by binding to bHLH proteins and preventing their action.  Sequence similarities and common intron-exon boundaries indicate that they represent a gene family resulting from duplications of an ancestral gene.  A single Id protein is known in flies (extramacrochaetae) while four are known in mammals (Id1-4).  Id proteins form heterodimers with bHLH class proteins of the classes A and E but not class B.  As regulators of cell growth, they can be involved in cancer (Norton, 1998).


Inhibitor of DNA binding 4; ID4

ID4 produces several alternate transcripts.


ID3 possessed deficient humoral responses.


Proneural bHLH transcription factors can foster both neuronal differentiation and the development of specific neuronal traits (Talikka, 2002).

Proneural HLH proteins can activate transcription after forming a heterodimer with another HLH protein, E2A.  Anti-neuronal proteins also form dimers with proneural genes, preventing their binding to E2A.  Anti-neuronal genes may also bind to the control elements of the proneural genes, inhibiting their transcription (Kageyama, 1997).


5) F--COE family with COE domain

Transcription Factor 21 TCF21 (Capsulin) is a mammalian bHLH transcription factor expressed in progenitors of smooth muscle which may be homologous to a Drosophila muscle expressed in the progenitors of visceral muscle.  It also functions in the development of the heart and of tubes lined by epithelia (Hidai, 1998).  Its expression in podocytes coincides with their differentiation and it is expressed in the epicardium and coronary blood vessels (OMIM).


Sterol Regulatory Element-Binding Transcription Factor 1; SREBF1

SREBF1 is involved in cellular responses to low sterol levels and in LDL metabolism.  It is larger than most bHLH proteins (OMIM).



Oligodendrocyte Lineage Transcription Factor 1: OLIG1

Oligodendrocyte Lineage Transcription Factor 2: OLIG2

OLIG1and OLIG2 are primarily expressed in oligodendrocytes.  Mutations in both genes prevent the differentiation of oligodendrocytes.



Transcription Factor-like 5; TCL5

TCFL5 is expressed in the testes.


Transcription Factor 4; TCF4

TCF4  trinucleotide repeats (CTG) may be a factor in determining susceptibility to bipolar disorder.


T-Cell Transcription Factor; TFEB

Misexpression of TFEB seems to be a factor in some renal carcinomas.


Nescient Helix Loop Helix 1; NHLH1

NHLH1 mutations cause early death, heart rhythm abnormalities, and decreased baroreceptor function in mice.  The gene is expressed in the developing brain.


Nescient Helix Loop Helix 2; NHLH2

NHLH2 is expressed in the developing brain and may be involved in pubertal onset.  Mutations in mice affected male mouse behavior, testes development, and spermatogenesis.


Early B-cell Factor EBF (OLF1)

EBF is expressed in olfactory neuronal precursors and in Bcells.  Mutant mice do not develop B cells.  The protein can form a homodimer.


T-Cell Acute Lymphoblastic Leukemia 2; TAL2

Tal2 is a transcription factor similar to Lyl1 and both are considered oncogenes.  Mutations can cause leukemia.


Transcription factor for immunoglobulin heavy chain enhancer 3; TFE3

TFE3 is expressed in many tissues and binds to immunoglobulin enhancers (OMIM).