My name's Mark Timmers.
I've been interested in the process of gene regulation
for a very long time, and in doing so,
we've gathered a number of insights.
I'm going to share with you what we've learned
about the basal transcription by RNA polymerase ii.
But before we go into the details
of the basal transcription process
itself, it's important to
review the elements which
are controlling the process of transcription.
These are typically, as indicated in orange here,
the enhancer sequences which
can be located either upstream
or downstream or even in the gene,
a locus control region which can act
over larger distances, and the
function of these DNA elements
fits into the events that are happening
at the start site, indicated by the arrow.
And the start site is part of
the core promotor sequence,
which is surrounding the start site.
And it's about only 50 base
pairs in total sequence.
Although we know that these elements
are important from the simple act
of looking at the DNA
sequences, it's very difficult
to find these functional elements.
These DNA elements function
by attracting proteins.
From the DNA sequence of a number of genes,
we can now determine the number of genes which
are involved in expression of the genome.
For a simple eukaryote like yeast,
which has about 6,000 genes,
there's about 170 gene-specific transcription factors
which bind to upstream sequences like enhancers.
There's about 250 or so chromatin
remodeling and modifying factors.
And if we focus on the basal machinery,
there's about 60 to 70 general
transcription machinery proteins.
In addition to this, yeast has about 20 elongation proteins
and there's a number of upstream
regulatory factors like kinases,
ubiquitin, rare proteins, mRNA splicing proteins, export proteins.
I'm not going to talk about those.
So the total set is about 60 proteins
which are involved in basal machinery.
If you now look at the more
complex eukaryotes like ourselves,
we have about 20,000
messenger RNA coding genes.
There's a number of micro RNAs, and it's
about 5,000 to 10,000 long non-coding RNAs.
So in total, that's about
30,000 to 35,000 different RNAs
which are produced by RNA polymerase.
If you now look at the number
of regulatory proteins,
you see an expansion of the number
of gene-specific transcription
factors, of chromatin proteins,
but the basal machinery is still
comprised of about 75, in this case,
general transcription machinery
The elongation protein family has
expanded, but what's also important
is that the general
machinery of yeast or human
is quite similar in its
makeup and the number
of genes that are involved.
In a way, the process of
can be described in
a rather simple way.
An upstream activator binds to enhancer.
In this case, I've taken NF-κB as an example.
NF-κB is a gene-specific transcription factor,
which is involved in the inflammatory pathway, which
is mediating a signal to the
basal machine, to RNA polymerase,
to activate genes.
In this case, the Interleukin 2 gene.
This activation of transcription is not accomplished by NF-κB itself.
It requires a number of cofactors.
These cofactors are different
in their identities.
Mediator is an important cofactor, and I'll
talk about this in this presentation.
There's also chromatin modifiers,
chromatin remodelers, a number
of histone proteins, which are not going
to be covered in this lecture.
So I mentioned that factors like NF-κB
remediate a response over long distances.
The question now becomes how
to locate a core promoter?
How does the RNA polymerase
know how to start?
One problem is the accessibility
of the DNA, the core promotor DNA.
On the left, you can see mitotic
chromosome in the condensed form,
and on the right, you can see
electron micrograph of a cell
with the DNA in the interface state.
The DNA fiber is visible, is more accessible.
The problem still remains how does the polymerase
know where to find the core promoter?
I'll take an example of yeast.
Yeast has about 6,000 core promoters.
If you take 50 base pairs for core promoter,
that means that about 300,000
base pairs in yeast genome
are core promoter.
Which means that if you divide the
yeast genome in windows of 50 base
pairs, 1 out of 40 windows is a core promoter.
That's the region where the RNA polymerase supplant.
In the case of human DNA, the human genome,
the process is much more complicated.
As an example, I take chromosome 22.
It's about 400 genes, so 400 core promoters.
In total, spending about 2,000 base pairs.
And these 2,000 base pairs of core promoters
are embedded in 48 million base pairs.
So 1 out of 24,000 windows is a core promoter.
How does RNA polymerase deal with this complexity?
Let's look at the problem from
location of core promoters
from the product site, from the mRNA ends.
When the maps of five points
ends of messenger RNAs,
we'll find the core promoter.
The cluster surrounds the
transcription start site.
One of the surprises from this analysis
was that only a small proportion of
these promoters contain a TATA-box.
The previous analysis had indicated that a TATA-box is one
of the crucial elements of a core promoter.
The majority of the
promoters in our genome
are a very broad type of promoter.
They reside in the CpG islands,
and they encompass about 50 to 70%
of the total.
There's two functional difference
between these promoters.
Whereas a TATA promoter
typically focuses transcription
from a single start site, the CpG island type
of promoters' transcription start site
are scattered in a larger
window of about 200 base pairs.
So the mRNA ends are scattered
over a larger region.
That's why we call them broad type of promoters.