The basal transcription machinery for RNA polymerase II

Published on February 4, 2014   42 min

Other Talks in the Series: Epigenetics, Chromatin, Transcription and Cancer

0:00
My name's Mark Timmers. I've been interested in the process of gene regulation for a very long time, and in doing so, we've gathered a number of insights. I'm going to share with you what we've learned about the basal transcription by RNA polymerase ii.
0:15
But before we go into the details of the basal transcription process itself, it's important to review the elements which are controlling the process of transcription. These are typically, as indicated in orange here, the enhancer sequences which can be located either upstream or downstream or even in the gene, a locus control region which can act over larger distances, and the function of these DNA elements fits into the events that are happening at the start site, indicated by the arrow. And the start site is part of the core promotor sequence, which is surrounding the start site. And it's about only 50 base pairs in total sequence. Although we know that these elements are important from the simple act of looking at the DNA sequences, it's very difficult to find these functional elements.
1:03
These DNA elements function by attracting proteins. From the DNA sequence of a number of genes, we can now determine the number of genes which are involved in expression of the genome. For a simple eukaryote like yeast, which has about 6,000 genes, there's about 170 gene-specific transcription factors which bind to upstream sequences like enhancers. There's about 250 or so chromatin remodeling and modifying factors. And if we focus on the basal machinery, there's about 60 to 70 general transcription machinery proteins. In addition to this, yeast has about 20 elongation proteins and there's a number of upstream regulatory factors like kinases, ubiquitin, rare proteins, mRNA splicing proteins, export proteins. I'm not going to talk about those. So the total set is about 60 proteins which are involved in basal machinery.
1:58
If you now look at the more complex eukaryotes like ourselves, we have about 20,000 messenger RNA coding genes. There's a number of micro RNAs, and it's about 5,000 to 10,000 long non-coding RNAs. So in total, that's about 30,000 to 35,000 different RNAs which are produced by RNA polymerase. If you now look at the number of regulatory proteins, you see an expansion of the number of gene-specific transcription factors, of chromatin proteins, but the basal machinery is still comprised of about 75, in this case, general transcription machinery proteins. The elongation protein family has expanded, but what's also important is that the general machinery of yeast or human is quite similar in its makeup and the number of genes that are involved.
2:45
In a way, the process of regulated transcription can be described in a rather simple way. An upstream activator binds to enhancer. In this case, I've taken NF-κB as an example. NF-κB is a gene-specific transcription factor, which is involved in the inflammatory pathway, which is mediating a signal to the basal machine, to RNA polymerase, to activate genes. In this case, the Interleukin 2 gene. This activation of transcription is not accomplished by NF-κB itself. It requires a number of cofactors. These cofactors are different in their identities. Mediator is an important cofactor, and I'll talk about this in this presentation. There's also chromatin modifiers, chromatin remodelers, a number of histone proteins, which are not going to be covered in this lecture.
3:32
So I mentioned that factors like NF-κB remediate a response over long distances. The question now becomes how to locate a core promoter? How does the RNA polymerase know how to start? One problem is the accessibility of the DNA, the core promotor DNA. On the left, you can see mitotic chromosome in the condensed form, and on the right, you can see electron micrograph of a cell with the DNA in the interface state. The DNA fiber is visible, is more accessible. The problem still remains how does the polymerase know where to find the core promoter? I'll take an example of yeast. Yeast has about 6,000 core promoters. If you take 50 base pairs for core promoter, that means that about 300,000 base pairs in yeast genome are core promoter. Which means that if you divide the yeast genome in windows of 50 base pairs, 1 out of 40 windows is a core promoter. That's the region where the RNA polymerase supplant. In the case of human DNA, the human genome, the process is much more complicated. As an example, I take chromosome 22. It's about 400 genes, so 400 core promoters. In total, spending about 2,000 base pairs. And these 2,000 base pairs of core promoters are embedded in 48 million base pairs. So 1 out of 24,000 windows is a core promoter. How does RNA polymerase deal with this complexity?
4:57
Let's look at the problem from location of core promoters from the product site, from the mRNA ends. When the maps of five points ends of messenger RNAs, we'll find the core promoter. The cluster surrounds the transcription start site. One of the surprises from this analysis was that only a small proportion of these promoters contain a TATA-box. The previous analysis had indicated that a TATA-box is one of the crucial elements of a core promoter. The majority of the promoters in our genome are a very broad type of promoter. They reside in the CpG islands, and they encompass about 50 to 70% of the total. There's two functional difference between these promoters. Whereas a TATA promoter typically focuses transcription from a single start site, the CpG island type of promoters' transcription start site are scattered in a larger window of about 200 base pairs. So the mRNA ends are scattered over a larger region. That's why we call them broad type of promoters.
Hide

The basal transcription machinery for RNA polymerase II

Embed in course/own notes