Transcription is the first step in gene expression, and is the process by which a gene’s DNA sequence is copied (transcribed) to make an RNA molecule.
Prokaryotic transcription and eukaryotic transcription differ in a number of ways, which will be discussed below. Both in pro and eukaryotic cells the process takes place in different places. In eukaryotic cells it takes place in the nucleus and in prokaryotic cells it takes place in the cytoplasm. Both types of transcription involve initiation, elongation, and termination. Before transcription can take place, the DNA double helix must unwind near the gene that is being transcribed. The region of opened-up DNA is called a transcription bubble. Transcription uses one of the two exposed DNA strands as a template; this strand is called the template strand. The RNA product is complementary to the template strand and is almost identical to the other DNA strand, called the nontemplate strand. However, there is one important difference: in the newly made RNA, all of the T (thymine) nucleotides are replaced with U (uracil) nucleotides.
The site on the DNA from which the first RNA nucleotide is transcribed is called the +1 site, or the initiation site. Nucleotides that come before the initiation site are given negative numbers and said to be upstream. Nucleotides that come after the initiation site are marked with positive numbers and are referred to as downstream.
Transcription is performed by enzymes called RNA polymerases. Using a DNA template, RNA polymerase builds a new RNA molecule through base pairing. For instance, if there is a G nucleotide in the DNA template, RNA polymerase will add a C nucleotide to the growing RNA strand. RNA polymerases are large enzymes with multiple subunits, even in simple organisms like bacteria. Humans and other eukaryotes have three different kinds of RNA polymerase: I, II, and III. Each one specializes in transcribing certain classes of genes. Take a look at a diagram of an RNA polymerase in action below:
To begin transcribing a gene, RNA polymerase binds to the DNA of the gene at a region called the promoter, which points out on the DNA where the polymerase should begin transcribing. Each gene (or, in bacteria, each group of genes transcribed together) has its own promoter. You can get a sense of where the promoter is located relative to the gene downstream of it in the image below:
Prokaryotic transcription initiation begins when the transcription machinery binds to the promoter region of a DNA sequence. The specific sequence of a promoter is very important because it determines whether the corresponding gene is transcribed all the time, some of the time, or infrequently. Although promoters vary among prokaryotic genomes, a few elements are conserved. At the -10 and -35 regions upstream of the initiation site (that is, 10 and 35 nucleotides upstream of the site), there are two promoter consensus sequences.
Unlike the prokaryotic RNA polymerase that can bind to a DNA template on its own, eukaryotes require several other proteins, called transcription factors, to first bind to the promoter region and then recruit the appropriate polymerase. The completed assembly of transcription factors and RNA polymerase bind to the promoter, forming a transcription pre-initiation complex (PIC).
The most well-studied promoter element in eukaryotes is a short DNA sequence known as a TATA box, found 25-30 base pairs upstream from the start site of transcription. The TATA box is the binding site for a transcription factor called TATA-binding protein (TBP). Several additional transcription factors and RNA polymerase combine around the TATA box to form the pre-initiation complex.
As mentioned earlier, there are three RNA polymerases in eukaryotes. RNA polymerase I is located in the nucleolus, a specialized nuclear substructure in which ribosomal RNA (rRNA) is transcribed, processed, and assembled into ribosomes. RNA polymerase I synthesizes rRNAs. RNA polymerase II is located in the nucleus and synthesizes all protein-coding nuclear pre-mRNAs. RNA polymerase III is also located in the nucleus. This polymerase transcribes a variety of structural RNAs that includes the 5S pre-rRNA, transfer pre-RNAs (pre-tRNAs), and small nuclear pre-RNAs.
Once RNA polymerase is in position at the promoter, the next step of transcription—elongation—can begin.
During elongation, RNA polymerase “walks” along one strand of DNA, known as the template strand, in the 3′ to 5′ direction. For each nucleotide in the template, RNA polymerase adds a matching (complementary) RNA nucleotide to the 3′ end of the RNA strand, as shown below.
RNA polymerase will keep transcribing until it receives a stop signal. The process of ending transcription is called termination, and it happens once the polymerase transcribes a sequence of DNA known as a terminator.
In prokaryotes, there are two kinds of termination signals: one is protein-based and the other is RNA-based. Rho-dependent termination is controlled by the rho protein, which follows behind the polymerase on the growing mRNA chain. Near the end of the gene, the polymerase encounters a run of G nucleotides on the DNA template, causing it to stall (or stop). As a result, the rho protein collides with the polymerase. The interaction with rho releases the mRNA from the transcription bubble.
Rho-independent termination is controlled by specific sequences in the DNA template strand. As the polymerase nears the end of the gene being transcribed, it encounters a region rich in C and G nucleotides. The RNA transcribed from this region folds back on itself, and the complementary C and G form a stable hairpin that causes the polymerase to stall and fall off.
Termination in eukaryotes begins when a polyadenylation signal appears in the RNA transcript. This is a sequence of nucleotides that marks where an RNA transcript should end. The polyadenylation signal is recognized by an enzyme that cuts the RNA transcript nearby, releasing it from RNA polymerase before transcription actually terminates.
After termination, transcription is finished. An RNA transcript that is ready to be used in translation is called a messenger RNA (mRNA). In eukaryotes, newly transcribed RNAs must first undergo a series of processing steps to form the mature mRNA.
Practice Questions
Khan Academy
Mechanisms of antibody variability during B-cell development
Reverse transcriptase-polymerase chain reaction (RT-PCR) of a UV-dependent gene
MCAT Official Prep (AAMC)
Biology Question Pack, Vol. 1 Question 83
Biology Question Pack, Vol. 1 Question 119
Biology Question Pack, Vol 2. Question 39
Biology Question Pack, Vol 2. Question 102
Practice Exam 2 B/B Section Passage 10 Question 53
Key Points
• Transcription initiation occurs when RNA polymerase binds to a promoter sequence near the beginning of a gene (directly or through helper proteins).
• Transcription elongation occurs in a bubble of unwound DNA, where the RNA Polymerase uses one strand of DNA as a template to catalyze the synthesis of a new RNA strand in the 5′ to 3′ direction.
• Transcription ends in a process called termination, which depends on sequences in the RNA called terminators.
• In prokaryotes, two promoter consensus sequences are at the -10 and -35 regions upstream of the initiation site.
• Eukaryotic transcription is carried out in the nucleus of the cell.
• Eukaryotes require transcription factors to first bind to the promoter region and then help recruit the appropriate RNA polymerase.
• Termination in prokaryotes can be protein-based or RNA-based. Rho-dependent termination occurs when the rho protein collides with the stalled polymerase at a stretch of G nucleotides on the DNA template near the end of the gene. Rho-independent termination is caused the polymerase stalling at a stable hairpin formed by a region of complementary C and G nucleotides at the end of the mRNA.
• In eukaryotes, RNA Polymerase II is the polymerase responsible for transcribing mRNA that encodes proteins.
• In eukaryotes, RNA Polymerase I and RNA Polymerase III terminate transcription in response to specific termination sequences in either the DNA being transcribed (RNA Polymerase I) or in the newly-synthesized RNA (RNA Polymerase III).
Key Terms
Transcription: The first step in gene expression, where DNA is transcribed into RNA.
Prokaryotic: A single-celled organism that lacks membrane-bound organelles.
Eukaryotic: Organisms with membrane-bound organelles.
Initiation: The first stage of transcription when RNA polymerase binds to a DNA sequence.
Elongation: The addition of nucleotides to the 3′-end of a growing RNA chain during transcription.
Termination: The last stage of transcription when the production of an RNA transcript ends.
Transcription bubble: A region where the DNA helix has unwound near a gene that is being transcribed.
Template strand: The strand of exposed DNA that is used as a template during transcription.
Nontemplate strand: The complementary strand of exposed DNA that is not used as the template during transcription.
Initiation site: The site in a DNA sequence where the first RNA nucleotide is transcribed.
Upstream: Nucleotides that are located before the initiation site.
Downstream: Nucleotides that are located after the initiation site.
RNA polymerase: Any of various enzymes that catalyze the formation of polymers of RNA using an existing strand of DNA as a template.
-10 and -35 regions: In prokaryotes, the two promoter consensus sequences where RNA polymerase binds during initiation.
Transcription factors: Proteins that bind to the promoter region of DNA and help recruit RNA polymerase in eukaryotes.
Transcription pre-initiation complex (PIC): In eukaryotes, the completed assembly of transcription factors and RNA polymerase that bind to the promoter to initate transcription.
TATA box: The most well-understood promoter element in eukaryotes where the transcription pre-initiation complex forms.
TATA-binding protein (TBP): A transcription factor that recognizes and binds to the TATA box during initiation.
Termination: The process by which transcription ends.
Terminator: A DNA sequence that marks the end of transcription.
Rho-dependent termination: In prokaryotes, protein-dependent transcription termination.
Rho-independent termination: In prokaryotes, protein-independent transcription termination that instead depends on the DNA sequence being transcribed.
Polyadenylation signal: In eukaryotes, a sequence of nucleotides in RNA that marks where the RNA transcript should end.
Messenger RNA (mRNA): An RNA transcript that is ready to be used for protein synthesis, or translation.