Gene Expression
& Regulation
How does the information locked in DNA become a functioning organism? Unit 6 traces the central dogma from DNA replication through transcription and translation, then explores how gene expression is precisely regulated — in prokaryotes via operons, in eukaryotes via transcription factors and epigenetics. Mutations, biotechnology, and cell specialization complete this second highest-weighted unit.
DNA and RNA Structure
Genetic information is stored in DNA and transmitted to subsequent generations. This topic connects Unit 1 (nucleic acid structure) to Unit 6's focus on how that information is expressed. Understanding the physical organization of chromosomes sets the stage for replication and gene regulation.
Chromosomal Organization
| Feature | Prokaryotes | Eukaryotes |
|---|---|---|
| Chromosome shape | Single, circular chromosome in nucleoid region | Multiple, linear chromosomes in nucleus |
| DNA packaging | DNA associated with histone-like proteins; some supercoiling | DNA tightly wound around histone proteins → nucleosomes → chromatin → chromosomes |
| Extra-chromosomal DNA | Often contain plasmids (small circular DNA) carrying extra genes (e.g., antibiotic resistance) | Mitochondria contain their own circular DNA; chloroplasts contain DNA in plants and algae (remnants of endosymbiotic ancestors). Plasmids are far less common in eukaryotes than in prokaryotes and are not a defining feature of eukaryotic cells. |
| Chromosome access | DNA always accessible to RNA polymerase | Chromatin must be unpacked for transcription — histone modification controls access |
Chromatin Structure and Gene Regulation Preview
In eukaryotes, DNA wraps around histone protein complexes (octamers) forming nucleosomes — "beads on a string." Nucleosome packing compacts DNA ~6-fold. Further coiling achieves the ~10,000-fold compaction seen in metaphase chromosomes. Tightly packed chromatin (heterochromatin) = genes silenced. Loosely packed chromatin (euchromatin) = genes accessible for transcription. This packaging is actively regulated by histone modifications (Unit 6.5).
Plasmids are extra-chromosomal and are found in both prokaryotes AND eukaryotes (though far more commonly in prokaryotes). Plasmids are central to bacterial transformation (Unit 6.8) and gene cloning.
Histones and DNA packaging: The degree of chromatin condensation directly affects whether genes can be transcribed. This is the physical basis of epigenetic gene regulation (Topic 6.5). Exam questions may ask why a gene is "turned off" even without a mutation — tightly packed chromatin (heterochromatin) is the answer.
DNA Replication
DNA replication copies the genome before cell division, ensuring each daughter cell receives an identical copy. Replication is semiconservative: each new double helix contains one original (parental) strand and one newly synthesized strand — shown by the Meselson-Stahl experiment (1958) using ¹⁵N/¹⁴N isotope labeling.
Key Enzymes and Steps
| Enzyme | Function | Direction |
|---|---|---|
| Helicase | Unwinds the double helix by breaking hydrogen bonds between base pairs at the replication fork | Moves along DNA unwinding it |
| Topoisomerase | Relieves tension (supercoiling) ahead of the replication fork by cutting and rejoining DNA strands — prevents the DNA from over-twisting as helicase unwinds it | Works ahead of helicase |
| Primase (RNA polymerase) | Synthesizes a short RNA primer (~10 nucleotides) complementary to the template strand — provides the free 3'–OH needed to start DNA synthesis (DNA polymerase cannot initiate de novo) | 5'→3' |
| DNA Polymerase | Adds complementary deoxyribonucleotides to the 3' end of the primer/growing strand; proofreads by removing mismatched bases (3'→5' exonuclease activity) | Synthesizes only 5'→3' |
| Ligase | Seals the nicks (gaps) between adjacent DNA fragments — joins Okazaki fragments on the lagging strand into a continuous strand. Note: ligase seals the phosphodiester backbone; the removal of RNA primers and their replacement with DNA is performed by other enzymes (RNase H and DNA polymerase). | Seals nicks in DNA |
Leading vs. Lagging Strand
Because DNA polymerase can only synthesize in the 5'→3' direction, and the two template strands are antiparallel, replication of the two strands proceeds differently:
Template runs 3'→5'. DNA polymerase synthesizes continuously in the 5'→3' direction, moving toward the replication fork. Requires only one RNA primer at the origin.
Template runs 5'→3'. DNA polymerase must work away from the replication fork in the 5'→3' direction. Synthesis occurs in short, discontinuous segments called Okazaki fragments, each requiring a new RNA primer. Ligase joins the fragments.
Semiconservative replication evidence: The Meselson-Stahl experiment is a classic AP question. After one round of replication in ¹⁴N medium (after growing in ¹⁵N), ALL DNA was hybrid (intermediate density) — not all heavy or all light. After two rounds, half was hybrid and half was light. This proves semiconservative, not conservative or dispersive replication.
Why RNA primer is needed: DNA polymerase can only ADD nucleotides to an existing 3'–OH group — it cannot start a new strand from scratch. Primase (RNA polymerase) can initiate synthesis de novo, creating the short RNA primer that provides the 3'–OH starting point for DNA polymerase.
Lagging strand has more components: Many RNA primers → many Okazaki fragments → many DNA polymerase events → ligase seals them all. The leading strand is simpler. This explains why replication errors are more likely on the lagging strand.
A scientist adds a drug that specifically inhibits ligase during DNA replication. Which of the following would be the most likely consequence?
- (A) The replication fork would fail to open because helicase requires ligase to function.
- (B) Both the leading and lagging strands would fail to be synthesized at all.
- (C) The lagging strand would remain fragmented, with Okazaki fragments unable to be joined into a continuous strand.
- (D) RNA primers could not be synthesized, stopping all replication initiation.
Transcription and RNA Processing
Transcription is the process by which the information in a DNA sequence is copied into an RNA molecule. The enzyme RNA polymerase reads the template (antisense) DNA strand 3'→5' and synthesizes a complementary RNA strand in the 5'→3' direction. Unlike DNA replication, transcription copies only one gene (or a few genes) at a time, not the entire genome.
The Three Stages of Transcription
Transcription factors bind to the promoter region (a specific DNA sequence upstream of the gene, often containing a TATA box). This recruits RNA polymerase to the correct start site. The DNA double helix unwinds locally. RNA polymerase does NOT need a primer — it initiates de novo.
RNA polymerase moves along the template strand 3'→5', adding ribonucleotides complementary to the template (A pairs with U in RNA, not T). The RNA transcript grows 5'→3'. The DNA double helix reforms behind the polymerase. Only one strand (the template strand) is read; the other (coding/sense strand) has the same sequence as the RNA (except T→U).
RNA polymerase reaches a terminator sequence in the DNA. The RNA transcript is released. In prokaryotes, termination can be intrinsic (hairpin loop in RNA) or Rho-dependent. In eukaryotes, a poly-A signal sequence triggers cleavage and release of the pre-mRNA.
RNA Types and Functions
| RNA Type | Full Name | Function | Made By |
|---|---|---|---|
| mRNA | Messenger RNA | Carries the genetic message from nucleus to ribosome; the sequence of codons determines the amino acid sequence | RNA polymerase II (eukaryotes) |
| tRNA | Transfer RNA | Adaptor molecule: one end carries a specific amino acid; the other end has an anticodon that base-pairs with the complementary codon on mRNA during translation | RNA polymerase III |
| rRNA | Ribosomal RNA | Structural and catalytic component of ribosomes; the large rRNA subunit has peptidyl transferase activity (is a ribozyme) — catalyzes peptide bond formation | RNA polymerase I |
| snRNA | Small nuclear RNA | Component of the spliceosome — involved in intron removal (splicing) during mRNA processing | RNA polymerase II/III |
| miRNA / siRNA | Micro/small interfering RNA | Small regulatory RNAs that silence gene expression by degrading complementary mRNA or blocking translation (RNA interference / RNAi) | Processed from precursor transcripts |
Eukaryotic mRNA Processing (Pre-mRNA → Mature mRNA)
In eukaryotes, the initial transcript (pre-mRNA) is extensively modified before it leaves the nucleus. Three key processing steps:
A modified GTP is added to the 5' end of the pre-mRNA early in transcription. Functions: (1) ribosome recognition — ribosomes require the cap to initiate translation, (2) protection from degradation by nucleases at the 5' end.
A string of ~100–200 adenine nucleotides (poly-A) is added to the 3' end of the pre-mRNA by poly-A polymerase after cleavage at a poly-A signal (AAUAAA). Functions: (1) stabilizes mRNA — protects 3' end from nuclease degradation, (2) aids export from nucleus, (3) involved in translation initiation.
Introns (intervening sequences) are non-coding regions removed from the pre-mRNA by the spliceosome (a complex of snRNAs and proteins). Exons (expressed sequences) are joined together. Alternative splicing: the same pre-mRNA can be spliced differently in different cell types, generating different mature mRNAs and thus different proteins — greatly expanding the proteome from a limited number of genes.
Prokaryotes: No nucleus → transcription and translation are coupled (occur simultaneously on the same mRNA). No pre-mRNA processing — mRNA goes directly to ribosomes. Ribosomes begin translating the 5' end of mRNA while it is still being transcribed at the 3' end.
Eukaryotes: Transcription in nucleus, translation in cytoplasm — processes are separated in space and time. Extensive pre-mRNA processing (cap, poly-A, splicing) occurs before export through nuclear pores. This allows more regulatory control over gene expression.
A eukaryotic gene has the following DNA template strand sequence (3'→5'): 3'-TACGCAATTGCGG-5'. What is the sequence of the mRNA transcribed from this template, written 5'→3'?
- (A) 3'-AUGCGUUAACGCC-5'
- (B) 5'-AUGCGUUAACGCC-3'
- (C) 5'-TACGCAATTGCGG-3'
- (D) 5'-ATGCGTAACGCC-3'
Translation
Translation is the process of decoding the mRNA sequence into a polypeptide (protein). The ribosome reads mRNA codons (triplets of bases) and recruits tRNA molecules carrying the correct amino acids, building the protein one amino acid at a time. Translation occurs on ribosomes in the cytoplasm of both prokaryotes and eukaryotes, and on the rough ER in eukaryotes for secreted/membrane proteins.
The Genetic Code
Each codon — a sequence of 3 consecutive mRNA bases — specifies one amino acid (or a start/stop signal). Key properties of the genetic code:
- Triplet code: 4³ = 64 possible codons; 20 amino acids + 3 stop codons + 1 start codon
- Degenerate (redundant): Most amino acids are encoded by more than one codon — the third base often doesn't matter (wobble position). Leucine has 6 codons; methionine and tryptophan each have only 1.
- Start codon:
AUG(codes for methionine) — all proteins begin with Met - Stop codons:
UAA,UAG,UGA— signal termination; no amino acid is inserted - Universal: Nearly identical in all living organisms — strong evidence for common ancestry
- Non-overlapping: Read as consecutive, non-overlapping triplets from the start codon
The Three Stages of Translation
| Stage | Key Events |
|---|---|
| Initiation | Small ribosomal subunit binds to the 5' cap and scans for the AUG start codon; initiator tRNA (Met-tRNA) binds to AUG in the P site; large ribosomal subunit joins; the A site is open and ready to receive the next tRNA |
| Elongation | (1) Codon recognition: Aminoacyl-tRNA enters the A site; anticodon base-pairs with mRNA codon. (2) Peptide bond formation: Ribosome catalyzes transfer of the growing polypeptide chain from P-site tRNA to the amino acid on A-site tRNA (peptidyl transferase activity of rRNA). (3) Translocation: Ribosome advances 3 nucleotides (one codon) in the 5'→3' direction; A-site tRNA moves to P site; previous P-site tRNA exits via E site; new A site is ready |
| Termination | A stop codon (UAA, UAG, or UGA) enters the A site; no tRNA can bind stop codons — a release factor binds instead; the polypeptide is released from the ribosome; ribosomal subunits dissociate |
tRNA Structure and Function
tRNA molecules have two critical regions: (1) the acceptor stem at the 3' end, where the specific amino acid is covalently attached (by aminoacyl-tRNA synthetase — a highly specific enzyme, one per amino acid), and (2) the anticodon loop, which contains 3 bases that are complementary and antiparallel to the mRNA codon. Codon–anticodon base pairing ensures the correct amino acid is inserted.
Retroviruses — An Exception to the Central Dogma
HIV and other retroviruses carry RNA genomes. After infection, the viral enzyme reverse transcriptase copies the RNA genome into a DNA copy (cDNA). This DNA then integrates into the host cell's chromosomes (becoming a provirus) and is transcribed and translated to produce new viral proteins. This RNA → DNA flow is an additional pathway in information transfer — it does not overturn the central dogma but demonstrates that genetic information can also flow from RNA back to DNA.
Reading a codon table: Given mRNA codons, look up the amino acid. Start codon AUG = Methionine. Stop codons UAA, UAG, UGA = no amino acid. This skill is tested directly on the AP exam — a codon chart is provided on the exam.
Counting the polypeptide: If an mRNA has a start codon and a stop codon, the number of amino acids = number of codons between AUG and stop codon (inclusive of AUG, exclusive of stop). Common calculation question.
Universality of the genetic code as evolution evidence: The same codon (AUG) means methionine in bacteria, fungi, plants, and humans — this shared code is powerful evidence that all life descended from a common ancestor.
An mRNA molecule has the sequence: 5'-AUGCCGUUAAGCUGA-3'. How many amino acids are in the polypeptide produced, and what is the last amino acid added before termination? (Use AUG=Met, CCG=Pro, UUA=Leu, AGC=Ser, UGA=Stop)
- (A) 5 amino acids; the last is Serine
- (B) 4 amino acids; the last is Leucine
- (C) 4 amino acids; the last is Serine
- (D) 5 amino acids; the last is encoded by UGA
Regulation of Gene Expression
Not all genes are expressed all the time in all cells. Cells regulate which genes are transcribed (and how much) in response to developmental stage, cell type, and environmental signals. This control operates at multiple levels: chromatin structure, transcription initiation, post-transcriptional processing, and post-translational modification.
Prokaryotic Gene Regulation — The Operon Model
In prokaryotes, related genes are often grouped in an operon — a cluster of genes under control of a single promoter and operator. The most important AP example is the lac operon in E. coli.
Components: Promoter (where RNA polymerase binds), Operator (where repressor binds), Structural genes (lacZ, lacY, lacA — encode enzymes for lactose metabolism), and the lac repressor protein (encoded by separate regulatory gene).
Default state (no lactose present): The lac repressor protein binds the operator → blocks RNA polymerase from transcribing the structural genes → genes are OFF. The cell saves energy by not making enzymes it doesn't need.
When lactose is present: Lactose is converted to allolactose (an isomer), which acts as the inducer — it binds the lac repressor → repressor changes shape → can no longer bind operator → RNA polymerase can now transcribe → lac genes are expressed → lactose-digesting enzymes are made.
Catabolite repression (positive regulation by cAMP/CAP): When glucose is also present, cAMP levels are low → CAP (catabolite activator protein) is inactive → transcription reduced even with lactose present. When only lactose is available (no glucose), cAMP is high → CAP binds the promoter → enhances RNA polymerase binding → maximum transcription. This ensures glucose is used first (preferred energy source).
Eukaryotic Gene Regulation
Eukaryotic gene regulation is far more complex than prokaryotic, operating at many levels simultaneously:
| Level | Mechanism | Effect |
|---|---|---|
| Chromatin remodeling | Histone acetylation (loosens DNA–histone binding → euchromatin); histone deacetylation or methylation (tightens → heterochromatin) | Controls physical access to DNA for transcription |
| DNA methylation | Addition of methyl groups to cytosine bases (usually at CpG islands in promoters) → generally silences gene expression | Heritable gene silencing; involved in genomic imprinting, X-inactivation |
| Transcription factors | Activator proteins bind enhancers (can be far from gene) → recruit coactivators and RNA polymerase → ↑ transcription. Repressors bind silencers → ↓ transcription | Cell-type specific gene expression — same DNA, different proteins expressed in liver vs. muscle cells |
| Alternative splicing | Different exons included/excluded from same pre-mRNA in different cell types | One gene → multiple protein isoforms |
| mRNA stability | Poly-A tail length, miRNA binding → affects mRNA half-life and availability for translation | Controls amount of protein made without changing transcription rate |
| Post-translational modification | Phosphorylation, glycosylation, ubiquitination → activates, inactivates, or marks proteins for degradation | Fine-tunes protein activity and turnover |
Epigenetics
Epigenetics refers to heritable changes in gene expression that do NOT involve changes in the DNA sequence itself. Epigenetic modifications include histone modifications (acetylation, methylation, phosphorylation) and DNA methylation. These marks can be passed from parent cell to daughter cells during mitosis and, in some cases, even to offspring. Key examples: genomic imprinting (one parental allele silenced by methylation), X-inactivation (Barr bodies — one X chromosome per cell condensed into heterochromatin), and cancer (abnormal methylation patterns).
Describe the mechanism by which the lac operon in E. coli is regulated. In your answer, explain what happens when (a) neither glucose nor lactose is present, and (b) only lactose is present.
The lac repressor protein (encoded by the regulatory gene) is in its active form — it has high affinity for the operator. It binds to the operator sequence, physically blocking RNA polymerase from binding the promoter and transcribing the structural genes (lacZ, lacY, lacA). The lac operon is OFF. No lactose-metabolizing enzymes are made. The cell saves energy — there is no need to produce enzymes for a substrate that is absent.
(b) Only lactose present (no glucose):
Lactose is converted to allolactose (the true inducer). Allolactose binds to the lac repressor, causing a conformational change — the repressor loses affinity for the operator and detaches. RNA polymerase can now bind the promoter and transcribe the structural genes. Additionally, because glucose is absent, cAMP levels are high. High cAMP activates CAP (catabolite activator protein). CAP binds to the CAP site near the promoter and recruits RNA polymerase more efficiently → strong transcription of the lac operon. Lactose-digesting enzymes (β-galactosidase, permease) are produced.
Gene Expression and Cell Specialization
All cells in a multicellular organism contain the same genome (same DNA sequence), yet liver cells, muscle cells, and neurons look and function completely differently. This diversity arises from differential gene expression — different cells express different subsets of their genes. Gene regulation is therefore the molecular basis of development and cell differentiation.
How Transcription Factors Drive Cell Specialization
RNA polymerase alone cannot bind a promoter efficiently. It requires transcription factors — proteins that bind specific DNA sequences (promoters, enhancers, silencers) and either recruit or block RNA polymerase:
- Activators bind enhancer sequences (can be thousands of base pairs away from the gene, even downstream) and interact with the transcription initiation complex to increase transcription rate.
- Repressors bind silencer sequences and inhibit transcription — either by blocking RNA polymerase binding or by recruiting chromatin-compacting enzymes.
- Each cell type has a unique combination of transcription factors → expresses a unique set of genes → has a unique proteome → exhibits unique structure and function.
Sequential Gene Expression in Development
Development proceeds through induction — signals from one cell type trigger transcription factor expression in neighboring cells, which then express a new set of target genes, creating cascading waves of gene activation. HOX genes are master regulatory transcription factors that determine body plan patterning (which end is the head, which is the tail; which segment develops limbs). HOX gene mutations cause homeotic transformations — e.g., legs growing where antennae should be in Drosophila. HOX genes are conserved across most animal phyla — another example of common ancestry.
Small RNA Molecules and Gene Silencing
miRNA (microRNA) and siRNA (small interfering RNA) are short (~21–23 nucleotide) small regulatory RNA molecules that regulate gene expression post-transcriptionally. They work by binding to complementary sequences in the 3' untranslated region (3'UTR) of target mRNAs, leading to either mRNA degradation or blocked translation. This RNA interference (RNAi) pathway is a widespread gene-silencing mechanism. miRNAs regulate development, cell cycle, apoptosis, and stress responses. Their dysregulation is associated with cancer.
All cells in an organism have the same DNA, but different cells express different genes. This fundamental principle is tested frequently: a liver cell and a neuron have identical genotypes but very different phenotypes due to differential gene expression. Stem cells (totipotent/pluripotent) can differentiate into any cell type because their gene expression patterns have not yet been locked in.
Enhancers and promoters are separate elements. Promoters are immediately upstream of the gene; enhancers can be far away. Both affect transcription rate, but enhancers are position- and orientation-independent. Mutations in enhancer sequences can cause disease without altering the protein-coding sequence itself.
Mutations
A mutation is any heritable change in the DNA sequence. Mutations are the ultimate source of all genetic variation — they provide the raw material for evolution. Mutations can be beneficial, neutral, or detrimental depending on their effect on protein function and the environmental context in which they occur.
Types of Point Mutations (Single Nucleotide Changes)
| Mutation Type | Change | Effect on Protein | Example |
|---|---|---|---|
| Missense | One nucleotide substituted → different amino acid in protein | May alter or eliminate protein function, depending on where the change occurs (active site vs. elsewhere) and the chemical similarity of amino acids | Sickle cell anemia: GAG→GTG → Glu→Val in hemoglobin β chain |
| Silent (synonymous) | One nucleotide substituted → same amino acid (due to degeneracy of genetic code) | No change in protein sequence; typically no effect on phenotype | UCA→UCG both code for Serine |
| Nonsense | One nucleotide substituted → premature stop codon | Truncated (shorter) protein — usually non-functional; often causes disease | CAG (Gln) → UAG (Stop) → truncated protein |
Frameshift Mutations (Insertions and Deletions)
Adding or deleting one or two nucleotides shifts the reading frame of the mRNA — all codons downstream of the mutation are altered because the ribosome reads in non-overlapping triplets from the start codon. This typically destroys protein function by altering nearly every amino acid after the mutation site. Adding or deleting 3 nucleotides (or multiples of 3) inserts/removes one amino acid but does NOT cause a frameshift.
In general, from least to most severe effect on protein function:
Silent (no AA change) < Missense at non-critical site < Missense at active site < Nonsense (truncation) < Frameshift (alters all downstream amino acids).
However, context matters: a silent mutation in a splice site could cause aberrant splicing and be highly damaging; a missense mutation might be beneficial in a new environment.
Causes and Consequences of Mutations
- Spontaneous mutations: Errors during DNA replication; spontaneous chemical changes in bases (deamination, depurination); tautomeric shifts
- Induced mutations (mutagens): UV radiation (forms thymine dimers → transcription/replication errors); ionizing radiation (X-rays — breaks DNA strands); chemical mutagens (base analogs, intercalating agents, alkylating agents)
- Chromosomal-level mutations: Nondisjunction → aneuploidy (Unit 5.2); deletions, duplications, inversions, translocations of large chromosome segments
Horizontal Gene Transfer in Prokaryotes — A Source of Variation
Unlike eukaryotes that inherit genes only vertically (parent → offspring), prokaryotes also transfer genes horizontally between cells — even between different species. This greatly accelerates evolution (especially antibiotic resistance spread) and is a source of genetic variation:
Uptake of free DNA fragments from the environment directly through the cell membrane. Natural competence varies by species. Basis of lab bacterial transformation used in biotech (Unit 6.8).
Bacteriophage (bacterial virus) accidentally packages host bacterial DNA and injects it into a new host cell during infection. Transfers genes between bacteria via viral vectors.
Direct cell-to-cell DNA transfer through a pilus (protein tube connection). The donor cell transfers a plasmid or chromosomal genes to the recipient. Major mechanism for antibiotic resistance spread.
Mobile genetic elements (transposons/"jumping genes") move DNA segments from one location to another within or between DNA molecules — can disrupt genes or create new combinations.
A mutation changes the 5th codon in a gene from GGU (Glycine) to GGC. Which type of mutation is this, and what is its most likely effect on the resulting protein?
- (A) Nonsense mutation; the protein will be truncated at the 5th amino acid position.
- (B) Missense mutation; Glycine will be replaced by a different amino acid at position 5.
- (C) Silent mutation; the amino acid at position 5 will still be Glycine, so the protein sequence is unchanged.
- (D) Frameshift mutation; all amino acids after position 5 will be changed.
Biotechnology
Modern biotechnology exploits the molecular machinery of cells to analyze and manipulate genetic information. The AP exam focuses on the conceptual understanding of four key techniques: PCR, gel electrophoresis, bacterial transformation, and DNA sequencing. You need to know what each does and why, not the detailed molecular mechanics.
Polymerase Chain Reaction (PCR)
PCR amplifies (makes millions of copies of) a specific DNA sequence in vitro — even from a tiny starting sample. The process mimics DNA replication in a thermal cycler:
High heat breaks hydrogen bonds between DNA strands → double helix separates into two single strands (template strands). Same as helicase action in the cell, but using heat instead of an enzyme.
Temperature drops → short, synthetic single-stranded DNA primers (complementary to sequences flanking the target) bind to the template strands. Primers define the boundaries of the amplified region.
Taq polymerase (heat-stable DNA polymerase from Thermus aquaticus) extends the primers in the 5'→3' direction, synthesizing new complementary strands. After ~30 cycles: 2³⁰ ≈ 1 billion copies of the target sequence.
Applications: Forensic DNA profiling; diagnosis of genetic diseases and infections (COVID-19 PCR test); amplifying ancient DNA; paternity testing; cancer mutation detection; research.
Gel Electrophoresis
Gel electrophoresis separates DNA (or RNA or protein) fragments by size. DNA is negatively charged (from phosphate groups) and migrates through an agarose gel matrix toward the positive electrode when an electric current is applied. Smaller fragments migrate faster and travel further than larger fragments.
- DNA migrates from the negative (−) electrode to the positive (+) electrode (DNA is negative)
- Smaller fragments travel further (lower molecular weight = faster migration)
- A DNA ladder (size standard) of known fragment sizes is run alongside — allows determination of unknown fragment sizes by comparison
- DNA is stained (e.g., ethidium bromide or SYBR Green) and visualized under UV light as bands
- Each band = a population of fragments of the same size
- Forensic DNA fingerprinting: comparison of unique VNTR (variable number tandem repeat) patterns between individuals; probabilities are calculated; no two individuals (except identical twins) have the same pattern
Bacterial Transformation
Bacterial transformation introduces foreign (recombinant) DNA into a host bacterial cell. The procedure: (1) target gene is inserted into a vector (usually a bacterial plasmid with an antibiotic resistance marker and a promoter) to create recombinant DNA; (2) bacteria are made "competent" to take up DNA; (3) bacteria are plated on antibiotic medium — only transformed bacteria (carrying the plasmid with antibiotic resistance) survive; (4) selected bacteria express the foreign gene and produce the desired protein.
Applications: Mass production of human insulin (insulin gene in E. coli), human growth hormone, clotting factors, vaccines; generating transgenic organisms; producing recombinant proteins for research.
DNA Sequencing
DNA sequencing determines the exact order of nucleotides in a DNA molecule. Modern next-generation sequencing (NGS) can sequence an entire human genome in hours. Applications include clinical diagnosis of genetic diseases, cancer mutation profiling, evolutionary analysis (phylogenetics), and personal genomics.
DNA profiling (DNA fingerprinting) is a separate technique that compares characteristic DNA markers — typically short tandem repeats (STRs) or variable number tandem repeats (VNTRs) — between individuals. Rather than reading the full sequence, it generates a pattern of fragment sizes unique to each individual (except identical twins). Applications: forensic identification, paternity testing, ancestry analysis. DNA profiling is often visualized using gel electrophoresis (see Topic 6.8 above).
A forensic scientist runs a gel electrophoresis comparing DNA samples from a crime scene and four suspects (A, B, C, D). The DNA ladder shows bands at 1000, 800, 600, 400, and 200 bp. The crime scene sample shows bands at approximately 800 and 400 bp. Suspect B shows bands at 800, 600, and 200 bp. Suspect C shows bands at 800 and 400 bp. Which suspect's DNA pattern matches the crime scene sample, and what does this mean?
Interpretation: The matching band pattern means Suspect C has the same DNA fragment sizes at the loci analyzed as the DNA found at the crime scene. This is consistent with Suspect C being the source of the crime scene DNA. However, this is probabilistic evidence, not absolute proof — a probability calculation based on allele frequencies in the population would accompany this result to determine how likely it is that the match occurred by chance. The pattern of Suspect B (800, 600, 200) does not match (different set of fragments).
Key gel reading principles applied: (1) Smaller fragments travel further from the well (lower on gel). (2) Each band represents fragments of identical size. (3) Matching bands at the same positions = same fragment sizes = same genetic profile at those loci.
Mixed Practice Questions
A mutation changes a single nucleotide in the middle of a gene, converting a codon for glutamic acid (a charged, polar amino acid) to a codon for valine (a nonpolar amino acid) in the active site of an enzyme. Trace the full consequences of this mutation from DNA to phenotype.
Step 2 — Transcription: When the gene is transcribed by RNA polymerase, the mutant DNA template produces a mutant mRNA. The codon for glutamic acid is replaced by a codon for valine (e.g., GAA → GUA).
Step 3 — Translation: During translation, the ribosome reads the mutant mRNA. When the altered codon is reached, a valine-tRNA (anticodon complementary to the valine codon) enters the A site instead of the glutamic acid-tRNA. Valine is incorporated into the polypeptide at that position instead of glutamic acid.
Step 4 — Protein structure: The primary structure (amino acid sequence) of the enzyme is changed. Glutamic acid is negatively charged and polar; valine is nonpolar. This change in R-group chemistry disrupts the normal noncovalent interactions (ionic bonds, hydrophobic interactions) that maintain the enzyme's tertiary structure. The active site shape is altered — the active site may no longer properly accommodate the substrate.
Step 5 — Phenotype: The enzyme has reduced or no catalytic activity (cannot bind substrate or catalyze the reaction efficiently). If this enzyme is required for a critical metabolic pathway, the pathway is disrupted, leading to an altered phenotype — possibly a metabolic disease, depending on the importance of the enzyme. Example: This is exactly the mechanism of sickle cell anemia (Glu→Val in hemoglobin).
A scientist uses PCR to amplify a segment of DNA from a 5000-year-old bone sample. After 35 cycles of PCR, the scientist has enough DNA for gel electrophoresis analysis. Which enzyme is most critical for the extension step of PCR, and why?
- (A) Standard E. coli DNA polymerase, because it is the most accurate polymerase available
- (B) RNA polymerase, because RNA primers must be extended
- (C) Taq polymerase, because it is heat-stable and remains active after the denaturation step at ~95°C
- (D) Ligase, because Okazaki fragments must be joined during PCR synthesis
High-Frequency Errors to Avoid
- 🧬Saying RNA polymerase reads the coding strand (sense strand)RNA polymerase reads the TEMPLATE strand (antisense strand, 3'→5') and synthesizes RNA in the 5'→3' direction. The resulting RNA has the same sequence as the CODING strand (except T→U). This direction matters for transcription problems — always identify which strand is the template.
- 🔄Saying transcription and translation are coupled in eukaryotesCoupling (simultaneous transcription and translation) occurs only in PROKARYOTES (no nucleus). In eukaryotes, transcription occurs in the nucleus, mRNA is processed and exported, and translation occurs in the cytoplasm. These are spatially and temporally separated.
- ⛔Saying stop codons code for an amino acidUAA, UAG, and UGA are stop codons — they do NOT encode any amino acid. They signal termination of translation. Release factor proteins (not tRNAs) bind stop codons, causing the polypeptide to be released.
- 🔬Confusing introns and exonsIntrons are INtervening sequences — they are spliced OUT of the pre-mRNA and degraded. Exons are EXpressed sequences — they are retained and joined to form the mature mRNA. Memory trick: "EXons are EXpressed; INtrons are IN the trash."
- 📊Thinking larger DNA fragments travel further in gel electrophoresisSmaller fragments travel FURTHER (faster) through the gel. Larger fragments are slowed more by the agarose matrix. The wells are at the TOP; DNA migrates DOWNWARD toward the positive electrode. Bands near the bottom = small fragments; bands near the top = large fragments.
- 🌿Saying all mutations are harmfulMutations can be beneficial (provide a survival advantage), neutral (no effect on fitness, especially silent mutations), or detrimental. The effect depends on environmental context — sickle cell heterozygotes have an advantage in malaria-endemic regions. Mutations are the raw material for evolution, so evolution literally depends on mutations occurring.
- 🧪Saying point mutations always cause frameshiftsFrameshifts require INSERTIONS or DELETIONS of nucleotides (indels). Point mutations (single base SUBSTITUTIONS) never cause frameshifts — they change one codon to another codon, potentially changing one amino acid (missense), creating a stop codon (nonsense), or having no effect (silent).
Unit 6 — Key Takeaways
Prokaryotes: circular chromosome + plasmids. Eukaryotes: linear chromosomes + histones + nucleosomes. Chromatin packing (hetero vs. euchromatin) regulates gene access.
Semiconservative. Helicase (unwinds) → Primase (RNA primer) → DNA Polymerase 5'→3' (leading: continuous; lagging: Okazaki fragments) → Ligase (joins fragments). Meselson-Stahl evidence.
RNA polymerase reads template 3'→5', synthesizes RNA 5'→3'. Eukaryotic processing: 5' cap + poly-A tail + splicing (introns out, exons in). Alternative splicing → protein diversity. Prokaryotes: coupled transcription/translation.
mRNA → protein on ribosomes. Codons (triplets); AUG = start (Met); UAA/UAG/UGA = stop. tRNA anticodon pairs with codon. Initiation → Elongation → Termination. Genetic code is universal = common ancestry. Retroviruses: RNA→DNA via reverse transcriptase.
Prokaryotes: lac operon (repressor blocks operator; allolactose = inducer; cAMP/CAP = positive regulation). Eukaryotes: chromatin remodeling, DNA methylation, transcription factors, enhancers/silencers, miRNA. Epigenetics = heritable non-sequence changes.
All cells same genome → differential gene expression → different cell types. Transcription factors drive cell identity. HOX genes pattern body plans. miRNA and siRNA post-transcriptional silencing.
Silent (no AA change), Missense (different AA), Nonsense (premature stop), Frameshift (insertion/deletion shifts reading frame). Horizontal gene transfer in prokaryotes: transformation, transduction, conjugation, transposition.
PCR: amplifies DNA (denature→anneal→extend with Taq polymerase). Gel electrophoresis: separates by size (small = travels further). Bacterial transformation: insert foreign gene via plasmid vector. DNA sequencing: determines nucleotide order.
Unit 6 = 12–16% of the AP Biology Exam — tied for the highest weight with Units 3 and 7. The top exam topics are: transcription direction rules (template vs. coding strand), translation codon/anticodon problems with the genetic code chart, eukaryotic mRNA processing (cap, poly-A, splicing with introns/exons), lac operon regulation (all four states), mutation type identification and effect on protein, and gel electrophoresis interpretation (small fragments travel further). For FRQs, always trace from DNA mutation → mRNA change → amino acid change → protein structure change → phenotype change. This "molecular chain of effects" is the most common FRQ pattern in Unit 6.