Oklahoma State University

Molecular Genetics

Genetics is defined as the mechanism by which traits are passed on from one organism to another and how they are expressed.
How does information flow? Central dogma of molecular biology
Replication - to ensure that following generations acquire genetic material
Transcription - intermediate process between DNA and protein synthesis. Generates mRNA, rRNA and tRNA.
Translation - messenger RNA is translated by ribosomes into a polypeptide based on genetic code of the nucleic acids. Genetic code is made up of "words" that are 3 bases long called a codon.
Reiterate - central dogma of molecular biology is that genetic information flows from DNA through RNA to protein.
Genetic information - genetic information is stored primarily in DNA.
What is DNA? DNA is composed of four nucleotides adenine, guanine, cytosine and thymine. Each nucleotide is composed of a nitrogenous base, deoxyribose, and phosphate groups. DNA is a polynucleotide molecule where the backbone is a sugar phosphate backbone and the nitrogenous bases stick out like rungs on a ladder. The phosphate linkage is called a phosphodiester linkage between the 5 prime phosphate group of one sugar and the 3 prime hydroxyl group on the second sugar.
DNA is a double helix where two polynucleotide strands that are complementary are held together by hydrogen bonds. Specifically adenine and thymine complement each other and guanine and cytosine complement each other. The two strands are oriented in an antiparallel fashion meaning that one strand runs 5 prime to 3 prime and the complement is 3 prime to 5 prime. The DNA is not a ladder but rather has a twisted about it which gives it it's helical nature.
The amount of DNA in an microorganism may be in the thousands of kilobase pairs. E. coli has about 4700 kilobase pairs which is about 1.5 millimeter in length if it were linear but bacterial chromosomes are circular.
Problem still getting all of that DNA in a cell. How do they pack the DNA in a cell? The answer is as a supercoiled molecule. Supercoiled DNA is double stranded DNA further twisted in what is either negatively or positively supercoiled. Most DNA in nature is negatively supercoiled. Supercoiled DNA is only stable if the two strands are intact. If one strand is nicked, the supercoil tension is relaxed and the DNA is in a relaxed state.
How is DNA supercoiled in bacteria? A enzyme called DNA gyrase which introduces the supercoils. DNA gyrase is one topoisomerase enzyme which change the topology of DNA - specifically topoisomerase II.
Topoisomerase I removes supercoils from DNA. A nick is made in one of the two strands and the supercoil relaxes. In bacteria, like eukaryotic cells, a single nick does not cause the entire chromosome to become relaxed since there are some 50 domains of supercoil which are independent of each other. The level of supercoiling is balanced between the activity of the two topoisomerases.
Genetic elements includes chromosome and other elements.
Chromosome - single circular double stranded helix that codes for indispensable functions - so called house keeping genes
Plasmid - small cirular double stranded helix. Code for important properties for the cell such as antibiotic resistance. There may be more than one type of plasmid in a cell.
Transposable elements - pieces of DNA that can move about the chromosome. Three types of transposable elements:
insertion sequences that carry no genetic information other than what is required to move.
Transposons - which carry other genes in addition to what is required to move.
Some special viruses -
DNA Replication - making a second copy of the chromosome before the cell divides.
Beginning of replication - in a circular DNA molecule there is a single origin of replication (oriR) which is where replication begins. The origin of replication opens up and DNA replication begin on the two single strands and as the two strands are separated a replication fork is formed and proceeds down the DNA. Usually bidirectional replication occurs.

leading strand - newly synthesized strand that grows towards the replication fork. Replication begins with a small stretch of RNA - a primer - laid down by primase. DNA polymerase III catalyzes the bonding between the 3 prime hydroxyl group of the primer and the 5 prime phosphate group of the incoming nucleotide with the concomitant hydrolysis of the terminal two phosphate groups. This leads to 5´ to 3´ extension of the newly synthesized strand.
At the replication fork, a helicase unwinds the DNA to expose the single strand. Helicases are ATP-dependent enzymes that hydrolyze ATP as they move in advance of the replication fork. Single stranded binding proteins stabilize single stranded DNA before it forms double stranded DNA.
lagging strand - because DNA is antiparallel, this strand is synthesized discontinuously. What do we mean by discontinously? Synthesis of the lagging strand occurs in starts and stops. Why because the 3 prime hydroxyl group is pointed away from the replication fork. As the replication fork opens up, primase lays down 11 bases and then DNA polymerase III lays down nucleotides until it reaches the beginning of another short segment called Okazaki fragments which are about 1000 bases long. Here Pol III falls off and Pol I removes the primer sequence and lays down deoxynucleotides until it has removed all of the ribonucleotides of the primer and then it falls off. DNA ligase forms a bond between the two new fragments.
The fidelity of replication is remarkable. About one error in between 10 8 and 10 11 base pairs. Errors are corrected by the proofreading activity of PolIII. Pol III has 3 prime to 5 prime exonuclease activity to remove any mistakened nucleotides that tries to make an incorrect insertion.
Transcription
The synthesis of RNA from DNA template. There are three different RNA molecules: messenger RNA, ribosomal RNA, and transfer RNA. There are three key differences between RNA and DNA: 1. ribose instead of deoxyribose; 2. uracil instead of thymine; 3. RNA is usually single stranded.
RNA polymerase - the enzyme that catalyzes the formation of RNA by transcribing DNA into RNA. Requires a DNA template, ribonucleotides ATP, GTP, CTP and UTP. Elongation is 5 prime to 3 prime just like DNA synthesis. No need for a primer unlike DNA synthesis.
Template DNA is usually double stranded DNA but only one of the two strands is transcribed for any gene.
RNA polymerase composed of beta, beta prime, two alpha subunits plue a sigma subunit. The core enzyme is beta, beta prime and the two alpha subunits.
RNA polyerase must start at the proper site of each gene to generate a complete correct transcript. RNA polymerase binds at the promoter site to orient the enzyme at the correct start position. It is the sigma factor of the polymerase that recognizes the promoter region.
Features of the promoter sequence of DNA - Two highly conserved sequences in the promoter sequence of many different genes. -10 region or the Pribnow box has a sequence of TATAAT and the -35 region has the sequence TTGACA. Again these sequences are highly conserved but not perfectly conserved among all genes.
Features of transcription terminators - Some terminators are a result of secondary structures in the RNA transcript that are stem loops followed by runs of uracil. A second terminator is a GC rich region followed by a AT rich region. A third type of termination is due to an extrinsic factor called Rho which binds to the RNA and moves toward the DNA/RNA polymerase complex and when the RNA polymerase stalls at a rho dependent site, rho causes RNA polymerase and RNA to leave the DNA.
Features of messenger RNA - mRNA is unstable unlike tRNA or rRNA. In prokaryotes mRNA often codes for more than one polypeptide but 2 or more. This is referred to as polycistronic mRNA.
Features of transfer RNA - have two important features: 1. carry an amino acid and 2. recognize a sequence on the messenger RNA - the codon.

tRNA structure - short single stranded RNAs of about 73-93 nucleotides in length. There are many unusual nucleotides in tRNAs which are a result of post-transcriptional modifications of the tRNA. The tRNA folds back upon itself to have secondary structure due to internal base pairing. Often drawn to look like a clover leaf which is slightly missleading.
One loop of the tRNA is very important - the anticodon loop -which is where you find the anticodon. The three nucleotides of the anticodon recognize the three nucleotides of the codon in the mRNA molecule.
The other important end of the tRNA is the 3 prime end which is alway CCA where the amino acid attaches to the terminal adenine nucleotide via an ester linkage.
how is the correct amino acid attached to the correct tRNA? There are key regions of the tRNA like the anticodon region and the 5 prime end of the tRNA that are important in the recognition of the correct tRNA by the correct aminoacy-tRNA synthetase protein.
First reactions involves the activation of the amino acid by a reaction with ATP to form aminoacyl - AMP which remains bound to the synthetase protein. A correct tRNA enters this complex and the amino acid is transferred to the tRNA to form a charged tRNA - aminoacyl-tRNA. The aminoacyl-tRNA participates in translation.
Translation - the synthesis of a polypeptide. Ribosomes are the site of protein synthesis. Ribosomes are composed of two subunits - 30S and 50S subunits. Each of these are composed of ribosomal RNA and proteins. For example the 50S subunit is composed of a 5S and 23S rRNA and about 34 different proteins.
Protein synthesis can be broken down into initiation, elongation, and termination. These are ongoing processes that are continuously going on.
Initiation of protein synthesis.
30S subunit, mRNA, formylmethionine tRNA and initiation factors are required. These components form a complex near the 5prime end of the mRNA at a site called the Shine-Delgarno sequence which interacts with the 16S rRNA to ensure that the ribosome is starting at the beginning of a gene. The 50S subunit then joins the complex to form a complete ribosome. The first codon of the mRNA is usually AUG - the start codon for translation. This codon is recognized by formyl-methionine tRNA. The formyl group and maybe even the methionine amino acid may be removed later.
Elongation - Important features of the 50S subunit are two sites called the A site and P site. The A site is the accepting site where incoming amino acids dock with the ribosome. The P site is the peptide site where the growing peptide is attached to a tRNA. Initially f-met-tRNA occupies the P site and the A site is empty. A respective charged tRNA comes and occupies the A site - Which charged tRNA? that depends on the exposed codon in the A site. Now with both P and A site occupied a peptide bond forms between the carboxyl group of the P site amino acid and the amine group of the A site amino acid. Hydrolysis of the aminoacyl-tRNA bond is used to drive the peptide bond formation. Peptidyl transferase is responsible for forming the peptide bond. Now the peptide is attached to the tRNA in the A site. The peptide-tRNA must now move to the P site in a process called translocation which expends another GTP and a new codon is exposed in the A site. The empty tRNA is moved to what appears to be a third site in the ribosome called the E site. This continues on until the ribosome reaches a specific termination signal.
Termination - there are three codons that are stop or nonsense codons - UAA, UAG, and UGA. There are no tRNAs that recognize these codons but rather release factors recognize these codons and come in a cleave the peptide from the tRNA and cause the ribosome to dissassociate.
Energy requirements - 4 high energy phosphate bonds are hydrolyzed per amino acid incorporated into the peptide. 2 bonds are consumed by the activity of the aminoacy-tRNA synthetase activity, 1 bond when the charged tRNA enters the A site and 1 bond hydrolyzed for the translocation process. Point is that protein synthesis is energy demanding and we will see how it is controlled later.
Secreted proteins - proteins that must pass out the cytoplasmic membrane into the periplasmic space or extracellular. These proteins have an extra 15-20 amino acids on the N-terminal portion of the polypeptide. These amino acids are referred to as the signal sequence. The signal sequence is usually rich in hydrophobic amino acids to help it insert into the cytoplasmic membrane. Once secreted, the signal sequence is cleaved by specific peptidases in a posttranslational modification.
Antibiotics - protein synthesis is a great target for antibiotics to control growth of the bacteria. Specifically, antibiotics that interact with the ribosomes are medically useful. Streptomycin inhibits initiation, puromycin, chloramphenicol, cycloheximide, and tetracycline inhibit elongation. Puromycin will compete with incoming amino acids for the A site. Chloramphenicol inhibits peptide bond formation.
Because of the differences between eukaryotic and prokaryotic ribosomes, drugs can be used to stop one group and not the other. Streptomycin and chloramphenicol inhibit prokaryotes and cycloheximide inhibits eukaryotes.
Genetic code - recall that three nucleotides in the mRNA codes for a specific amino acid and these three nucleotides are called a codon. If there are 4 possible bases for each position of a codon then there are 4e3 power possible codons or 64 codons. There are only 20 or so amino acids though. What does this mean? More than one codon may code for a specific amino acid - but no codon codes for more than one amino acid! This is referred to as degeneracy or redundancy.
Significance of degeneracy is that I) there is more than one tRNA for some amino acids and ii) a single tRNA may pair with more than one codon in what is called the wobble effect. There is not a tRNA for every codon, so some tRNAs recognize more than one codon. How? the first two bases are complementary but the third base may be a mismatch but still the correct amino acid is incorporated into the polypeptide.
Open reading frames - it is essential that the ribosome starts at the right codon to initiate the polypeptide. There are many safegards to ensure that this is so. Recall that most polypeptides begin with f-met which is coded for by AUG. This is the start codon and sets the ribosome in the right reading frame.
Another feature of the genetic code is that it is universal. Essentially the same between Escherichia coli and human beings. Your book talks about some exceptions to the universality of the genetic code. But still, these exceptions appear to have evolved from the universal code. In some instances stop codons have been assigned an amino acid.