ADDRESS Department of Pharmacology, University of Cambridge, Tennis Court Road, Cambridge CB2 1PD UK
CONTACT e: t: 01223 334176
2023 Mark Howarth. All rights reserved.
Outreach: A protein alphabet I was thinking about the diversity of protein shapes and so curated the alphabet below. (Some of the lab’s research relates to building new protein architectures.) Note that these are not structures from my lab- they come from groups around the world. It is those groups who did the hard work! When academic researchers solve a protein structure, they make the structure freely available in the Protein Data Bank (PDB), for anyone to look at and learn from. In the table below, I briefly describe the function of the protein and give a link to the PDB code, so you can click on that code to easily find more about any structure that interests you. This alphabet was published in Nature Structural and Molecular Biology in May 2015 . These structures are shown in cartoon format, which makes the overall path of the polypeptide chain easier to see; alpha-helices are shown as coils and beta-strands are shown as arrows. The structures are coloured with the N-terminus of each chain in blue and the C- terminus in red. Movie of protein alphabet rotating or watch on YouTube Can you help? - What other alphabets are there representing shapes in science? I know of one from Kjell Bloch Sandved finding the alphabet on butterfly wings. Also, alphabets of human cell nuclei, tissues from histochemical staining, polymer lithography, or DNA self-assembly. - Do you know other protein structures matching well to letters or numbers? (O and C are relatively easy to find. B, E, F, G, H, K, and R are hard to find.) Where all written alphabets came from: BBC Four - The Secret History of Writing Videos for non-scientists What is a Protein? Learn about the 3D shape and function of macromolecules. Introduction to crystallography through cartoons from the Royal Institution. Protein alphabet resources Please use these files freely for any non-commercial purposes. Copyright on files is mine, so you don’t need journal permission. - Just type in to spell out words in Protein Alphabet- converter on webpage from NIH - Complete protein alphabet image (as above) Low resolution (tif, 1MB) Low res jpg, High resolution (tif, 10MB) - Editable protein alphabet image so easy to re-arrange letters in PowerPoint, or as .xcf (Zip file 9MB) for more control using the freeware GNU Image Manipulation Program. - All individual protein letters (Zip of .pse files 15 MB) for viewing 3D structures with PyMOL software. - Movie as mp4 (14 MB) or avi (20 MB) Table of the protein alphabet Click on the PDB code to go to the primary research paper. Hyperlinks on the right are helpful background. PDB code Function Comments 3ifz DNA topology DNA gyrase reaction core from M. tuberculosis. Target of antibiotic. 2qyc Unknown Ferredoxin-like protein from Bordetella bronchiseptica 2bnh Blocks RNA degradation Ribonuclease inhibitor from pig. Leucine-rich repeats. Ribonuclease will bind in centre super-tight. 4j3o Pore to export Usher pore (24-stranded β-barrel) in outer membrane of E. coli. surface proteins Non-pore subunits cut from the image. 2q5r Milk sugar metabolism Tagatose-6-phosphate kinase from Staphylococcus aureus. 3j04 Muscle contraction Myosin fragment bound to regulatory chains, from chicken. From electron crystallography of 2D array. 4u48 Protease inhibitor α2-macroglobulin from Salmonella. Mimic of a protein in eukaryotic defence. 1xu9 Steroid metabolism Enzyme interconverting cortisone and cortisol from human. 4-helix bundle tetramerization site. 3h7x Bacterial adhesion Part of adhesin from Yersinia enterocolitica. Trimeric coiled-coil. 1b3u Cytosolic signaling Protein Phosphatase 2A regulatory subunit from human. 15 HEAT motifs. 4ox0 Gene regulation Keratin-like domain of transcription factor, SEPALLATA, from the model plant, Arabidopsis thaliana. 1ueb Protein synthesis Elongation Factor P from Thermus thermophilus. Three β-barrels, mimicking negative charge and L-shape of transfer RNA. 1ou5 Protein synthesis Human enzyme adding CCA trinucleotide to 3’ end of transfer RNA. 1z85 RNA methyltransferase From Thermotoga maritima. β-barrel and 3-layer sandwich (predicted) 2wcd Bacterial toxin Pore-forming toxin cytolysin A from E. coli. 12 copies of 3-helix bundle. 3afc Development of nerves Semaphorin 6A extracellular domain from mouse. and blood vessels Contains β-propeller fold. 3szv Membrane channel Pseudomonas aeruginosa outer membrane channel. 18-stranded β-barrel. 2arp Differentiation, Human activin A bound to a fragment of follistatin. inflammation 2ot8 Nuclear import Human transportin recognizing a nuclear localization signal. HEAT repeats. 3e98 Unknown GAF domain from Pseudomonas aeruginosa . 2vwe Blood vessel formation Vascular Endothelial Growth Factor bound to neutralizing antibody fragment. 3h90 Metal ion transport E. coli transporter of zinc through inner membrane into the cytoplasm. 4cj9 DNA-binding protein DNA-binding domain from Burkholderia rhizoxinica. Helix-loop-helix repeats. Modular DNA-binding specificity useful for genome editing. 1w3b Protein glycosylation Tetratricopeptide repeat domain of N-acetylglucosamine (GlcNAc) transferase from human. 1igt Immune defence IgG antibody from mouse. The arms can flex to recognize different targets. 4bta Collagen stabilization Part of collagen prolyl 4-hydroxylase from human, relating to role of Vitamin C in preventing scurvy.
I knew the alphabet. Maybe I could be a writer.                                       ― Hubert Selby
If plan A doesn't work,  the alphabet has 25 more letters -  204 if you're in Japan.                                       ― Claire Cook