Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction who am I ? language .. English .. verhandelbar Zettel

Similar presentations


Presentation on theme: "Introduction who am I ? language .. English .. verhandelbar Zettel"— Presentation transcript:

1 Introduction who am I ? language .. English .. verhandelbar Zettel
Andrew Torda, wintersemester 2006 / 2007, Angewandte … who am I ? language .. English .. verhandelbar Zettel + stine Übungen also on web where should information go ? stine or our pages ?

2 Administration People Andrew Torda 42838 7331 1. Stock / 105
(but use phone or stairs) sekr (Annette Schade) 7330 Gundolf Schenk Paul Reuter (maths as well) Nasir Mahmood Stefan Bienert (more in DNA) Andrew Torda Jan 2003 Vorlesungen Mittwoch 12:30 Übungen (flexibel) jede zweite Woche Freitag 12:15

3 Homework / Übungen Textbooks
Not too much enough from other courses Übungen very short report (schriftlich) Textbooks any biochemistry book (Stryer, Biochemistry as per chem dept) expensive, not used too much statistics, Ewens, W.J. and Grant, G.R., Statistical Methods in Bioinformatics, Springer, 2001 Leach, Andrew, “Molecular Modelling” very good for future semesters Folien should be sufficient

4 Exams any facts that are mentioned in these lectures and Übungen
schriftliche Klausur 16 Feb

5 Protein Structure - the problem - sociological
Easy ? boring ? Essential How many people have done biology ? chemistry ? Mein Vorschläge Freitag 12:30 protein structure lecture or Schnell hier The chemists can correct me + Freitag 12:30 optional protein structure with details For everybody in normal lecture slot

6 Broad themes Theme of Semester
given some information about a macromolecule (protein) what can be calculated ? predicted ? how much would you trust predictions ? limitation, applicability, reliability typical information a protein sequence (lots known) a protein structure (less known) a DNA sequence (think of genomes)

7 Specific and general models
Dream Feed data to box and have it interpreted given my protein, what is the structure ? given my spectrum where is the centre of the peak ? Model types Specific you know the structure of your data, fit points to the observations General look for some patterns in data – little understanding of the underlying theory examples

8 Interpreting spectroscopic data
just an example (no spectroscopy in this course) many kinds of peaks in spectroscopy look like my mission find centre (≈24) and height (≈0.08) but they have noise

9 noisy data real world has noise we still want centre, height
try simple smoothing no assumptions about data claim centre around 23 looks believable

10 Using prior knowledge I expect peaks like A fit of a calculated peak…
something is clearly wrong if peak has a certain width it must have an appropriate height What looked good is not the correct form

11 More appropriate fitting
what if we used two peaks ? peaks centred at 20 and 26 very different explanation of data

12 General vs appropriate modelling
general smoothing method suggested one peak looks good appears to explain observations generally applicable testing with correct model suggested this is a trick fitting with best model (two peaks) near perfect summary if you know the underlying model, use it always applicable ? back to biological questions

13 General purpose modelling
Proteins have "secondary structure It appears to reflect the sequence of amino acids what is the rule ? 20 amino acids, N positions, 20N sequences, patterns not clear what to do ? correct model – think of all atomic interactions see where atoms should be placed not practical or forget physics use dumb statistics / machine learning approaches bluetongue virus capsid protein

14 Mixtures of specific and general
Will a ligand (Wirkstoff) bind to a protein ? with physics model all atomic interactions, best physical model calculate free energy (∆G) difference in solution / bound more generally gather idea of important terms (H-bonds, overlap, ..) try to find some function which often works do not stick to real physics Will my drug dissolve in water or oil (lipid) ? (important) sounds like chemistry usually approached by machine learning number of atoms, types of atoms, … bluetongue virus capsid protein

15 Similarity Important in all bioinformatics I have a protein of unknown
structure / function / cell localisation is it similar to one of known structure, function … Similarity seems obvious two sets of numbers (above) two protein sequences ACDEACDE rather similar - but quantified ? ADDEAQDE how many positions differ ? how long are proteins ? could the similarity be by chance ? synteny plot: Dr. Brian Fristensky

16 Similarity Two genomes similarity what are the descriptors ?
how many genes are common ? is the order preserved ? Potential drugs drug 1 binds, will drug 2 ? how similar ? synteny plot: Fristensky, B. ligands from, Wang, N., DeLisle, R. K. and Diller, D.J. (2005), J. Med. Chem., 48,

17 Detection and Quantification
Models for prediction and interpretation often not well justified Similarity in these applications detection (finding / recognising) quantification Each in the context of applications first protein structure …

18 Summary so far A model can explain observations, make predictions or both A model may be based on a belief of the underlying chemistry / physics purely mathematical, probabilistic Similarity we have objects with some information (proteins, ligands, genomes, sequences, …) we want to find similar objects and hope they have the same properties similarity has a different meaning in different areas

19 Proteins - who cares ? Most important molecules in life ? Ask the DNA / RNA people structural (keratin / hair) enzymes (catalysts) messengers (hormones) regulation (bind to other proteins, DNA, ..) industrial – biosensors to washing powder receptors transporters (O2, sugars, fats) anti-freeze …

20 Proteins are not friendly
Proteins are easy data (protein data bank, nearly structures literature on function, interactions, structure software viewers, molecular dynamics simulators, docking, .. nomenclature and rules Proteins are not friendly one cannot take a sequence and predict structure /function data formats are full of surprises, mostly old formats data contains error and mistakes

21 Protein Rules Physics /chemistry versus rules / dogma / beliefs / folklore Physics / Chemistry protein + water = set of interacting atoms can be calculated (not really) Rules (not quantified) proteins unfold if you heat them (exceptions ?) if they contain lots of charged amino acids, they are soluble if they are more than 300 residues, they have more than one domain, proteins fold to a unique structure (could you prove this ?) lowest free energy structure

22 Protein chemistry Chemists / biochemists may sleep (quietly)
Short version proteins are sets of building blocks (amino acids, residues, Reste) 20 types of residue chains of length few to 103 ( 100 or 200 typical) small ones (< ≈50) are peptides Longer version

23 Sizes 1 Å = 10-10m or 0.1 nm structure size bond CH 1 Å CC 1.5 Å
protein radius Å α-helix spacing 5 ½ Å Cαi to Cαi+1 3.8 Å myoglobin picture 1gjn from

24 proteins are polymers simple polymers many times gives A X B A X B
example what kind of polymer would this give ? Is it obvious what R is ?

25 Why are proteins interesting polymers ?
boring polymer gives uninteresting structures OK for plastic bags, haushaltsfolie. Not nice regular structures.. What can we do to make things more protein like ?

26 Giving proteins character 1
more complicated backbone with H-bond donor acceptor R basis of standard regular structures in proteins (secondary structure) repeating polymer unit: if this was all there was all proteins would be the same R

27 }- protein chemistry amino acids (monomers) all look like: maybe
OH NH2 C C O R maybe NH3+ C H R O }- sidechain α carbon or Cα how can we construct specific structures ? different kinds of "R" groups

28 Putting monomers together
NH2 C H R1 O OH R2 + NH2 C H R1 O N R2 R3 OH protein synthesis story (biochemistry lectures) ? peptides and proteins < 30 or 40 residues = peptide > 30 or 40 residues = protein

29 side chain possibilities
big / small charged +, charged -, polar hydrophobic (not water soluble), polar interactions between sites… A C T G B R W S

30 Backbone and consequences
peptide bond is planar partial double bond character shorter than other C-N nearly always trans NH2 C H R1 O N R2 R3 OH two bonds can rotate H O C O H H OH N NH2 C C N C C C O H R1 H R3 R2 phi φ psi ψ

31 ramachandran plot can we rotate freely ? no… steric hindrance
diagrams from

32 Backbone H bonds oxygen is slightly negative NH bond is polar H-bonds
δ+ δ- H-bonds can be near or far in sequence fairly stable at room temperature

33 Secondary structure regular structures using information so far
rotate phi, psi angles so as to form H-bonds where possible do not force side chains to hit each other (steric clash) two common structures α-helix β-strand / sheet

34 α helix each CO H-bonded to NH 3 or 4 away 3.6 residues per turn
2 H-bonds per residue side chains well separated

35 β-sheet β-strand stretch out backbone and make NH and CO groups point out β-sheet join these strands together with H-bonds (2 H-bonds/residue) anti-parallel diagrams from or parallel

36 After α-helix and β-sheet
do helices and sheets explain everything ? no there is flexibility in the angles (look at plot) geometry is not perfectly defined there are local deviations and exceptions other common structures tighter helices some turns other structure coil, random, not named

37 What determines secondary structure ?
So far secondary structure pattern of H-bonding Almost all residues have H-bond acceptor and donor all could form α-helix or β-sheet ? No Difference ? sequence of side-chains – overall folding Why else are sidechains important chemistry of proteins (interactions, catalysis) Fundamental dogma the sequence of sidechains determines the protein shape why is dogma a good word ?

38 Side chain properties properties big / small neutral / polar / charged
special (…) example phenylalanine side chain looks like benzene (benzin) very insoluble benzene would rather interact with benzene than water what if you have phe-phe-phe… poly-phe ? does not happen in nature (can be made) would be insoluble not like a real peptide phe is a constituent of real proteins – has a role

39 Properties are not clear cut
You can be big / small, hydrophic / polar combinations are possible Do not memorise this figure Taylor, W.R. (1986) J. Theor. Biol.,

40 Sidechain interactions
ionic (if the sidechains have charge) hydrophobic (insoluble sidechains) H-bonds (some donors and acceptors) repulsive

41 summary of amino acids from Diagram from MDL isis draw. Better pictures at

42 Amino Acids by property
aromatic tryptophan phenylalanine tyrosine

43 rather hydrophobic leucine isoleucine cysteine methionine alanine
proline glycine valine

44 Polar threonine serine glutamine asparagine

45 charged histidine arginine lysine aspartate glutamate

46 Hydrophobicity – how serious ?
very serious, but simplified the lists above are pH dependent difficult to measure experimentally (some aspects) is hydrophobicity really defined ? Other properties - size gly big … small trp ala

47 Other properties – chemistry / geometry
proline only one rotatable angle ! peptide bond sometimes cis pro ramachandran plot 4000 points

48 gly and cys glycine no side chain
can visit forbidden parts of phi-psi map (4000 points here) cysteine forms covalent links with other cys picture from Stryer, L, Biochemistry, WH Freeman, 1981

49 Summary so far proteins are heteropolymers
from the backbone alone form α-helices and β-strands (and more) side-chains determine the pattern of secondary structure overall protein shape special amino acids cys (forms disulfide bridges) gly (can visit "forbidden" regions of ramachandran plot pro (no H-bond donor) last bits of nomenclature…

50 Nomenclature some rules are unavoidable
Alanine Ala A Cysteine Cys C Aspartic acid Asp D Glutamic acid Glu E Phenylalanine Phe F Glycine Gly G Histidine His H Isoleucine Ile I Lysine Lys K Leucine Leu L Methionine Met M Asparagine Asn N Proline Pro P Glutamine Gln Q Arginine Arg R Serine Ser S Threonine Thr T Valine Val V Tryptophan Trp W Tyrosine Tyr Y always write from N to C terminal important convention

51 Definitions, primary, secondary …
first, some more definitions primary structure sequence of amino acids ACDF (ala cys asp phe…) secondary structure α-helix, β-sheet (+ few more) structure defined by local backbone tertiary structure how these units fold together coordinates of a protein quaternary structure how proteins interact

52 Protein structure general comments
primary, secondary, tertiary structure … how real ? primary/secondary well defined edges can blur supersecondary struct / tertiary

53 Representation Ultimately, our representation of a structure…
ATOM N ARG BPI 137 ATOM CA ARG BPI 138 ATOM C ARG BPI 139 ATOM O ARG BPI 140 ATOM CB ARG BPI 141 ATOM CG ARG BPI 142 ATOM CD ARG BPI 143 ATOM NE ARG BPI 144 ATOM CZ ARG BPI 145 ATOM NH1 ARG BPI 146 ATOM NH2 ARG BPI 147 ATOM N PRO BPI 148 ATOM CA PRO BPI 149 ATOM C PRO BPI 150 ATOM O PRO BPI 151 ATOM CB PRO BPI 152 ATOM CG PRO BPI 153 ATOM CD PRO BPI 154 ATOM N ASP BPI 155 x, y, z coordinates drawing the structure ?

54 Drawings 3 ways of looking at “ras” protein
ribbons bare Cα trace ribbons and cylinders are these just cosmetic differences ? diagrams made with molscript

55 Different levels of abstraction
pictures from "Structural Bioinformatics", ed Bourne, PE and Weissig, H., Wiley New York (2003)

56 Atomistic Ribbons For details where does a ligand bind ?
which interactions is a residue involved in ? Ribbons Overview shape number secondary struct elements symmetry strands helices pictures from "Structural Bioinformatics", ed Bourne, PE and Weissig, H., Wiley New York (2003)

57 More abstract no idea of real shape
very quickly classify a protein – example lots of serine proteases lots of different sequences all very similar at this level of abstraction pictures from "Structural Bioinformatics", ed Bourne, PE and Weissig, H., Wiley New York (2003)

58 Why does structure matter ?
what residues can I change and preserve function ? what is the reaction mechanism of an enzyme ? what small molecules would bind and block the enzyme ? is this protein the same shape as some other of known function ? Where do structures come from ? topic of other course (lots of detail) X-ray crystallography NMR + a bit of small angle X-ray scattering, electron diffraction, …

59 Atomic coordinates - warnings
remember the coordinate file ? lots of problems atoms and residues missing numbering can be peculiar history suits fortran 66 (think columns) non-standard amino acids nucleotides, ligands accuracy ATOM N ARG BPI 137 ATOM CA ARG BPI 138 ATOM C ARG BPI 139 ATOM O ARG BPI 140 ATOM CB ARG BPI 141 ATOM CG ARG BPI 142 ATOM CD ARG BPI 143 ATOM NE ARG BPI 144 ATOM CZ ARG BPI 145 ATOM NH1 ARG BPI 146 ATOM NH2 ARG BPI 147 ATOM N PRO BPI 148 ATOM CA PRO BPI 149 ATOM C PRO BPI 150 ATOM O PRO BPI 151 ATOM CB PRO BPI 152 ATOM CG PRO BPI 153 ATOM CD PRO BPI 154 ATOM N ASP BPI 155

60 resolution, precision, accuracy
coordinates what do they mean ? random errors non-systematic / noise / uncertainty should be scattered around correct point from any measurement there are errors ±x.y x-ray crystallography has model for data uncertainty (probability) resolution (experimental) < 1 Å (good) > 5 Å (bad, but excusable – monster structures)

61 X-ray crystallography
N O C non-systematic errors small problems: (O and N look the same) few huge problems newer structures are better proteins are not static overall motion local motion O N C

62 NMR structures different philosophy to X-ray
lots of little internal distances do not quite define structure generate 50 or 102 solutions look at scatter of solutions as with X-ray some parts are well defined some not structure 1sm7 from


Download ppt "Introduction who am I ? language .. English .. verhandelbar Zettel"

Similar presentations


Ads by Google