SEMIHOM (algorithm of genetic semihomology)

57 slides
2.87 MB

Similar Presentations

Presentation Transcript


Genetic semihomology algorithmA new approach to the comparative analysis of protein sequencesIt is admitted that from evolutionary point of view the genetic code and amino acid ‘language’ have evolved simultaneously. They also act with strict coherence with each other. Therefore, in analysis of protein differentiation and variability, both levels should be considered simultaneously.


The tools currently used for comparative sequence analysis apply the Markovian model of amino acid replacement and are based on stochastic matrices of the observed amino acid substitution frequencies


BLOSUM62 matrix of amino acid replacements


BLASTP 2.2.2 [Dec-14-2001]BLAST protein search - output


Genetic semihomology algorithmThe aim of the new algorithm elaboration is to overcome the basic disadvantages of protein sequence analysis tools and to exclude some basic errors in the assumptions of the existing statistical methods. It is to be able to explain the mechanism and pathway of protein evolution and differentiation, not only limited to the description of the initial and final step of the observed changes.


Genetic semihomology algorithmAnother goal of this algorithm is to make it applicable to any group of proteins of any nature, function and location. It can be achieved for two reasons:1) minimization of basic assumptions limited to the general amino acid: codon translation table and assuming that single point mutation is a principle, most common, mechanism of protein variability; 2) non-statistical approach (no stochastic matrices)


Genetic semihomology algorithmThe algorithm of genetic semihomology assumes the close relation between the compared amino acids and their codons in related proteins. The algorithm is based on the network of genetic relationship between amino acids. Such assumption makes the same residues at different positions of the sequence unequal with respect to their changeability.


Genetic semihomology algorithmThe algorithm assumes that the basic mechanism of differentiation among related proteins consists in the single point mutation. The general part of the algorithm is the three-dimensional diagram reflecting the network of genetic relationship between amino acids


Semihomology algorithm Input requirements The minimum data required for starting analysis with the algorithm of genetic semihomology are the protein sequences (at least two). The more sequences are used for analysis and multiple alignment construction - the more concise and accurate the results are. Although the nucleotide sequences of the genes are not necessarily required, it is very helpful if the nucleotide sequences are known at least for some of the analyzed proteins. That increases significantly the amount of the information accessible from such analysis. Also the results are the best for sequences revealing sufficiently high degree of diversity.


Comparison of the fragments of 1st and 2nd domain of chicken ovomucoid using unitary matrix, GCM, PAM250 and algorithm of genetic semihomology


Semihomology algorithm Advantages of the approach The results obtained by using this method are more comprehensive than those of the methods used currently and reflect the actual mechanism of protein differentiation and evolution. They concern: 1) location of homologous and semihomologous sites in compared proteins, 2) precise estimation of gap location in non-identical fragments of different length, 3) analysis of internal homology and semihomology, 4) precise location of domains in multidomain proteins, 5) estimation of genetic code of non homologous fragments,


Semihomology algorithm Advantages of the approach 6) construction of genetic probes, 7) studies on differentiation processes among related proteins, 8) estimation of the degree of relationship among related proteins, 9) studies on the evolution mechanism within homologous protein families, 10) confirmation of the actual relationship between sequences revealing low degree of identity/similarity.


Semihomology algorithm Advantages of the approach Application of the semihomology approach has led to discovery and describing some important mechanisms affecting the protein evolution and differentiation. The most important are: the mechanism and role of cryptic mutations at unusually variable positions (Leluk; 2000b-c) 2) the phenomenon of very long distance (dispersed) mutational correlation within sets of variable positions (Leluk 2000a; Leluk and Grabiec, 2001; Leluk et al., 2001b; Leluk et al., 2002)


Semihomology algorithm Limitations of the approach The limitations in use of the semihomology approach appear in case if: - there are too few sequences taken for analysis - the identity degree among the sequences is too high (too low diversity) - the long fragments of the compared sequences show no identity at all (e.g. N-terminal signal fragments of homologous proteins)


The analysis of genetic semihomology excludes applicability of Markov model for the studies on protein variability at the amino acid level. The amino acid codons do contain the information about the „ancestral” amino acids, whose codons were the starting point to the codon of current residue. It refers mainly to the positions undergoing single-point mutations as the most basic mechanism of evolutionary variability.


GEISHA- Protein sequence similarity and homology analysis. - Multiple alignment construction on the basis of genetic relationships between amino-acids. - Analysis of variability within homologous protein families - Molecular phylogenetic studies Software based on genetic semihomology algorithm


FQS Semihomology tool based on genetic semihomology algorithm


The semihomologous correlation between amino acids occurring at selected non-homologous positions of inhibitors from squash seeds. The solid lines indicate the transition type of semihomology, the dashed lines refer to transversion type of semihomology.


Application of genetic semihomology algorithm for identifying the fragments of possible different mechanism than single point mutation


Dot matrix pairwise alignmentNoise reduction


Dot matrix pairwise alignmentInternal homology (gene multiplication)Chicken ovoinhibitor precursor (7 domains)Chicken ovomucoid precursor (3 domains)BLAST 2 SEQUENCESSEMIHOM


Dot matrix comparison of selected homologous Kazal inhibitors ovoinhibitor-ovoinhibitor ovoinhibitor-ovomucoid ovoinhibitor-PSTIBLASTP 2.0.9 (Blosum62) SEMIHOM (algorithm of genetic semihomology)


Consecutive steps of dot matrix results obtained from comparison of chicken ovoinhibitor with itself by program SEMIHOM


Precise gap location in non-identical fragments of different length by using the genetic semihomology algorithm. The compared proteins are trypsin inhibitors from squash seeds - CPGTI-I (horizontal) and CSTI-IIb (vertical). CPGTI-I has a non homologous His25 while CSTI-IIb possesses at relative site two non homologous residues Asp25 and Ile26. identities only ? RVCPKILMECKKDSDCLAECICLEH-GYCG MVCPKILMKCKHDSDCLLDCVCLEDIGYCGVS ? RVCPKILMECKKDSDCLAECICLE-HGYCG MVCPKILMKCKHDSDCLLDCVCLEDIGYCGVS RVCPKILMECKKDSDCLAECICLEH-GYCG MVCPKILMKCKHDSDCLLDCVCLEDIGYCGVS


Multiple alignment

Browse More Presentations

Last Updated: 8th March 2018

Recommended PPTs