<< Chapter < Page Chapter >> Page >
This module is designed to introduce the student to PSI-BLAST. PSI-BLAST is a bioinformatics tool, offered through NCBI, for identifying weak, but biologically relevant sequence similarities.

PSI-BLAST (1) (Position-Specific Iterated BLAST) is a tool that produces a position-specific scoring matrix constructed from a multiple alignment of the top-scoring BLAST responses to a given query sequence. This scoring matrix produces a profile designed to identify the key positions of conserved amino acids within a motif. When a profile is used to search a database, it can often detect subtle relationships between proteins that are distant structural or functional homologues. These relationships are often not detected by a BLAST search with a sample sequence query.

For an oversimplified example of what a consensus sequence, or profile, looks like, consider that the EF-hand binding loop of the calmodulin family could be represented as follows:

Loop Position # 1 3 4 5 6 8 12 Profile D x D G D/N G x I x x x E

Here "x" stands for positions where there is variability in amino acid type, and therefore, that position is not heavily weighted in the alignment. Comparing the profile to some actual binding loop sequences from different calmodulins is the best way to illustrate the derivation of this profile.

POSITION # 1 3 4 5 6 8 12 CALM_HUMAN_1 D K D G D G T I T T K ECALF_NAEGR_1 D K D G D G T I T T S E CALM_SCHPO_1 D R D Q D G N I T S N ECALM_HUMAN_2 D A D G N G T I D F P E CALF_NAEGR_2 D A D G N G T I D F T ECALM_SCHPO_2 D A D G N G T I D F T E CALM_HUMAN_3 D K D G N G Y I S A A ECALF_NAEGR_3 D K D G N G F I S A Q E CALM_SCHPO_3 D K D G N G Y I T V E ECALM_HUMAN_4 D I D G D G Q V N Y E E CALF_NAEGR_4 D I D G D N Q I N Y T ECALM_SCHPO_4 D T D G D G V I N Y E E

The rules for deriving this simple profile are: 1) any position with 90% amino acid identity or greater is considered conserved in the profile, and thus a higher score would be given when the conserved amino acid is found at that position in the sequence, and 2) any position that always contains one of only two types of amino acids would be up-weighted to give a higher score whenever either of those two amino acids appears at that position. A program such as PSI-BLAST will employ more sophisticated rules to create a profile than this example, of course. It is easy to see even with these sequences that amino acid similarity could be taken into consideration in addition to amino acid identity, and exploited in the profile.

There are three common categories of homologues that are studied in relation to biological molecules, sequence homology, structural homology, and functionalhomology. Sequence homology is the easiest to identify, and is therefore the primary target of many bioinformatics methods. Sequence homology yields directimplications about the relatedness of proteins and their potential pathways of derivation. However, to help understand how a protein is implicated in acertain disease state, or how to design a pharmaceutical that interacts with a given protein, functional and/or structural information is necessary.Functional homologues are relatively easy to define, as they are any two proteins, or protein domains, that perform similar functions. Structuralhomologues contain similar "folds", which are localized regions of a molecule that comprise a structural feature such as a "beta barrel" or "four helicalbundle" motif. The fold can encompass the entire protein, or just one domain of the protein. A good introduction to the topic of protein folds can be foundat the website for the Internet Course on The Principles of Protein Structure organized by Birkbeck College (2). When considering sequence, functional, or structural homology, it is importantto understand that one type of homology between proteins does not always infer another type of homology. Nevertheless, it is a reasonable assumption thatproteins that are related through evolutionary pathways are likely to have some degree of all three types of homology. PSI-BLAST was engineered toidentify distant relationships between sequences that are too subtle to discover with a regular BLAST search.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Bios 533 bioinformatics. OpenStax CNX. Sep 24, 2008 Download for free at http://cnx.org/content/col10152/1.16
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Bios 533 bioinformatics' conversation and receive update notifications?

Ask