Sequence Analysis - Sequence Alignment

Sequence Alignment

There are millions of protein and nucleotide sequences known. These sequences fall into many groups of related sequences known as protein families or gene families. Relationships between these sequences are usually discovered by aligning them together and assigning this alignment a score. There are two main types of sequence alignment. Pair-wise sequence alignment only compares two sequences at a time and multiple sequence alignment compares many sequences in one go. Two important algorithms for aligning pairs of sequences are the Needleman-Wunsch algorithm and the Smith-Waterman algorithm. Popular tools for sequence alignment include:

  • Pair-wise alignment - BLAST
  • Multiple alignment - ClustalW, PROBCONS, MUSCLE, MAFFT, and T-Coffee.

A common use for pairwise sequence alignment is to take a sequence of interest and compare it to all known sequences in a database to identify homologous sequences. In general the matches in the database are ordered to show the most closely related sequences first followed by sequences with diminishing similarity. These matches are usually reported with a measure of statistical significance such as an Expectation value.

Read more about this topic:  Sequence Analysis

Famous quotes containing the word sequence:

    It isn’t that you subordinate your ideas to the force of the facts in autobiography but that you construct a sequence of stories to bind up the facts with a persuasive hypothesis that unravels your history’s meaning.
    Philip Roth (b. 1933)