tool-banner
Tool Catalog
Home>Tool Catalog>

Multiple Sequence Alignment

Multiple Sequence Alignment
Multiple Sequence Alignment
Antibody Sequence Analysis
2025-08-11
Try Now

Multiple Sequence Alignment

1 Introduction

Multiple Sequence Alignment is used for aligning DNA and protein sequences, and visualizing the results of the sequence alignment. It aids in sequence clustering, analyzing diversity among sequences, identifying conserved regions and mutations. It includes automatic alignment tools such as ClustalW2 and MUSCLE1, with MUSCLE incorporating clustering methods like NJ(Neighbor Joining3), UPGMA(Unweighted Pair Group Method with Arithmetic Mean4), and UPGMB(Unweighted Pair Group Method with Banded Mean).

2 Parameters

  • Sequences:Input protein sequences or DNA sequences.
  • Aligners:
    • ClustalW: A widely used tool for multiple sequence alignment, combining memory-efficient dynamic programming algorithms with progressive alignment strategies, aiming to provide accurate, robust, and user-friendly sequence alignment results.
    • MUSCLE: A tool for generating multiple sequence alignments of amino acid and nucleotide sequences, including NJ, UPGMA, and UPGMB clustering methods. Its paper tests have shown higher accuracy than ClustalW, equivalent to T-Coffee or MAFFT, and is the fastest when aligning large sequence sets.
muscle

Figure 1. Alignment speeds of various aligners from the MUSCLE paper.


  • Cluster Method:
    • NJ: NJ: Neighbor Joining is a distance-based clustering method, the core idea of which is to find, at each stage of clustering, the pair of operational taxonomic units (OTUs) that minimize the total branch length, i.e., the "neighbors," and merge them. NJ is favored for its fast computation speed and high accuracy, especially when dealing with smaller evolutionary distances and shorter sequences, but it may be sensitive to outliers.
    • UPGMA: Unweighted Pair Group Method with Arithmetic Mean is a hierarchical clustering method, the core idea of which is to gradually build a clustering tree by calculating the average distance between samples. UPGMA is algorithmically simple and the fastest in terms of speed.
    • UPGMB: Unweighted Pair Group Method with Banded Mean is a variant of UPGMA. Unlike UPGMA, which uses a simple arithmetic mean, UPGMB considers a "bandwidth" parameter when calculating the new cluster center, reducing the impact of outliers, and thus may be more robust than UPGMA in certain situations.

3 Results Explanation

  • Chart:Sequence Alignment Chart.
  • alignment.fasta :An aligned sequence FASTA file.
MSA_result

Figure 2. Result chart of Multiple Sequence Alignment example.


4 Reference

[1] Edgar, R.C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).https://doi.org/10.1186/1471-2105-5-113
[2] Julie D. Thompson, Desmond G. Higgins, Toby J. Gibson, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Research, Volume 22, Issue 22, 11 November 1994, Pages 4673–4680.https://doi.org/10.1093/nar/22.22.4673
[3] N Saitou, M Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees., Molecular Biology and Evolution, Volume 4, Issue 4, Jul 1987, Pages 406–425. https://doi.org/10.1093/oxfordjournals.molbev.a040454
[4] Sokal, R. R., & Michener, C. D. (1958). A statistical method for evaluating systematic relationships. University of Kansas Science Bulletin, 38(1), 1409-1438.