

Similarity AbScan
1 Introduction
Similarity AbScan is a specialized tool for similarity searching of antibody sequence patent literature. It features a comprehensive database sourced from authoritative websites, divided into two sub-databases: Antibody and Nanobody.
The Nanobody sub-database contains nearly 5,000 VHH, VNAR, and single-domain antibody sequences derived from patents and academic papers. The Antibody sub-database includes over 670,000 heavy and light chain antibody sequences, as well as 170,000 pairs of antibody sequences matched through patents and academic papers.Users can search for similar antibody sequences or structures by entering a sequence, or they can search for related antibodies by keywords.
Among the paired antibody sequences, there are:
- Patented Antibodies (patent in heavy definition) with 134,183 entries, accounting for 75.3%.
- Crystal Structures (Xtal structure) with 12,778 entries, accounting for 7.1%.
- Therapeutic Antibodies (TheraSAbDab) with 1,198 entries, accounting for 0.6%.
- Scientific Literature (Other) with 28,735 entries, accounting for 16.9%.

Figure 1. Distribution map of paired antibody sequences from different sources.

Figure 2. Distribution map of paired antibody sequences from different species.
2 Parameters
- Similar AbScan DB Name: Choose between Antibody or Nanobody sub-database.
- Search by structure:
- Heavy Chain Sequence: Antibody heavy chain Fv region sequence.
- Light Chain Sequence: Antibody light chain Fv region sequence.
- RMSD cutoff (Å) : Structure RMSD cutoff (Å).
- Search by sequence:
- Fv Heavy sequence: Antibody heavy chain Fv region sequence.
- Fv Light sequence: Antibody light chain Fv region sequence.
- Average Identity: Average identity cutoff (%) of heavy and light chain sequences (%).
- Heavy Identity Cutoff (%):Heavy chain sequence identity cutoff (%).
- Light Identity Cutoff (%): Light chain sequence identity cutoff (%).
- Regions: Antibody region to search,whole, cdrs, cdr3.
- Search by keyword:
- Keyword: Search using regular expressions for keywords contained in the title of the source literature, example 'RSV|Respiratory Syncytial Virus'.
- Pair: Pairing search or unpairing search.
- Max Sequences Return: Maximum number of sequences to return.
3 Results Explanation
Table: Retrieval results table, supports downloading, and includes the following key fields:
| Fields | Description |
|---|---|
| ID | GeneBank LOCUS. |
| sequence | Fv of GeneBank Sequence. |
| organism | GeneBank Organism. |
| definition | GeneBank DEFINITION. |
| reference_authors | GeneBank REFERENCE AUTHORS. |
| reference_title | GeneBank REFERENCE TITLE. |
| update_date | GeneBank Update date. |
| cdr_lengths | CDR Sequence Lengths. |
| pairing | Pairing Method. |
| targets_mentioned | Targets. |
| url | Pairing Basis Link. |
| chain | Chain Type. |
| division | GeneBank Division, such as PAT. |
| GeneBank_accession-version | GeneBank ACCESSION VERSION. |
| other-seqids | other sequence ids. |
| model | GeneBank LOCUS Of Structure. |
| identity | Sequence identity with the query sequence. |
| rmsd | Structure RMSD with the query sequence. |
| cdr1/2/3 | cdr1/2/3 Sequence. |
| cdr1/2/3 mismatch | cdr1/2/3 mismatch with the query sequence. |
| cdrs mismatch | cdrs mismatch with the query sequence. |
| total_mismatch | total mismatch with the query sequence. |
Note:
- Structural search of Antibody only supports antibody pair searches, and must input antibody pairs, returning results that only include antibody pairs (Pair).
- Sequence search of Antibody supports searches for antibody pairs, heavy chains, and light chains, returning results that include antibody pairs (Pair), heavy chains (H), and light chains (L).
- Keyword search of Antibody supports searches for antibody pairs, heavy chains, and light chains, returning results that include antibody pairs (Pair) or heavy chains (H) and light chains (L).
4 Reference
[1] Abanades B, Olsen TH, Raybould MIJ, et al. The Patent and Literature Antibody Database (PLAbDab): an evolving reference set of functionally diverse, literature-annotated antibody sequences and structures. Nucleic Acids Res. 2024;52(D1):D545-D551. https://doi.org/10.1093/nar/gkad1056

