ChimeraX docs icon

Tool: Blast Protein

Blast Protein runs a protein sequence similarity search using a BLAST web service hosted by the UCSF Resource for Biocomputing, Visualization, and Informatics (RBVI). One use is to search with a target sequence of unknown structure to find templates for comparative modeling

The related tool Foldseek (Similar Structures) can also search with BLAST and other methods, but only using a structure chain as the query; it facilitates exploring large sets of similar structures by efficiently showing them in 3D as backbone traces and in 2D as sequence alignment schematics or scatter plots based on conformation. See also: AlphaFold, ESMFold, Matchmaker, Task Manager

The Blast Protein tool can be started from the Sequence section of the Tools menu, or by using the Sequence Viewer context menu. It can be manipulated like other panels (more...). It is also implemented as the blastprotein command. See also: alphafold search, esmfold search

Search Parameters

Clicking Apply (or OK, which also dismisses the dialog) runs the search, whereas Close dismisses the dialog without starting a search. Help opens this page in the Help Viewer, and Reset restores the parameters to factory default settings.

BLAST Protein Results

When results are returned, the table of hits is shown in a separate window. These results are saved in ChimeraX sessions. If you wish to prevent the results from docking into the main window (which may resize it), see Tool windows start undocked in the Window preferences.

Checkboxes in the bottom section of the panel control which columns of information are shown in the table of hits, with buttons:

List only best-matching chain per PDB entry – searches of the PDB usually give multiple hits per PDB entry (to multiple chains in that entry, typically redundant because the structure contains multiple copies of the same protein). This option allows collapsing the results list to show only a single hit per PDB entry, the one that best matches the query according to its BLAST score. If multiple chains from the same PDB entry have identical scores, the first in the list is retained.

Clicking a column header sorts by the values in that column.

Double-clicking a row with a corresponding structure fetches it, and if a structure chain was used as the query, automatically superimposes the hit onto the query using matchmaker. If the query was sequence-only (not a structure chain), the first structure opened from the results will serve as the reference for superimposing the others. AlphaFold-predicted structures are colored by confidence 0-100. ESMFold-predicted structures are colored by confidence 0-1.

One or more hits can be chosen (highlighted) in the list by clicking and dragging with the left mouse button; Ctrl-click (or command-click if using a Mac) toggles whether a row is chosen. The result panel's context menu or the buttons across the bottom of the results dialog can be used to:

Regardless of which hits are chosen and which columns are shown, clicking Save Results as TSV brings up a file browser to save the entire set of results as a tab-separated values file (*.tsv).

Some columns of data are available no matter which database is searched:

Additional columns for AlphaFold entries:

Additional columns for PDB entries (from searching PDB or other databases that include it, such as NR):

Additional columns for UniRef entries:

Notes

Pseudo-multiple alignment. The pseudo-multiple alignment from BLAST is not a true multiple alignment, but a consolidation of the pairwise alignments of individual hits to the query. This output corresponds to the BLAST formatting option (alignment view) “flat query-anchored with letters for identities.”

Basic Local Alignment Search Tool (BLAST). The BLAST software is provided by the NCBI and described in:

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Nucleic Acids Res. 1997 Sep 1;25(17):3389-402.

Basic local alignment search tool. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. J Mol Biol. 1990 Oct 5;215(3):403-10.

UCSF Resource for Biocomputing, Visualization, and Informatics / November 2024