CAS Registry BLAST logo

Search Statistics in BLAST® Search Summary

This is an example of the statistical details that are appended to a CAS Registry BLAST® result (in this case, a BLASTn result) and a description of the meaning of some of the key statistics.

Example BLAST Search Summary:

BLAST® Search Summary
Lambda      K        H
    1.37    0.711     1.31 
Effective search space used: 24925054310455
Number of letters in database: 66,115,079,151
Number of sequences in database: 58,534,391
Matrix: blastn matrix 1 -3
Gap Penalties: Existence: 5, Extension: 2
Report Section Explanation

Search Statistics

Lambda

A statistical parameter used in calculating BLAST scores that can be thought of as a natural scale for scoring system. The value lambda is used in converting a raw score (S) to a bit score (S').

K
A statistical parameter used in calculating BLAST scores that can be thought of as a natural scale for search space size. The value K is used in converting a raw score (S) to a bit score (S').

H
The relative entropy of the target and background residue frequencies (Karlin and Altschul, 1990). H can be thought of as a measure of the average information (in bits) available per position that distinguishes an alignment from chance. At high values of H, short alignments can be distinguished by chance, whereas at lower H values, a longer alignment may be necessary (Altschul, 1991).

Effective search space used

 

Number of letters in database

 

Number of Sequences

Values for the various steps in this BLAST search leading to the identification of HSPs.

Matrix

1-3

Gap Penalties

See Gap cost.

T

Neighborhood word score threshold. This generates all words of length W that yield a score of at least T when aligned with some word of length W from the query sequence.

BLASTn: The value of T in a BLASTn result will be zero because BLASTn allows only identical matches for the initial word.

A

A "distance" within which two initial word hits (HSPs) must occur.

BLASTn: Because BLASTn does not look for two words (HSPs) initially, this parameter is not used and the value is indicated as zero.

X1 . . . n:

Extension of alignment can result in a score drop before extra matches appear to raise the score again and compensate for the drop. The X value dictates when to stop the exploration.

Example: The three Xs in the gapped BLAST program indicate:

  1. Dropoff for the ungapped extension (the HSP).
  2. Dropoff parameter used in the first gapped extension.
  3. Dropoff parameter used n the second (final) gapped extension.

S1 and S2

Cutoff scores for the maximal-scoring segment pair (MSP) and multiple HSPs, respectively. Pairwise comparisons scoring below these cutoff scores are ignored.

Cutoff scores: Cutoff scores are calculated by BLAST algorithms and cannot be changed.

_____

GenBank® is a registered trademark of the U.S. Department of Health and Human Services for the Genetic Sequence Data Bank.

BLAST® is a registered trademark of the National Library of Medicine.

BLAST® reference information provided in whole or in part from the National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health.

Unless designated otherwise, all other information Copyright © 1997-2014 by the American Chemical Society. All rights reserved.