Before submitting a request, remove any numerical digits in the query sequence or replace them with the appropriate letter codes (e.g., N for an unknown nucleotide residue or X for an unknown amino acid residue).
Supported nucleotide codes are:
Nucleotide Code |
Base |
A |
Adenosine |
C |
Cytidine |
G |
Guanine |
T |
Thymidine |
U |
Uridine |
R |
G or A (Purine) |
Y |
T or C (Pyrimidine) |
K |
G or T (Keto) |
M |
A or C (Amino) |
S |
G or C (strong) |
W |
A or T (weak) |
B |
G or T or C |
D |
G or A or T |
H |
A or C or T |
V |
G or C or A |
N |
A or G or C or T (any) |
- |
Gap of indeterminate length |
For programs that use protein query sequences (BLASTp and tBLASTn), the accepted amino acid codes are:
Amino Acid Code |
Three Letter Code |
Amino Acid Name |
A |
Ala |
Alanine |
B |
Asx |
Aspartate or Asparagine |
C |
Cys |
Cysteine |
D |
Asp |
Aspartate |
E |
Glu |
Glutamate |
F |
Phe |
Phenylalanine |
G |
Gly |
Glycine |
H |
His |
Histidine |
I |
Ile |
Isoleucine |
J |
Xle |
Leucine or Isoleucine |
K |
Lys |
Lysine |
L |
Leu |
Leucine |
M |
Met |
Methionine |
N |
Asn |
Asparagine |
O |
Pyl |
Pyrrolysine |
P |
Pro |
Proline |
Q |
Gln |
Glutamine |
R |
Arg |
Arginine |
S |
Ser |
Serine |
T |
Thr |
Threonine |
U |
Scy |
Selenocysteine |
V |
Val |
Valine |
W |
Trp |
Tryptophan |
X |
Xxx |
Any - Uncommon or Unspecified |
Y |
Tyr |
Tyrosine |
Z |
Glx |
Glutamate or Glutamine |
* |
Translation stop | |
- |
Gap of indeterminate length |
Bare sequence (plain text) input can be lines of sequence data only, without a FASTA definition line.
Example:
QIKDLLVSSSTDLDTTLVLVNAIYFKGMWKTAFNAEDTREMPFHVTKQESKPVQMMCMNNSFNVATLPAE KMKILELPFASGDLSMLVLLPDEVSDLERIEKTINFEKLTEWTNPNTMEKRRVKVYLPQMKIEEKYNLTS VLMALGMTDLFIPSANLTGISSAESLKISQAVHGAFMELSEDGIEMAGSTGVIEDIKHSPESEQFRADHP FLFLIKHNPTNTIVYFGRYWSP
Bare sequence input can also be interspersed with numbers and/or spaces, such as the sequence portion of a GenBank®/GenPept flatfile report:
Example:
1 qikdllvsss tdldttlvlv naiyfkgmwk tafnaedtre mpfhvtkqes kpvqmmcmnn 61 sfnvatlpae kmkilelpfa sgdlsmlvll pdevsdleri ektinfeklt ewtnpntmek 121 rrvkvylpqm kieekynlts vlmalgmtdl fipsanltgi ssaeslkisq avhgafmels 181 edgiemagst gviedikhsp eseqfradhp flflikhnpt ntivyfgryw sp
_____
See also
Similarity Search Input Formats
GenBank® is a registered trademark of the U.S. Department of Health and Human Services for the Genetic Sequence Data Bank.
BLAST® is a registered trademark of the National Library of Medicine.
BLAST® reference information provided in whole or in part from the National Center for Biotechnology Information (NCBI), National Library of Medicine, National Institutes of Health.
Unless designated otherwise, all other information Copyright © 1997-2017 by the American Chemical Society. All rights reserved.