![]() |

|
Search Tip: Knowing whether to use U or T when searching RNA sequences on STN
IntroductionWhen you are designing a search query for sequence information on STN, it is important to consider how specific databases index RNA sequences. Extensive sequence information is available in:
These databases derive their sequence content from multiple sources, including journals, patents, and GenBank. RNA sequences in these databases are indexed with U or T, depending on the source of the data. For example, the National Center for Biotechnology Information (NCBI) has a policy of converting uracil (U) residues to thymidine (T) for RNA sequences in GenBank. Other sources of sequence data, however, index RNA sequences with U. This article provides descriptions and examples of how you can effectively search for RNA sequences in STN databases. DGENE (GENESEQ)DGENE contains nucleotide and peptide sequences from basic patent documents of 41 patent-issuing authorities. RNA sequences in DGENE are reported with U residues. Therefore, you should include Us in your RNA search query sequences. => FIL DGENE L6 22 GGGAAUACCA/SQSN => D SQIDE L6 ANSWER 1 OF 22 DGENE COPYRIGHT 2008 THOMSON REUTERS on STN CAS REGISTRYRNA sequence records in REGISTRY can include U or T, depending on their original source. RNA sequences from GenBank contain T, but those from non-GenBank sources contain U. But, REGISTRY automatically helps with this. A subsequence query containing U allows for ambiguous matches on either T or U. In the following example, two records are retrieved using U in the Subsequence search. The first contains an RNA sequence indexed with U that is found in a PCT application. => FIL REGISTRY The second record contains an RNA sequence indexed with T instead of U. The original source of this sequence is GenBank. L1 ANSWER 2 OF 2 REGISTRY COPYRIGHT 2008 ACS on STN USGENE (USPTO Genetic Sequence Database)USGENE includes sequence records from the USPTO and GenBank. RNA sequences from the USPTO are indexed with U. RNA sequences from GenBank are indexed with T. Unlike REGISTRY, searches in USGENE for sequences containing U do not automatically result in ambiguous matches on T or U. Searching with U only retrieves sequences containing U. Searching with T only retrieves sequences containing T. For comprehensive results, it is therefore necessary to conduct two searches: one with T and a second with U. The following example shows a search in USGENE for an RNA sequence containing U. The retrieved record is from the USPTO and is indexed with U. => FIL USGENE The next example shows a search for the same RNA sequence with T instead of U. The retrieved record (indexed with T) is from GenBank (NCBI) with an earlier application date than the USPTO record shown above. => RUN GETSEQ GGGAATACCA/SQSN PCTGEN (World Patent Application Biosequences)PCTGEN covers nucleotide and amino acid sequence information as submitted by the patent applicant to the World Intellectual Property Organization (WIPO). Therefore, RNA sequences in PCTGEN records contain either U or T. Searching with U only retrieves sequences indexed with U. Searching with T only retrieves sequences indexed with T. For comprehensive results, it is necessary to conduct two searches, with U and with T, as you would in USGENE. This example shows an RNA search in which the query sequence contains U residues. => FIL PCTGEN => RUN GETSEQ GGGAAUACCA/SQSN The following example shows a search for the same RNA sequence used in the example above, but U has been replaced with T. The records in the answer set are indexed with T. => RUN GETSEQ GGGAATACCA/SQSN When the retrieved records are limited to those containing RNA sequences (RNA in the Molecule Type field (/MTY)), three records are found. These records would have been missed if the second search (with T) had not been conducted. => S L2 AND RNA/MTY ConclusionAs more patents for DNA-related inventions are issued, comprehensive searches for sequence patentability become increasingly important. Freedom-to-operate, prior art, validity, and infringement patent sequence searches can be conducted in STN databases that contain extensive sequence and patent information. Keeping in mind the differences between these databases can help you improve your RNA sequence search strategies and retrieve more comprehensive results. Overview of RNA sequence indexing.
Additional ResourcesFor additional information about the databases mentioned above, refer to the STN Database Summary Sheets. For more information about sequence searching on STN, visit the Searching for Sequences documentation within STN Support & Training. Updated: 10/9/2008 10:13:07 AM
|


