Database Tables To refer to sequences in the databases, use the names listed in the Nucleic Acid Databases and Protein Databases tables. Notice that in many cases there is more than one name to refer to a database; use whichever you are most comfortable with. NOTE: Because the databases change frequently, these tables may not be up-to-date. NOTE: Upcoming versions of data release does not contain separate PIR database sequences. GenEMBLPlus combines GenBank and EMBL into one database for comprehensive nucleic acid database searching. Due to the large duplication between GenBank and EMBL, GCG has eliminated EMBL sequences sharing the same primary accession number as sequences in GenBank. Therefore EMBL is an abridged database. Refseq and genbank share genomic database and are not processed in Refseq. ----------------------------------------------------------------------------- Nucleic Acid Databases Names and Descriptions ----------------------------------------------------------------------------- Description GenEMBL GenBank EMBL (GenBank + EMBL) (Abridged) ----------------------------------------------------------------------------- Entire sequence GenEMBLPlus:* GenBankPlus:* EMBLPlus:* database GEP:* GBP:* EMP:* All database divisions GenEMBL:* GenBank:* EMBL:* except EST and GSS GE:* GB:* EM:* sequences Only EST and GSS Tags:* GB_Tags:* EM_Tags:* sequences Bacterial sequences Bacterial:* GB_Ba:* EM_Ba:* Ba:* Expressed sequence tag EST:* GB_EST:* EM_EST:* (EST) sequences Genome survey sequences GSS:* GB_GSS:* EM_GSS:* (GSS) High throughput genome HTG:* GB_HTG:* -- Invertebrate sequences Invertebrate:* GB_In:* EM_In:* In:* Organelle sequences Organelle:* -- EM_Or:* Or:* Non-rodent, non-primate Other_Mammalian:* GB_Om:* EM_Om:* mammalian sequences Om:* Nom-mammalian, Other_Vertebrate:* GB_Ov:* EM_Ov:* vertebrate sequences Ov:* Sequences from patents Patent:* GB_Pat:* EM_Pat:* and patent applications Phage sequences Phage:* GB_Ph:* EM_Ph:* Ph:* Plant and Fungal Plant:* GB_Pl:* EM_Pl:* sequences Pl:* EM_Fun:* Primate sequences Primate:* GB_Pr:* EM_Hum:* Pr:* Rodent sequences Rodent:* GB_Ro:* EM_Ro:* Ro:* Sequence-tagged site STS:* GB_STS:* EM_STS:* (STS) sequences Synthetic sequences Synthetic:* GB_Sy:* EM_Sy:* (plasmids, vectors) Sy:* Unannotated sequences Unannotated:* GB_Un:* EM_Un:* Un:* Viral sequences Viral:* GB_Vi:* EM_Vi:* Vi:* ------------------------------------------------------------------------------------------- Protein Databases Names and Descriptions ------------------------------------------------------------------------------------------ Description PIR-Protein SWISS-PROT GenPept REFSEQ ------------------------------------------------------------------------------------------ GenBank Translated proteins -- -- GenPept:* ---- GP:* Refseq protein products -- -- -- RS_PROT:* ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------- RNA Databases Names and Descriptions ---------------------------------------------------------------------------- Description REFSEQ ---------------------------------------------------------------------------- Refseq RNA RS_RNA:* ---------------------------------------------------------------------------- Examples of Specifying Sequences ----------------------------------------------------------------------------- To do this: Type something Description: like this: ----------------------------------------------------------------------------- Specify an entire database: GEP:* Specifies all of GenBank and EMBL, including EST, and GSS sequences GE:* Specifies all of GenBank and EMBL, excluding EST, and GSS sequences GB:* Specifies all of GenBank SWP:* Specifies all of SWISS-PROT Specify a division Ba:* Specifies all Bacterial of a database: sequences in GenEMBL Tags:* Specifies all EST and GSS sequences in GenEMBLPlus EST:* Specifies all EST sequences in GenEMBLPlus Specify sequence(s) Ba:Ecoompa Specifies a single sequence by name: from the GenEMBL Bacterial database division Ba:Eco* Specifies all sequence names beginning with "eco" from the GenEMBLPlus database EM_Fun:Agasa Specifies a single sequence from the EMBL Fungal database division GEP:Ecoompa Specifies a single sequence from the GenEMBLPlus database SW:Aap1_Yeast Specifies a single sequence from the SWISS-PROT database Specify a single sequence GE:J01654 Specifies a single sequence by accession number: from the GenEMBL database STS:X76318 Specifies a single sequence from the STS database division Specify the sequence by name NM_117872 Specifies a single sequence from the RNA divisions