FETCH+

[Genhelp | Program Manual | User's Guide | Data Files | Databases | Release Notes ]

 

Table of Contents

FUNCTION

DESCRIPTION

EXAMPLE

INPUT FILES

COMMAND-LINE SUMMARY

LOCAL DATA FILES

PARAMETER REFERENCE


FUNCTION

[ Top | Next ]

Fetch+ copies GCG sequences or data files from the GCG database into your directory or displays them on your terminal screen.

 

DESCRIPTION

[ Previous | Top | Next ]

Advantages of Plus “+” Programs:

 

P      Plus programs are enhanced to be able to read sequences in a variety of native formats such as GCG RSF, GCG SSF, GCG MSF, GenBank, EMBL, FastA, SwissProt, PIR, and BSML without conversion.

 

P      Plus programs remove sequence length restriction of 350,000bp.

 

If you do not need these features and wish to have more interactivity, you might wish to seek out and run the original program version.

The related program of Fetch+ is: Fetch, which copies GCG sequences or data files from the GCG database into your directory or displays them on your terminal screen.

The expression % fetch+ *bov* will retrieve every GCG data file or sequence entry whose name contains the string bov. Sequence specification is described in detail in Section 2, Using Sequence Files and Databases of the User's Guide.

When copying a sequence from a database, Fetch creates a file in GCG format whose name is the entry name and whose extension is the database logical name. For example, % fetch+ Genbank:Hsrep2 copies the requested sequence into a file called hsrep2.gb_pr. The filename extension is taken from the logical name for the database. In this example, the extension .gb_pr indicates that the sequence was copied from the Primate division of the GenBank nucleotide sequence database. (See "Using Database Sequences" in Section 2, Using Sequence Files and Databases of the User's Guide for a complete listing of logical names for all GCG databases.) If the file being copied is not from a sequence database, for example enzyme.dat, then its name is not changed.

If your sequence specification contains no logical name, Fetch looks in all the databases and in all the GCG data directories to find all possible entry names. For example, % fetch+ hum* would do almost the same thing as % fetch+ Genbank:hum*, except that if any sequences beginning with hum were present in databases other than GenBank or in any GCG data directories, they would also be retrieved.

Special Considerations for Searching

Keep in mind that filenames are case sensitive and database entry names are case insensitive. Because this program searches for both filenames and database entry names, you must take care when you enter the character pattern that makes up your specification.

For example, if you entered Gamma* as a file specification, this program would find all entries in the databases whose names begin with Gamma but no Accelrys GCG (GCG) supplied files would be found. This is because all the files in GCG are named using lowercase letters. Conversely, if you entered gamma*, this program would find all of the entries in the databases and all GCG supplied files whose names begin with gamma.

 

EXAMPLE

[ Previous | Top | Next ]

          Here is a session using Fetch+ to retrieve a local copy of Bacterial 7 kDa surface antigen precursor protein from Uniprot database:
 
          Add the following commands, that explains the usage of fetch commands for retrieving sequences from a SeqStore Database. 
 
NOTE: the following commands work only if the users have a Accelrys SeqStore Database for storage of their Sequence Data; Users should source gcg with following commands to establish connection with SeqStore database;
 
           Steps to connect to a SeqStore database:
 
source SeqStore/setup_seqstore.csh
setenv GCG_ORACONNECT seqstore/seqstore@sstore32
source /data2/<user>/sandbox/bio/build/debug/solaris/startup
gcg
gcgsupport
 
Now fetch+ can talk to seqstore. 
If you want to fetch sequence data from a SeqStore database installed in your network, you can retrieve the sequences using the following fetch commands:
 
fetch+ gcggb:AJ226132
fetch+ gcggb^N:AJ226132 (use ' ^'  special character if you want to specify the type of sequence (either N or P)
fetch+ gcggb^N:AJ22613% [AJ226131 and AJ226132 ] (use ' %' special character if you want to retrieve sequences starting with Accession numbers AJ22613)
fetch+ gcggb^N:AJ22613* (use ' *' special character if you want to retrieve sequences starting with Accession numbers AJ22613; fetches sequences same as above command)
fetch+ gcggb^N:AJ226132^1[Revision 1] (use ' ^1' with any accesion number if you want to retrieve a sequence with an Accession number AJ226132 that has been modified once (^1)and stored in the SeqStore database as version 1; Similarly use the command with ^2 for retrieving a second version of sequence, ^3 for a third version of the same sequence etc.) 
 
 
You can also use fetch+ to query against other SeqStore Sequence containers such as GCGPROT for protein sequences GCGEST for protein (translated ESTs) and Nucleotide (EST) source sequences.
 
fetch+
 
Fetch+ copies sequences or data files from their installed locations into your current working directory.
 
Fetch what sequence(s) ? 17kd_ricam
 
 
Created '17KD_RICAM.uniprot_sprot'
 

OUTPUT

!!AA_SEQUENCE 1.0

ID   17KD_RICAM     STANDARD;      PRT;   154 AA.

AC   P50927;

DT   01-OCT-1996 (Rel. 34, Created)

DT   01-OCT-1996 (Rel. 34, Last sequence update)

DT   10-OCT-2003 (Rel. 42, Last annotation update)

DE   17 kDa surface antigen precursor (Fragment).

GN   OMP.

OS   Rickettsia amblyommii.

OC   Bacteria; Proteobacteria; Alphaproteobacteria; Rickettsiales;

OC   Rickettsiaceae; Rickettsieae; Rickettsia.

OX   NCBI_TaxID=33989;

RN   [1]

RP   SEQUENCE FROM N.A.

RC   STRAIN=MO 85-1084;

RA   Stothard D.R., Ralph D.A., Clark J.B., Fuerst P.A., Pretzman C.;

RL   Submitted (JAN-1995) to the EMBL/Genbank/DDBJ databases.

CC   -!- SUBCELLULAR LOCATION: Attached to the outer membrane by a lipid

CC       anchor (Probable).

CC   --------------------------------------------------------------------------

CC   This SWISS-PROT entry is copyright. It is produced through a collaboration

CC   between  the Swiss Institute of Bioinformatics  and the  EMBL outstation -

CC   the European Bioinformatics Institute.  There are no  restrictions on  its

CC   use  by  non-profit  institutions as long  as its content  is  in  no  way

CC   modified and this statement is not removed.  Usage  by  and for commercial

CC   entities requires a license agreement (See http://www.isb-sib.ch/announce/

CC   or send an email to license@isb-sib.ch).

CC   --------------------------------------------------------------------------

DR   EMBL; U11013; AAB07704.1; -.

DR   InterPro; IPR000437; Prok_lipoprot_S.

DR   InterPro; IPR008816; Rick_17kDa_Anti.

DR   Pfam; PF05433; Rick_17kDa_Anti; 1.

DR   PROSITE; PS00013; PROKAR_LIPOPROTEIN; 1.

KW   Outer membrane; Lipoprotein; Antigen; Signal; Palmitate.

FT   SIGNAL        1     19       By similarity.

FT   CHAIN        20   >154       17 kDa surface antigen.

FT   LIPID        20     20       N-palmitoyl cysteine (Probable).

FT   LIPID        20     20       S-diacylglycerol cysteine (Probable).

FT   NON_TER     154    154

SQ   SEQUENCE   154 AA;  15879 MW;  E4FBE4C29D943581 CRC64;

 

 17KD_RICAM  Length: 154  December 03, 2004 16:55  Type: P  Check: 4846  ..

 

       1  MKLLSKIMII ALAASTLQAC NGPGGMNKQG TGTLLGGAGG ALLGSQFGKG

 

      51  KGQLVGVGVG ALLGAVLGGQ VGAGMDEQDR RIAELTSQKA LETAPNGSNV

 

     101  EWRNPDNGNY GYVTPNKTYR NSTGQYCREY TQTVVIGGKQ QKAYGNACRQ

 

     151  PDGQ

 

 

 

INPUT FILES

[ Previous | Top | Next ]

Except for the exceptions listed below, Fetch+ accepts valid single file specifications. You can also specify multiple files by using a list file, for example @project.list, or by using an ambiguous specification with an asterisk (*) wildcard, for example GenBank:*. You can use Fetch+ to obtain copies of multiple sequence format (MSF) or rich sequence format (RSF) files, for example GenDocData:*.msf, but you cannot use specifications such as hsp70.msf{hsp_yeast} to extract sequences from such files into individual sequence files. (To accomplish this task, use the Reformat or SeqConv+ program.)

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -check to view the summary below and to specify parameters before the program executes. In the syntax summary below, square brackets ([and]) enclose parameter values that are optional. For each program parameter, square brackets enclose the type of parameter value specified, the default parameter value, and shortened forms of the parameter name, aliases.  Programs with a plus in the name use either the full parameter name or a specified alias. If “Type” is “Boolean”, then the presence of the parameter on the command line indicates a true condition. A false condition needs to be stated as, parameter=false.

Fetch+ copies sequences or data files from their installed locations into your current working directory.
 
 
Minimal Syntax: % fetch+ [-infile=]value -Default
 
 
Minimal Parameters (case-insensitive):
 
-infile         [Type: List / Default: EMPTY / Aliases: infile1 in]
                Inputs file specification.
 
Optional Parameters (case-insensitive):
 
-check          [Type: Boolean / Default: 'false' / Aliases: che help]
                Print out this usage message.
 
-default        [Type: Boolean / Default: 'false' / Aliases: d def]
                Specifies that sensible default values be used for all parameters where possible.
 
-documentation [Type: Boolean / Default: 'true' / Aliases: doc]
                Prints banner at program startup.
 
-quiet          [Type: Boolean / Default: 'false' / Aliases: qui]
                Tells application to print only a minimal amount of information.
 
-outfile        [Type: OutFile / Default: EMPTY / Aliases: out]
Destination to which the fetched file/sequence is written. This option only works when fetching a single file or sequence.
 
-annotate       [Type: Boolean / Default: 'true' / Aliases: annot]
                Specifies whether to include annotation in the output Sequences.
 
-outformat      [Type: String / Default: 'SSF' / Aliases: outfmt   format] The desired output format for sequence data retrieved from flatfile databases. Value should be one of the following: GB GENPEPT FSA EMBL SPT SW RSF MSF SSF BSML.

LOCAL DATA FILES

[ Previous | Top | Next ]

None.

PARAMETER REFERENCE

[ Previous | Top ]

You can set the parameters listed below from the command line. Shortened forms of the parameter name, aliases, are shown, separated by commas.

-infile, -in, -infile1

 

Inputs file specification.

 

-doclines=6, -docl

 

Sets an individual GCG program to copy only six non-blank lines of documentation from input data files into the output files. Use the % doclines global switch to set this value for your whole session. Usually, Fetch copies all of the documentation from each sequence entry into your new files exactly as it appeared in the original entry.

 

-check, -che, -help

 

Prints out this usage message.

 

-default, -d -def

 

Specifies that sensible default values be used for all parameters where possible.

 

-documentation, -doc

 

Prints banner at program startup.

 

-quiet, -qui

 

This parameter is not supported.

 

-outformat

 

The desired output format for retrieved sequence data. Value should be one of the following: GB GENPEPT FSA EMBL SPT SW RSF MSF SSF BSML

 

-annotate, -annot

 

Specifies whether to include annotation in the output Sequences.

 

-outfile=filename, -out

 

Copies the sequence(s) and/or data file(s) into one file which you can name. If you leave out the name of the file, Fetch prompts you for one. (GCG programs will not read files containing more than one sequence unless they are in an MSF (multiple sequence format) or RSF (rich sequence format) file.)

It is often useful to use term for the filename so that the data are displayed on your terminal screen.

 

-monitor, -mon

 

Program monitors its progress on your screen by displaying a screen trace of progress. However, when you use -default to suppress all program interaction, you also suppress the monitor. You can turn it back on with this parameter. If you are running the program in batch, the monitor will appear in the log file.

 

-reference, -ref

 

Copies only the documentation for the sequence or data file. Unless specified, the name of the output file is the entry name concatenated with _ref, followed by the database logical name as the extension.

 

Printed: May 27, 2005  12:17


[Genhelp | Program Manual | User's Guide | Data Files | Databases | Release Notes ]


Technical Support: support-us@accelrys.com, support-japan@accelrys.com,
or support-eu@accelrys.com

Copyright (c) 1982-2005 Accelrys Inc. All rights reserved.

Licenses and Trademarks: Discovery Studio ®, SeqLab ®, SeqWeb ®, SeqMerge ®, GCG ® and, the GCG logo are registered trademarks of Accelrys Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

www.accelrys.com/bio