Table of Contents
NetFetch retrieves sequences from NCBI listed in a NetBLAST output file. You can also use it to retrieve sequences individually by sequence name or accession number. The output of NetFetch is an RSF file.
NetFetch is an interface to the NetEntrez service provided by NCBI's web server at www.ncbi.nlm.nih.gov. It uses this server to perform remote retrievals. NetFetch reads the NetBLAST output file, queries the NCBI web service, and returns the sequences in an RSF output file. You can also retrieve individual sequences with NetFetch.
NetFetch can retrieve sequences only from the databases maintained at NCBI. Sometimes these databases and the databases searched with NetBLAST differ, resulting in the total or partial failure of some requests. Remote searches require almost no resources from your own computer.
Here is a session using NetFetch to retrieve sequences listed in a NetBLAST output file:
NETFETCH what NCBI sequence or NetBLAST output file ? zizm99.blastp
What should I call the RSF output file (* zizm99.rsf *) ?
NETFETCH complete with:
Below is part of the output from the example session:
NETFETCH of: zizm99.blastp
August 11, 1998
from server: www.ncbi.nlm.nih.gov
25 Sequences Requested
25 Sequences Returned
descrip ZEIN-ALPHA PRECURSOR (19 KD) (CLONE ZG99).
longname Zea mays
LOCUS 141598 235 aa
DEFINITION ZEIN-ALPHA PRECURSOR (19 KD) (CLONE ZG99).
Since NetFetch completes successfully if any of the sequences requested are returned, the output file may not contain all of the files that were requested.
NetFetch accepts a NetBLAST output file or the sequence name or accession number of a sequence. You can specify several sequences by placing a comma between sequence names or accession numbers.
NetBLAST searches for sequences similar to a query
sequence. The query and the database searched can be either peptide or nucleic
acid in any combination. NetBLAST can search only
databases maintained at the
NetFetch+ retrieves sequences from NCBI listed in a NetBLAST+ output file. You can also use it to retrieve sequences individually by sequence name or accession number. The output of NetFetch+ is an RSF file.
NetFetch was designed specifically to search the NetEntrez server at NCBI. It is unlikely that it will work with other similar servers.
Searching remote databases opens up the possibility of unauthorized access to your query sequence. You should not use confidential query sequences for remote searches.
NetFetch does not accept a conventional GCG sequence specification for the input. The input file is the NetBLAST output file not a GCG list file. Sequence specifications must be consistent with those allowed by the NCBI web server.
The NCBI databases searched by NetFetch may differ from the databases searched by NetBLAST so that not all sequence names listed in the NetBLAST output file can be retrieved by NetFetch. For example, when this document was written you could search the Alu database with NetBLAST but that database was not available to the NetEntrez server at NCBI used by NetFetch.
Network bandwidth varies greatly from time to time and from site to site. You may want to retrieve sequences when the network is more likely to be quiet. However, be aware that waiting too long to fetch sequences may result in retrieval failures because sequences are sometimes replaced or deleted from the databases.
NetFetch retrieves all of the sequences into a single RSF file. Most Accelrys GCG (GCG) programs can read individual sequences directly from the RSF file. If you want to export a single sequence into a GCG single sequence file, use the program Reformat.
There are a number of possible problems with client/server applications running over the Internet. You should determine if you are charged for network communications, and note that the security and integrity of your sequences is at risk. Also there is the possibility that a server will become overloaded and that your search will take much longer than normal or that your output will be lost altogether because of a network or server computer glitch.
All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional.
Minimal Syntax: % netfetch [-INfile1=]zizm99.blastp -Default
-OUTfile1=name.rsf specifies the output file name
-TOP=10 fetch only the top 10 sequences
-MONitor displays screen trace
-NOSUMmary suppresses the screen summary
-RAW saves the entire server response in a .raw file
-URL='"www.your.url/script="' sends HTTP query to an alternate URL
rather than NCBI's Entrez server
-PROXY=gatekeeper.org uses proxy server to send request
-TYPe=s specify the type of database to search:
s = both
n = nucleotide
p = protein
You can set the parameters listed below from the command line.
Limit the retrieval to the top sequences. You specify how many sequences you want to retrieve and NetFetch will request no more that that many. It always builds the request list from the sequences at the top of the list. If you specify more sequences than listed in the input file, all of the sequences in the file will be requested. If you specify zero or omit -TOP, all of the sequences in the input file will be requested.
Display's a screen trace of the program's progress. Messages will display indicating the connection status to NCBI, the retrieval, and parsing of the result.
Writes a summary of the program's work to the screen when you've used -Default to suppress all program interaction. A summary typically displays at the end of a program run interactively. You can suppress the summary for a program run interactively with -NOSUMmary.
You can also use this parameter to cause a summary of the program's work to be written in the log file of a program run in batch.
Saves the response as it comes back from NCBI in a .raw file. The file will have the same basename as the RSF file. This file will contain the entire response from NCBI including any error or informational messages.
Specifies the host and port of a proxy server to use. This parameter causes the request to be sent through the proxy which might be your company's firewall. Not all firewalls require proxy settings; therefore, you should check with your network or system administrator before using this option. The complete URL for NCBI is passed in the GET or POST request. The syntax of the proxy specification is, hostname:portnumber. If the ":portnumber" is omitted, port 80 is assumed.
Specifies the host, port, and command to use when making the request. You can specify the host only, in which case the default port and command are used. You must specify the host if you need to change the port or the command. Specifying the port is never necessary.
The syntax of the command assumes that a comma-separated list of sequence IDs will be concatenated to it before submission to NCBI. For example, if you specify:
% netfetch -URL="www.blast.ncbi.nih.gov/htbin/Entrez/query?db=s&uid=" drome_gpdh
The actual request made to NCBI will be equivalent to making the following request from a web browser:
You can read the current version of the NetEntrez documentation on the World Wide Web at http://www.ncbi.nlm.nih.gov/.
Printed: May 27, 2005 13:51
Copyright (c) 1982-2005 Accelrys Inc. All rights reserved.
Licenses and Trademarks: Discovery Studio ®, SeqLab ®, SeqWeb ®, SeqMerge ®, GCG ® and, the GCG logo are registered trademarks of Accelrys Inc.
All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.