Table of Contents
Names identifies GCG data files and sequence entries by name. It can show you what set of sequences is implied by any sequence specification.
The purpose of the Names program is to show what entries and filenames correspond to a GCG sequence specification. For example, the expression % names *bov* displays every GCG data file or sequence entry whose name contains the pattern bov. Sequence specification is described in detail in Section 2, Using Sequence Files and Databases of the User's Guide.
Names asks you what you want to call the output file. If you press <Return> in answer to this query, Names shows the names on your terminal screen in a format similar to a directory listing. Otherwise, Names writes a list file containing the names of all of the entries corresponding to your sequence specification. The list file is suitable for input to any Accelrys GCG (GCG) program that supports indirect file specification.
For each sequence entry or file in the list file, Names adds as much of the entry's documentation as will fit between the end of the entry's name and column 132 of the output. You can also assign a value to -SHOwfiles to specify a different column where the documentation should be truncated. You may use any number between 20 and 511. Or you can use -NOSHOwfiles to suppress the documentation completely.
If your sequence specification contains no logical name or directory specification, Names looks in all the databases and in all the GCG data directories to find all possible entry names. The expression % names hum* will find the same sequences as % names GenBank:hum* plus any sequences beginning with hum that are present in databases other than GenBank and in any GCG data directories.
Special Considerations for Searching
Keep in mind that filenames are case sensitive and database entry names are case insensitive. Because this program searches for both filenames and database entry names, you must take care when you enter the character pattern that makes up your specification.
For example, if you entered Gamma* as a file specification, this program would find all entries in the databases whose names begin with Gamma but no GCG supplied files would be found. This is because all the files in GCG are named using lowercase letters. Conversely, if you entered gamma*, this program would find all of the entries in the databases and all GCG supplied files whose names begin with gamma.
Here is a session using Names to make a documented list file for most of the GenBank human beta globin sequences:
NAMES for what GCG data file(s) ? GenBank:HumHb*
What (file of filenames) output file (* TERM *) ? humanbeta.fil
Names written into "humanbeta.fil".
Here is part of the output file (the lines in humanbeta.fil contain 132 characters each; we've truncated them for this document):
! NAMES from: GenBank:humhb*
October 21, 1998..
gb_pr1:humhb16aa LOCUS HUMHB16AA 1216 bp mRNA PRI
gb_pr1:humhb1az LOCUS HUMHB1AZ 483 bp DNA PRI
gb_pr1:humhb24 LOCUS HUMHB24 2231 bp mRNA PRI 18-APR-1991 DEFINITION Hum
gb_pr2:humhbgg LOCUS HUMHBGG 545 bp mRNA PRI 24-MAR-1997 DEFINITION Huma
gb_pr2:humhbl2a LOCUS HUMHBL2A 462 bp mRNA PRI 12-MAY-1994 DEFINITION Hu
gb_pr2:humhbp68 LOCUS HUMHBP68 28 bp mRNA PRI 05-APR-1995 DEFINITION Hum
The database programs LookUp, Names, StringSearch, FindPatterns, FastA, TFastA, FastX, TFastX, SSearch, and WordSearch can be used for list refinement if you are looking for sequences with something in common. For instance, you could identify human globin nucleotide sequences with LookUp. The output list from LookUp could then be refined further with FindPatterns to show only those human globin sequences containing EcoRI sites. If you run FindPatterns with -NAMes, you could then do a FastA sequence search on the FindPatterns list file output to see if a sequence you have is similar to any of these EcoRI-containing human globin sequences.
Adding Lists Together
You can add two lists together by simply appending one of the files to the other. It is better if you use a text editor to modify the heading of the combined list so that the annotation in the list correctly reflects what you have done. Remember to delete the text heading from the second file so that it does not occur in the middle of the list.
Suppress any item in a list by typing an exclamation point (!) in front of the item. You can also put comments into a list anywhere on a line by placing an exclamation point before the comment.
All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional.
Minimal Syntax: % names [-INfile=]GenBank:humhb* -Default
[-OUTfile=]term names the output file (defaults to your terminal)
Local Data Files: None
-SHOwfiles=132 limits documentation in the output file to column 132
-NOHEAding suppresses the heading at the top of the file
-NOMONitor suppresses the screen monitor
You can set the parameters listed below from the command line.
Sets the number of the column at which the sequence documentation is cut off. -NOSHOwfiles suppresses the sequence documentation altogether, and only the sequence name is written to the list file.
Suppresses the heading at the top of the list file that shows the input specification and the time.
This program normally monitors its progress on your screen. However, when you use -Default to suppress all program interaction, you also suppress the monitor. You can turn it back on with this parameter. If you are running the program in batch, the monitor will appear in the log file.
Printed: May 27, 2005 13:48
Copyright (c) 1982-2005 Accelrys Inc. All rights reserved.
Licenses and Trademarks: Discovery Studio ®, SeqLab ®, SeqWeb ®, SeqMerge ®, GCG ® and, the GCG logo are registered trademarks of Accelrys Inc.
All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.