TERMINATOR

[Genhelp | Program Manual | User's Guide | Data Files | Databases | Release Notes ]

 

Table of Contents

FUNCTION

DESCRIPTION

EXAMPLE

OUTPUT

INPUT FILES

RELATED PROGRAMS

RESTRICTIONS

ALGORITHM

CONSIDERATIONS

COMMAND-LINE SUMMARY

LOCAL DATA FILES

PARAMETER REFERENCE


FUNCTION

[ Top | Next ]

Terminator searches for prokaryotic factor-independent RNA polymerase terminators according to the method of Brendel and Trifonov.

DESCRIPTION

[ Previous | Top | Next ]

Terminator uses a table of the dinucleotide frequencies for each position from a set of known terminators to find places in a new sequence where terminator-like sequences occur. Terminator finds all discrete examples in the searched sequence where a measurement falls above some user-defined threshold value. The measurement for each alignment of the table over the sequence is the sum of the values in the table for each dinucleotide from the sequence. The method can also restrict the set of terminator-like sequences shown to those that fall above some threshold for the presence of a GC-rich dyad symmetry near the poly-U region.

The method used by Terminator is described in detail in two papers: Brendel, V. and Trifonov, E. N., Nucl. Acids Res. 12 4411-4427 (1984) and Brendel, V. and Trifonov, E. N. in CODATA Conference Proceedings, Jerusalem, 1984. Any use of Terminator that results in publication should cite these papers.

EXAMPLE

[ Previous | Top | Next ]

Here is a session using Terminator to search for terminator-like sequences in synpbr322:

 
 
% terminator
 
  TERMINATOR search of what sequence ?  GenBank:SynpBR322
 
                 Begin (* 1 *) ?
               End (*  4361 *) ?
              Reverse (* No *) ?
 
  Primary structure threshold value (* 3.50 *) ?
 
  Secondary structure threshold value (* 0 *) ?
 
  What should I call the output file (* synpbr322.trm *) ?
 
  Searching . . .
 
%
 

OUTPUT

[ Previous | Top | Next ]

Here is the output file:

 
 
 TERMINATOR search on: synpbr322  check: 5483  from: 1  to: 4361
 
J01749 Cloning vector pBR322, complete genome. 6/96
LOCUS       SYNPBR322    4361 bp    DNA   circular  SYN       07-JUN-1996
DEFINITION  Cloning vector pBR322, complete genome.
ACCESSION   J01749 K00005 L08654 M10282 M10283 M10286 M10356 M10784 M10785
            M10786 M33694 V01119
NID         g208958 . . .
 
 Primary structure threshold: 3.50  Secondary structure threshold: 0
 
                                   October 6, 1998 15:02  ..
 
           -40  -35  -30  -25  -20  -15  -10   -5  -1+  +5         p      s
             .    .    .    .    .    .    .    .   ..   .
     921=> CCATTATCGCCGGCATGGCGGCCGACGCGCTGGGCTACGTCTTGCTGGCGT   3.80      0
    1398=> CATCTCCAGCAGCCGCACGCGGCGCATCTCGGGCAGCGTTGGGTCCTGGCC   3.62      0
    1573=> TCTGCGACCTGAGCAACAACATGAATGGTCTTCGGTTTCCGTGTTTCGTAA   3.62      0
    1583=> GAGCAACAACATGAATGGTCTTCGGTTTCCGTGTTTCGTAAAGTCTGGAAA   4.32      0
    1881=> CATGAACAGAAATCCCCCTTACACGGAGGCATCAGTGACCAAACAGGAAAA   3.57     16
                                --- --    -- ---
                        -- --      -- --
    1914=> AGTGACCAAACAGGAAAAAACCGCCCTTAACATGGCCCGCTTTATCAGAAG   4.47      0
    2320=> GATGCGTAAGGAGAAAATACCGCATCAGGCGCTCTTCCGCTTCCTCGCTCA   3.73     48
                          -- - ---     --- - --
    2492=> GCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAG   4.35      0
    2497=> AGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCC   3.95      0
    3039=> TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAG   6.92     95
                        -----------   -----------
                          ----    ----
             .    .    .    .    .    .    .    .   ..   .
    3101=> GCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTC   4.18     68
                      -------  - --   -- -  -------
                     ---------     ---------
    3199=> GATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTT   4.62     19
                    ---------      ---------
    3502=> GTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGG   3.59      0
    4226=> TATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGT   4.49      0
    4311=> ACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCTTCAAGAA   3.69      0

INPUT FILES

[ Previous | Top | Next ]

Terminator takes a single nucleic acid sequence as input. If Terminator rejects your nucleotide sequence, turn to Appendix VI to see how to change or set the type of a sequence.

RELATED PROGRAMS

[ Previous | Top | Next ]

None

RESTRICTIONS

[ Previous | Top | Next ]

The pattern recognition method used by Terminator is only applicable to the search for prokaryotic factor-independent terminators. As mentioned above, Terminator is not really a GCG Package program, but was adapted to run with the Package by Greg Hamm. Its behavior is not completely known, and it may not adhere to all GCG Package conventions. Accelrys (GCG) is very grateful to Drs. Brendel and Trifonov for generously allowing them to distribute their program.

ALGORITHM

[ Previous | Top | Next ]

The algorithm is described clearly in the CODATA paper.

CONSIDERATIONS

[ Previous | Top | Next ]

The default primary structure threshold is such that about 95 percent of known factor-independent, prokaryotic terminators should be found by Terminator in a set of terminator-like sequences, based on primary structure alone.

The program predicts terminators in those parts of the sequence composed entirely of lower- and uppercase G, A, T, and C. Parts of the sequence containing other sequence symbols are given a primary structure value of 0.0 and a secondary structure value of 0.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional.

Minimal Syntax: % terminator [-INfile=]genbank:synpbr322 -Default
 
Prompted Parameters:
 
-BEGin=1 -END=4363        sets the range of interest
-REVerse                  uses the reverse strand
-PTHRESHold=3.50          sets the primary structure threshold value
-STHRESHold=0             sets the secondary structure threshold value
[-OUTfile=]synpbr322.trm  names the output file
 
Local Data Files:
 
-DATa1=pmatrix.dat       contains the normalized dinucleotide fractions
-DATa2=smatrix.dat       contains the significant GC-rich dyad diagonals
 
Optional Parameters: None
 

LOCAL DATA FILES

[ Previous | Top | Next ]

The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.

The file pmatrix.dat is taken from Figure 3 of the CODATA paper. It is similar to Figure 3 of the NAR paper. It contains the normalized fractions of each dinucleotide observed in the set thought to be determining terminator structure. The file smatrix.dat is from Figure 2 of the CODATA paper. It contains the significant diagonals for the GC-rich dyad symmetry. Both pmatrix.dat and smatrix.dat must be provided to Terminator as local data files.

PARAMETER REFERENCE

[ Previous | Top ]

You can set the parameters listed below from the command line.

-PTHRESHold=3.50

Sets the primary threshold value for display of sequence ranges. The default value is set to find 95 percent of known, factor-independent, prokaryotic terminators.

-STHRESHold=0

Sets the secondary structure threshold value. This secondary structure is GC-rich dyad symmetry near poly-U regions of a sequence.

Printed:  April 5, 2005 15:43 


[Genhelp | Program Manual | User's Guide | Data Files | Databases | Release Notes ]


Technical Support: support-us@accelrys.com, support-japan@accelrys.com,
or support-eu@accelrys.com

Copyright (c) 1982-2005 Accelrys Inc. All rights reserved.

Licenses and Trademarks: Discovery Studio ®, SeqLab ®, SeqWeb ®, SeqMerge ®, GCG ® and, the GCG logo are registered trademarks of Accelrys Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

www.accelrys.com/bio