Table of Contents
Terminator searches for prokaryotic factor-independent RNA polymerase terminators according to the method of Brendel and Trifonov.
Terminator uses a table of the dinucleotide frequencies for each position from a set of known terminators to find places in a new sequence where terminator-like sequences occur. Terminator finds all discrete examples in the searched sequence where a measurement falls above some user-defined threshold value. The measurement for each alignment of the table over the sequence is the sum of the values in the table for each dinucleotide from the sequence. The method can also restrict the set of terminator-like sequences shown to those that fall above some threshold for the presence of a GC-rich dyad symmetry near the poly-U region.
The method used by Terminator is described in
detail in two papers: Brendel, V. and Trifonov, E. N., Nucl. Acids Res. 12
4411-4427 (1984) and Brendel, V. and Trifonov, E. N. in CODATA Conference
Here is a session using Terminator to search for terminator-like sequences in synpbr322:
TERMINATOR search of what sequence ? GenBank:SynpBR322
Begin (* 1 *) ?
End (* 4361 *) ?
Reverse (* No *) ?
Primary structure threshold value (* 3.50 *) ?
Secondary structure threshold value (* 0 *) ?
What should I call the output file (* synpbr322.trm *) ?
Searching . . .
Here is the output file:
TERMINATOR search on: synpbr322 check: 5483 from: 1 to: 4361
J01749 Cloning vector pBR322, complete genome. 6/96
LOCUS SYNPBR322 4361 bp DNA circular SYN
DEFINITION Cloning vector pBR322, complete genome.
ACCESSION J01749 K00005 L08654 M10282 M10283 M10286 M10356 M10784 M10785
M10786 M33694 V01119
NID g208958 . . .
Primary structure threshold: 3.50 Secondary structure threshold: 0
October 6, 1998 15:02 ..
-40 -35 -30 -25 -20 -15 -10 -5 -1+ +5 p s
. . . . . . . . .. .
921=> CCATTATCGCCGGCATGGCGGCCGACGCGCTGGGCTACGTCTTGCTGGCGT 3.80 0
1398=> CATCTCCAGCAGCCGCACGCGGCGCATCTCGGGCAGCGTTGGGTCCTGGCC 3.62 0
1573=> TCTGCGACCTGAGCAACAACATGAATGGTCTTCGGTTTCCGTGTTTCGTAA 3.62 0
1583=> GAGCAACAACATGAATGGTCTTCGGTTTCCGTGTTTCGTAAAGTCTGGAAA 4.32 0
1881=> CATGAACAGAAATCCCCCTTACACGGAGGCATCAGTGACCAAACAGGAAAA 3.57 16
--- -- -- ---
-- -- -- --
1914=> AGTGACCAAACAGGAAAAAACCGCCCTTAACATGGCCCGCTTTATCAGAAG 4.47 0
2320=> GATGCGTAAGGAGAAAATACCGCATCAGGCGCTCTTCCGCTTCCTCGCTCA 3.73 48
-- - --- --- - --
2492=> GCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAG 4.35 0
2497=> AGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCC 3.95 0
3039=> TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAG 6.92 95
. . . . . . . . .. .
3101=> GCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTC 4.18 68
------- - -- -- - -------
3199=> GATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTT 4.62 19
3502=> GTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGG 3.59 0
4226=> TATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGT 4.49 0
4311=> ACATTAACCTATAAAAATAGGCGTATCACGAGGCCCTTTCGTCTTCAAGAA 3.69 0
Terminator takes a single nucleic acid sequence as input. If Terminator rejects your nucleotide sequence, turn to Appendix VI to see how to change or set the type of a sequence.
The pattern recognition method used by Terminator is only applicable to the search for prokaryotic factor-independent terminators. As mentioned above, Terminator is not really a GCG Package program, but was adapted to run with the Package by Greg Hamm. Its behavior is not completely known, and it may not adhere to all GCG Package conventions. Accelrys (GCG) is very grateful to Drs. Brendel and Trifonov for generously allowing them to distribute their program.
The algorithm is described clearly in the CODATA paper.
The default primary structure threshold is such that about 95 percent of known factor-independent, prokaryotic terminators should be found by Terminator in a set of terminator-like sequences, based on primary structure alone.
The program predicts terminators in those parts of the sequence composed entirely of lower- and uppercase G, A, T, and C. Parts of the sequence containing other sequence symbols are given a primary structure value of 0.0 and a secondary structure value of 0.
All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional.
Minimal Syntax: % terminator [-INfile=]genbank:synpbr322 -Default
-BEGin=1 -END=4363 sets the range of interest
-REVerse uses the reverse strand
-PTHRESHold=3.50 sets the primary structure threshold value
-STHRESHold=0 sets the secondary structure threshold value
[-OUTfile=]synpbr322.trm names the output file
Local Data Files:
-DATa1=pmatrix.dat contains the normalized dinucleotide fractions
-DATa2=smatrix.dat contains the significant GC-rich dyad diagonals
Optional Parameters: None
The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.
The file pmatrix.dat is taken from Figure 3 of the CODATA paper. It is similar to Figure 3 of the NAR paper. It contains the normalized fractions of each dinucleotide observed in the set thought to be determining terminator structure. The file smatrix.dat is from Figure 2 of the CODATA paper. It contains the significant diagonals for the GC-rich dyad symmetry. Both pmatrix.dat and smatrix.dat must be provided to Terminator as local data files.
You can set the parameters listed below from the command line.
Sets the primary threshold value for display of sequence ranges. The default value is set to find 95 percent of known, factor-independent, prokaryotic terminators.
Sets the secondary structure threshold value. This secondary structure is GC-rich dyad symmetry near poly-U regions of a sequence.
Printed: April 5, 2005 15:43
Copyright (c) 1982-2005 Accelrys Inc. All rights reserved.
Licenses and Trademarks: Discovery Studio ®, SeqLab ®, SeqWeb ®, SeqMerge ®, GCG ® and, the GCG logo are registered trademarks of Accelrys Inc.
All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.