PAUPDISPLAY

Table of Contents

FUNCTION

DESCRIPTION

EXAMPLE

FUNCTION

[ Top | Next ]

PAUPDisplay provides a GCG interface to tree manipulation, diagnosis, and display options in PAUP (Phylogenetic Analysis Using Parsimony). Starting with a trees file that contains a sequence alignment and one or more trees reconstructed from this alignment (such as the output from PAUPSearch), you can plot the tree(s); compute the score of the tree(s) according to the criteria of parsimony, distance, or maximum likelihood; or calculate a consensus tree (two or more input trees). PAUPDisplay can also plot the trees from a GrowTree trees file.

PAUP is the copyrighted property of the Smithsonian Institution. Use the program Fetch to obtain a copy of paup-license.txt to read about rights and limitations for using PAUP.

The version of PAUP that currently ships with the Accelrys GCG (GCG) is a developmental version. Be sure to check your results carefully.

DESCRIPTION

[ Previous | Top | Next ]

PAUPDisplay is a GCG front-end to Dr. David Swofford's PAUP (Phylogenetic Analysis Using Parsimony) program. It allows you to access most of the tree diagnosis, manipulation, and display functions of PAUP from the command line using GCG conventions, or from SeqLab. You can plot trees to a figure file or to a GCG-supported graphics device. To use all of the functions of PAUPDisplay, the file used as input for this program must be a NEXUS format file that contains both a DATA block (the alignment) and a TREES block, such as the output file from PAUPSearch. If you want only to plot trees and not to analyze them, a NEXUS file containing only a TREES block, such as the trees file from GrowTree, can be used as input.

PAUPDisplay processes your input, performs some checks on the NEXUS file that is used for input, and writes this information into a temporary script file in NEXUS format. (The NEXUS format was designed to be used as a standard file format for the interchange of information between programs used in phylogeny and classification. It is described in Maddison, et al., Systematic Biology 46; 590-621 (1997).) The script contains the alignment data, the trees reconstructed from the alignment, and commands in the PAUP command language. PAUPDisplay then calls PAUP itself, giving it the name of the script file as its input. The script is deleted unless you use -SCRIPT=paup.paupscript. After PAUP has completed the analysis, control is returned to PAUPDisplay, which creates a graphical plot of the tree(s) if a plot has been requested.

The PAUP functions supported by PAUPDisplay include rooting an unrooted tree, plotting a tree, and displaying a text representation of a tree along with diagnostic information. In addition, if you have two or more trees, you can compute a consensus tree, an agreement subtree, and tree-to-tree distances.

This document provides only an overview of the types of analyses that PAUP can do. For detailed information about maximum parsimony, minimum evolution, maximum likelihood, consensus trees, rooting of trees, and PAUP itself, you can purchase additional copies of the PAUP User's Manual from the publisher, Sinauer Associates, Inc., 23 Plumtree Road, Sunderland MA 01375-0407 USA, phone 413-549-4300, FAX 413-549-1118. Information about the availability of the manual can be obtained on their web site (http://www.sinauer.com) or by e-mail (publish@sinauer.com).

EXAMPLE

[ Previous | Top | Next ]

Here is a session with PAUPDisplay that displays the description and graphical plot of two trees that were created in the sample session with PAUPSearch.

% paupdisplay

 What trees file to display ?  hum_gtr.pauptrees

 What should be displayed ?

   1 Description of trees only

   2 Plot of trees only

   3 Description and plot of trees

   4 Tree-to-tree distances

   5 Consensus tree(s)

   6 Agreement subtree

 Choose one (* 3 *) :

 Optimality criterion for tree description:

   P  Parsimony

   D  Distance (Minimum Evolution)

 Choose a criterion (* P *) :

 Setting criterion to parsimony.

 What should I call the output file (* hum_gtr.paupdisplay *) ?

 Available plot output options :

   1 Plot on LaserWriter attached to /dev/tty10

   2 FIGURE file named paupdisplay.figure

 Send plot to (* 1 *) : 2

 Creating NEXUS file for input to PAUP

 Calling PAUP...

P A U P *

Portable version 4.0.0d55 for Unix

Fri Oct 23 13:32:03 1998

        --------------------------NOTICE------------------------

          PAUP* is experimental in this release.

          Please check your results carefully!

        --------------------------------------------------------

Processing of file "paup1038092136331.trees" begins...

 Aligned sequences from GCG file(s) 'HumGtr.Msf{*}'

Data matrix has 5 taxa, 548 characters

Data read in 'protein' format

Valid character-state symbols: ACDEFGHIKLMNPQRSTVWY*

Missing data identified by '?'

"Equate" macros in effect:

   B,b ==> {DN}

   Z,z ==> {EQ}

Gaps identified by '.', treated as "missing"

>Heuristic search settings:

>   Optimality criterion = maximum parsimony

>      Character-status summary:

>        Of 548 total characters:

>          All characters are of type 'unord'

>          All characters have equal weight

>          178 characters are constant

>          286 variable characters are parsimony-uninformative

>          Number of parsimony-informative characters = 84

>   Starting tree(s) obtained via stepwise addition

>   Addition sequence: simple (reference taxon = Gtr1 Human)

>   1 tree held at each step during stepwise addition

>   Tree-bisection-reconnection (TBR) branch-swapping performed

>   MULPARS option in effect

>   Steepest descent option not in effect

>   'MaxTrees' setting = 100 (will not be increased)

>   Branches collapsed (creating polytomies) if maximum branch length = 0

>   Topological constraints not enforced

>   Trees are unrooted

>Heuristic search completed

>   Total number of rearrangements tried = 28

>   Score of best tree(s) found = 738

>   Number of trees retained = 2

>   Time used = <1 sec (CPU time = 0.02 sec)

2 trees read from TREES block

   Time used = <1 sec (CPU time = 0.02 sec)

Logging output to file "HumGtr.Paupdisplay".

Tree description:

   Unrooted tree(s) rooted using outgroup method

   Optimality criterion = maximum parsimony

   Character-status summary:

     Of 548 total characters:

       All characters are of type 'unord'

       All characters have equal weight

       178 characters are constant

       286 variable characters are parsimony-uninformative

       Number of parsimony-informative characters = 84

   Character-state optimization: Accelerated transformation (ACCTRAN)

Tree number 1 (rooted using default outgroup)

Branch lengths and linkages for tree  1 (unrooted)

                                     Assigned       Minimum       Maximum

                    Connected          branch      possible      possible

   Node              to node           length        length        length

-------------------------------------------------------------------------

Gtr1 Human (1)           8                 82            28           105

Gtr3 Human (2)           8                115            72           150

     7                   8                 70            17            91

Gtr4 Human (3)           7                100            47           135

     6                   7                 79            16           133

Gtr2 Human (4)           6                110            56           167

Gtr5 Human (5)           6                182           129           243

Tree length = 738

Consistency index (CI) = 0.9350

Homoplasy index (HI) = 0.0650

CI excluding uninformative characters = 0.7318

HI excluding uninformative characters = 0.2682

Retention index (RI) = 0.4286

Rescaled consistency index (RC) = 0.4007

f value = 60

f-ratio = 0.0862

   (multistate unordered and stepmatrix characters excluded from f-value

   calculations)

/------------------------------------------------------------------- Gtr1 Human

+------------------------------------------------------------------- Gtr3 Human

|                     /--------------------------------------------- Gtr4 Human

|                     |

\---------------------7                      /---------------------- Gtr2 Human

                      \----------------------6

                                             \---------------------- Gtr5 Human

Note: Multistate unordered and/or stepmatrix characters excluded from

      patristic distance calculations.

Patristic distance matrix

   Below diagonal: Adjusted character distances

   Above diagonal: Patristic distances

                  1    2    3    4    5

  1 Gtr1 Human    -   27   43   70  115

  2 Gtr3 Human   27    -   44   71  116

  3 Gtr4 Human   35   40    -   59  104

  4 Gtr2 Human   56   57   59    -  107

  5 Gtr5 Human  105  106  104  107    -

Pairwise homoplasy matrix

                 1   2   3   4   5

  1 Gtr1 Human   -

  2 Gtr3 Human   0   -

  3 Gtr4 Human   8   4   -

  4 Gtr2 Human  14  14   0   -

  5 Gtr5 Human  10  10   0   0   -

Tree number 2 (rooted using default outgroup)

/////////////////////////////////////////////

Loading trees...

Tree 1: (PAUP 1) taxa = 5

Tree 2: (PAUP 2) taxa = 5

 FIGURE instructions are now being written into paupdisplay.figure.

OUTPUT

[ Previous | Top | Next ]

Here is some of the output file:

P A U P *

Portable version 4.0.0d55 for Unix

Fri Oct 23 13:32:05 1998

        --------------------------NOTICE------------------------

          PAUP* is experimental in this release.

          Please check your results carefully!

        --------------------------------------------------------

Tree description:

   Unrooted tree(s) rooted using outgroup method

   Optimality criterion = maximum parsimony

   Character-status summary:

     Of 548 total characters:

       All characters are of type 'unord'

       All characters have equal weight

       178 characters are constant

       286 variable characters are parsimony-uninformative

       Number of parsimony-informative characters = 84

   Character-state optimization: Accelerated transformation (ACCTRAN)

Tree number 1 (rooted using default outgroup)

Branch lengths and linkages for tree #1 (unrooted)

                                     Assigned       Minimum       Maximum

                    Connected          branch      possible      possible

   Node              to node           length        length        length

-------------------------------------------------------------------------

Gtr1 Human (1)           8                 82            28           105

Gtr3 Human (2)           8                115            72           150

     7                   8                 70            17            91

Gtr4 Human (3)           7                100            47           135

     6                   7                 79            16           133

Gtr2 Human (4)           6                110            56           167

Gtr5 Human (5)           6                182           129           243

Tree length = 738

Consistency index (CI) = 0.9350

Homoplasy index (HI) = 0.0650

CI excluding uninformative characters = 0.7318

HI excluding uninformative characters = 0.2682

Retention index (RI) = 0.4286

Rescaled consistency index (RC) = 0.4007

f value = 60

f-ratio = 0.0862

   (multistate unordered and stepmatrix characters excluded from f-value

   calculations)

/------------------------------------------------------------------- Gtr1 Human

+------------------------------------------------------------------- Gtr3 Human

|                     /--------------------------------------------- Gtr4 Human

|                     |

\---------------------7                      /---------------------- Gtr2 Human

                      \----------------------6

                                             \---------------------- Gtr5 Human

Note: Multistate unordered and/or stepmatrix characters excluded from

      patristic distance calculations.

Patristic distance matrix

   Below diagonal: Adjusted character distances

   Above diagonal: Patristic distances

                  1    2    3    4    5

  1 Gtr1 Human    -   27   43   70  115

  2 Gtr3 Human   27    -   44   71  116

  3 Gtr4 Human   35   40    -   59  104

  4 Gtr2 Human   56   57   59    -  107

  5 Gtr5 Human  105  106  104  107    -

Pairwise homoplasy matrix

                 1   2   3   4   5

  1 Gtr1 Human   -

  2 Gtr3 Human   0   -

  3 Gtr4 Human   8   4   -

  4 Gtr2 Human  14  14   0   -

  5 Gtr5 Human  10  10   0   0   -

Tree number 2 (rooted using default outgroup)

/////////////////////////////////////////////

Here is the graphical plot of one of the trees:

INPUT FILES

[ Previous | Top | Next ]

To use all of the functions available through PAUPDisplay, you must use an input file in NEXUS format that contains both a DATA block and a TREES block. (If only a TREES block is present, the trees can be plotted, but no other manipulations or analyses can be performed.) Ordinarily you would use the output file from PAUPSearch. The PAUP User's Manual published by Sinauer contains information about the NEXUS file format.

RELATED PROGRAMS

[ Previous | Top | Next ]

PileUp creates a multiple sequence alignment of a group of related sequences.

GCG includes several programs for evolutionary analysis of multiple sequence alignments. Distances creates a matrix of pairwise distances between the sequences in a multiple sequence alignment. Diverge measures the number of synonymous and nonsynonymous substitutions per site of two or more aligned protein coding regions and can output matrices of these values. GrowTree reconstructs a tree from a distance matrix or a matrix of synonymous or nonsynonymous substitutions. PAUPSearch reconstructs phylogenetic trees from a multiple sequence alignment using parsimony, distance, or maximum likelihood criteria; PAUPDisplay can manipulate and display the trees output by PAUPSearch and can also plot the trees output by GrowTree.

RESTRICTIONS

[ Previous | Top | Next ]

To use all of the functions provided through PAUPDisplay, you need a single file in NEXUS format containing both a DATA block with the original alignment and a TREES block containing the trees resulting from the analysis (normally you would use the output file produced by PAUPSearch). If only a TREES block is present, such as the trees file produced by GrowTree, PAUPDisplay can plot the tree to a GCG-supported graphics device, but no other display or manipulation actions can be performed.

ALGORITHM

[ Previous | Top | Next ]

Detailed information about the specific algorithms used by PAUP to create consensus trees and agreement subtrees, calculate likelihoods, root unrooted trees, and perform character optimizations, along with information about the statistics printed with the description of the tree, can be found in the PAUP User's Manual published by Sinauer. The following is a summary of some of these subjects.

The Describe Trees Command

When -ACTion=DEScribe, you will get the following information for each tree: the score of the tree according to the optimality criterion that you've set; a table of assigned, minimum-possible, and maximum-possible branch lengths; and a text "graphic" of the tree topology.

The criterion used here does not have to match the criterion that was used by PAUPSearch to obtain the tree(s). For example, you could specify -CRITerion=L in PAUPSearch to obtain the best tree according to maximum likelihood, then use -CRITerion=D in PAUPDisplay to display the score (sum of branch lengths) of this tree according to the minimum evolution algorithm.

When the optimality criterion is set to parsimony, you can get a great deal of additional information for each tree. (Again, the original tree(s) may have been obtained in PAUPSearch using a criterion other than parsimony.) A matrix containing adjusted character distances and patristic distances is displayed, along with the pairwise homoplasy matrix. (The patristic distance is the sum of the branch lengths on the path between two sequences on the tree. The character distance is the sum of the number of character changes between two sequences. The homoplasy distance is the patristic distance minus the character distance.) When -FVALue is used, you will get a listing of various goodness-of-fit statistics, such as the consistency index, homoplasy index, retention index, F value, and F-ratio. Other parameters allow you to examine character-state assignments.

Character-state Diagnosis

Character-state optimization is the reconstruction of character states at internal (ancestral) nodes of a tree. (This is not done while searching for trees; it is done only when you request this analysis on existing trees.) This analysis assigns character states that minimize the total number of changes required by a particular character on that tree. The set of assignments comprise a "most-parsimonious reconstruction" (MPR) or an "optimal reconstruction." The MPR is affected by tree topology and by any assumptions that have been made about character order and about the cost of changing from one character state to another.

In the event that more than one optimal assignment can be made at a node, PAUP uses one of three transformation optimization methods to favor one reconstruction over another by "pushing" transformations up or down the tree.

* accelerated transformation (ACCTRAN). This assigns states at internal nodes so that character state changes occur as close as possible to the root of the tree.

* delayed transformation (DELTRAN). This assigns states at internal nodes in such a way that character state changes occur as far as possible from the root of the tree.

* minimization of F-value (MINF). This assigns states at internal nodes so that the F-value of Farris (1972) is minimized. It is only available for unrooted trees.

ACCTRAN leads to a preference for a single origin followed by a reversal. DELTRAN leads to a preference for two origins of a character state (parallelisms). MINF transfers the length from interior branches towards terminal branches whenever possible and often yields the same reconstruction as DELTRAN. The methods always involve the same number of steps on the tree, but the changes will be located at different nodes depending on the method chosen. You can set the transformation to use with -OPTimize.

There are several parameters that you can use to display character-state information. -DIAG displays the minimum-possible, assigned, and maximum-possible length for each character, along with goodness-of-fit measures based on these lengths. -CSPoss displays a list of the possible character-state assignments (MPR-sets) for each tree. -XOUT=internal displays a table of character-states assigned to internal nodes of each tree.

Consensus Trees

PAUP provides four methods for generating consensus trees: strict, semistrict, majority-rule, and Adams.

* strict consensus. A strict consensus tree groups sequences only if that grouping appears in all of the trees. For many sets of trees, this method is too strict. For example, two trees may be identical except for the placement of a single sequence, yet their strict consensus tree might be completely unresolved (a star phylogeny). Strict consensus trees are the easiest type to interpret -- if a group appears in the consensus tree, it appears in all of the trees of the set.

* semistrict consensus. This method is also called "combinable component" consensus. If a particular grouping in one tree is not contradicted by the other trees, it will be retained in the consensus. For example, if all trees have either an (A,B,C) trichotomy or an ((A,B),C) dichotomy, then the consensus will have the group (A,B) since this grouping is not contradicted by the (A,B,C) trichotomy. When there is a conflict in grouping, semistrict consensus behaves like strict consensus.

* majority-rule consensus. This method allows a group to appear in the consensus even if some of the trees in the set contradict it, as long as a (pre-specified) majority of the trees support the grouping. Typically, the majority is considered to be more than 50 percent of the trees in the set. When comparing only two trees, this method is equivalent to the strict consensus method.

* Adams consensus. This method is based on the idea that a tree should be thought of as a "set of leaf subset nestings" rather than as a "set of clusters." A group nests within a larger group if the most recent common ancestor of the smaller group is a descendant of the most recent common ancestor of the larger group. Because the Adams method needs to know ancestor-descendant relationships, it can only be used for rooted trees. An advantage of this method is that it often preserves more structure than the strict methods. A drawback is that it may show groups in the consensus tree that do not occur in any of the trees in the set, which makes interpretation of the consensus tree somewhat more difficult. For example, the Adams consensus tree for (A,(((B,E),C),D)) and (A,(((B,D),C),E)) is (A,(B,C),D,E). The (B,C) grouping in this Adams consensus tree means only that B and C are more closely related to each other than either is to A, D, or E. (The strict consensus tree for these two trees would be the completely unresolved tree (A,B,C,D,E).)

In addition to calculating consensus trees, PAUP can also compute some consensus indices. A consensus index attempts to provide a quantitative measure of the extent to which a set of trees agrees. A number of different indices have been devised. Typically, their values range from 0.0 (no agreement) to 1.0 (maximum possible agreement).

Agreement Subtrees

An agreement subtree (also known as a common pruned tree) differs from a consensus tree in that branches are removed from the set of trees until the reduced trees become topologically equivalent. This can be useful in identifying "unstable" sequences in the data set -- those that appear in different places in different trees.

GRAPHICS

[ Previous | Top | Next ]

GCG must be configured for graphics before you run any program with graphics output! If the % setplot command is available in your installation, this is the easiest way to establish your graphics configuration, but you can also use commands like % postscript that correspond to the graphics languages GCG supports. See Section 5, Using Graphics in the User's Guide for more information about configuring your process for graphics.

<CTRL>C

[ Previous | Top | Next ]

If you need to stop this program, use <Ctrl>C to reset your terminal and session as gracefully as possible. Searches and comparisons write out the results from the part of the search that is complete when you use <Ctrl>C. The graphics device should stop plotting the current page and start plotting the next page. If the current page is the last page, plotters should put the pen away and graphic terminals should return to interactive mode.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional.

Minimal Syntax: % paupdisplay [-INfile=]paup.pauptrees -Default

Prompted Parameters:

[-OUTfile=]paup.paupdisplay  sets primary output filename

-ACTion=dandp                describes and plots input tree(s)

        DEScribe             describes input tree(s) only

        PLOT                 plots input tree(s) only

        TREEDist             shows matrix of tree-to-tree distances

        CONtree              shows consensus tree(s)

        AGRee                shows agreement subtree

-CRITerion=p                 sets parsimony...

           D                      distance (minimum evolution)...

           L                      likelihood...

                             as the optimality criterion for evaluating trees

When CRITerion=p, all parsimony parameters can be used.

When CRITerion=d, all distance parameters can be used.

When CRITerion=l, all likelihood parameters can be used.

Local Data Files:  None

Optional Parameters:

-SCRIPT[=paup.paupscript]    saves the NEXUS file used to run PAUP

  -NORUN                     doesn't perform analysis (just saves the

                               script if one is requested)

                             Describe Trees

Optional Parameters:

-TREElist=2               lists ID numbers of trees to display, or "all"

-OUTGroup=1,5             lists ID numbers of sequences to assign to

                            the outgroup

-ROOT=outgroup            sets how to root unrooted trees prior to output

      midpoint

-OUTRoot=polytomy         sets how to display outgroup in output

         paraphyl

         monophyl

-TCOMPress                compresses the text representation of the trees

-TREEFORMat=cladogram     sets how to display the tree branches

            PHYLogram

-ORDer=standard           sets how to order the tree branches

       right

       left

       alphabet

-NOLABELnode              doesn't label internal nodes with a node number

-NOBRLENs                 doesn't produce table of assigned,

                            minimum-possible, and maximum-possible

                            branch lengths

                             Parsimony Options

Optional Parameters:

-OPTimize=acctran         sets the character-state optimization to use

          deltran

          minf

-XOUT=none                sets how to output table of assigned

      terminal              character-state assignments for each tree

      internal

      both

-NOPATristic              doesn't display patristic distance matrix

-NOHOMOplasy              doesn't display pairwise homoplasy matrix

-FVALue                   displays F-value and F-ratio (goodness-of-fit

                            statistics)

-DIAG                     displays minimum-possible, assigned, and

                            maximum-possible length of each character and

                            goodness-of-fit measures based on these

                            lengths

-CSPoss                   lists possible character-state assignments

                            (MPR-sets) for each tree

                             Likelihood Options

Optional Parameters:

-NST=1                    sets number of substitution types for the

     2                      substitution model

-TRATio=2.0               sets transition (ti) : tranversion (tv) ratio

        estimate

-RMATrix=1,1,1,1,1        sets rate matrix (6-parameter model)

         estimate

-VARiant=hky              sets variant for unequal base frequencies

         f84                 (2-parameter model)

-BASEFReq=empirical       sets base frequencies to use

          equal

          0.35,0.25,0.2

-RATes=equal              sets model for rate variation across sites

       gamma

  -SHApe=0.5                sets shape parameter (gamma distribution of rate

                              variation)

  -NCAT=4                   sets number of rate categories for gamma

                              distribution

  -REPRAte=mean             sets the representation of the rate categories

           median             for gamma distribution

-LOGITer                  outputs the iteration log

                             Distance Options

Optional Parameters:

-DISTance=mean            sets distance method to use for prot or nuc...

          total

          p               ...or for nuc only

abs

jc

          tajnei

k2p

k3p

f81

f84

          hky85

          tamnei

gtr

          logdet

ml

          custom

When DISTance=ml, all likelihood optional parameters can be used.

  -CLAss=a,a,a,a,a,a        sets substitution classes (when DISTance=custom)

  -BASEFReq=empirical       sets base frequencies to use (when DISTance=custom)

            equal

-MISSDist=infer           sets how to treat nucleotide gaps and ambiguous bases

          ignore

-SUBST=all                calculates distance estimate using all substitutions,

       ti                   transitions only, or

       tv                   transversions only

-RATes=equal              sets the model for the substitution rate

       gamma                variation across sites

  -SHApe=0.5                sets the shape parameter of the gamma

                              distribution equation

-NEGBRlen=setzero         sets how to treat negative branch lengths

        prohibit

        allow

        setabs

-LOGITer                  outputs the iteration log

                             Tree-to-tree Distances

Optional Parameters:

-METric=symdiff           sets metric used to compute tree-to-tree distances

        agd1

        agreement

-FROMtree=2               shows tree-to-tree distances with respect to

                            the single tree with ID number 2

                             Consensus Trees

Optional Parameters:

-TREElist=2               lists ID numbers of trees to consider, or "all"

-OUTGroup=1,5             lists ID numbers of sequences to assign to

                            the outgroup

-ROOT=outgroup            sets how to root unrooted trees prior to output

      midpoint

-OUTRoot=polytomy         sets how to display outgroup in output

         paraphyl

         monophyl

-TCOMPress                compresses the text representation of the trees

-ORDer=standard           sets how to order the tree branches

       right

       left

       alphabet

-NOSTRICT                 doesn't compute a strict consensus tree

-SEMIstrict               computes a semistrict (combinable component)

                            consensus tree

-MAJRULE                  computes a majority rule consensus tree,

                            according to the next two options:

  -CUToff=50                  sets minimum percent of trees on which a group

                                must appear in order to be retained in the

                                consensus tree if they are compatible with

                                the groups already on the tree (applies only

                                if MAJRULE in effect)

  -LE50                       retains groups occurring on fewer than

                                50 percent of the tree in the consensus if they

                                are compatible with the groups already on the

                                tree (applies only if MAJRULE in effect)

-ADAMS                    computes an Adams consensus tree

-INDices                  calculates a variety of consensus indices

                             Agreement Subtrees

Optional Parameters:

-TREElist=2               lists ID numbers of trees to consider, or "all"

-TCOMPress                compresses the text representation of the trees

-FINDAll                  finds all agreement subtrees

-SHOWAll=sets             sets how to display agreement subtrees

         trees

         both

                             Plot Trees

Optional Parameters:

-TREElist=2               lists ID numbers of trees to display, or "all"

-OUTGroup=1,5             lists ID numbers of sequences to assign to

                            the outgroup

-ROOT=outgroup            sets how to root unrooted trees prior to output

      midpoint

-OUTRoot=polytomy         sets how to display outgroup in output

         paraphyl

         monophyl

-TREEFORMat=cladogram     sets how to display the tree branches

            PHYLogram

-ORDer=standard           sets how to order the tree branches

       right

       left

       alphabet

All GCG graphics programs accept these and other switches. See the Using

Graphics section of the USERS GUIDE for descriptions.

-FIGure[=filename]  stores plot in a file for later input to FIGURE

-FONT=3             draws all text on the plot using font 3

-COLor=1            draws entire plot with pen in stall 1

-SCAle=1.2          enlarges the plot by 20 percent (zoom in)

-XPAN=10.0          moves plot to the right 10 platen units (pan right)

-YPAN=10.0          moves plot up 10 platen units (pan up)

-PORtrait           rotates plot 90 degrees

LOCAL DATA FILES

[ Previous | Top | Next ]

None.

PARAMETER REFERENCE

[ Previous | Top ]

You can set the parameters listed below from the command line.

-ACTion=DANDp

Indicates which type of analysis is to be performed: describe and plot trees (DANDp), describe trees only (DEScribe), plot trees only (PLOT), compute a matrix of tree-to-tree distances (TREEDist), compute consensus tree(s) (CONtree), compute agreement tree(s) (AGRee).

-CRITerion=P

Sets the criterion to be used to evaluate trees: parsimony (P), distance (D), or likelihood (L).

-BEGin=1

Sets the beginning position for all input sequences. When the beginning position is set from the command line, PAUPSearch ignores beginning positions specified for individual sequences in a list file.

-END=100

Sets the ending position for all input sequences. When the ending position is set from the command line, PAUPSearch ignores ending positions specified for sequences in a list file.

-SCRIPT=paup.paupscript

Saves the NEXUS file used that is used as a script to run PAUP. This file can be used for documentation purposes or can be edited and used as input to the PAUP program.

-NORUN

Doesn't perform the analysis. This is used in conjunction with -SCRIPT when you want to create a script file and exit without performing the analysis.

Describe Trees

-TREElist=2

Lists the ID numbers of specific trees to display, or all trees. (For example: -TREElist=1,5-10,23 or -TREElist=all).

-OUTGroup=1,5

Lists the ID numbers of sequences to assign to the outgroup.

-ROOT=outgroup

Sets the rooting method used for the output trees. The default is outgroup, which uses the previously designated sequence(s) as the outgroup.midpoint roots the tree at the midpoint of the longest path connecting any pair of sequences.

-OUTRoot=polytomy

Sets the outgroup's appearance in the output: the default shows the outgroup as a polytomy next to the ingroup; paraphyl displays the outgroup as paraphyletic relative to the ingroup; andmonophyl displays the outgroup as the monophyletic sister group of the ingroup.

-TCOMPress

Compresses the text representation of the trees.

-TREEFORMat=CLADogram

Sets the tree display type. The default isCLADogram, a representation in which the branch lengths are not significant, and all sequence names are written in a column at the end of the tree. PHYLogram draws the branch lengths in proportion to the number of changes along that branch.

-ORDer=standard

Sets the order of the tree branches, within the constraints of the tree's topology: standard displays the branches in the order that the sequences were encountered in the alignment; right and left "ladder" the branches to the right or left according to the number of descendants each node has; and alphabet orders the sequence names alphabetically.

-NOBRLENs

Doesn't produce a table of assigned, minimum-possible, and maximum-possible branch lengths.

Parsimony Parameters for the Describe Trees Command

-OPTimize=acctran

Sets the character-state optimization to use. Characters on the tree(s) are optimized only when character state information is requested (by using -XOUT or -DIAG, for example), so this parameter has no effect on the display of the tree itself. The choices are acctran (accelerated transformation), the default;deltran (delayed transformation); and minf (minimize Farris' F-value (Am. Naturalist 106; 645-668 (1972)).

-XOUT=none

Selects the tables of character-state assignments for each tree that will be output: the default is none (no table is output); internal shows the character states assigned to the internal nodes; terminal shows a listing of the original data matrix; and both shows both of the tables.

-NOPATristic

Doesn't display the patristic distance matrix.

-NOHOMOplasy

Doesn't display the pairwise homoplasy matrix.

-FVALue

Displays F-value and F-ratio (goodness-of-fit statistics) from Farris (Am. Naturalist 106; 645-668 (1972)).

-DIAG

Displays the minimum-possible, assigned, and maximum-possible length of each character and goodness-of-fit measures based on these lengths.

-NOLABELnode

Doesn't label internal nodes with a node number.

-CSPoss

Lists possible character-state assignments (most parsimonious reconstruction sets) for each tree.

Likelihood Parameters for the Describe Trees Command

-NST=2

Specifies the number of substitution types for the substitution model. Accepted values are 1, 2 (the default), and 6.

-TRATio=2.0

Sets the transition (ti) : transversion (tv) ratio. The default ratio is 2.0. In addition to setting the ratio yourself, you can ask the program to estimate it from the sequence data by specifying -TRATio=estimate.

-RMATrix=1.0,1.0,1.0,1.0,1.0

Sets the rate matrix when a substitution model with six substitution types is specified. To set the rates yourself, supply a list of five integers or real numbers, separated by commas, after the parameter. These numbers represent the rates for AC, AG, AT, CG, and CT substitutions, respectively. The default is -RMATrix=1,1,1,1,1. In addition to setting the rates yourself, you can ask the program to estimate them by specifying -RMATrix=estimate. When the program estimates the rates, the search will be slower than when you specify the rate matrix.

-VARiant=hky

Sets the variant for unequal base frequencies when a substitution model with two substitution types (-NST=2) is specified. The two values for this parameter are hky (Hasegawa-Kishino-Yano's 1985 model) and f84 (Felsenstein's 1984 method).

-BASEFReq=empirical

Sets the base frequencies to use. You can supply a list of three real numbers, separated by commas, to represent the fraction of the bases that are A, C, and G, respectively (the fraction of bases that are T is calculated from the other three values), for example, -BASEFReq=0.25,0.33,0.3. Alternatively, you can tell the program that the frequencies are equal (all base frequencies will be set to 0.25) or ask it to calculate the base frequencies from the data in the alignment (empirical, the default).

-RATes=equal

Sets the model for the substitution rate variation across sites. The substitution rate can be equal at all sites (the default) or can vary according to thegamma distribution.

-SHApe=0.5

Sets the value of the shape parameter of the gamma distribution equation when -RATes=gamma. The default is 0.5, and the value must be greater than 0.0.

-NCAT=4

Sets the number of rate categories for the discrete gamma distribution when -RATes=gamma. The higher the number of categories, the closer the discrete gamma distribution will conform to the continuous gamma distribution, but at an increasing cost in computer time and memory. The default value of 4 is a good compromise.

-REPRate=mean

Sets how the rate categories for the discrete gamma distribution are represented when -RATes=gamma. The rate categories can be represented by the mean (default) or the median value for that category.

-LOGITer

Displays the iteration log.

Distance Parameters for the Describe Trees Command

-DISTance=p

Sets the distance correction method to use. The default is the p distance for nucleic acid sequences and mean distance for protein sequences. The methods that can be used for both types of sequences are total distance and mean distance. The following methods can be used only with nucleic acid sequences: p distance (uncorrected distance), abs (absolute distance, not normalized by the number of sites), jc (Jukes-Cantor), tajnei (Tajima-Nei), k2p (Kimura 2-parameter), f81 (Felsenstein 1981), f84 (Felsenstein 1984), hky85 (Hasegawa-Kishino-Yano, 1985), k3p (Kimura 3-parameter), tamnei (Tamura-Nei), gtr (general time-reversible), logdet (log determinant), ml (maximum likelihood distance), and a custom distance, which allows you to design your own distance correction method by means of the -CLAss and -BASEFReq parameters. When -DISTance=ml, any of the likelihood optional parameters can be used.

-CLAss=a,a,a,a,a,a

Is used when -DISTance=custom to specify which of the six possible substitution types fall into the same class. The order of the substitutions in the parameter list is AC, AG, AT, CG, CT, GT. Classes are designated by the letters a through f. In the default case shown above, all six substitution types are assigned to the same class. To assign transitions (AG and CT) and transversions (AC, AT, CG, GT) to be in two separate classes, you would specify -CLAss=a,b,a,a,b,a.

-BASEFReq=empirical

Is used when -DISTance=custom to specify whether the base frequencies in the sequence alignment should be considered equal or if they should be calculated from the data (empirical). The default is empirical.

-MISSDist=infer

Specifies how to treat gaps and ambiguous bases in an alignment when computing distances between nucleic acid sequences. When -MISSDist=ignore, the program does not take any sites containing gaps or ambiguous bases into account when computing the pairwise distances between sequences. The default value, infer, directs the program to guess which nucleotide a gap or ambiguous base represents based on the composition of the sequence data. When both nucleotides of a sequence pair are maximally ambiguous at a site (gap or N) the site is ignored even if infer is specified.

-SUBST=all

Estimates distances based on all substitutions (all), transitions only (ti) or transversions only (tv). This parameter is ignored when -DISTance=logdet.

-RATes=equal

Sets the model for the substitution rate variation across sites. The substitution rate can be equal at all sites (the default) or can vary according to thegamma distribution.

-SHApe=0.5

Sets the value of the shape parameter of the gamma distribution equation when -RATes=gamma. The default is 0.5, and the value must be greater than 0.0.

-NEGBRlen=setzero

Sets how negative branch lengths are treated if they occur in a tree. You can allow negative branch lengths, or you can specify one of the following: prohibit (branch lengths are optimized under the constraint that they be nonnegative), setzero (sets the negative branch length to 0.0 without affecting any of the other branch lengths), setabs (resets the negative branch length to its absolute value without affecting any of the other branch lengths).

-LOGITer

Displays the iteration log.

Tree-to-Tree Distances

-METric=symdiff

Sets the measure used to quantify the dissimilarity of pairs of trees. The default is the symmetric difference (symdiff). agd1 is the "d1" distance of Goddard, et al. It indicates the number of sequences that must be pruned to make the two trees identical. agreement is the "d" distance of Goddard, et al. This metric incorporates information on the distance between the pruned sequences on the original trees.

-FROMtree=1

Displays the distance between a specific tree and each of the others. Otherwise all possible pairwise tree-to-tree distances are displayed (-FROMtree=0, the default).

Consensus Trees

-TREElist=2

Lists the ID numbers of specific trees to display, or all trees. (For example: -TREElist=1,5-10,23 or -TREElist=all).

-OUTGroup=1,5

Lists the ID numbers of sequences to assign to the outgroup.

-ROOT=outgroup

Sets the rooting method used for the output trees. The default is outgroup, using the sequence(s) you've designated as the outgroup. midpoint roots the tree at the midpoint of the longest path connecting any pair of sequences.

-OUTRoot=polytomy

-TCOMPress

Compresses the text representation of the trees.

-ORDer=standard

-NOSTRICT

Doesn't compute a strict consensus tree.

-SEMIstrict

Computes a semistrict (combinable component) consensus tree.

-MAJRULE

Computes a majority rule consensus tree, according to the next two parameters (-CUToff and -LE50):

-CUToff=50

Sets the minimum percent of trees on which a group must appear in order to be retained in the consensus tree, if they are compatible with the groups already on the tree.

-LE50

Retains groups that occur on fewer than 50 percent of the tree in the consensus if they are compatible with the groups already on the tree.

-ADAMS

Computes an Adams consensus tree (Syst. Zoology 21; 390-397 (1972)).

-INDices

Calculates a variety of consensus indices.

Agreement Subtrees

-TREElist=2

Lists the ID numbers of specific trees to display, or all trees. (For example: -TREElist=1,5-10,23 or -TREElist=all).

-FINDAll

Finds all agreement subtrees.

-SHOWAll=sets

Shows agreement subtrees as sets, trees,both, or no (doesn't show them at all.

-TCOMPress

Compresses the text representation of the trees.

Plot Trees

-TREElist=2

Lists the ID numbers of specific trees to display, or all trees. (For example: -TREElist=1,5-10,23 or -TREElist=all).

-OUTGroup=1,5

Lists the ID numbers of sequences to assign to the outgroup.

-ROOT=outgroup

Sets the rooting method used for the output trees. The default is outgroup, using as the outgroup either the first sequence listed in the alignment or the sequence(s) you've designated using -OUTGroup.midpoint roots the tree at the midpoint of the longest path connecting any pair of sequences.

-OUTRoot=polytomy

-TREEFORMat=CLADogram

-ORDer=standard

The parameters below apply to all GCG graphics programs. These and many others are described in detail in Section 5, Using Graphics of the User's Guide.

-FIGure=programname.figure

Writes the plot as a text file of plotting instructions suitable for input to the Figure program instead of sending it to the device specified in your graphics configuration.

-FONT=3

Draws all text characters on the plot using Font 3 (see Appendix I).

-COLor=1

Draws the entire plot with the pen in stall 1.

The parameters below let you expand or reduce the plot (zoom), move it in either direction (pan), or rotate it 90 degrees (rotate).

-SCAle=1.2

Expands the plot by 20 percent by resetting the scaling factor (normally 1.0) to 1.2 (zoom in). You can expand the axes independently with -XSCAle and -YSCAle. Numbers less than 1.0 contract the plot (zoom out).

-XPAN=30.0

Moves the plot to the right by 30 platen units (pan right).

-YPAN=30.0

Moves the plot up by 30 platen units (pan up).

-PORtrait

Rotates the plot 90 degrees. Usually, plots are displayed with the horizontal axis longer than the vertical (landscape). Note that plots are reduced or enlarged, depending on the platen size, to fill the page.

Printed: May 27, 2005 13:53

Technical Support: support-us@accelrys.com, support-japan@accelrys.com,
or support-eu@accelrys.com

Licenses and Trademarks: Discovery Studio ®, SeqLab ®, SeqWeb ®, SeqMerge ®, GCG ® and, the GCG logo are registered trademarks of Accelrys Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.