Short Descriptions

[Genhelp | Program Manual | User's Guide | Data Files | Databases | Release Notes ]

This appendix lists and briefly describes programs in Accelrys GCG (GCG). Programs are grouped by function and may appear under multiple functional headings. For more information on using these programs, see the Program Manual.

 Table notes:

The following explains notations used in the tables.

 “2” These programs generate graphics that require a graphics output device. (Example of usage: DotPlot2)

 “+” These programs are new or enhanced in GCG 11.0. The “+” is part of the program name and is required when executing any of these programs. (Example of usage: ClustalW+)

Comparison

 

Pairwise Comparison

 

 

Gap

Uses the algorithm of Needleman and Wunsch to find the alignment of two complete sequences that maximizes the number of matches and minimizes the number of gaps.

BestFit

Makes an optimal alignment of the best segment of similarity between two sequences. Optimal alignments are found by inserting gaps to maximize the number of matches using the local homology algorithm of Smith and Waterman.

FrameAlign

Creates an optimal alignment of the best segment of similarity (local alignment) between a protein sequence and the codons in all possible reading frames on a single strand of a nucleotide sequence. Optimal alignments may include reading frame shifts.

Compare

Compares two protein or nucleic acid sequences and creates a file of the points of similarity between them for plotting with DotPlot. Compare finds the points using either a window/stringency or a word match criterion. The word comparison is 1,000 times faster than the window/stringency comparison, but somewhat less sensitive.

DotPlot2

Makes a dot-plot with the output file from Compare or StemLoop.

GapShow2

Displays an alignment by making a graph that shows the distribution of similarities and gaps. The two input sequences should be aligned with either Gap or BestFit before they are given to GapShow for display.

ProfileGap

Makes an optimal alignment between a profile and one or more sequences.

 

Multiple Comparison

 

ClustalW+

Creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments. It can also plot a tree showing the clustering relationships used to create the alignment.

PileUp2

Creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments. It can also plot a tree showing the clustering relationships used to create the alignment.

SeqLab

It is the graphical user interface for GCG. For additional information, refer to the SeqLab Guide.

PlotSimilarity2

Plots the running average of the similarity among the sequences in a multiple sequence alignment.

Pretty

Displays multiple sequence alignments and calculates a consensus sequence. It does not create the alignment; it simply displays it.

PrettyBox2

Displays multiple sequence alignments as shaded boxes in Postscript format for printing or displaying with a Postscript-compatible device. PrettyBox optionally calculates a consensus sequence. The program does not create the alignment; it simply displays it.

MEME

(Multiple EM for Motif Elicitation) Finds conserved motifs in a group of unaligned sequences. MEME saves these motifs as a set of profiles. You can search a database of sequences with these profiles using the MotifSearch program.

MEME+

(Multiple EM for Motif Elicitation) Finds conserved motifs in a group of unaligned sequences. MEME saves these motifs as a set of profiles. You can search a database of sequences with these profiles using the MotifSearch program.

ProfileMake

Creates a position-specific scoring table, called a profile that quantitatively represents the information from a group of aligned sequences. The profile can then be used for database searching (ProfileSearch) or sequence alignment (ProfileGap).

ProfileGap

Makes an optimal alignment between a profile and one or more sequences.

HmmerAlign

Uses a profile hidden Markov model (HMM) as a template to create an optimal multiple alignment of a group of sequences.

Overlap

Compares two sets of DNA sequences to each other in both orientations using a WordSearch style comparison.

NoOverlap

Identifies the places where a group of nucleotide sequences do not share any common subsequences.

OldDistances

Makes a table of the pairwise similarities within a group of aligned sequences.

HmmerBuild

 

HmmerBuild creates a position-specific scoring table, called a profile hidden Markov model (HMM), that is a statistical model of the consensus of a multiple sequence alignment. The profile HMM can be used for database searching (HmmerSearch), sequence alignment (HmmerAlign) or generating random sequences that match the model (HmmerEmit).

 

HmmerCalibrate

 

HmmerCalibrate “calibrates” a profile hidden Markov model in order to increase the sensitivity of database searches performed using that profile HMM as a query. The program compares the original profile HMM with a large number of randomly generated sequences and computes the extreme value distribution (EVD) parameters for this simulated search. The original profile HMM is replaced with a new one that contains these EVD parameters.

Database Searching

 

Reference Searching

 

LookUp

Identifies sequence database entries by name, accession number, author, organism, keyword, title, reference, feature, definition, length, or date. The output is a list of sequences.

StringSearch

Identifies sequences by searching for character patterns such as "globin" or "human" in the sequence documentation.

Names

Identifies GCG data files and sequence entries by name. It can show you what set of sequences is implied by any sequence specification.

 

Sequence Searching

 

BLAST

Searches one or more nucleic acid or protein databases for sequences similar to one or more query sequences of any type. BLAST can produce gapped alignments for the matches it finds.

BLAST+

Searches one or more nucleic acid or protein databases for sequences similar to one or more query sequences of any type. BLAST+ can produce gapped alignments for the matches it finds.

NetBLAST

Searches for sequences similar to a query sequence. The query and the database searched can be either peptide or nucleic acid in any combination. NetBLAST can search only databases maintained at the National Center for Biotechnology Information (NCBI) in Bethesda, Maryland, USA.

NetBLAST+

Searches for sequences similar to a query sequence. The query and the database searched can be either peptide or nucleic acid in any combination. NetBLAST+ can search only databases maintained at the National Center for Biotechnology Information (NCBI) in Bethesda, Maryland, USA.

PSIBLAST

Iteratively searches one or more protein databases for sequences similar to one or more protein query sequences. PSIBLAST is similar to BLAST except that it uses position-specific scoring matrices derived during the search.

FastA

Does a Pearson and Lipman search for similarity between a query sequence and a group of sequences of the same type (nucleic acid or protein). For nucleotide searches, FastA may be more sensitive than BLAST.

FastA+

Does a Pearson and Lipman search for similarity between a query sequence and a group of sequences of the same type (nucleic acid or protein). For nucleotide searches, FastA+ may be more sensitive than BLAST.

SSearch

Does a rigorous Smith-Waterman search for similarity between a query sequence and a group of sequences of the same type (nucleic acid or protein). This may be the most sensitive method available for similarity searches. Compared to BLAST and FastA, it can be very slow.

SSearch+

Does a rigorous Smith-Waterman search for similarity between a query sequence and a group of sequences of the same type (nucleic acid or protein). This may be the most sensitive method available for similarity searches. Compared to BLAST and FastA, it can be very slow.

TFastA

Does a Pearson and Lipman search for similarity between a protein query sequence and any group of nucleotide sequences. TFastA translates the nucleotide sequences in all six reading frames before performing the comparison. It is designed to answer the question, "What implied protein sequences in a nucleotide sequence database are similar to my protein sequence?"

TFastA+

Does a Pearson and Lipman search for similarity between a protein query sequence and any group of nucleotide sequences. TFastA+ translates the nucleotide sequences in all six reading frames before performing the comparison. It is designed to answer the question, "What implied protein sequences in a nucleotide sequence database are similar to my protein sequence?"

TFastX

Does a Pearson and Lipman search for similarity between a protein query sequence and any group of nucleotide sequences, taking frameshifts into account. It is designed to be a replacement for TFastA, and like TFastA, it is designed to answer the question, "What implied protein sequences in a nucleotide sequence database are similar to my protein sequence?" TFastA treats each of the six reading frames of a nucleotide sequence as a separate sequence, resulting in three separate alignments for each strand. TFastX, on the other hand, compares the protein query sequence to only one translated protein per strand of the nucleotide sequence, resulting in one alignment per strand. It calculates a similarity score for alignments that takes frameshifts into account, allowing it to "join" short regions separated by frameshifts into a single long alignment. TFastX may alert you to more meaningful hits than TFastA does when the nucleotide sequences contain frameshift errors.

TFastX+

Does a Pearson and Lipman search for similarity between a protein query sequence and any group of nucleotide sequences, taking frameshifts into account. It is designed to be a replacement for TFastA+, and like TFastA+, it is designed to answer the question, "What implied protein sequences in a nucleotide sequence database are similar to my protein sequence?" TFastA+ treats each of the six reading frames of a nucleotide sequence as a separate sequence, resulting in three separate alignments for each strand. TFastX+, on the other hand, compares the protein query sequence to only one translated protein per strand of the nucleotide sequence, resulting in one alignment per strand. It calculates a similarity score for alignments that takes frameshifts into account, allowing it to "join" short regions separated by frameshifts into a single long alignment. TFastX may alert you to more meaningful hits than TFastA does when the nucleotide sequences contain frameshift errors.

FastX

Does a Pearson and Lipman search for similarity between a nucleotide query sequence and a group of protein sequences, taking frameshifts into account. FastX translates both strands of the nucleic sequence before performing the comparison. It is designed to answer the question, "What implied protein sequences in my nucleic acid
sequence are similar to sequences in a protein database?"

FastX+

Does a Pearson and Lipman search for similarity between a nucleotide query sequence and a group of protein sequences, taking frameshifts into account. FastX+ translates both strands of the nucleic sequence before performing the comparison. It is designed to answer the question, "What implied protein sequences in my nucleic acid
sequence are similar to sequences in a protein database?"

FrameSearch2

Searches a group of protein sequences for similarity to one or more nucleotide query sequences, or searches a group of nucleotide sequences for similarity to one or more protein query sequences. For each sequence comparison, the program finds an optimal alignment between the protein sequence and all possible codons on each strand of the nucleotide sequence. Optimal alignments may include reading frame shifts.

HmmerSearch

Uses a profile hidden Markov model as a query to search a sequence database to find sequences similar to the family from which the profile HMM was built. Profile HMMs can be created using HmmerBuild.

MotifSearch

Uses a set of profiles (representing similarities within a family of sequences) as a query to either a) search a database for new sequences similar to the original family, or b) annotate the members of the original family with details of the matches between the profiles and each of the members. Normally, the profiles are created with the program MEME.

ProfileSearch

Uses a profile (representing a group of aligned sequences) as a query to search the database for new sequences with similarity to the group. The profile is created with the program ProfileMake.

ProfileSegments

Makes optimal alignments showing the segments of similarity found by ProfileSearch.

FindPatterns

Identifies sequences that contain short patterns like GAATTC or YRYRYRYR. You can define the patterns ambiguously and allow mismatches. You can provide the patterns in a file or simply type them in from the terminal.

FindPatterns+

Identifies sequences that contain short patterns like GAATTC or YRYRYRYR. You can define the patterns ambiguously and allow mismatches. You can provide the patterns in a file or simply type them in from the terminal.

Motifs

Looks for sequence motifs by searching through proteins for the patterns defined in the PROSITE Dictionary of Protein Sites and Patterns. Motifs can display an abstract of the current literature on each of the motifs it finds.

HmmerBuild

 

HmmerBuild creates a position-specific scoring table, called a profile hidden Markov model (HMM), that is a statistical model of the consensus of a multiple sequence alignment. The profile HMM can be used for database searching (HmmerSearch), sequence alignment (HmmerAlign) or generating random sequences that match the model (HmmerEmit).

 

HmmerCalibrate

 

HmmerCalibrate “calibrates” a profile hidden Markov model in order to increase the sensitivity of database searches performed using that profile HMM as a query. The program compares the original profile HMM with a large number of randomly generated sequences and computes the extreme value distribution (EVD) parameters for this simulated search. The original profile HMM is replaced with a new one that contains these EVD parameters.

HmmerPfam

Compares one or more sequences to a database of profile hidden Markov models, such as the Pfam library, in order to identify known domains within the sequences.

WordSearch2

Identifies sequences in the database that share large numbers of common words in the same register of comparison with your query sequence. The output of WordSearch can be displayed with Segments.

Segments

Aligns and displays the segments of similarity found by WordSearch.

 

Sequence Retrieval

 

Fetch

Copies GCG sequences or data files from the GCG database into your directory or displays them on your terminal screen.

Fetch+

Copies GCG sequences or data files from the GCG database into your directory or displays them on your terminal screen.

NetFetch

Retrieves entries from NCBI listed in a NetBLAST output file. It can also be used to retrieve entries individually by entry name or accession number. The output of NetFetch is an RSF file.

NetFetch+

Retrieves entries from NCBI listed in a NetBLAST+ output file. It can also be used to retrieve entries individually by entry name or accession number. The output of NetFetch+ is an RSF file.

DNA/ RNA Secondary Structure

MFold

Predicts optimal and suboptimal secondary structures for an RNA or DNA molecule using the most recent energy minimization method of Zuker.

PlotFold

Displays the optimal and suboptimal secondary structures for an RNA or DNA molecule predicted by MFold.

StemLoop

Finds stems (inverted repeats) within a sequence. You specify the minimum stem length, minimum and maximum loop sizes, and the minimum number of bonds per stem. All loops or only the best loops can be displayed on your screen or written into a file.

DotPlot2

Makes a dot-plot with the output file from Compare or StemLoop.

Editing and Publication

SeqLab

Is the graphical user interface for GCG. For additional information, refer to the SeqLab Guide.

Assemble

Constructs new sequences from pieces of existing sequences. It concatenates the fragments you specify and writes them out as a new sequence file. SeqEd is a better tool for assembling sequences interactively, but Assemble is best for assembling sequences from fragments defined in a list file.

Pretty

Displays multiple sequence alignments and calculates a consensus sequence. It does not create the alignment; it simply displays it.

PrettyBox2

Displays multiple sequence alignments as shaded boxes in Postscript format for printing or displaying with a Postscript-compatible device. PrettyBox optionally calculates a consensus sequence. The program does not create the alignment; it simply displays it.

PlasmidMap2

Draws a circular plot of a plasmid construct. It can display restriction patterns, inserts, and known genetic elements. The plot is suitable for publication, record keeping, or analysis. It is drawn from one or more labeling files such as those written by MapSort.

Figure2

Makes figures and posters by drawing graphics and text together. You can include output from other GCG graphics programs as part of a figure.

Evolution

PAUPSearch

Provides a GCG interface to the tree-searching options in PAUP (Phylogenetic Analysis Using Parsimony). Starting with a set of aligned sequences, you can search for phylogenetic trees that are optimal according to parsimony, distance, or maximum likelihood criteria; reconstruct a neighbor-joining tree; or perform a bootstrap analysis. The program PAUPDisplay can produce a graphical version of a PAUPSearch trees file. PAUP is the copyrighted property of the Smithsonian Institution. Use the program Fetch to obtain a copy of paup-license.txt to read about rights and limitations for using PAUP.

PAUPDisplay

Provides a GCG interface to tree manipulation, diagnosis, and display options in PAUP (Phylogenetic Analysis Using Parsimony). Starting with a trees file that contains a sequence alignment and one or more trees reconstructed from this alignment (such as the output from PAUPSearch), you can plot the tree(s); compute the score of the tree(s) according to the criteria of parsimony, distance, or maximum likelihood; or calculate a consensus tree (two or more input trees). PAUPDisplay can also plot the trees from a GrowTree trees file. PAUP is the copyrighted property of the Smithsonian Institution. Use the program Fetch to obtain a copy of paup-license.txt to read about rights and limitations for using PAUP.

Distances

Creates a table of the pairwise distances within a group of aligned sequences.

GrowTree

Creates a phylogenetic tree from a distance matrix created by Distances using either the UPGMA or neighbor-joining method. You can create a text or graphics output file.

Diverge

Estimates the pairwise number of synonymous and nonsynonymous substitutions per site between two or more aligned nucleic acid sequences that code for proteins. It uses a variant of the method published by Li et al.

Fragment Assembly

SeqMerge

SeqMerge is GCG’s powerful new fragment assembly application with an X Windows graphical user interface.  SeqMerge allows you to intuitively assemble fragments in a sequencing project into contigs, or alignments of overlapping fragments.  From the contig, SeqMerge creates a consensus sequence representing the underlying sequence from which your fragments were derived.

Gene Finding and Pattern Recognition

TestCode

Helps you identify protein coding sequences by plotting a measure of the non-randomness of the composition at every third base. The statistic does not require a codon frequency table.

CodonPreference

Is a frame-specific gene finder that tries to recognize protein coding sequences by virtue of the similarity of their codon usage to a codon frequency table or by the bias of their composition (usually GC) in the third position of each codon.

Frames

Shows open reading frames for the six translation frames of a DNA sequence. Frames can superimpose the pattern of rare codon choices if you provide it with a codon frequency table.

Terminator

Searches for prokaryotic factor-independent RNA polymerase terminators according to the method of Brendel and Trifonov.

Motifs

Looks for sequence motifs by searching through proteins for the patterns defined in the PROSITE Dictionary of Protein Sites and Patterns. Motifs can display an abstract of the current literature on each of the motifs it finds.

MEME

(Multiple EM for Motif Elicitation) Finds conserved motifs in a group unaligned sequences. MEME saves these motifs as a set of profiles. You can search a database of sequences with these profiles using the MotifSearch program.

MEME+

(Multiple EM for Motif Elicitation) Finds conserved motifs in a group unaligned sequences. MEME+ saves these motifs as a set of profiles. You can search a database of sequences with these profiles using the MotifSearch program.

Repeat

Finds direct repeats in sequences. You must set the size, stringency, and range within which the repeat must occur; all the repeats of that size or greater are displayed as short alignments.

FindPatterns

Identifies sequences that contain short patterns like GAATTC or YRYRYRYR. You can define the patterns ambiguously and allow mismatches. You can provide the patterns in a file or simply type them in from the terminal.

FindPatterns+

Identifies sequences that contain short patterns like GAATTC or YRYRYRYR. You can define the patterns ambiguously and allow mismatches. You can provide the patterns in a file or simply type them in from the terminal.

Composition

Determines the composition of sequence(s). For nucleotide sequence(s), Composition also determines dinucleotide and trinucleotide content.

CodonFrequency

Tabulates codon usage from sequences and/or existing codon usage tables. The output file is correctly formatted for input to the CodonPreference, Correspond, and Frames programs.

Correspond

Looks for similar patterns of codon usage by comparing codon frequency tables.

Window

Makes a table of the frequencies of different sequence patterns within a window as it is moved along a sequence. A pattern is any short sequence like GC or R or ATG. You can plot the output with the program StatPlot.

StatPlot2

Plots a set of parallel curves from a table of numbers like the table written by the Window program. The statistics in each column of the table are associated with a position in the analyzed sequence.

FitConsensus

Uses a consensus table written by Consensus as a probe to find the best examples of the consensus in a DNA sequence. You can specify the number of fits you want to see, and FitConsensus tabulates them with their position, frame, and a statistical measure of their quality.

Consensus

Calculates a consensus sequence for a set of pre-aligned short nucleic acid sequences by tabulating the percent of G, A, T, and C for each position in the set. FitConsensus uses the Consensus output table as a probe to search for the best examples of the derived consensus in other nucleotide sequences.

Xnu

Replaces statistically significant tandem repeats in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.

Seg

Replaces low complexity regions in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.

FromTrace

 

FromTrace converts one or more ABI or SCF trace files into GCG single sequence files.

Importing / Exporting

SeqConv+

SeqConv+ is a utility program that provides batch conversions between different sequence formats. The motivation for the program is to allow an end user to easily convert between file formats to easily import data into Accelrys’ bioinformatics applications. In addition, the converter allows the user to convert our internally used formats (e.g. BSML, RSF) into formats more commonly accepted by third-party tools. The supported file formats will include BSML, GenBank, FastA, and RSF.

Reformat

Rewrites sequence file(s), scoring matrix file(s), or enzyme data file(s) so that they can be read by GCG programs.

BreakUp

BreakUp reads a GCG-format sequence file containing more than 350,000 sequence characters and writes it as a set of separate, shorter, overlapping sequence files that can be analyzed by GS GCG programs.

HmmerConvert

HmmerConvert converts profile hidden Markov model files into different profile formats.

 

Mapping

Map

Maps a DNA sequence and displays both strands of the mapped sequence with restriction enzyme cut points above the sequence and protein translations below. Map can also create a peptide map of an amino acid sequence.

Map+

Maps a DNA sequence and displays both strands of the mapped sequence with restriction enzyme cut points above the sequence and protein translations below. Map+ can also create a peptide map of an amino acid sequence.

MapPlot2

Displays restriction sites graphically. If you don't have a plotter, MapPlot can write a text file that approximates the graph.

MapSort

Finds the coordinates of the restriction enzyme cuts in a DNA sequence and sorts the fragments of the resulting digest by size. MapSort can sort the fragments from single or multiple enzyme digests.

Fingerprint

Identifies the products of T1 ribonuclease digestion.

PeptideMap

Creates a peptide map of an amino acid sequence.

PlasmidMap

Draws a circular plot of a plasmid construct. It can display restriction patterns, inserts, and known genetic elements. The plot is suitable for publication, record keeping, or analysis. It is drawn from one or more labeling files such as those written by MapSort.

PeptideSort

Shows the peptide fragments from a digest of an amino acid sequence. It sorts the peptides by weight, position, and HPLC retention at pH 2.1, and shows the composition of each peptide. It also prints a summary of the composition of the whole protein.

Primer Selection

Prime

Selects oligonucleotide primers for a template DNA sequence. The primers may be useful for the polymerase chain reaction (PCR) or for DNA sequencing. You can allow Prime to choose primers from the whole template or limit the choices to a particular set of primers listed in a file.

Prime+

Selects oligonucleotide primers for a template DNA sequence. The primers may be useful for the polymerase chain reaction (PCR) or for DNA sequencing. You can allow Prime+ to choose primers from the whole template or limit the choices to a particular set of primers listed in a file.

PrimePair

Evaluates individual primers to determine their compatibility for use as PCR primer pairs. You can provide the primers in files (one for forward, one for reverse primers) or on the command line, or you can enter them interactively from the keyboard.

MeltTemp

Computes the melting temperature of oligonucleotides. You can provide the oligonucleotide sequences in a file or simply type them in at the keyboard.

HMMER

HmmerAlign

Uses a profile hidden Markov model (HMM) as a template to create an optimal multiple alignment of a group of sequences.

HmmerBuild

Creates a position-specific scoring table, called a profile hidden Markov model (HMM), that is a statistical model of the consensus of a multiple sequence alignment. The profile HMM can be used for database searching (HmmerSearch), sequence alignment (HmmerAlign) or generating random sequences that match the model (HmmerEmit).

HmmerCalibrate

"Calibrates" a profile hidden Markov model in order to increase the sensitivity of database searches performed using that profile HMM as a query. The program compares the original profile HMM with a large number of randomly generated sequences and computes the extreme value distribution (EVD) parameters for this simulated search. The original profile HMM is replaced with a new one that contains these EVD parameters.

HmmerConvert

Converts profile hidden Markov model files into different profile formats.

HmmerEmit

Generates sequences that match a profile hidden Markov model.

HmmerFetch

Retrieves a profile hidden Markov model (HMM) from a database of profile HMMs that has been indexed by HmmerIndex.

HmmerIndex

Creates an index for a profile hidden Markov model database so that profile HMMs can be retrieved from the database with HmmerFetch.

HmmerPfam

Compares one or more sequences to a database of profile hidden Markov models, such as the Pfam library, in order to identify known domains within the sequences.

HmmerSearch

Uses a profile hidden Markov model as a query to search a sequence database to find sequences similar to the family from which the profile HMM was built. Profile HMMs can be created using HmmerBuild.

Protein Analysis

Motifs

Looks for sequence motifs by searching through proteins for the patterns defined in the PROSITE Dictionary of Protein Sites and Patterns. Motifs can display an abstract of the current literature on each of the motifs it finds.

ProfileScan

Uses a database of profiles to find structural and sequence motifs in protein sequences.

HmmerPfam

Compares one or more sequences to a database of profile hidden Markov models, such as the Pfam library, in order to identify known domains within the sequences.

TransMem

Scans for likely transmembrane helices in one or more input protein sequences.

TransMem+

Scans for likely transmembrane helices in one or more input protein sequences.

CoilScan

Locates coiled-coil segments in protein sequences.

HTHScan

Scans protein sequences for the presence of helix-turn-helix motifs, indicative of sequence-specific DNA-binding structures often associated with gene regulation.

SPScan

Scans protein sequences for the presence of secretor signal peptides (SPs).

CoilScan+

Locates coiled-coil segments in protein sequences.

HTHScan+

Scans protein sequences for the presence of helix-turn-helix motifs, indicative of sequence-specific DNA-binding structures often associated with gene regulation.

SPScan+

Scans protein sequences for the presence of secretor signal peptides (SPs).

PeptideSort

Shows the peptide fragments from a digest of an amino acid sequence. It sorts the peptides by weight, position, and HPLC retention at pH 2.1, and shows the composition of each peptide. It also prints a summary of the composition of the whole protein.

Isoelectric

Plots the charge as a function of pH for any peptide sequence.

PeptideMap

Creates a peptide map of an amino acid sequence.

PepPlot2

Plots measures of protein secondary structure and hydrophobicity in parallel panels of the same plot.

PeptideStructure

Makes secondary structure predictions for a peptide sequence. The predictions include (in addition to alpha, beta, coil, and turn) measures for antigenicity, flexibility, hydrophobicity, and surface probability. PlotStructure displays the predictions graphically.

PlotStructure

Plots the measures of protein secondary structure in the output file from PeptideStructure. The measures can be shown on parallel panels of a graph or with a two-dimensional "squiggly" representation.

Moment

Makes a contour plot of the helical hydrophobic moment of a peptide sequence.

HelicalWheel

Plots a peptide sequence as a helical wheel to help you recognize amphiphilic regions.

Xnu

Replaces statistically significant tandem repeats in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.

Seg

Replaces low complexity regions in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.

Translation

Translate

Translates nucleotide sequences into peptide sequences.

BackTranslate

Backtranslates an amino acid sequence into a nucleotide sequence. The output helps you recognize minimally ambiguous regions that might be good for constructing synthetic probes.

Map

Maps a DNA sequence and displays both strands of the mapped sequence with restriction enzyme cut points above the sequence and protein translations below. Map can also create a peptide map of an amino acid sequence.

Reverse

Reverses and/or complements a sequence.

DataSet

Creates a GCG data library from any set of sequences in GCG format. To translate nucleotide sequences into peptide sequences, include the ToProt parameter.

Map+

Maps a DNA sequence and displays both strands of the mapped sequence with restriction enzyme cut points above the sequence and protein translations below. Map can also create a peptide map of an amino acid sequence.

DataSet+

Creates a GCG data library from any set of sequences in GCG format.

Utilities

 

Sequence Utilities

 

 

SeqManip+

 SeqManip+ is a utility program that allows the user to perform some manipulations of sequences, including translation, back translation of protein sequences, splitting sequences. While individual programs to perform these tasks already exist in Wisconsin Package 10.3, SeqManip+ provides a single platform to execute all the relevant sequence operations. This saves the users from having to find and run several different applications in order to execute some basic sequence manipulations.

 

SeqStat+

SeqStat+ is a utility program that reads through any number of input sequences and provides some basic statistics about the files, including total length, number of sequences, and average length. Additionally it provides some extended information about the sequences depending on their type (protein or nucleotide), such as G+C% content.

 

SeqConv+

 SeqConv+ is a utility program that provides batch conversions between different sequence formats. The motivation for the program is to allow an end user to easily convert between file formats to easily import data into Accelrys’ bioinformatics applications. In addition, the converter allows the user to convert our internally used formats (e.g. BSML, RSF) into formats more commonly accepted by third-party tools. The supported file formats will include BSML, GenBank, FastA, and RSF.

Reverse

Reverses and/or complements a sequence.

Shuffle

Randomizes the order of the symbols in a sequence without changing the composition.

Simplify

Lets you reduce the number of symbols in a sequence. Such a simplification would allow you, for instance, to treat all hydrophobic amino acids as equivalent.

CompTable

Creates a scoring matrix using equivalences defined in a simplification scheme such as the one used for Simplify.

HmmerEmit

Generates sequences that match a profile hidden Markov model.

Corrupt

Randomly introduces small numbers of substitutions, insertions, and deletions into nucleotide or protein sequence(s).

Xnu

Replaces statistically significant tandem repeats in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.

Seg

Replaces low complexity regions in protein sequences with X characters. If a resulting protein sequence is used as a query for a BLAST search, the regions with X characters are ignored.

Sample

Extracts sequence fragments randomly from sequence(s). You can set a sampling rate to determine how many fragments Sample extracts.

 

Database Utilities

 

DataSet

Creates a GCG data library from any set of sequences in GCG format.

DataSet+

Creates a GCG data library from any set of sequences in GCG format.

Sample

Extracts sequence fragments randomly from sequence(s). You can set a sampling rate to determine how many fragments Sample extracts.

 

Printing / Plotting Utilities

 

StatPlot2

Allows you to choose a plotting configuration from a menu of available graphics devices at your site.

Figure2

Makes figures and posters by drawing graphics and text together. You can include output from other GCG graphics programs as part of a figure.

PlotTest2

Plots an example graphic to test your graphics configuration. The graphic created by PlotTest uses every GCG graphics feature. It should resemble the example graphic in the Program Manual.

 

Miscellaneous Utilities

 

Reformat

Rewrites sequence file(s), scoring matrix file(s), or enzyme data file(s) so that they can be read by GCG programs.

Name

Displays GCG logical name(s) from the GCG logical names table.

Symbol

Displays GCG symbol(s) from the GCG symbol table.


[Genhelp | Program Manual | User's Guide | Data Files | Databases | Release Notes ]


Technical Support: support-us@accelrys.com, support-japan@accelrys.com,
or support-eu@accelrys.com

Copyright (c) 1982-2005 Accelrys Inc. All rights reserved.

Licenses and Trademarks: Discovery Studio ®, SeqLab ®, SeqWeb ®, SeqMerge ®, GCG ® and, the GCG logo are registered trademarks of Accelrys Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

www.accelrys.com/bio