[Genhelp | Program Manual | User's Guide | Data Files | Databases | Release Notes ]
Table of Contents
SeqConv+ is a utility program that
provides batch conversions between different sequence formats.
Advantages of Plus “+” Programs:
P
Plus programs are enhanced to be able to read
sequences in a variety of native formats such as GCG RSF, GCG SSF, GCG MSF, GenBank,
EMBL, FastA, SwissProt, PIR, and BSML without conversion.
P
Plus programs remove sequence length restriction
of 350,000bp.
If you do not need these features and wish to have
more interactivity, you might wish to seek out and run the original program
version.
SeqConv+ rewrites the sequence files into
any of the standard sequence formats. The following are some of the operations
that SeqConv+ can perform:
- Converting single sequence files that
were prepared or edited with a text editor into GCG format.
- Conversion between multiple sequence
(MSF), rich sequence (RSF) and single sequence (SSF) GCG formats.
- Interconversion
between standard sequence file formats- GenBank, Genpept,
Swissprot, SP-TrEMBL, and
FastA.
-Support to BSML format (Bio-Sequence
Markup Language).
- The functions of individual GCG programs such as
FromEMBL, FromGenBank, FromFastA, and ToFastA have
been incorporated in SeqConv+.
Here is an example of using SeqConv+ for
converting the Hemolysin Precursor sequence in RSF
format into a FastA format sequence.
% SeqConv+
SeqConv+ is a new (batch) sequence
conversion utility. The program can convert one or more sequence files into a
specified format [BSML, GB (GenBank), FASTA, EMBL, SPT (SPTrEMBL),
SW (SwissProt), RSF, SSF (GCG) and MSF]. With multiple files, SeqConv can
either convert each file into a separate file or concatenate them all into one
file.
SeqConv+ of what sequence(s) ?
P15320.RSF
Desired output format (* BSML *) ? FASTA
Written 1 sequence from P15320.RSF to
P15320.fa
>P15320 Hemolysin
precursor.
MKNNNFRLSAAGKLAAALAIILAASAGAYAAEIVAANGANGPGVSTAATGAQVVDIVAPN
GNGLSHNQYQDFNVNQPGAVLNNSREAGLSQLAGQLGANPNLGGREASVILNEVIGRNPS
LLHGQQEIFGMAADYVLANPNGISCQSCGFINTSHSSLVVGNPLVENGVLQGYSTFGNRN
TLSLNGTLNAGGVLDLIAPKIDSRGEVIVQDFKQSNGKVTSAAINAISGLNRVARDGTVQ
ASQQMPTALDSYYLGSMQAGRINIINTAQGSGVKLAGSLNAGDELKVKAYDIRSESRVDD
ASSNKNGGDNYQNYRGGIYVNDRSSSQTLTRTELKGKNISLVADNHAHLTATDIRGEDIT
LQGGKLTLDGQQLKQTQGHTDDRWFYSWQYDVTREREQLQQAGSTVAASGSAKLISTQED
VKLLGANVSADRALSVKAARDVHLAGLVEKDKSSERGYQRNHTSSLRTGRWSNSDESESL
KASELRSEGELTLKAGRNVSTQGAKVHAQRDLTIDADNQIQVGVQKTANAKAVRDDKTSW
GGIGGGDNKNNSNRREISHASELTSGGTLRLNGQQGVTITGSKARGQKGGEVTATHGGLR
IDNALSTTVDKIDARTGTAFNITSSSHKADNSYQSSTASELKSDTNLTLVSHKDADVIGS
QVASGGELSVESKTGNINVKAAERQQNIDEQKTALTVNGYAKEAGDKQYRAGLRIEHTRD
SEKTTRTENSASSLSGGSVKLKAEKDVTFSGSKLVADKGDASVSGNKVSFLAADDKTASN
TEQTKIGGGFYYTGGIDKLGSGVEAGYENNKTQAQSSKAITSGSDVKGNLTINARDKLTQ
QGAQHSVGGAYQENAAGVDHLAAADTASTTTTKTDVGVNIGANVDYSAVTRPVERAVGKA
AKLDATGVINDIGGIGAPNVGLDIGAQGGSSEKRSSSSQAVVSSVQAGSIDINAKGEVRD
QGTQYQASKGAVNLTADSHRSEAAANRQDEQSRDTRGSAGVRVYTTTGSDLTVDAKGEGG
TQRSNSSASQAVTGSIDAANGINVNVKKDAIYQGTALNGGRGKTAVNAGGDIRLDQASDK
QSESRSGFNVKASAKGGFTADSKNFGAGFGGGTHNGESSSSTAQVGNISGQQGVELKAGR
DLTLQGTDVKSQGDVSLSAGNKVALQAAESTQTRKESKLSGNIDLGAGSSDSKEKTGGNL
SAGGAFDIAKVNESATERQGATIASDGKVTLSANGKGDDALHLQGAKVSGGSAALEAKNG
GILLESAKNEQHKDNWSLGIKANAKGGQTFNKDAGGKVDPNTGKDTHTLGAGLKVGVEQQ
DKTTHANTGITAGDVTLNSGKDTRLAGARVDADSVQGKVGGDLHVESRKDVENGVKVDVD
AGLSHSNDPGSSITSKLSKVGTPRYAGKVKEKLEAGVNKVADATTDKYNSVARRLDPQQD
TTGAVSFSKAEGKVTLPATPAGEKPQGPLWDRGARTVGGAVKDSITGPAGRQGHLKVNAD
VVNNNAVGEQSAIAGKNGVALQVGGQTQLTGGEIRSQQGKVELGGSQVSQQDVNGQRYQG
GGRVDAAATVGGLLGGAAKQSVAGNVPFASGHASTQQADAKAGVFSGK
The input to SeqConv+ is one or more
nucleotide or protein sequences. You can specify multiple sequences in a number
of ways: by using a list file, for example @project.list;
by using an MSF or RSF file, for example project.msf{*}; or by using a sequence specification with an asterisk
(*) wildcard, for example GenBank:*.
SeqManip+:
SeqManip+ is a utility program that allows the user to perform some
manipulations of sequences, including translation, back translation of protein
sequences, splitting sequences.
Reformat:
Reformat rewrites sequence file(s), scoring matrix file(s), or enzyme data
file(s) so that they can be read by GCG programs.
All
parameters for this program may be added to the command line. Use -check to
view the summary below and to specify parameters before the program executes. In
the syntax summary below, square brackets ([ and ])
enclose parameter values that are optional. For each program parameter, square
brackets enclose the type of parameter value specified, the default parameter
value, and shortened forms of the parameter name, aliases. Programs with a plus in the name use
either the full parameter name or a specified alias. If “Type” is
“Boolean”, then the presence of the parameter on the command line
indicates a true condition. A false condition needs to be stated as, parameter=false.
Minimal Syntax: % seqconv+ [-infile=]value -Default
Minimal Parameters
(case-insensitive):
-infile
[Type: List / Default: EMPTY / Aliases: infile1 in]
Input file specification.
Prompted Parameters (case-insensitive):
-format
[Type:
String / Default: 'BSML' / Aliases: fmt]
The
desired output format for the files. Should be one of BSML, GB
(GenBank), FASTA, EMBL, SPT (SPTrEMBL), SW (SwissProt), RSF, SSF(GCG) and MSF.
Optional Parameters (case-insensitive):
-check [Type:
Boolean / Default: 'false' / Aliases: che help]
Prints out this usage message.
-default
[Type: Boolean / Default: 'false' / Aliases: d def]
Specifies that sensible default values be used for all
parameters where possible.
-documentation [Type: Boolean / Default: 'true'
/ Aliases: doc]
Prints banner at program startup.
-quiet
[Type: Boolean / Default: 'false' / Aliases: qui]
Tells application to print only a minimal amount of
information.
-outfile
[Type: OutFile / Default: EMPTY / Aliases:
out]
File to
which all input files are concatenated. A value of '-' means STDOUT.
Specifying this option also turns on the 'concat'
option. Default value is 'SeqConvOut.EXT'.
-concat
[Type: Boolean / Default: 'false']
Flag which governs whether all input files should be concatenated into a
single output file.
-informat [Type:
String / Default: 'BSML' / Aliases: infmt]
The
specified input format for files whose format can not be detected
automatically. Note that this is not a way to filter out only files of
the desired format. Also, if the format can be determined automatically, it
will not be overridden by the given informat value
unless you also specify the 'force' flag. Should be one of BSML, GB (GenBank),
FASTA, EMBL, SPT (SPTrEMBL), SW (SwissProt), RSF,
SSF(GCG), and PHY (Phylip).
-force
[Type: Boolean / Default: 'false']
Forces all input files to read
according to the format specified by the 'informat'
parameter. If a file doesn't conform to the given format, a warning will be
written.
-preserveannot [Type: Boolean
/ Default: 'true' / Aliases: annot]
Attempt to preserve all of the
data (seq + annotations) rather than just preserve
file name and sequence data.
-summary
[Type: Boolean / Default: 'true']
Print a summary of all conversions.
-breakup
[Type: String / Default: EMPTY / Aliases: extract split] each sequence
converted will be saved to its own output file. This option is incompatible
with concatenation option.
You
can set the parameters listed below from the command line. Shortened forms of
the parameter name, aliases, are shown, separated by commas.
-format, -fmt
The desired
output format for the files. This should be one of BSML, GB (GenBank),
FASTA, EMBL, SPT (SPTrEMBL), SW (SwissProt), RSF, SSF(GCG), and MSF.
-infile, -in,
-infile1
Input file specification.
-check, -che, -help
Prints out this usage message.
-default, -def, -d
Specifies that
sensible default values be used for all parameters where possible.
-documentaion, -doc
Prints banner at program startup.
-quiet, -qui
This parameter is not
supported.
-outfile, -out
File to which
all input files are concatenated. Specifying this option also turns on
the 'concat' option. Default value is 'SeqConvOut.EXT'
-concat
Flag which
governs whether all input files should be concatenated into a single output
file.
-informat,
-infmt
The specified
input format for files, whose format can not be detected automatically.
Note that this is not a way to filter out only files of the desired format.
Also, if the format can be determined automatically, it will not be overridden
by the given information value unless you also specify the ‘force’
flag. This value should be one of BSML, GB (GenBank), FASTA, EMBL, SPT (SPTrEMBL), SW (SwissProt), RSF, SSF(GCG), or PHY (Phylip)
-force
Forces all input files to read according
to the format specified by the 'informat' parameter.
If a file doesn't conform to the given format, a warning will be written.
-preserveannot,
-annot
Attempt to
preserve all of the data (seq + annotations) rather
than just preserve file name and sequence data
-summary
Writes a summary of the program's completion to the screen.
A summary typically displays at the end of a program run interactively. You can
suppress the summary for a program run interactively with -summary=false.
You
can also use this parameter to cause a summary of the program's work to be
written in the log file of a program run in batch.
-breakup, -extract,
-split
Each sequence
converted will be saved to its own output file. This option shall not be used
along with -concat option
Printed:
May 26, 2005
11:40
[Genhelp | Program Manual |
User's Guide | Data
Files | Databases | Release Notes ]
Technical
Support: support-us@accelrys.com, support-japan@accelrys.com,
or support-eu@accelrys.com
Copyright (c) 1982-2005 Accelrys Inc. All
rights reserved.
Licenses and Trademarks: Discovery Studio
®, SeqLab ®, SeqWeb ®, SeqMerge ®, GCG ® and, the GCG logo
are registered trademarks of Accelrys Inc.
All other product names mentioned in this
documentation may be trademarks, and if so, are trademarks or registered
trademarks of their respective holders and are used in this documentation for
identification purposes only.