SEQCONV+

[Genhelp | Program Manual | User's Guide | Data Files | Databases | Release Notes ]

 

Table of Contents

FUNCTION

DESCRIPTION

EXAMPLE

OUTPUT

INPUT FILES

RELATED PROGRAMS

COMMAND-LINE SUMMARY

PARAMETER REFERENCE


FUNCTION

[Top | Next ]

SeqConv+ is a utility program that provides batch conversions between different sequence formats.

DESCRIPTION

[Previous | Top | Next ]

Advantages of Plus “+” Programs:

 

P      Plus programs are enhanced to be able to read sequences in a variety of native formats such as GCG RSF, GCG SSF, GCG MSF, GenBank, EMBL, FastA, SwissProt, PIR, and BSML without conversion.

 

P      Plus programs remove sequence length restriction of 350,000bp.

 

If you do not need these features and wish to have more interactivity, you might wish to seek out and run the original program version.

SeqConv+ rewrites the sequence files into any of the standard sequence formats. The following are some of the operations that SeqConv+ can perform:

- Converting single sequence files that were prepared or edited with a text editor into GCG format.

- Conversion between multiple sequence (MSF), rich sequence (RSF) and single sequence (SSF) GCG formats.

- Interconversion between standard sequence file formats- GenBank, Genpept, Swissprot, SP-TrEMBL, and FastA.

-Support to BSML format (Bio-Sequence Markup Language).

- The functions of individual GCG programs such as FromEMBL, FromGenBank, FromFastA, and ToFastA have been incorporated in SeqConv+.

EXAMPLE

[Previous | Top | Next ]

Here is an example of using SeqConv+ for converting the Hemolysin Precursor sequence in RSF format into a FastA format sequence.

% SeqConv+

SeqConv+ is a new (batch) sequence conversion utility. The program can convert one or more sequence files into a specified format [BSML, GB (GenBank), FASTA, EMBL, SPT (SPTrEMBL), SW (SwissProt), RSF, SSF (GCG) and MSF]. With multiple files, SeqConv can either convert each file into a separate file or concatenate them all into one file.

SeqConv+ of what sequence(s) ? P15320.RSF

Desired output format (* BSML *) ? FASTA

Written 1 sequence from P15320.RSF to P15320.fa

 

OUTPUT

[Previous | Top | Next ]

>P15320 Hemolysin precursor.

MKNNNFRLSAAGKLAAALAIILAASAGAYAAEIVAANGANGPGVSTAATGAQVVDIVAPN

GNGLSHNQYQDFNVNQPGAVLNNSREAGLSQLAGQLGANPNLGGREASVILNEVIGRNPS

LLHGQQEIFGMAADYVLANPNGISCQSCGFINTSHSSLVVGNPLVENGVLQGYSTFGNRN

TLSLNGTLNAGGVLDLIAPKIDSRGEVIVQDFKQSNGKVTSAAINAISGLNRVARDGTVQ

ASQQMPTALDSYYLGSMQAGRINIINTAQGSGVKLAGSLNAGDELKVKAYDIRSESRVDD

ASSNKNGGDNYQNYRGGIYVNDRSSSQTLTRTELKGKNISLVADNHAHLTATDIRGEDIT

LQGGKLTLDGQQLKQTQGHTDDRWFYSWQYDVTREREQLQQAGSTVAASGSAKLISTQED

VKLLGANVSADRALSVKAARDVHLAGLVEKDKSSERGYQRNHTSSLRTGRWSNSDESESL

KASELRSEGELTLKAGRNVSTQGAKVHAQRDLTIDADNQIQVGVQKTANAKAVRDDKTSW

GGIGGGDNKNNSNRREISHASELTSGGTLRLNGQQGVTITGSKARGQKGGEVTATHGGLR

IDNALSTTVDKIDARTGTAFNITSSSHKADNSYQSSTASELKSDTNLTLVSHKDADVIGS

QVASGGELSVESKTGNINVKAAERQQNIDEQKTALTVNGYAKEAGDKQYRAGLRIEHTRD

SEKTTRTENSASSLSGGSVKLKAEKDVTFSGSKLVADKGDASVSGNKVSFLAADDKTASN

TEQTKIGGGFYYTGGIDKLGSGVEAGYENNKTQAQSSKAITSGSDVKGNLTINARDKLTQ

QGAQHSVGGAYQENAAGVDHLAAADTASTTTTKTDVGVNIGANVDYSAVTRPVERAVGKA

AKLDATGVINDIGGIGAPNVGLDIGAQGGSSEKRSSSSQAVVSSVQAGSIDINAKGEVRD

QGTQYQASKGAVNLTADSHRSEAAANRQDEQSRDTRGSAGVRVYTTTGSDLTVDAKGEGG

TQRSNSSASQAVTGSIDAANGINVNVKKDAIYQGTALNGGRGKTAVNAGGDIRLDQASDK

QSESRSGFNVKASAKGGFTADSKNFGAGFGGGTHNGESSSSTAQVGNISGQQGVELKAGR

DLTLQGTDVKSQGDVSLSAGNKVALQAAESTQTRKESKLSGNIDLGAGSSDSKEKTGGNL

SAGGAFDIAKVNESATERQGATIASDGKVTLSANGKGDDALHLQGAKVSGGSAALEAKNG

GILLESAKNEQHKDNWSLGIKANAKGGQTFNKDAGGKVDPNTGKDTHTLGAGLKVGVEQQ

DKTTHANTGITAGDVTLNSGKDTRLAGARVDADSVQGKVGGDLHVESRKDVENGVKVDVD

AGLSHSNDPGSSITSKLSKVGTPRYAGKVKEKLEAGVNKVADATTDKYNSVARRLDPQQD

TTGAVSFSKAEGKVTLPATPAGEKPQGPLWDRGARTVGGAVKDSITGPAGRQGHLKVNAD

VVNNNAVGEQSAIAGKNGVALQVGGQTQLTGGEIRSQQGKVELGGSQVSQQDVNGQRYQG

GGRVDAAATVGGLLGGAAKQSVAGNVPFASGHASTQQADAKAGVFSGK

 

INPUT FILES

[Previous | Top | Next ]

The input to SeqConv+ is one or more nucleotide or protein sequences. You can specify multiple sequences in a number of ways: by using a list file, for example @project.list; by using an MSF or RSF file, for example project.msf{*}; or by using a sequence specification with an asterisk (*) wildcard, for example GenBank:*.

RELATED PROGRAMS

[Previous | Top | Next ]

SeqManip+: SeqManip+ is a utility program that allows the user to perform some manipulations of sequences, including translation, back translation of protein sequences, splitting sequences.

Reformat: Reformat rewrites sequence file(s), scoring matrix file(s), or enzyme data file(s) so that they can be read by GCG programs.

COMMAND-LINE SUMMARY

[Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -check to view the summary below and to specify parameters before the program executes. In the syntax summary below, square brackets ([ and ]) enclose parameter values that are optional. For each program parameter, square brackets enclose the type of parameter value specified, the default parameter value, and shortened forms of the parameter name, aliases.  Programs with a plus in the name use either the full parameter name or a specified alias. If “Type” is “Boolean”, then the presence of the parameter on the command line indicates a true condition. A false condition needs to be stated as, parameter=false.

Minimal Syntax: % seqconv+ [-infile=]value -Default

 

 

Minimal Parameters (case-insensitive):

 

-infile         [Type: List / Default: EMPTY / Aliases: infile1 in]

                Input file specification.

 

Prompted Parameters (case-insensitive):

 

-format         [Type: String / Default: 'BSML' / Aliases: fmt]

    The desired output format for the files. Should be one of BSML, GB (GenBank), FASTA, EMBL, SPT (SPTrEMBL),   SW (SwissProt), RSF, SSF(GCG) and MSF.

 

Optional Parameters (case-insensitive):

 

-check          [Type: Boolean / Default: 'false' / Aliases: che help]

                Prints out this usage message.

 

-default        [Type: Boolean / Default: 'false' / Aliases: d def]

                Specifies that sensible default values be used for all parameters where possible.

 

-documentation  [Type: Boolean / Default: 'true' / Aliases: doc]

                Prints banner at program startup.

 

-quiet          [Type: Boolean / Default: 'false' / Aliases: qui]

                Tells application to print only a minimal amount of information.

 

-outfile        [Type: OutFile / Default: EMPTY / Aliases: out]

    File to which all input files are concatenated. A value of '-' means STDOUT. Specifying this option also     turns   on the 'concat' option. Default value is 'SeqConvOut.EXT'.

 

-concat         [Type: Boolean / Default: 'false']

                Flag which governs whether all input files should be concatenated into a single output file.

 

-informat       [Type: String / Default: 'BSML' / Aliases: infmt]

The specified input format for files whose format can not be detected automatically. Note that this is not a way to filter out only files of the desired format. Also, if the format can be determined automatically, it will not be overridden by the given informat value unless you also specify the 'force' flag. Should be one of BSML, GB (GenBank), FASTA, EMBL, SPT (SPTrEMBL), SW (SwissProt), RSF, SSF(GCG), and PHY (Phylip).

 

-force          [Type: Boolean / Default: 'false']

Forces all input files to read according to the format specified by the 'informat' parameter. If a file doesn't conform to the given format, a warning will be written.

 

-preserveannot  [Type: Boolean / Default: 'true' / Aliases: annot]

Attempt to preserve all of the data (seq + annotations) rather than just preserve file name and sequence data.

 

-summary        [Type: Boolean / Default: 'true']

                Print a summary of all conversions.

 

-breakup        [Type: String / Default: EMPTY / Aliases: extract split] each sequence converted will be saved to its own output file. This option is incompatible with concatenation option.

 

 

PARAMETER REFERENCE

[ Previous | Top ]

You can set the parameters listed below from the command line. Shortened forms of the parameter name, aliases, are shown, separated by commas.

-format, -fmt

 

The desired output format for the files. This should be one of BSML, GB (GenBank), FASTA, EMBL, SPT (SPTrEMBL),   SW (SwissProt), RSF, SSF(GCG), and MSF.

 

-infile, -in, -infile1

 

Input file specification.

 

-check, -che, -help   

 

Prints out this usage message.

 

-default, -def, -d

 

 Specifies that sensible default values be used for all parameters where possible.

 

-documentaion, -doc

 

 Prints banner at program startup.

 

-quiet, -qui  

      

This parameter is not supported.

 

-outfile, -out

     

File to which all input files are concatenated. Specifying this option also turns on the 'concat' option. Default value is 'SeqConvOut.EXT'

 

-concat 

     

Flag which governs whether all input files should be concatenated into a single output file.

 

-informat, -infmt 

    

The specified input format for files, whose format can not be detected automatically. Note that this is not a way to filter out only files of the desired format. Also, if the format can be determined automatically, it will not be overridden by the given information value unless you also specify the ‘force’ flag. This value should be one of BSML, GB (GenBank), FASTA, EMBL, SPT (SPTrEMBL), SW (SwissProt), RSF, SSF(GCG), or PHY (Phylip)

 

-force   

     

Forces all input files to read according to the format specified by the 'informat' parameter. If a file doesn't conform to the given format, a warning will be written.

 

-preserveannot, -annot 

 

Attempt to preserve all of the data (seq + annotations) rather than just preserve file name and sequence data

           

-summary   

Writes a summary of the program's completion to the screen. A summary typically displays at the end of a program run interactively. You can suppress the summary for a program run interactively with -summary=false.

You can also use this parameter to cause a summary of the program's work to be written in the log file of a program run in batch.

-breakup, -extract, -split 

     

Each sequence converted will be saved to its own output file. This option shall not be used along with -concat option

 

Printed: May 26, 2005 11:40


[Genhelp | Program Manual | User's Guide | Data Files | Databases | Release Notes ]


Technical Support: support-us@accelrys.com, support-japan@accelrys.com,
or support-eu@accelrys.com

Copyright (c) 1982-2005 Accelrys Inc. All rights reserved.

Licenses and Trademarks: Discovery Studio ®, SeqLab ®, SeqWeb ®, SeqMerge ®, GCG ® and, the GCG logo are registered trademarks of Accelrys Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

www.accelrys.com/bio