GELASSEMBLE

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents
FUNCTION
DESCRIPTION
EXAMPLE
SELECTING A CONTIG
THE EDIT SCREEN
SCREEN MODE
COMMAND MODE
RELATED PROGRAMS
RESTRICTIONS
CONSIDERATIONS
SUGGESTIONS
COMMAND-LINE SUMMARY
ACKNOWLEDGEMENTS
LOCAL DATA FILES
PARAMETER REFERENCE

FUNCTION

[ Top | Next ]

GelAssemble is a multiple sequence editor for viewing and editing contigs assembled by GelMerge.

DESCRIPTION

[ Previous | Top | Next ]

See the Fragment Assembly System (FAS) Introduction for an overview of working with the programs within the FAS to assemble sequences in a sequencing project.

GelMerge takes unassembled fragment sequences in a sequencing project database and creates complete assemblies, called contigs. Each contig is a multiple alignment of contiguous, overlapping sequences in the project database. You can edit these sequence alignments in GelAssemble, allowing you to improve the alignments and resolve inconsistencies among the aligned fragments in regions of overlap. With GelAssemble, you can also combine separate contigs into a single assembly, or conversely split a single assembly into more than one contig. By allowing you complete freedom to manipulate the contigs in the project database, without relying solely on the automatic assembly process of GelMerge, you maintain complete control over your sequencing project.

GelAssemble also provides commands to export information from the project database into your local directory. You can copy any fragment or consensus sequence to a file in you current working directory where you can use it as input to other GCG programs. You can write a Big Picture schematic representation of the alignment of fragments in a contig. You can also write the contig's aligned sequences into a file that resembles the output displayed by the Pretty program.

EXAMPLE

[ Previous | Top | Next ]

The example screens below are from a session with GelAssemble using "myproject", the fragment assembly project created in the example sessions of GelStart, GelEnter, and GelMerge.


% gelassemble

SELECTING A CONTIG

[ Previous | Top | Next ]

When you first run GelAssemble, your screen looks similar to this:

The first contig in the sequencing project is displayed on the screen. Each contig is named after the first fragment in the contig -- the fragment on line 2 of the screen display. All of the available commands are displayed at the bottom of the screen.

Scrolling Through the Contig List

You can scroll through the list of contigs using the <Left-arrow> and <Right-arrow> keys to display the preceding and following contigs, respectively. If there is not enough room for the entire contig to display on the screen at once, you can use the <Up-arrow> and <Down-arrow> keys to scroll a line at a time.

Selecting a Contig for Editing

You can choose a new contig to load into the GelAssemble Edit Screen by pressing <Ctrl>K when that contig is displayed on the screen. You can move to the Edit Screen without selecting a new contig by pressing <Ctrl>D. If a contig was previously loaded into the Edit Screen, press <Ctrl>D to return to that contig.

THE EDIT SCREEN

[ Previous | Top | Next ]

After you select a contig for editing, GelAssemble loads it into the multiple alignment editor Edit Screen where you can view and edit the alignment. If you select the first contig in the example fragment assembly project, "myproject," your screen should look something like this:

The upper half of the screen displays the sequence alignment. The bottom row of the alignment, row 1, is reserved for the consensus sequence. Symbols in a fragment sequence that disagree with the consensus appear in reverse video. The cursor also appears in reverse video, indicating your current position in the alignment. Above the sequence alignment is a summary of the current cursor position. This summary includes the name of the fragment on which the cursor is positioned, the absolute position of the cursor in the alignment, and the position of the cursor relative to the beginning of the current fragment.

The lower half of the screen contains the Big Picture schematic. Each fragment sequence on the screen is represented by a bar diagram in this schematic. The bar indicates the length of each fragment as well as its relative position and orientation in the alignment. The letters A, M, and L occasionally appear to the left of a bar, indicating that the corresponding fragment is either Anchored, Modified, or Locked, respectively (these terms are described below). The position of the cursor in the sequence alignment is indicated by an asterisk (*) at the corresponding position of the Big Picture schematic.

When the cursor is positioned in the alignment, you can enter Screen Mode commands to move the cursor around the alignment, edit the fragment sequences, and modify the alignment. These commands are described below under the SCREEN MODE topic. From Screen Mode, use <Ctrl>D to enter Command Mode. The cursor moves down to the lower-left corner of the screen next to the Command: prompt where you can type commands that enable you to: 1) save your changes to the contig to the sequencing project database; 2) write a file containing the contig sequence alignment; 3) export any fragment or consensus sequences out of the Fragment Assembly project database into individual sequence files for use with other GCG programs; and 4) choose a different contig for editing. These commands are described below under the COMMAND MODE topic. There is some overlap in function between Screen Mode and Command Mode commands. In general, Screen Mode commands either move the cursor or edit a sequence, while Command Mode commands manipulate entire sequences or contigs.

SCREEN MODE

[ Previous | Top | Next ]

Moving the Cursor

In Screen Mode, the cursor shows your position in the current fragment sequence. You can move around in the sequence, add and delete symbols, and search for patterns. Use the <Left-arrow> and <Right-arrow> keys to move the cursor within a single sequence. Use the <Up-arrow> and <Down-arrow> keys to move the cursor to different sequences on different lines in the Edit Screen. The entire alignment is loaded, but only a portion of it may be visible at any time on your terminal screen. If you try to move the cursor near the edge of the screen with the <Left-arrow> and <Right-arrow> keys, the display scrolls to show more of the alignment in that direction. If you try to move beyond the top or bottom of the currently-displayed alignment, the display scrolls to show additional sequences in that direction. Use the < and > keys to scroll the display horizontally, one screen at a time. These keys provide a quick way to review the entire alignment.

You can also quickly move through the alignment. For example, type a number and press <Return> to move the cursor to that position in the current sequence. Use <Ctrl>E to move the cursor to the end of the current sequence. Use <Ctrl>A to move the cursor to the next position in the alignment where at least one of the sequences disagrees with the consensus symbol. Use <Ctrl>R to move the cursor to the next position where the current fragment sequence disagrees with the consensus symbol. Use <Ctrl>V to move the cursor to the next position in the alignment where a gap character is found in the contig consensus.

Other commands for moving the cursor to different parts of the alignment are described under the COMMAND MODE topic, below.

Expanding the Alignment Display

A contig may contain so many sequences that overlap at the same position, that the entire alignment at that position cannot be viewed on the screen at once. Use <Ctrl>L to remove the Big Picture display from the screen and display additional lines of the sequence alignment. You can toggle the screen to restore the Big Picture display by pressing <Ctrl>L again.

Editing a Sequence

You can insert any valid GCG sequence symbol (see Appendix III) into the sequence by typing the symbol; it is inserted at the cursor. <Ctrl>H or the <Delete> key deletes symbols to the left of the cursor, one by one. Any sequence symbol, <Ctrl>H, and the <Delete> key can be preceded by a number, indicating how many symbols to add or delete. For example, use 10<Delete> to removed the 10 symbols to the left of the cursor. You can edit a different fragment sequence on the screen simply by moving the cursor to the line containing that sequence.

<Ctrl>O toggles between Insert and OverStrike mode for editing. OverStrike mode can be useful for changing sequence symbols because it first deletes a symbol and then inserts a new symbol at the same position. The current editing mode is indicated in the lower-right corner of the Edit Screen.

To delete an entire column of the alignment, position the cursor on that column and press <Ctrl>X. Each time you use <Ctrl>X, the entire contig to the right of the cursor shifts to the left by one position to fill in the deleted column and maintain the appropriate alignment. Use <Ctrl>I to restore an entire column that was just deleted with <Ctrl>X. You only can restore a deleted column if you haven't moved the cursor away from the position of deletion on the screen. If you've deleted adjacent columns in the alignment by sequentially pressing <Ctrl>X several times, you can restore those deleted columns by sequentially pressing <Ctrl>I the same number of times. Up to 10 adjacent columns in the alignment can be restored with <Ctrl>I; if you've deleted more than 10, only the last 10 are restored.

Removing Sequences from a Contig

You may, at times, wish to remove individual fragment sequences from a contig without beginning the assembly process from scratch in GelMerge. For instance, GelMerge may have included a sequence in a contig that you think should not be aligned as part of that contig. You can remove any sequence from the current contig by placing the cursor on the line containing that sequence and pressing the minus key (-). If the remaining alignment is subsequently written to the sequencing project database as a new contig (see COMMAND MODE for more information), the removed fragment will be written to the sequencing project database as a single-fragment contig. If you quit GelAssemble without writing the contig to the sequencing project database, the fragment remains part of the larger contig.

You also can manually assemble separate contigs that do not share sufficient overlap to be automatically assembled by GelMerge. This function is available only in Command Mode (see COMMAND MODE for more information).

The Consensus Sequence

The consensus sequence on the bottom row of the sequence alignment is not updated automatically when you edit any of the fragment sequences. If you add a new sequence to the current contig (see COMMAND MODE for more information) the existing consensus sequence is not completely recalculated; only that part of the new fragment that extends beyond the ends of the current consensus is added to the consensus sequence. If you remove a sequence from the current contig (see "Removing Sequences from a Contig", above), only that part of the removed fragment that extended beyond the ends of the remaining alignment is removed from the consensus sequence.

To completely recalculate the consensus sequence from the fragment sequences currently loaded into the GelAssemble editor, use <Ctrl>G. This replaces what was on row 1 with the computed consensus. For information on recalculating only part of the consensus sequence, see the Command Mode Summary, below.

The consensus function is a simple measure of plurality. IUB nucleotide ambiguity symbols in the fragment sequences are treated as weighted representations of their constituent bases for the purpose of generating a consensus. For example, an R represents half A and half G. If there is no absolute plurality (that is, two or more bases are tied), then those bases tied for plurality are used to generate an IUB nucleotide ambiguity symbol for the consensus. If the most common symbol at a position is a gap character (. and ~), then the consensus contains a gap character at that position. If there is no unanimity among the loaded fragments at a particular column, then the displayed consensus symbol for that column is in lowercase.

The consensus sequence can be edited directly, just like other fragment sequences.

Finding Patterns

To search for a pattern, type a / (slash) in Screen Mode. The cursor moves to the lower-left corner of the screen where you can enter the sequence pattern you want to find. You can repeat the last search by simply using /<Return>. The search is case-insensitive, and GelAssemble does not understand nucleotide ambiguity symbols in pattern matching. For instance, /RAC matches RAC, but not GAC.

Positioning Fragments in the Contig Manually

To shift the entire sequence to the left, position the cursor at the beginning of the sequence and press <Delete>. To move the sequence to the right, place the cursor anywhere within the sequence and press the <Space Bar>. To move two or more positions either way, type the number of positions you want to move and press the appropriate key. For example, to move the sequence three positions to the right, type 3<Space Bar>.

Sequences aligned automatically with GelMerge should rarely require manual positioning.

Selecting Ranges for Deletion and Insertion

You can delete a sequence range from one position and insert it at another. To delete a sequence range, move the cursor to the beginning of the section you want to delete. Press <Ctrl>B and use the <Left-arrow> and <Right-arrow> keys to select the sequence range. The selected sequence appears in reverse video. Press <Ctrl>N to delete the selected range and temporarily place it in a buffer. Note: Only the last deleted range is saved in the temporary buffer.

To insert the last deleted range into the alignment again, position the cursor at the point of insertion and press <Ctrl>P. The sequence range appears at the new position.

Leaving Screen Mode

Use <Ctrl>D to leave Screen Mode and enter Command Mode.

Remember, to save any changes you make to an alignment, you must explicitly write the contig to the sequencing project database using an appropriate Command Mode command. (See "Writing Contigs" under the COMMAND MODE topic for more information.)

Screen Mode Summary

Here is the summary of Screen Mode commands you would see by typing ? in Screen Mode:


                                  SCREEN MODE

                     [n] is an optional numeric parameter

Keys Pressed                             Action

[n]<Right-arrow>                      move ahead [n bases]
[n]<Left-arrow>                       move back [n bases]
[n]<Up-arrow>                         move up [to row n]
[n]<Down-arrow>                       move down [to row n]
 >                                    scroll one screen to the right
 <                                    scroll one screen to the left
1<Return>                             move to start of the sequence
<Ctrl>E                               move to end of the sequence
/GATTC<Return>                        find next occurrence of GATTC
165<Return>                           move to base 165 in sequence
<Ctrl>A                               move to next ambiguity in alignment
<Ctrl>R                               move to next ambiguity in sequence
<Ctrl>V                               move to next gap in consensus
<Ctrl>D                               enter Command Mode
<Ctrl>L                               toggle alignment display enlargement
<Ctrl>W                               redraw the screen
<Ctrl>O                               toggle INSERT/OVERSTRIKE mode
 !                                    summary of current sequence
 ?                                    display these help screens
<Ctrl>G                               recalculate the consensus
G A T C ....                          add base at the cursor
<Delete>                              delete a base, or move sequence left
<Ctrl>H                               delete a base, or move sequence left
<Space bar>                           move the sequence to the right
<Ctrl>X                               delete alignment column
<Ctrl>I                               restore alignment column
<Ctrl>B                               begin selecting a range for removal
<Ctrl>N                               remove the selected range
<Ctrl>P                               insert the removed range
  -                                   reject current fragment

COMMAND MODE

[ Previous | Top | Next ]

To enter Command Mode from Screen Mode, use <Ctrl>D. The cursor moves down to the lower-left corner of the screen next to Command: where you can enter any of the commands shown below followed by a <Return>.

There is some overlap in function between Command Mode and Screen Mode commands. In general, Screen Mode commands either move the cursor or edit a sequence; Command Mode commands manipulate entire sequences or contigs. To save changes you've made to the contig currently loaded in the GelAssemble editor, you must enter one of the appropriate Command Mode commands . (See "Writing Contigs" in this topic for more information.)

Editing GelAssemble Commands

GelAssemble command editing is modeled on the OpenVMS DCL command-line editing. The <Left-arrow> and <Right-arrow> keys let you move your cursor around in a command you've typed; you can insert or delete characters at any position. <Ctrl>E moves the cursor to the end of the line. <Ctrl>U deletes all characters from the current cursor position to the start of the command.

Returning to Screen Mode

Press <Return> in Command Mode to return to Screen Mode (described above).

Commands May Be Shortened

Only the capitalized portion of the commands described in the documentation below needs to be typed.

Parameters Are Used With Commands

Some commands can be preceded with numeric parameters or followed with a file name. The square brackets ([ and ]) in the documentation below show command parameters that are optional, meaning you can leave them out.

Anchored, Locked, and Modified Fragments

By default, GelAssemble allows you to move and edit the fragments loaded in the contig editor independently of one another. Also, by default, you can freely edit any fragment sequence on which the cursor is positioned. You may want to override these default settings under certain circumstances.

Using the : ANChor command, you can link, or anchor together, selected sequences that are loaded into the Edit Screen. Modifications to any anchored fragments, such as insertions, deletions, and sequence reversals, are propagated through all anchored fragment sequences in the current alignment. This preserves the alignment of all anchored fragments. For instance, the insertion of a single base within one anchored fragment causes the same base to be inserted at the same position in all other anchored fragments. An anchored fragment is denoted by an A next to the corresponding bar diagram in the Big Picture display. Even if all sequences are unanchored, you can still delete an entire column of the alignment at once with <Ctrl>X (see SCREEN MODE for more information).

When you load the first contig into the GelAssemble editor, all of the fragments in that contig are initially unanchored to one another. If you subsequently load another contig on top of an existing one in the editor using the : LOad command, all of the fragments in the new contig are anchored to one another. Furthermore, all previously anchored fragments become unanchored. This permits easy positioning of the new contig when you are manually attempting to align two existing contigs.

The : LOCk command protects the specified sequence(s) from accidental modification in the GelAssemble editor. GelAssemble will not allow insertions, deletions, or reversal of a locked fragment. A locked fragment is denoted by an L next to the corresponding bar diagram in the Big Picture display.

An M adjacent to a bar in the Big Picture display means that the corresponding fragment sequence has been modified in some way (e.g. insertion, deletion, reversal) in the current editing session.

Writing Contigs

The only way to save modifications you've made to a contig during the current session with GelAssemble is by writing the contig to the sequencing project database.

You can save your modifications to a contig using two commands: : WRite and : EXit. The : WRite command saves the contig alignment as well as the contig consensus sequence to the project database. The contig is named after the left-most sequence in the contig. The : EXit command works the same as : WRite, but GelAssemble exits after saving the contig to the sequencing project database. To quit GelAssemble without writing the contig to the project database, use : QUIT.

The consensus is not recalculated automatically when you write a contig to the sequencing project database; you must recalculate the consensus sequence explicitly. Use either <Ctrl>G in Screen Mode or the : CONSensus command in Command Mode to update the consensus sequence before writing the contig.

Manually Assembling Contigs

You can use GelAssemble to assemble contigs manually that do not share sufficient overlap to be assembled automatically with GelMerge. Once you've entered a single contig into the editor, you can use the : LOad command to enter other contigs on top of the existing one. You can position the contigs manually (see "Positioning Fragments in the Contig Manually", above) and insert gap characters to create the desired alignment. If you then use the : WRite command, the entire alignment is written to the sequencing project database as a single contig. The new contig is named after the left-most fragment sequence in the contig.

Conversely, you can remove individual fragment sequences from a contig as described above (see "Removing Sequences from a Contig", above). The : REJect command in Command Mode performs the same function as pressing the minus key (-) in Screen Mode.

To prevent the sequencing project database from being corrupted, you are not allowed to store duplicate entries of any fragment sequence in the database during manual assembly. (Automatic assembly by GelMerge never stores duplicate entries of any sequence in the database.) GelAssemble does not permit you to store a contig containing a single fragment found at more that one position in the alignment. If you accidentally create such an alignment on the screen, you can remove redundant copies of each duplicated sequence with the : NODUPlicate command (see "Command Mode Summary" for more information). If you use the : LOad command to purposely place copies of a single fragment at more than one position in the same contig, you can retain the duplicate copies by renaming them. Position the cursor on a duplicated fragment sequence and use the : SPAWN command (see "Command Mode Summary" for more information) to rename that fragment. Essentially, you are adding a new fragment sequence to the project database.

Displaying Contig Alignments

You can write the contig alignment to a file in your local directory with the : PRETTYout command (see "Command Mode Summary" for more information). You can write the Big Picture schematic (lower half of the Edit Screen) to a file in your local directory with the : BIGPICture command.

Analyzing Fragment Sequences with Other GCG Programs

Use the : SEQOUT command to write the fragment and consensus sequences to individual sequence files in your local directory (see "Command Mode Summary" for more information). You can then use other GCG programs to analyze the sequences in these files.

Selecting Another Contig to Edit

When you first run GelAssemble, you scroll through the list of contigs from which you select one to load into the GelAssemble Edit Screen. Once you move to the Edit Screen, you can use the : CONTIGs Command Mode command to return to this contig list, enabling you to select a new contig for editing. CAUTION -- Selecting a new contig for editing erases all previous work from the Edit Screen. Therefore, use the : WRite command to save the current contig before using the : CONTIGs command. However, you can restore the current Edit Screen after issuing the : CONTIGs command if you press <Ctrl>D while scrolling through the list of contigs.

Command Mode Summary

Here is a summary of Command Mode commands you see with the : Help command:


                                 COMMAND MODE

                     [a,b] specifies a range of fragments.
                       [x,y] specifies a range of bases.
                     [n] is an optional numeric parameter.

      EDit [ContigName]     replace current contig with a new contig
      CONTIGs               select another contig for editing
      WRite                 write a contig to the database
      EXit                  write the contig and quit
      QUIT                  quit without writing
      ERASE                 delete current contig from the database
      238                   move to position 238 in the current fragment
[x,y] PRETTYout [FileName]  write the sequence alignment [position x - y]
[a,b] SEQOUT                write fragments [a - b] to sequence files
      BIGPICture [FileName] write bar schematic to an output file
      OVERstrike            select OVERSTRIKE sequence edit mode
      NOOVERstrike          select INSERT sequence edit mode
[x,y] CONSensus             recalculate the consensus sequence
[a,b] LOCk                  lock strands [a through b]
[a,b] Unlock                unlock strands [a through b]
[x,y] SELect                select bases [x through y]
      REMove                remove the selected bases
[n]   INSert                insert the removed bases [at position n]
      CAncel                cancel the selection
[x,y] DElete                delete bases [x through y]
      GOTo [FragmentName]   move to strand by name
      FInd GAATC            find the next occurrence of GAATC
      DIfferences           show differences from the consensus
      MAtches               show matches with the consensus
      Neither               show neither matches nor differences
      REDraw                redraw the screen
      Help                  display these help screens
      SORt [DEScending]     sorts strands by their offsets in alignment
[a,b] MOve                  moves a strand [from line a to line b]
      OPen                  opens a blank line at the cursor position
[a,b] ANChor                anchors strands [a through b]
[a,b] NOANchor              unanchors strands [a through b]
      LOad [ContigName]     loads another contig into the Edit Screen
      REVerse               reverse-complement the (anchored) strand(s)
[n]   Offset                shifts the current fragment [to begin at n]
      REJect                removes the current fragment from the screen
      NODUPlicate           removes a duplicated fragment from the screen
      SPAWN                 renames a duplicated fragment
      SEParate              makes two contigs from anchored and
                                 unanchored strands

For all commands requiring row numbers, the consensus strand is row 1.

: EDit [ContigName]

loads the aligned fragment sequences from the specified contig into the Edit Screen, replacing any fragment sequences currently on your screen. All unsaved changes to the replaced contig alignment and fragment sequences are lost. Use this command if you know the name of the new contig you wish to edit; otherwise scroll through the list of available contigs displayed with the : CONTIGs command.

: CONTIGs

returns you to the list of contigs, letting you select a new contig for editing. All modifications to the contig currently in the Edit Screen are lost if you haven't written them to the sequencing project database before selecting a new contig. If you haven't yet selected a new contig, use <Ctrl>D to return to the contig currently in the Edit Screen.

: WRite

writes the contig, including the alignment and all fragment sequences currently loaded into the Edit Screen, to the sequencing project database. All edits made to the contig alignment and the individual fragment sequences are preserved. The contig is named after the left-most fragment in the contig.

Note: The contig consensus sequence is not automatically recalculated before writing the contig.

: EXit

writes the contig, including the alignment and all fragment sequences currently loaded into the Edit Screen, to the sequencing project database and then exits GelAssemble. All changes made to the alignment and the individual fragment sequences are preserved. The contig is named after the left-most fragment in the contig.

Note: The contig consensus sequence is not automatically recalculated before writing the contig.

: QUIT

exits GelAssemble without saving any of the changes made to the contig loaded in the Edit Screen. All changes made to the alignment and the individual fragment sequences since the last time the contig was written are lost.

: ERASE

If the fragment on which the cursor is positioned is a single-fragment contig, this command deletes it from the sequencing project database. You are prompted to confirm that you wish to delete the fragment. You cannot delete a fragment if any other fragments are loaded in the Edit Screen or if any fragments have been rejected from the Edit Screen. CAUTION -- If you use the : ERASE command, all copies of the deleted fragment are removed from the database, including the archival copy.

: [x,y] PRETTYout [filename]

writes the alignment of all the fragments in the Edit Screen between absolute positions x and y (inclusive) to an output file in your local directory. If any fragments are anchored, only the alignment of those fragments is written. If you do not specify beginning and ending positions in the alignment, the entire length of the alignment is written.

: [a,b] SEQOUT [*]

writes each fragment from row a to row b (inclusive) to a separate output file in your current working directory. You can use other GCG programs to analyze the sequences in these files. GelAssemble prompts you for a file name for each fragment unless you add an asterisk (*) after the : SEQOUT command. If you do not specify any row numbers, : SEQOUT writes out the current fragment. If you specify only one row number, : SEQOUT writes out all fragments between that row and the row on which the cursor is positioned. Remember, the consensus row is 1.

: BIGPICture [filename]

writes the bar schematic (the lower half of the Edit Screen) to a file in your local directory.

: OVERstrike

sets OverStrike edit mode. Any sequence symbol typed at the cursor position will replace the existing sequence symbol at that position.

: NOOVERstrike

sets Insert edit mode. Any sequence symbol typed at the cursor position will be inserted at that position, shifting the entire sequence from that position to the right by one column.

: [x,y] CONSensus

calculates the consensus sequence from position x through position y in the alignment, and replaces what is on row 1 with the new consensus. If you don't supply a range, the entire consensus is recalculated.

The consensus function is a simple measure of plurality. IUB nucleotide ambiguity symbols in the fragment sequences are treated as weighted representations of their constituent bases for the purpose of generating a consensus. For example, an R represents half A and half G. If there is no absolute plurality (that is, two or more bases are tied), then those bases tied for plurality are used to generate an IUB nucleotide ambiguity symbol for the consensus. If the most common symbol at a position is a gap character (. and ~), then the consensus contains a gap character at that position. If there is no unanimity among the loaded fragments at a particular column, then the displayed consensus symbol for that column is in lowercase.

: [a,b] LOCk

locks the fragments on rows a through b to prevent accidental modification. If you omit the row numbers, the current fragment is locked.

: [a,b] Unlock

unlocks the fragments on rows a through b, once again permitting modification. If you omit the row numbers, the current fragment is unlocked.

: [x,y] SELect

selects and highlights bases x through y of the current fragment. If you omit the range, the selected range begins at the cursor position and is extended using the <Left-arrow> and <Right-arrow> keys.

: REMove

deletes the selected range and copies it into a buffer. If the selection was in an anchored fragment, the corresponding bases in the other anchored fragments are not removed.

: [n] INSert

inserts a copy of the removed bases (using the : REMove command) at position n in the current fragment. If you omit the position, the bases are inserted at the cursor. If the current fragment is anchored, the corresponding bases are not inserted in the other anchored fragments.

: CAncel

cancels the : SELect command, unhighlighting the currently selected range.

: [x,y] DElete

deletes bases x through y from the current fragment. If you omit the range, GelAssemble prompts you for one. If the current fragment is anchored, the corresponding bases are not deleted from the other anchored fragments.

: GOTo [FragmentName]

moves the cursor to the row containing the specified fragment.

: FInd GTATTC

finds the next occurrence of GTATTC in the current fragment sequence.

: DIfferences [attribute]

highlights bases that differ from the consensus symbol at each position in the alignment. This is the default for GelAssemble. The default highlight is reverse-video. The optional attribute lets you specify these other kinds of highlighting: B = blinking, U = underlining, and D = bold.

: MAtches [attribute]

highlights bases that match the consensus symbol at each position in the alignment. The default highlight is reverse-video. The optional attribute lets you specify other kinds of highlighting: B = blinking, U = underlining, and D = bold.

: Neither

turns off all highlighting bases that match or differ from the consensus symbol at each position in the alignment.

: REDraw

redraws your terminal screen. This is useful if line noise between your terminal and the computer has changed the screen in some unreasonable way or if a system message appears on your screen.

: Help

displays the commands available to the Screen and Command Modes of GelAssemble.

: SORt [DEScending]

sorts the loaded fragments on the screen in order of increasing offset (beginning position of the fragment in the contig alignment), beginning with the left-most fragment on row 2 of the sequence alignment. If you specify DEScending, the sort is in order of decreasing offset, beginning with the right-most fragment on row 2 of the sequence alignment.

: [a,b] MOve

moves a fragment from row a to row b. If no row numbers are supplied, the fragment on which the cursor is positioned is moved to the next vacant row.

: OPen

creates a vacant row at the cursor by pushing all fragment sequences up one row, including the fragment on which the cursor is positioned.

: [a,b] ANChor

anchors rows a through b. If you omit the row numbers, the current fragment is anchored. (See "Anchored, Locked, and Modified Fragments" for more information.)

: [a,b] NOANchor

unanchors rows a through b. If you omit the row numbers, the current fragment is unanchored. (See Anchored, Locked, and Modified Fragments for more information.)

: LOad [ContigName]

loads the aligned fragment sequences in the specified contig onto the lowest empty rows of the Edit Screen. The left-most sequence in the contig is loaded beginning at the cursor position, and each remaining fragment is loaded so that its alignment within its contig is retained on the screen. The loaded fragments are all anchored, and all fragments already in the Edit Screen are unanchored.

This command is used to assemble contigs manually that do not share sufficient overlap to be assembled automatically by GelMerge.

: REVerse

reverse complements the current fragment. If the current fragment is anchored, the entire anchored group is also reversed.

This command is used most often during manual contig assembly to align fragments entered into the Edit Screen with the : LOad command.

: [n] Offset

shifts the current fragment (fragment on which the cursor is positioned) so that its left end is at position n in the sequence alignment. If you omit the position, the fragment is shifted so that its left end begins at the cursor.

This command is used during manual contig assembly to position fragments entered into the Edit Screen with the : LOad command.

: REJect

removes the fragment on which your cursor is positioned from the screen and stores it in a buffer. If the remaining fragments on the screen are written to the sequencing project database as an aligned contig, each rejected fragment is written to the database as a separate contig.

This command is used during manual contig assembly to remove individual fragments from an existing contig.

: NODUPlicate

removes the fragment on which your cursor is positioned from the screen if that fragment also occurs on another line in the current Edit Screen.

This command is used during manual contig assembly to remove redundant occurrences of fragments from the Edit Screen before saving a contig to the sequencing project database.

: SPAWN [NewContigName]

If a fragment is duplicated on another row of the Edit Screen, this command renames the fragment sequence on which the cursor is positioned and stores it in the database as a new fragment. You can then write the duplicated fragments into the same contig since they have different names.

This command is used during manual contig assembly to allow multiple copies of individual fragment sequences to exist in a sequencing project database.

: SEParate

divides the fragment sequences loaded into the Edit Screen into two separate contigs that are then written to the sequencing project database. One contig contains the fragment on which the cursor is positioned and all fragments anchored to that fragment. After they are stored in the project database, the fragments in this contig are removed from the Edit Screen. The remaining fragment sequences are written to the sequencing project database as the second contig. The consensus sequence for each contig is recalculated before writing. The consensus sequence on row 1 of the sequence alignment is not recalculated automatically; it is simply truncated to reflect the boundaries of the remaining contig.

This command is used during manual contig assembly to separate contigs created by GelMerge that you believe should not be aligned together.

RELATED PROGRAMS

[ Previous | Top | Next ]

GelStart begins a fragment assembly session by creating a new fragment assembly project or by identifying an existing project. GelEnter adds fragment sequences to a fragment assembly project. It accepts sequence data from your terminal keyboard, a digitizer, or existing sequence files. GelMerge aligns the sequences in a fragment assembly project into assemblies called contigs. You can view and edit these assemblies in GelAssemble. GelAssemble is a multiple sequence editor for viewing and editing contigs assembled by GelMerge. GelView displays the structure of the contigs in a fragment assembly project. GelDisassemble breaks up the contigs in a fragment assembly project into single fragments.

RESTRICTIONS

[ Previous | Top | Next ]

A contig may not contain more than 1,650 fragments and may not be longer than 200,000 bases. No single fragment may be longer than 2,500 bases.

CONSIDERATIONS

[ Previous | Top | Next ]

Sequence Symbols

GelEnter accepts any valid GCG sequence character (see Appendix III). GelMerge and GelAssemble recognize all IUB nucleotide ambiguity codes (see Appendix III) and the period (.) and tilde (~) as gap symbols for the generation of consensus sequences. All other sequence characters are treated as non-nucleotide symbols in GelMerge and GelAssemble.

SUGGESTIONS

[ Previous | Top | Next ]

The number of fragments you can view at any one time and the number of bases from each sequence you can view at once are constrained by the number of lines and columns on your terminal screen. If your terminal can display more than 24 lines and 80 columns, GelAssemble uses the extra space to display additional fragment sequences and larger ranges from each sequence.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.


Minimal Syntax: % gelassemble

Prompted Parameters: None

Local Data Files:

set.keys (must be in your current working directory to be used)

Optional Parameters:

-CONTIG=contigname          loads the specified contig

ACKNOWLEDGEMENTS

[ Previous | Top | Next ]

GelAssemble is derived from the MSE (Multiple Sequence Editor) program written by Dr. William Gilbert at the Massachusetts Institute of Technology, to whom we are very grateful. The screen layout and vertical scrolling are his invention, as are the concepts of anchoring and locking strands. He also designed the mechanism for displaying matches and differences. MSE was converted into GelAssemble by Philip Delaquess, Lisa Caballero, and Irv Edelman. MSE evolved from SeqEd, which was developed by John Devereux and Paul Haeberli.

LOCAL DATA FILES

[ Previous | Top | Next ]

The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.

Customizing Your Keyboard With SetKeys

You can use the program SetKeys to create a set.keys file that tells the SeqEd, GelEnter, LineUp, GelAssemble, and SeqLab sequence editors how to interpret the letters you type at the terminal. When entering gel readings, it is useful to have the symbols for G, A, T, and C under the fingers of one hand in the same positions as the lanes in your gel. SeqEd, GelEnter, LineUp, GelAssemble, and the SeqLab sequence editor automatically read the file set.keys if it is present in your local directory. If set.keys is absent, or if the sequence type is set to Protein (in SeqEd and LineUp only) the terminal keys retain their conventional meanings.

If you have a set.keys file in your directory, SeqEd, GelEnter, LineUp, and GelAssemble only respond to the keys that it redefines. You can edit the file set.keys with a text editor if some of the keys you want to use are not in it. Any keys not mentioned in set.keys appear to be dead in these sequence editors. In the SeqLab sequence editor, keys that are not redefined retain their normal meanings.

Several keys are vital for the control of SeqEd, LineUp, GelEnter, and GelAssemble; this means you are not allowed to redefine the keys for /, [, ], {, }, (, ), :, ,, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, <Ctrl>R, <Ctrl>D, <Ctrl>H, <Return>, and <Ctrl>E.

PARAMETER REFERENCE

[ Previous | Top | Next ]

You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

-CONTIG=ContigName

loads a contig directly into the Edit Screen.

Printed: January 9, 2002 13:45 (1162)

[ Program Manual | User's Guide | Data Files | Databases ]


Technical Support: support-us@accelrys.com
or support-eu@accelrys.com

Copyright (c) 1982-2002 Accelrys Inc. A subsidiary of Pharmacopeia, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark and GCG and the GCG logo are registered trademarks of Accelrys Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Accelrys Inc.

www.accelrys.com/bio