![]() |
ExtractFromSequenceFiles.pl - Extract data from sequence and alignment files
ExtractFromSequenceFiles.pl SequenceFile(s) AlignmentFile(s)...
ExtractFromSequenceFiles.pl [-h, --help] [-i, --IgnoreGaps yes | no] [-m, --mode SequenceID | SequenceNum | SequenceNumRange] [-o, --overwrite] [-r, --root rootname] [-s, --Sequences ''SequenceID, [SequenceID,...]'' | ''SequenceNum, [SequenceNum,...]'' | ''StartingSeqNum, EndingSeqNum''] [--SequenceIDMatch Exact | Relaxed] [-w, --WorkingDir dirname] SequenceFile(s) AlignmentFile(s)...
Extract specific data from SequenceFile(s) and AlignmentFile(s) and generate FASTA files. You can extract sequences using sequence IDs or sequence numbers.
The file names are separated by spaces. All the sequence files in a current directory can be specified by *.aln, *.msf, *.fasta, *.fta, *.pir or any other supported formats; additionally, DirName corresponds to all the sequence files in the current directory with any of the supported file extension: .aln, .msf, .fasta, .fta, and .pir.
Supported sequence formats are: ALN/CLustalW, GCG/MSF, PILEUP/MSF, Pearson/FASTA, and NBRF/PIR. Instead of using file extensions, file formats are detected by parsing the contents of SequenceFile(s) and AlignmentFile(s).
In order to remove gap columns, length of all the sequence must be same; otherwise, this option is ignored.
The sequence numbers correspond to position of sequences starting from 1 for first sequence in SequenceFile(s) and AlignmentFile(s).
For SequenceID value of -m, --mode option, input value format is: SequenceID,.... Examples:
For SequenceNum value of -m, --mode option, input value format is: SequenceNum,.... Examples:
For SequenceNum value of -m, --mode option, input value format is: StaringSeqNum,EndingSeqNum. Examples:
To extract first sequence from Sample1.fasta sequence file and generate Sample1SequenceNum.fasta sequence file, type:
To extract first sequence from Sample1.aln alignment file and generate Sample1SequenceNum.fasta sequence file without any column gaps, type:
To extract first sequence from Sample1.aln alignment file and generate Sample1SequenceNum.fasta sequence file with column gaps, type:
To extract sequence number 1 and 4 from Sample1.fasta sequence file and generate Sample1SequenceNum.fasta sequence file, type:
To extract sequences from sequence number 1 to 4 from Sample1.fasta sequence file and generate Sample1SequenceNumRange.fasta sequence file, type:
To extract sequence ID ''Q9P993/104-387'' from sequence from Sample1.fasta sequence file and generate Sample1SequenceID.fasta sequence file, type:
AnalyzeSequenceFilesData.pl, InfoSequenceFiles.pl
Copyright (C) 2004-2008 Manish Sud. All rights reserved.
This file is part of MayaChemTools.
MayaChemTools is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.