aboutsummaryrefslogtreecommitdiff
path: root/src/galaxy/quikr_train.xml
blob: b5f5291e9d5140aade7f9c24443865af606a5d9e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
<tool id="quikr_train" name="Quikr Train">
	<description>Train Quikr Matrix</description>
	<command>quikr_train -f -v -k $kmer -i $input -o $output</command>
	<inputs>
		<param name="input" type="data" format="fasta" label="input fasta"/>
		<param name="kmer" type="integer" size="5" value="6" label="What k-mer size to use?" help="range 6 - 12"/>
	</inputs>
	<outputs>
		<data name="output" format="data"/>
	</outputs>
	<help>
**What it does**

This tool counts the length of each fasta sequence in the file. The output file has two columns per line (separated by tab): fasta titles and lengths of the sequences. The option *How many characters to keep?* allows to select a specified number of letters from the beginning of each FASTA entry. 

-----	

**Example**

Suppose you have the following FASTA formatted sequences from a Roche (454) FLX sequencing run::

    &gt;EYKX4VC02EQLO5 length=108 xy=1826_0455 region=2 run=R_2007_11_07_16_15_57_
    TCCGCGCCGAGCATGCCCATCTTGGATTCCGGCGCGATGACCATCGCCCGCTCCACCACG
    TTCGGCCGGCCCTTCTCGTCGAGGAATGACACCAGCGCTTCGCCCACG
    &gt;EYKX4VC02D4GS2 length=60 xy=1573_3972 region=2 run=R_2007_11_07_16_15_57_
    AATAAAACTAAATCAGCAAAGACTGGCAAATACTCACAGGCTTATACAATACAAATGTAAfa

Running this tool while setting **How many characters to keep?** to **14** will produce this::
	
	EYKX4VC02EQLO5  108
	EYKX4VC02D4GS2	 60


	</help>
</tool>