summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorCalvin <calvin@EESI>2013-03-13 16:12:08 -0400
committerCalvin <calvin@EESI>2013-03-13 16:12:08 -0400
commit1e59439b88e1649ecf2d1b58cff0836f01c1068c (patch)
treedc1af0d5603abda1b835ffbb15b17d9c0534ed7f /doc
parent7127badd2a1ba320eb96fea0d5543664d92912c4 (diff)
added multi_fasta_otu to cli.markdown
Diffstat (limited to 'doc')
-rw-r--r--doc/cli.markdown34
1 files changed, 30 insertions, 4 deletions
diff --git a/doc/cli.markdown b/doc/cli.markdown
index 8d25337..b065240 100644
--- a/doc/cli.markdown
+++ b/doc/cli.markdown
@@ -5,7 +5,7 @@ module and the matlab implementation. The advantage of this is ease of scripting
and job management. These utilities are written in python and wrap the quikr
module.
-## quikr\_train ##
+## Quikr\_train ##
The quikr\_train is a tool to train a database for use with the quikr tool.
Before running the quikr utility, you need to generate the trained matrix or
@@ -15,13 +15,20 @@ download a pretrained matrix from our database\_download.html.
quikr\_train returns a custom trained matrix that can be used with the quikr
function. You must supply a kmer.
-quikr\_train's optional arguments:
+quikr\_train's arguments:
-i, --input, the database of sequences (fasta format)
-o, --output, the trained matrix (text file)
-k, --kmer, the kmer size (integer)
-z, --compress compress the output matrix with gzip (flag)
-## quikr ##
+### Example ###
+Here is an example on how to train a database. This uses the -z flag to compress
+the output matrix since it can be very large. It takes the gg94\_database.fasta
+as an input and outputs the trained matrix as gg94\_trained\_databse.npy.gz
+
+ quikr_train -i gg94_database.fasta -o gg94_trained_database.npy.gz -k 6 -z
+
+## Quikr ##
Quikr returns the estimated frequencies of batcteria present when given a
input FASTA file. A default trained matrix will be used if none is supplied
You must supply a kmer and default lambda if using a custom trained matrix.
@@ -29,13 +36,32 @@ You must supply a kmer and default lambda if using a custom trained matrix.
### Usage ###
quikr returns the solution vector as a csv file.
-quikr's optional arguments:
+quikr's arguments:
-f, --fasta, the fasta file sample
-o, --output OUTPUT, the output path (csv output)
-t, --trained-matrix, the trained matrix
-l, --lamb, the lambda size. (the default lambda value is 10,000)
-k, --kmer, this specifies which kmer to use (default is 6)
+## Multifasta\_to\_otu ##
+The Multifasta\_to\_otu tool is a handy wrapper for quikr which lets the user
+to input as many fasta files as they like, and then returns an OTU table of the
+number of times a specimen was seen in all of the samples
+
+Warning: this program will use a large amount of memory, and CPU time. You can
+reduce the number of cores used, and thus memory, by specifying the -j flag
+with aspecified number of jobs. Otherwise python with run one job per cpu core.
+
+### Usage ###
+multifasta\_to\_otu's arguments:
+ -i, --input-directory, the directory containing fasta files
+ -o, --otu-table, the output OTU table
+ -t, --trained-matrix, the trained database to use
+ -f, --trained-fasta, the fasta file used to train your matrix
+ -d, --output-directory, quikr output directory
+ -l, --lamb, specify what lambda to use (the default value is 10,000)
+ -k, --kmer, specify which kmer to use, (default value is 6)
+ -j, --jobs, specifies how many jobs to run at once, (default=number of CPUs)
# Troubleshooting #