added multi_fasta_otu to cli.markdown

author: Calvin <calvin@EESI> 2013-03-13 16:12:08 -0400
committer: Calvin <calvin@EESI> 2013-03-13 16:12:08 -0400
commit: 1e59439b88e1649ecf2d1b58cff0836f01c1068c (patch)
tree: dc1af0d5603abda1b835ffbb15b17d9c0534ed7f /doc
parent: 7127badd2a1ba320eb96fea0d5543664d92912c4 (diff)
1 files changed, 30 insertions, 4 deletions
diff --git a/doc/cli.markdown b/doc/cli.markdown
index 8d25337..b065240 100644
--- a/doc/cli.markdown
+++ b/doc/cli.markdown
@@ -5,7 +5,7 @@ module and the matlab implementation. The advantage of this is ease of scripting
 and job management. These utilities are written in python and wrap the quikr
 module.
 
-## quikr\_train ##
+## Quikr\_train ##
 
 The quikr\_train is a tool to train a database for use with the quikr tool.
 Before running the quikr utility, you need to generate the trained matrix or
@@ -15,13 +15,20 @@ download a pretrained matrix from our database\_download.html.
 quikr\_train returns a custom trained matrix that can be used with the quikr
 function. You must supply a kmer.
 
-quikr\_train's optional arguments:
+quikr\_train's arguments:
   -i, --input, the database of sequences (fasta format)
   -o, --output, the trained matrix (text file)
   -k, --kmer, the kmer size (integer)
   -z, --compress  compress the output matrix with gzip (flag)
 
-## quikr ##
+### Example ###
+Here is an example on how to train a database. This uses the -z flag to compress
+the output matrix since it can be very large. It takes the gg94\_database.fasta
+as an input and outputs the trained matrix as gg94\_trained\_databse.npy.gz
+
+    quikr_train -i gg94_database.fasta -o gg94_trained_database.npy.gz -k 6 -z 
+
+## Quikr ##
 Quikr returns the estimated frequencies of batcteria present when given a
 input FASTA file. A default trained matrix will be used if none is supplied
 You must supply a kmer and default lambda if using a custom trained matrix.
@@ -29,13 +36,32 @@ You must supply a kmer and default lambda if using a custom trained matrix.
 ### Usage ###
 quikr returns the solution vector as a csv file.
 
-quikr's optional arguments:
+quikr's arguments:
   -f, --fasta, the fasta file sample
   -o, --output OUTPUT, the output path (csv output)
   -t, --trained-matrix, the trained matrix
   -l, --lamb, the lambda size. (the default lambda value is 10,000)
   -k, --kmer, this specifies which kmer to use (default is 6)
 
+## Multifasta\_to\_otu ##
+The Multifasta\_to\_otu tool is a handy wrapper for quikr which lets the user
+to input as many fasta files as they like, and then returns an OTU table of the
+number of times a specimen was seen in all of the samples 
+
+Warning: this program will use a large amount of memory, and CPU time. You can
+reduce the number of cores used, and thus memory, by specifying the -j flag
+with aspecified number of jobs. Otherwise python with run one job per cpu core.
+
+### Usage ###
+multifasta\_to\_otu's arguments:
+  -i, --input-directory, the directory containing fasta files
+  -o, --otu-table, the output OTU table
+  -t, --trained-matrix, the trained database to use
+  -f, --trained-fasta, the fasta file used to train your matrix
+  -d, --output-directory, quikr output directory
+  -l, --lamb, specify what lambda to use (the default value is 10,000)
+  -k, --kmer, specify which kmer to use, (default value is 6)
+  -j, --jobs, specifies how many jobs to run at once, (default=number of CPUs)
 
 # Troubleshooting #
author	Calvin <calvin@EESI>	2013-03-13 16:12:08 -0400
committer	Calvin <calvin@EESI>	2013-03-13 16:12:08 -0400
commit	1e59439b88e1649ecf2d1b58cff0836f01c1068c (patch)
tree	dc1af0d5603abda1b835ffbb15b17d9c0534ed7f /doc
parent	7127badd2a1ba320eb96fea0d5543664d92912c4 (diff)