summaryrefslogtreecommitdiff
path: root/doc/cli.markdown
diff options
context:
space:
mode:
Diffstat (limited to 'doc/cli.markdown')
-rw-r--r--doc/cli.markdown61
1 files changed, 19 insertions, 42 deletions
diff --git a/doc/cli.markdown b/doc/cli.markdown
index 843bae9..214c3b7 100644
--- a/doc/cli.markdown
+++ b/doc/cli.markdown
@@ -1,39 +1,36 @@
# Quikr Command Line Utilities #
-
Quikr has three command-line utilities that mirror the behavior of the python
module and the matlab implementation. The advantage of this is ease of scripting
-and job management. These utilities are written in python and wrap the quikr
-module.
+and job management, as well as faster processing and lower memory usage. These
+utilities are written in C and utilize OpenMP for multithreading.
## Quikr\_train ##
-
The quikr\_train is a tool to train a database for use with the quikr tool.
-Before running the quikr utility, you need to generate the sensing matrix or
+Before running the quikr utility, you need to generate the sensing matqrix or
download a pretrained matrix from our database\_download.html.
### Usage ###
-quikr\_train returns a custom trained matrix that can be used with the quikr
-function. You must supply a kmer.
+quikr\_train returns a custom sensing matrix that can be used with the quikr
+function.
quikr\_train's arguments:
-i, --input, the database of sequences (fasta format)
- -o, --output, the trained matrix (text file)
- -k, --kmer, the kmer size, the default is 6 (integer)
- -z, --compress compress the output matrix with gzip (flag)
+ -o, --output, the sensing matrix (text file)
+ -k, --kmer, specifiy wha size of kmer to use. (default value is 6)
+ -v, --verbose, verbose mode.
### Example ###
Here is an example on how to train a database. This uses the -z flag to compress
the output matrix since it can be very large. Because of the sparse nature of
the database, the matrix easily achieves a high compression ratio, even with
-gzip. It takes the gg94\_database.fasta as an input and outputs the trained
-matrix as gg94\_trained\_databse.npy.gz
+gzip. It takes the gg94\_database.fasta as an input and outputs the sensing
+matrix as gg94\_sensing\_databse.npy.gz
- quikr_train -i gg94_database.fasta -o gg94_trained_database.npy.gz -k 6 -z
+ quikr_train -i gg94_database.fasta -o gg94_sensing_database.matrix.gz -k 6
## Quikr ##
Quikr returns the estimated frequencies of batcteria present when given a
-input FASTA file. A default trained matrix will be used if none is supplied
-You must supply a kmer and default lambda if using a custom trained matrix.
+input FASTA file. You need to train a matrix or download a new matrix
### Usage ###
quikr returns the solution vector as a csv file.
@@ -42,8 +39,8 @@ quikr's arguments:
-f, --fasta, the sample's fasta file of NGS READS
-o, --output OTU\_FRACTION\_PRESENT, a vector representing the percentage of
database sequence's presence in sample (csv output)
- -t, --trained-matrix, the trained matrix
- -l, --lamb, the lambda size. (the default lambda value is 10,000)
+ -s, --sensing-matrix the sensing matrix. (generated by quikr\_train)
+ -l, --lambda, the lambda size. (the default lambda value is 10,000)
-k, --kmer, this specifies the size of the kmer to use (default is 6)
## Multifasta\_to\_otu ##
@@ -66,14 +63,14 @@ with aspecified number of jobs. Otherwise python with run one job per cpu core.
### Usage ###
multifasta\_to\_otu's arguments:
- -i, --input-directory, the directory containing the samples' fasta files of
+ -i, --input, the directory containing the samples' fasta files of
reads (note each fasta file should correspond to a separate sample)
-o, --otu-table, the OTU table, with OTU\_FRACTION\_PRESENT for each sample,
which is compatible with QIIME's convert\_biom.py (or sequence table if not
OTU's)
- -t, --trained-matrix, the trained matrix
- -f, --trained-fasta, the fasta file database of sequences
- -l, --lamb, specify what size of lambda to use (the default value is 10,000)
+ -s, --sensing-matrix, the sensing matrix
+ -f, --sensing-fasta, the fasta file database of sequences
+ -l, --lambda, specify what size of lambda to use (the default value is 10,000)
-k, --kmer, specify what size of kmer to use, (default value is 6)
-j, --jobs, specifies how many jobs to run at once, (default=number of CPUs)
@@ -98,12 +95,6 @@ The QIIME procedue:
principal_coordinates.py -i beta_div/weighted_unifrac_<quikr_otu>.txt -o <quikr_otu_project_name>_weighted.txt
make_3d_plots.py -i <quikr_otu_project_name>_weighted.txt -o <3d_pcoa_plotdirectory> -m <qiime_metadata_file>
-
-# Python Quikr Troubleshooting #
-
-If you are having trouble, and these solutions don't work. Please contact the
-developers with questions and issues.
-
#### Broken Pipe Errors ####
Make sure that you have the count-kmers and probablilties-by-read in your
$PATH, and that they are executable.
@@ -111,19 +102,5 @@ $PATH, and that they are executable.
If you have not installed quikr system-wide, you'll need to add the folder
location of these binaries in the terminal before running the command:
+ mv /path/to/quikr/src/nbc/count /path/to/quikr/src/nbc/count-kmers
PATH = $PATH:/path/to/quikr/src/nbc/
-
-Make sure that the binaries are executable by running:
-
- chmod +x probabilities-by-read
- chmod +x count-kmers
-
-#### Python Cannot Find XYZ ####
-
-Ensure that you have Python 2.7, Scipy, Numpy, and BIOpython installed
-and that python is setup correctly. You should be able to do this from a python
-prompt without any errors:
- >>> import numpy
- >>> import scipy
- >>> from Bio import SeqIO
-