aboutsummaryrefslogtreecommitdiff
path: root/kmer_utils.c
AgeCommit message (Collapse)Author
2014-03-12reverse complimentCalvin Morrison
2014-03-06update readme, add a check_null_ptr function to clear up clutterCalvin Morrison
2014-03-06add kmer_continuous_countCalvin Morrison
this tool will count continuously, instead of line by line. The way that this works out is something like this: test.fa > header 1 AAAAATTTTT > header 2 GGGGGAAAAA counting 6 mers, the program will count TTTGGG, TTGGGG, TGGGGG, like there was no header seperating them. This can be useful for certain tyeps of processing, like when the sequences are continuous from a genome. initial commit
2014-02-24add more verbose error messages and add more memory checksCalvin Morrison
2014-02-04don't inline hint, breaks clangCalvin Morrison
2014-02-02add a helper function for pythonCalvin Morrison
2014-02-02add commentCalvin Morrison
2014-02-01remove str alloc and replace inside s for strstrCalvin Morrison
2014-02-01use proper types, fix warnings, declare vars at top of section, take FILE ↵Calvin Morrison
instead of char
2014-02-01fix spacingCalvin Morrison
2014-02-01update types of functions, remove non existant include and se size_t for ↵Calvin Morrison
strnstrip
2014-01-30kmer_count_per_sequence: add option to load specific mers from file, add ↵Calvin Morrison
multiline ecounting
2014-01-07fix kmer_counts_per_sequence, make sure we convert the array fully, and ↵Calvin Morrison
update kmer_utils for str[i] == 5 instead of >> 2
2013-11-24performance boost from skipping our first newline. It seems crazy, but this ↵Calvin Morrison
could be up to a 10-15% improvement because of our strstrip function. Each time we were copying the entire array, even if we didn't need to. There will be more of a benefit on a single line'd sequence file, but will see a speed up on all
2013-11-23fix labels, fix spelling of positionCalvin Morrison
2013-11-23better allocation of memory, make sure to free other memoryCalvin Morrison
2013-11-15Merge branch 'master' of github.com:mutantturkey/dna-utilsCalvin Morrison
2013-11-15instead of bitshift, use a equality operatorCalvin Morrison
2013-11-11fix memleakCalvin Morrison
2013-11-11index to kmerfunctionCalvin Morrison
2013-10-17update kmer utilsCalvin Morrison
2013-10-16added new functionsCalvin Morrison
2013-10-04no more branchingCalvin Morrison
2013-10-02use an external iterator so that we can skip over anything in range of an errorwCalvin Morrison
2013-10-02remove unused headersCalvin Morrison
2013-10-01update headers, use const for better performance (.500ms on ~2gb file), ↵Calvin Morrison
update comments for functions
2013-09-28ideaCalvin Morrison
2013-09-14improve performance of convert_kmer_to_index by using a bitwise OR to ↵Calvin Morrison
convert our characters to lowercare: str[i] | Ox20, and reduce the number of switches as a result
2013-09-12don't use strtolCalvin Morrison
2013-09-11update convert_kmer_to_index for brevity and clarityCalvin Morrison
2013-09-11add headersCalvin Morrison
2013-09-10Initial commit of some kmer utilities.Calvin Morrison
there are two utilties included. one is kmer_frequency_per_sequence, which outputs a (m x n) matrix where m is the sequence, and n is the frequency of that nmer to occur in the given sequence. the other tool is kmer_total_count, which counts kmers for the total file, not just one sequence