dna-utils.git - Tools for parsing FASTA files very fast

Age	Commit message (Collapse)	Author
2014-04-14	add some testing scripts i have had laying around, some fasta files, and ↵	Calvin Morrison
	make kmerlocations work
2014-04-09	MERGE sparse trunk into master	Calvin Morrison

2014-03-12	reverse compliment	Calvin Morrison

2014-03-06	update readme, add a check_null_ptr function to clear up clutter	Calvin Morrison

2014-03-06	add kmer_continuous_count	Calvin Morrison
	this tool will count continuously, instead of line by line. The way that this works out is something like this: test.fa > header 1 AAAAATTTTT > header 2 GGGGGAAAAA counting 6 mers, the program will count TTTGGG, TTGGGG, TGGGGG, like there was no header seperating them. This can be useful for certain tyeps of processing, like when the sequences are continuous from a genome. initial commit
2014-02-24	add more verbose error messages and add more memory checks	Calvin Morrison

2014-02-04	don't inline hint, breaks clang	Calvin Morrison

2014-02-02	add a helper function for python	Calvin Morrison

2014-02-02	add comment	Calvin Morrison

2014-02-01	remove str alloc and replace inside s for strstr	Calvin Morrison

2014-02-01	use proper types, fix warnings, declare vars at top of section, take FILE ↵	Calvin Morrison
	instead of char
2014-02-01	fix spacing	Calvin Morrison

2014-02-01	update types of functions, remove non existant include and se size_t for ↵	Calvin Morrison
	strnstrip
2014-01-30	kmer_count_per_sequence: add option to load specific mers from file, add ↵	Calvin Morrison
	multiline ecounting
2014-01-07	fix kmer_counts_per_sequence, make sure we convert the array fully, and ↵	Calvin Morrison
	update kmer_utils for str[i] == 5 instead of >> 2
2013-11-24	performance boost from skipping our first newline. It seems crazy, but this ↵	Calvin Morrison
	could be up to a 10-15% improvement because of our strstrip function. Each time we were copying the entire array, even if we didn't need to. There will be more of a benefit on a single line'd sequence file, but will see a speed up on all
2013-11-23	fix labels, fix spelling of position	Calvin Morrison

2013-11-23	better allocation of memory, make sure to free other memory	Calvin Morrison

2013-11-15	Merge branch 'master' of github.com:mutantturkey/dna-utils	Calvin Morrison

2013-11-15	instead of bitshift, use a equality operator	Calvin Morrison

2013-11-11	fix memleak	Calvin Morrison

2013-11-11	index to kmerfunction	Calvin Morrison

2013-10-17	update kmer utils	Calvin Morrison

2013-10-16	added new functions	Calvin Morrison

2013-10-04	no more branching	Calvin Morrison

2013-10-02	use an external iterator so that we can skip over anything in range of an errorw	Calvin Morrison

2013-10-02	remove unused headers	Calvin Morrison

2013-10-01	update headers, use const for better performance (.500ms on ~2gb file), ↵	Calvin Morrison
	update comments for functions
2013-09-28	idea	Calvin Morrison

2013-09-14	improve performance of convert_kmer_to_index by using a bitwise OR to ↵	Calvin Morrison
	convert our characters to lowercare: str[i] \| Ox20, and reduce the number of switches as a result
2013-09-12	don't use strtol	Calvin Morrison

2013-09-11	update convert_kmer_to_index for brevity and clarity	Calvin Morrison

2013-09-11	add headers	Calvin Morrison

2013-09-10	Initial commit of some kmer utilities.	Calvin Morrison
	there are two utilties included. one is kmer_frequency_per_sequence, which outputs a (m x n) matrix where m is the sequence, and n is the frequency of that nmer to occur in the given sequence. the other tool is kmer_total_count, which counts kmers for the total file, not just one sequence