******************************************************************************** MEME - Motif discovery tool ******************************************************************************** MEME version 4.10.0 (Release date: Wed May 21 10:35:36 2014 +1000) For further information on how to interpret these results or to get a copy of the MEME software please access http://meme.nbcr.net. This file may be used as input to the MAST algorithm for searching sequence databases for matches to groups of motifs. MAST is available for interactive use and downloading at http://meme.nbcr.net. ******************************************************************************** ******************************************************************************** REFERENCE ******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28-36, AAAI Press, Menlo Park, California, 1994. ******************************************************************************** ******************************************************************************** TRAINING SET ******************************************************************************** DATAFILE= motifs/55/55.seqs.fa ALPHABET= ACGT Sequence name Weight Length Sequence name Weight Length ------------- ------ ------ ------------- ------ ------ 42501 1.0000 500 31948 1.0000 500 42917 1.0000 500 42940 1.0000 500 43038 1.0000 500 36485 1.0000 500 21354 1.0000 500 47237 1.0000 500 47417 1.0000 500 13987 1.0000 500 43532 1.0000 500 32759 1.0000 500 25422 1.0000 500 49342 1.0000 500 40261 1.0000 500 44231 1.0000 500 33914 1.0000 500 44891 1.0000 500 47512 1.0000 500 48014 1.0000 500 37585 1.0000 500 39040 1.0000 500 ******************************************************************************** ******************************************************************************** COMMAND LINE SUMMARY ******************************************************************************** This information can also be useful in the event you wish to report a problem with the MEME software. command: meme motifs/55/55.seqs.fa -oc motifs/55 -dna -minw 12 -maxw 21 -nmotifs 3 -maxsize 500000 model: mod= zoops nmotifs= 3 evt= inf object function= E-value of product of p-values width: minw= 12 maxw= 21 minic= 0.00 width: wg= 11 ws= 1 endgaps= yes nsites: minsites= 2 maxsites= 22 wnsites= 0.8 theta: prob= 1 spmap= uni spfuzz= 0.5 global: substring= yes branching= no wbranch= no em: prior= dirichlet b= 0.01 maxiter= 50 distance= 1e-05 data: n= 11000 N= 22 strands: + sample: seed= 0 seqfrac= 1 Letter frequencies in dataset: A 0.269 C 0.227 G 0.224 T 0.280 Background letter frequencies (from dataset with add-one prior applied): A 0.269 C 0.227 G 0.224 T 0.280 ******************************************************************************** ******************************************************************************** MOTIF 1 MEME width = 12 sites = 22 llr = 190 E-value = 2.6e-001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 1 Description -------------------------------------------------------------------------------- Simplified A 1:1a:6::2a5: pos.-specific C 2:5:a:::4:12 probability G 2:1:::9:2:51 matrix T 5a4::4192::7 bits 2.2 1.9 * * 1.7 * ** * 1.5 * ** * * Relative 1.3 * ** ** * Entropy 1.1 * ** ** * (12.5 bits) 0.9 * ***** * 0.6 * ***** *** 0.4 ******* *** 0.2 ************ 0.0 ------------ Multilevel TTCACAGTCAAT consensus C T T G G sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47237 308 6.32e-07 TGCGTTCACA TTTACAGTCAAT TTGAACAGAC 32759 294 9.98e-07 AACAGCGAAA TTTACTGTCAGT TGAACATGCC 44231 226 2.23e-06 TTCATTCACA GTCACTGTCAGT CACTATCATT 37585 434 3.73e-06 AGGTAACTGT TTCACTGTCAGC CGCTTCAGGT 21354 4 6.83e-06 TGC TTTACTGTAAGT CTTGCTCTAT 49342 236 8.56e-06 TACAGTTCGG TTCACAGTAAGC AGAGTTTCCC 25422 312 9.74e-06 TACAGTTTCA GTCACTGTTAGT TATCTGACTG 43038 224 9.74e-06 ATACGGACGC CTTACAGTTAAT CAATTGACAG 44891 51 1.75e-05 GTTGTTTTCA TTCACTGTGACT TCGAATCCCA 40261 46 1.99e-05 CAGGCCATTG TTCACAGTCAAA AGCGAAAACG 33914 85 2.31e-05 ACAAGAGCAC CTGACAGTGAAT TCAATAACAA 47512 142 3.07e-05 GTTGGTTCCA TTCACAATCAAT TCTAATCTGT 43532 56 3.30e-05 AAAAATACGC TTCACAGGGAGT GTGTGAGAAA 36485 205 3.92e-05 GCTGGTAACG CTAACAGTAAAT GAAAATGCAA 31948 16 6.50e-05 GACGAAGATA ATCACAGCCAAT TGACACTCCA 39040 223 9.05e-05 GTGCACCTGG CTAACAGTAAGC TCGCCGGGTA 13987 283 9.05e-05 ATTTACATCA ATTACAGTTAGG TTAGTGAGCT 48014 72 1.21e-04 TGGATTGGTT GTCGCAGTCAAC CTACCAATGT 42940 393 2.18e-04 AAGTCGGCTG ATTAAAGTGAAT GGGCTGCTCT 42917 305 2.18e-04 AGCGACTCCA TTTACTTTGACT TCGTTGCTTC 42501 32 2.63e-04 CGAAATTGTA CTTACTTTCAAG CCCCCCCTAT 47417 394 4.82e-04 CACAAATCAC GTGACTGTTTGT GAAAAATTTA -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47237 6.3e-07 307_[+1]_181 32759 1e-06 293_[+1]_195 44231 2.2e-06 225_[+1]_263 37585 3.7e-06 433_[+1]_55 21354 6.8e-06 3_[+1]_485 49342 8.6e-06 235_[+1]_253 25422 9.7e-06 311_[+1]_177 43038 9.7e-06 223_[+1]_265 44891 1.8e-05 50_[+1]_438 40261 2e-05 45_[+1]_443 33914 2.3e-05 84_[+1]_404 47512 3.1e-05 141_[+1]_347 43532 3.3e-05 55_[+1]_433 36485 3.9e-05 204_[+1]_284 31948 6.5e-05 15_[+1]_473 39040 9e-05 222_[+1]_266 13987 9e-05 282_[+1]_206 48014 0.00012 71_[+1]_417 42940 0.00022 392_[+1]_96 42917 0.00022 304_[+1]_184 42501 0.00026 31_[+1]_457 47417 0.00048 393_[+1]_95 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 1 width=12 seqs=22 47237 ( 308) TTTACAGTCAAT 1 32759 ( 294) TTTACTGTCAGT 1 44231 ( 226) GTCACTGTCAGT 1 37585 ( 434) TTCACTGTCAGC 1 21354 ( 4) TTTACTGTAAGT 1 49342 ( 236) TTCACAGTAAGC 1 25422 ( 312) GTCACTGTTAGT 1 43038 ( 224) CTTACAGTTAAT 1 44891 ( 51) TTCACTGTGACT 1 40261 ( 46) TTCACAGTCAAA 1 33914 ( 85) CTGACAGTGAAT 1 47512 ( 142) TTCACAATCAAT 1 43532 ( 56) TTCACAGGGAGT 1 36485 ( 205) CTAACAGTAAAT 1 31948 ( 16) ATCACAGCCAAT 1 39040 ( 223) CTAACAGTAAGC 1 13987 ( 283) ATTACAGTTAGG 1 48014 ( 72) GTCGCAGTCAAC 1 42940 ( 393) ATTAAAGTGAAT 1 42917 ( 305) TTTACTTTGACT 1 42501 ( 32) CTTACTTTCAAG 1 47417 ( 394) GTGACTGTTTGT 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10758 bayes= 8.93074 E= 2.6e-001 -98 0 -30 70 -1110 -1110 -1110 184 -156 100 -130 38 183 -1110 -230 -1110 -256 207 -1110 -1110 114 -1110 -1110 55 -256 -1110 195 -162 -1110 -232 -230 170 -56 85 2 -62 183 -1110 -1110 -262 76 -132 102 -1110 -256 -32 -130 128 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 22 E= 2.6e-001 0.136364 0.227273 0.181818 0.454545 0.000000 0.000000 0.000000 1.000000 0.090909 0.454545 0.090909 0.363636 0.954545 0.000000 0.045455 0.000000 0.045455 0.954545 0.000000 0.000000 0.590909 0.000000 0.000000 0.409091 0.045455 0.000000 0.863636 0.090909 0.000000 0.045455 0.045455 0.909091 0.181818 0.409091 0.227273 0.181818 0.954545 0.000000 0.000000 0.045455 0.454545 0.090909 0.454545 0.000000 0.045455 0.181818 0.090909 0.681818 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 1 regular expression -------------------------------------------------------------------------------- [TC]T[CT]AC[AT]GT[CG]A[AG]T -------------------------------------------------------------------------------- Time 4.22 secs. ******************************************************************************** ******************************************************************************** MOTIF 2 MEME width = 16 sites = 7 llr = 105 E-value = 8.8e+001 ******************************************************************************** -------------------------------------------------------------------------------- Motif 2 Description -------------------------------------------------------------------------------- Simplified A 941:::::a:6::73: pos.-specific C :63:179::a:17:79 probability G 1:1::1::::493::1 matrix T ::4a911a:::::3:: bits 2.2 * 1.9 * *** 1.7 * *** 1.5 * **** * * Relative 1.3 * ** **** ** ** Entropy 1.1 ** ** ********** (21.6 bits) 0.9 ** ************* 0.6 ** ************* 0.4 ** ************* 0.2 **************** 0.0 ---------------- Multilevel ACTTTCCTACAGCACC consensus AC G GTA sequence -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ---------------- 42501 329 1.23e-09 AAACGGCGAA AATTTCCTACGGCACC CAAATCCATC 42940 175 1.45e-08 TTTCTCGTCA ACTTTCTTACAGCACC AACCATCACC 43532 268 8.34e-08 GTCAGCACTC ACGTCCCTACGGGACC CGTTGTCATT 25422 373 9.76e-08 TCATAAAACA ACATTGCTACAGCTCC CGAGAGAGGC 47237 411 1.07e-07 AGTCACTCAT AACTTTCTACAGCAAC GAAGCCATCG 47512 417 2.00e-07 GCTTTCCCCT ACTTTCCTACACGACG ACAGTGCTGT 39040 100 2.14e-07 CGAAATTCTG GACTTCCTACGGCTAC TTTGTCGTTT -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42501 1.2e-09 328_[+2]_156 42940 1.5e-08 174_[+2]_310 43532 8.3e-08 267_[+2]_217 25422 9.8e-08 372_[+2]_112 47237 1.1e-07 410_[+2]_74 47512 2e-07 416_[+2]_68 39040 2.1e-07 99_[+2]_385 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 2 width=16 seqs=7 42501 ( 329) AATTTCCTACGGCACC 1 42940 ( 175) ACTTTCTTACAGCACC 1 43532 ( 268) ACGTCCCTACGGGACC 1 25422 ( 373) ACATTGCTACAGCTCC 1 47237 ( 411) AACTTTCTACAGCAAC 1 47512 ( 417) ACTTTCCTACACGACG 1 39040 ( 100) GACTTCCTACGGCTAC 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 16 n= 10670 bayes= 10.4167 E= 8.8e+001 167 -945 -65 -945 67 133 -945 -945 -91 33 -65 61 -945 -945 -945 184 -945 -67 -945 161 -945 165 -65 -97 -945 191 -945 -97 -945 -945 -945 184 189 -945 -945 -945 -945 213 -945 -945 109 -945 94 -945 -945 -67 194 -945 -945 165 35 -945 141 -945 -945 3 9 165 -945 -945 -945 191 -65 -945 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 16 nsites= 7 E= 8.8e+001 0.857143 0.000000 0.142857 0.000000 0.428571 0.571429 0.000000 0.000000 0.142857 0.285714 0.142857 0.428571 0.000000 0.000000 0.000000 1.000000 0.000000 0.142857 0.000000 0.857143 0.000000 0.714286 0.142857 0.142857 0.000000 0.857143 0.000000 0.142857 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.571429 0.000000 0.428571 0.000000 0.000000 0.142857 0.857143 0.000000 0.000000 0.714286 0.285714 0.000000 0.714286 0.000000 0.000000 0.285714 0.285714 0.714286 0.000000 0.000000 0.000000 0.857143 0.142857 0.000000 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 2 regular expression -------------------------------------------------------------------------------- A[CA][TC]TTCCTAC[AG]G[CG][AT][CA]C -------------------------------------------------------------------------------- Time 8.16 secs. ******************************************************************************** ******************************************************************************** MOTIF 3 MEME width = 12 sites = 9 llr = 104 E-value = 1.5e+002 ******************************************************************************** -------------------------------------------------------------------------------- Motif 3 Description -------------------------------------------------------------------------------- Simplified A 43:::9:4:::9 pos.-specific C :22::1a1:::: probability G 6221a:::a:a: matrix T :269:::4:a:1 bits 2.2 * * * * 1.9 * * *** 1.7 * * *** 1.5 *** *** Relative 1.3 **** **** Entropy 1.1 * **** **** (16.7 bits) 0.9 * **** **** 0.6 * ***** **** 0.4 * ********** 0.2 * ********** 0.0 ------------ Multilevel GATTGACAGTGA consensus ACC T sequence GG T -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 sites sorted by position p-value -------------------------------------------------------------------------------- Sequence name Start P-value Site ------------- ----- --------- ------------ 47512 13 1.34e-07 GCGACTGATG GATTGACAGTGA TGGATTTGTG 32759 244 1.34e-07 ACTGAAACAC GATTGACTGTGA ACGGGATCCA 44231 194 1.67e-06 CGGTGTTAGT GGCTGACAGTGA CAACGAATTT 47237 194 2.65e-06 ACATTCACTC ACGTGACTGTGA AATATTACTG 25422 325 3.01e-06 ACTGTTAGTT ATCTGACTGTGA GACAGTGAGT 37585 102 4.29e-06 AAGTCATTGC GGTTGCCTGTGA GGAAGGACTA 48014 280 5.00e-06 AATACCGATC GTTGGACAGTGA CACGGTTACG 43038 237 5.77e-06 ACAGTTAATC AATTGACAGTGT TTCGCGATGA 42940 315 6.51e-06 ACCATATTCG ACGTGACCGTGA GAACAATTGC -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 block diagrams -------------------------------------------------------------------------------- SEQUENCE NAME POSITION P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 47512 1.3e-07 12_[+3]_476 32759 1.3e-07 243_[+3]_245 44231 1.7e-06 193_[+3]_295 47237 2.6e-06 193_[+3]_295 25422 3e-06 324_[+3]_164 37585 4.3e-06 101_[+3]_387 48014 5e-06 279_[+3]_209 43038 5.8e-06 236_[+3]_252 42940 6.5e-06 314_[+3]_174 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 in BLOCKS format -------------------------------------------------------------------------------- BL MOTIF 3 width=12 seqs=9 47512 ( 13) GATTGACAGTGA 1 32759 ( 244) GATTGACTGTGA 1 44231 ( 194) GGCTGACAGTGA 1 47237 ( 194) ACGTGACTGTGA 1 25422 ( 325) ATCTGACTGTGA 1 37585 ( 102) GGTTGCCTGTGA 1 48014 ( 280) GTTGGACAGTGA 1 43038 ( 237) AATTGACAGTGT 1 42940 ( 315) ACGTGACCGTGA 1 // -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific scoring matrix -------------------------------------------------------------------------------- log-odds matrix: alength= 4 w= 12 n= 10758 bayes= 10.3564 E= 1.5e+002 72 -982 131 -982 31 -3 -1 -33 -982 -3 -1 99 -982 -982 -101 167 -982 -982 216 -982 172 -103 -982 -982 -982 214 -982 -982 72 -103 -982 67 -982 -982 216 -982 -982 -982 -982 184 -982 -982 216 -982 172 -982 -982 -133 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 position-specific probability matrix -------------------------------------------------------------------------------- letter-probability matrix: alength= 4 w= 12 nsites= 9 E= 1.5e+002 0.444444 0.000000 0.555556 0.000000 0.333333 0.222222 0.222222 0.222222 0.000000 0.222222 0.222222 0.555556 0.000000 0.000000 0.111111 0.888889 0.000000 0.000000 1.000000 0.000000 0.888889 0.111111 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.444444 0.111111 0.000000 0.444444 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 1.000000 0.000000 0.888889 0.000000 0.000000 0.111111 -------------------------------------------------------------------------------- -------------------------------------------------------------------------------- Motif 3 regular expression -------------------------------------------------------------------------------- [GA][ACGT][TCG]TGAC[AT]GTGA -------------------------------------------------------------------------------- Time 12.35 secs. ******************************************************************************** ******************************************************************************** SUMMARY OF MOTIFS ******************************************************************************** -------------------------------------------------------------------------------- Combined block diagrams: non-overlapping sites with p-value < 0.0001 -------------------------------------------------------------------------------- SEQUENCE NAME COMBINED P-VALUE MOTIF DIAGRAM ------------- ---------------- ------------- 42501 1.07e-05 328_[+2(1.23e-09)]_156 31948 1.62e-01 15_[+1(6.50e-05)]_473 42917 1.07e-01 500 42940 4.95e-07 174_[+2(1.45e-08)]_124_\ [+3(6.51e-06)]_174 43038 6.26e-04 223_[+1(9.74e-06)]_1_[+3(5.77e-06)]_\ 252 36485 2.96e-02 204_[+1(3.92e-05)]_284 21354 2.14e-02 3_[+1(6.83e-06)]_485 47237 6.82e-09 96_[+3(8.13e-05)]_85_[+3(2.65e-06)]_\ 102_[+1(6.32e-07)]_91_[+2(1.07e-07)]_74 47417 2.24e-01 500 13987 2.97e-01 282_[+1(9.05e-05)]_206 43532 1.40e-05 55_[+1(3.30e-05)]_200_\ [+2(8.34e-08)]_217 32759 4.48e-06 243_[+3(1.34e-07)]_38_\ [+1(9.98e-07)]_195 25422 8.64e-08 311_[+1(9.74e-06)]_1_[+3(3.01e-06)]_\ 36_[+2(9.76e-08)]_112 49342 8.98e-03 131_[+1(1.99e-05)]_92_\ [+1(8.56e-06)]_253 40261 5.68e-02 45_[+1(1.99e-05)]_443 44231 4.08e-05 193_[+3(1.67e-06)]_20_\ [+1(2.23e-06)]_263 33914 1.07e-03 82_[+3(2.65e-06)]_406 44891 5.72e-02 50_[+1(1.75e-05)]_32_[+1(5.99e-05)]_\ 394 47512 2.75e-08 12_[+3(1.34e-07)]_117_\ [+1(3.07e-05)]_263_[+2(2.00e-07)]_68 48014 3.73e-03 279_[+3(5.00e-06)]_209 37585 2.32e-04 101_[+3(4.29e-06)]_320_\ [+1(3.73e-06)]_55 39040 1.25e-04 99_[+2(2.14e-07)]_107_\ [+1(9.05e-05)]_266 -------------------------------------------------------------------------------- ******************************************************************************** ******************************************************************************** Stopped because nmotifs = 3 reached. ******************************************************************************** CPU: seaotter.hsd1.wa.comcast.net ********************************************************************************